Re: [galaxy-dev] Incomplete datasets when dowloaded from history

Carlos Borroto Wed, 06 Jun 2012 10:02:00 -0700

In my case I ended modifying the API workflow_execute.py script and
used it to create symlinks I could use to run processes outside of
galaxy. In case the new processing machine doesn't have access to the
directory tree and use rsync with '--copy-links' to download the
datasets.


This is the important part of my script:
    r = get(api_key, galaxy_url)
    # iterate over each data set in the library or history
    for item in r:
                # check if the name matchs the regular expression
        if item['type'] == 'file' and match(regex, item['name']) is not None:
                        # if so go and get the details for the data set
                        item_details = get(api_key, galaxy_url + '/' + 
item['id'])
                        # link to the data set real file name
                        os.symlink(item_details['file_name'], 
item_details['name'])
                        # if this is a bam file, also sym link to the bam index 
file
                        if item_details['data_type'] == 'bam':
                                os.symlink(item_details['metadata_bam_index'],
item_details['name'] + '.bai')

A few advantages of this approach over simply "scpying". First you get
the same name you are using in galaxy and not a generic
dataset_000.dat. Also, you can copy multiple files matching a provided
regular expression in their name, with scp you would have to copy one
at a time.

If you are interested I can send you the whole script, which I'm sure
you will find several ways to break it, but in general it does its
job.

Regards,
Carlos

On Wed, Jun 6, 2012 at 11:20 AM, Hans-Rudolf Hotz <[email protected]> wrote:
> Hi Jean-Francois
>
> My random guess: this is a web browser issue, struggling to download big
> files? - although, 3000 lines is not big
>
> If required, we use scp on the command line to get a copy of the dataset
>
>
> Regards, Hans
>
>
>
> On 06/06/2012 03:53 PM, Jean-Francois Payotte wrote:
>>
>> Hi folks,
>>
>> Some of our local Galaxy instance users seem to be experiencing some
>> strange behaviour lately. I searched the mailing-list archive but I
>> didn't found anything related, so I'd be interested to know if somebody
>> already had the same issue.
>>
>> The problem is that sometimes, when people are trying to download their
>> datasets from their history, although the file seems to download
>> successfully, it appears that the downloaded file is incomplete (for
>> example a 3000 lines text file will show only maybe 2000 lines at the
>> first download, 1600 lines at the second download, and so on... and
>> eventually, the file will download completely.
>>
>> This issue happened with more than one user and with different tools.
>>
>> Does anybody ever had this kind of issue? Or does somebody would have an
>> idea of where to look to solve this problem?
>>
>> Best regards,
>> Jean-François
>>
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>
>>   http://lists.bx.psu.edu/
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Incomplete datasets when dowloaded from history

Reply via email to