Re: [galaxy-dev] Tools that make datasets

2015-10-21 Thread Peter van Heusden
These names have meaning in CollectedDatasetMatch:

designation means designation if that exists else name
name means name
dbkey means dbkey
ext means ext
visible means visible


On 21 October 2015 at 08:44, Steve Cassidy  wrote:

> Thanks, yes that works, though I'm not really sure what the difference is
> between __name__ and __designation__. They both seem to have the same
> effect in my example.
>
> Thanks for your help.
>
> Steve
>
>
> On 21 October 2015 at 16:49, Peter van Heusden  wrote:
>
>> Since you're using a directory, you can use one of the built in patterns:
>>
>> DEFAULT_EXTRA_FILENAME_PATTERN = 
>> r"primary_DATASET_ID_(?P[^_]+)_(?P[^_]+)_(?P[^_]+)(_(?P[^_]+))?"
>>
>> NAMED_PATTERNS = {
>> "__default__": DEFAULT_EXTRA_FILENAME_PATTERN,
>> "__name__": r"(?P.*)",
>> "__designation__": r"(?P.*)",
>> "__name_and_ext__": r"(?P.*)\.(?P[^\.]+)?",
>> "__designation_and_ext__": r"(?P.*)\.(?P[^\._]+)?",
>> }
>>
>> In terms of docs, I don't know what the future is - the Galaxy wiki or 
>> http://galaxy.readthedocs.org/en/master/
>>
>>
>> On 21 October 2015 at 04:26, Steve Cassidy 
>> wrote:
>>
>>> Ah, thankyou, yes, I can now get results by using patterns to match the
>>> output.  I used your example but prepend 'simple' to the filename and then
>>> searches for that with:
>>>
>>> 
>>> this solves the problem for the sample script but not generally since in
>>> general I can't predict the filenames that will be generated - this is a
>>> tool for downloading data from a repository which could be text, audio or
>>> video data.
>>>
>>> If I don't use the 'simple' prefix and omit the file extension I still
>>> get my data but I also get three other files which are temporary scripts
>>> generated by galaxy and placed in the working directory.  So, back to
>>> trying to put things in a subdirectory. It turns out that the issue I was
>>> having was as you pointed out earlier, the directory attribute to
>>> discover_datasets doesn't allow variables, so I need to write to a fixed
>>> directory name:
>>>
>>> 
>>> This now works!
>>>
>>> I had thought that I'd need to use a unique directory name but since
>>> galaxy runs each job in a separate directory, this isn't required.  My real
>>> tool now works too after following the same pattern.
>>>
>>> Thanks for your help.  I'll see if I can write this up in a blog post.
>>>
>>> Steve
>>>
>>>
>>>
>>> On 21 October 2015 at 00:06, Peter van Heusden  wrote:
>>>
 I poked around at your tool XML and the code a bit and the problem is
 directory="$job_name". Galaxy expects to collect files from the job's
 working directory - basically the current working directory the job runs
 in. The directory= argument doesn't have variables expanded as far as I can
 tell. In any event it is used in walk_over_extra_files() that is in
 lib/galaxy/tools/parameters/output_collect.py - if you look there you see
 that it is simply appended to the job's working directory.

 So if you use:

 

 (note the  and  - this is effectively the regexp
 r"(?P.*)\.txt" with the < and > escaped out)

 And alter the code so that it just writes files to the current
 directory, then you'll pick up the files one.txt, two,txt and three.txt.

 Peter

 On 20 October 2015 at 12:28, Steve Cassidy 
 wrote:

> Sorry, it was just an example of a tool that works - the file that it
> writes out is put into that directory, so I assume that's where my files
> should end up too.
>
> Steve
>
> On 20 October 2015 at 21:12, Peter van Heusden 
> wrote:
>
>> Sorry, I don't understand - what does the Upload File tool have to do
>> with this?
>>
>> On 20 October 2015 at 11:49, Steve Cassidy 
>> wrote:
>>
>>> Yes, I'm sure that's where the problem lies. Writing out to the
>>> current directory doesn't work.  The files get written to
>>> 'job_working_directory/000/1/' but if I run the Upload File tool the 
>>> result
>>> is placed in 'files/000/'.  I think I need to work out where to write 
>>> the
>>> files, I found some references to $__new_file_path__ but that doesn't 
>>> seem
>>> to help.
>>>
>>> Steve
>>>
>>>
>>>
>>> On 20 October 2015 at 19:57, Peter van Heusden 
>>> wrote:
>>>
 I suspect that the problem might be in the 
 then. I'm not an export on this, but "__name_and_ext__" turns into the
 regexp r"(?P.*)\.(?P[^\.]+)?" in
 lib/galaxy/tools/parameters/output_collect.py, and is used by the
 DatasetCollector (line 358). This looks like it should match the 
 filenames
 you're creating, but I'm not 100% sure how that code works. One thing I
 

Re: [galaxy-dev] Tools that make datasets

2015-10-21 Thread Steve Cassidy
A brief writeup of my experiences:

http://web.science.mq.edu.au/~cassidy/wordpress/2015/10/21/galaxy-tool-generating-datasets/

Steve

On 21 October 2015 at 18:27, Peter van Heusden  wrote:

> These names have meaning in CollectedDatasetMatch:
>
> designation means designation if that exists else name
> name means name
> dbkey means dbkey
> ext means ext
> visible means visible
>
>
> On 21 October 2015 at 08:44, Steve Cassidy 
> wrote:
>
>> Thanks, yes that works, though I'm not really sure what the difference is
>> between __name__ and __designation__. They both seem to have the same
>> effect in my example.
>>
>> Thanks for your help.
>>
>> Steve
>>
>>
>> On 21 October 2015 at 16:49, Peter van Heusden  wrote:
>>
>>> Since you're using a directory, you can use one of the built in patterns:
>>>
>>> DEFAULT_EXTRA_FILENAME_PATTERN = 
>>> r"primary_DATASET_ID_(?P[^_]+)_(?P[^_]+)_(?P[^_]+)(_(?P[^_]+))?"
>>>
>>> NAMED_PATTERNS = {
>>> "__default__": DEFAULT_EXTRA_FILENAME_PATTERN,
>>> "__name__": r"(?P.*)",
>>> "__designation__": r"(?P.*)",
>>> "__name_and_ext__": r"(?P.*)\.(?P[^\.]+)?",
>>> "__designation_and_ext__": r"(?P.*)\.(?P[^\._]+)?",
>>> }
>>>
>>> In terms of docs, I don't know what the future is - the Galaxy wiki or 
>>> http://galaxy.readthedocs.org/en/master/
>>>
>>>
>>> On 21 October 2015 at 04:26, Steve Cassidy 
>>> wrote:
>>>
 Ah, thankyou, yes, I can now get results by using patterns to match the
 output.  I used your example but prepend 'simple' to the filename and then
 searches for that with:

 
 this solves the problem for the sample script but not generally since
 in general I can't predict the filenames that will be generated - this is a
 tool for downloading data from a repository which could be text, audio or
 video data.

 If I don't use the 'simple' prefix and omit the file extension I still
 get my data but I also get three other files which are temporary scripts
 generated by galaxy and placed in the working directory.  So, back to
 trying to put things in a subdirectory. It turns out that the issue I was
 having was as you pointed out earlier, the directory attribute to
 discover_datasets doesn't allow variables, so I need to write to a fixed
 directory name:

 
 This now works!

 I had thought that I'd need to use a unique directory name but since
 galaxy runs each job in a separate directory, this isn't required.  My real
 tool now works too after following the same pattern.

 Thanks for your help.  I'll see if I can write this up in a blog post.

 Steve



 On 21 October 2015 at 00:06, Peter van Heusden  wrote:

> I poked around at your tool XML and the code a bit and the problem is
> directory="$job_name". Galaxy expects to collect files from the job's
> working directory - basically the current working directory the job runs
> in. The directory= argument doesn't have variables expanded as far as I 
> can
> tell. In any event it is used in walk_over_extra_files() that is in
> lib/galaxy/tools/parameters/output_collect.py - if you look there you see
> that it is simply appended to the job's working directory.
>
> So if you use:
>
> 
>
> (note the  and  - this is effectively the regexp
> r"(?P.*)\.txt" with the < and > escaped out)
>
> And alter the code so that it just writes files to the current
> directory, then you'll pick up the files one.txt, two,txt and three.txt.
>
> Peter
>
> On 20 October 2015 at 12:28, Steve Cassidy 
> wrote:
>
>> Sorry, it was just an example of a tool that works - the file that it
>> writes out is put into that directory, so I assume that's where my files
>> should end up too.
>>
>> Steve
>>
>> On 20 October 2015 at 21:12, Peter van Heusden 
>> wrote:
>>
>>> Sorry, I don't understand - what does the Upload File tool have to
>>> do with this?
>>>
>>> On 20 October 2015 at 11:49, Steve Cassidy 
>>> wrote:
>>>
 Yes, I'm sure that's where the problem lies. Writing out to the
 current directory doesn't work.  The files get written to
 'job_working_directory/000/1/' but if I run the Upload File tool the 
 result
 is placed in 'files/000/'.  I think I need to work out where to write 
 the
 files, I found some references to $__new_file_path__ but that doesn't 
 seem
 to help.

 Steve



 On 20 October 2015 at 19:57, Peter van Heusden 
 wrote:

> I suspect that the problem might be in the 
> then. I'm not an 

Re: [galaxy-dev] Tools that make datasets

2015-10-20 Thread Peter van Heusden
Just a quick check - did you refresh your history to confirm that the
dataset *is* empty? We had the same thing at SANBI but it turns out that
Galaxy creates an empty output collection and then only populates it
sometime after job completion (this is a know UI bug).

See:
http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections/

On 20 October 2015 at 08:48, Steve Cassidy  wrote:

> Hi all,
>   I'm trying to understand how to write a tool that generates a dataset
> rather than a single output file.  I've tried following all of the examples
> but I'm stuck, so I thought I would distil down the simplest example I
> could write and ask for help here.
>
> So here's my example:
>
> https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55
>
> it's a simple python script that writes three files to a directory named
> for the single input parameter.
>
> I think one of the problems I'm having is knowing where to write the
> output to. I've run this under planemo serve and the job runs, creating the
> output directory within the 'job_working_directory/000/1/SampleDataset'
> directory, however my dataset doesn't contain anything so clearly my
> outputs directive isn't working:
>
> 
> 
>  directory="$job_name" />
> 
> 
>
> ($job_name is the name of the directory that is being written to,
> SampleDataset in this case)
>
> Any help in getting this example working would be appreciated.
>
> Thanks,
>
> Steve
>
>
>
>
>
> --
> Department of Computing, Macquarie University
> http://web.science.mq.edu.au/~cassidy/
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Tools that make datasets

2015-10-20 Thread Steve Cassidy
Thanks Peter,
  I did see that proviso somewhere but no, refreshing doesn't help.

That page was one of those that I referred to getting to this point.

Steve

On 20 October 2015 at 18:33, Peter van Heusden  wrote:

> Just a quick check - did you refresh your history to confirm that the
> dataset *is* empty? We had the same thing at SANBI but it turns out that
> Galaxy creates an empty output collection and then only populates it
> sometime after job completion (this is a know UI bug).
>
> See:
> http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections/
>
> On 20 October 2015 at 08:48, Steve Cassidy 
> wrote:
>
>> Hi all,
>>   I'm trying to understand how to write a tool that generates a dataset
>> rather than a single output file.  I've tried following all of the examples
>> but I'm stuck, so I thought I would distil down the simplest example I
>> could write and ask for help here.
>>
>> So here's my example:
>>
>> https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55
>>
>> it's a simple python script that writes three files to a directory named
>> for the single input parameter.
>>
>> I think one of the problems I'm having is knowing where to write the
>> output to. I've run this under planemo serve and the job runs, creating the
>> output directory within the 'job_working_directory/000/1/SampleDataset'
>> directory, however my dataset doesn't contain anything so clearly my
>> outputs directive isn't working:
>>
>> 
>> 
>> > directory="$job_name" />
>> 
>> 
>>
>> ($job_name is the name of the directory that is being written to,
>> SampleDataset in this case)
>>
>> Any help in getting this example working would be appreciated.
>>
>> Thanks,
>>
>> Steve
>>
>>
>>
>>
>>
>> --
>> Department of Computing, Macquarie University
>> http://web.science.mq.edu.au/~cassidy/
>>
>> ___
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>   https://lists.galaxyproject.org/
>>
>> To search Galaxy mailing lists use the unified search at:
>>   http://galaxyproject.org/search/mailinglists/
>>
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
>



-- 
Department of Computing, Macquarie University
http://web.science.mq.edu.au/~cassidy/
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Tools that make datasets

2015-10-20 Thread Peter van Heusden
I suspect that the problem might be in the  then. I'm
not an export on this, but "__name_and_ext__" turns into the
regexp r"(?P.*)\.(?P[^\.]+)?" in
lib/galaxy/tools/parameters/output_collect.py, and is used by the
DatasetCollector (line 358). This looks like it should match the filenames
you're creating, but I'm not 100% sure how that code works. One thing I
notice is the "directory" argument. If you write jobs to the current
directory instead of "output_path" can you get it to work?

Peter

On 20 October 2015 at 09:52, Steve Cassidy  wrote:

> Thanks Peter,
>   I did see that proviso somewhere but no, refreshing doesn't help.
>
> That page was one of those that I referred to getting to this point.
>
> Steve
>
> On 20 October 2015 at 18:33, Peter van Heusden  wrote:
>
>> Just a quick check - did you refresh your history to confirm that the
>> dataset *is* empty? We had the same thing at SANBI but it turns out that
>> Galaxy creates an empty output collection and then only populates it
>> sometime after job completion (this is a know UI bug).
>>
>> See:
>> http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections/
>>
>> On 20 October 2015 at 08:48, Steve Cassidy 
>> wrote:
>>
>>> Hi all,
>>>   I'm trying to understand how to write a tool that generates a dataset
>>> rather than a single output file.  I've tried following all of the examples
>>> but I'm stuck, so I thought I would distil down the simplest example I
>>> could write and ask for help here.
>>>
>>> So here's my example:
>>>
>>> https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55
>>>
>>> it's a simple python script that writes three files to a directory named
>>> for the single input parameter.
>>>
>>> I think one of the problems I'm having is knowing where to write the
>>> output to. I've run this under planemo serve and the job runs, creating the
>>> output directory within the 'job_working_directory/000/1/SampleDataset'
>>> directory, however my dataset doesn't contain anything so clearly my
>>> outputs directive isn't working:
>>>
>>> 
>>> 
>>> >> directory="$job_name" />
>>> 
>>> 
>>>
>>> ($job_name is the name of the directory that is being written to,
>>> SampleDataset in this case)
>>>
>>> Any help in getting this example working would be appreciated.
>>>
>>> Thanks,
>>>
>>> Steve
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Department of Computing, Macquarie University
>>> http://web.science.mq.edu.au/~cassidy/
>>>
>>> ___
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>   https://lists.galaxyproject.org/
>>>
>>> To search Galaxy mailing lists use the unified search at:
>>>   http://galaxyproject.org/search/mailinglists/
>>>
>>
>>
>> ___
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>   https://lists.galaxyproject.org/
>>
>> To search Galaxy mailing lists use the unified search at:
>>   http://galaxyproject.org/search/mailinglists/
>>
>
>
>
> --
> Department of Computing, Macquarie University
> http://web.science.mq.edu.au/~cassidy/
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Tools that make datasets

2015-10-20 Thread Steve Cassidy
Yes, I'm sure that's where the problem lies. Writing out to the current
directory doesn't work.  The files get written to
'job_working_directory/000/1/' but if I run the Upload File tool the result
is placed in 'files/000/'.  I think I need to work out where to write the
files, I found some references to $__new_file_path__ but that doesn't seem
to help.

Steve



On 20 October 2015 at 19:57, Peter van Heusden  wrote:

> I suspect that the problem might be in the  then. I'm
> not an export on this, but "__name_and_ext__" turns into the
> regexp r"(?P.*)\.(?P[^\.]+)?" in
> lib/galaxy/tools/parameters/output_collect.py, and is used by the
> DatasetCollector (line 358). This looks like it should match the filenames
> you're creating, but I'm not 100% sure how that code works. One thing I
> notice is the "directory" argument. If you write jobs to the current
> directory instead of "output_path" can you get it to work?
>
> Peter
>
> On 20 October 2015 at 09:52, Steve Cassidy 
> wrote:
>
>> Thanks Peter,
>>   I did see that proviso somewhere but no, refreshing doesn't help.
>>
>> That page was one of those that I referred to getting to this point.
>>
>> Steve
>>
>> On 20 October 2015 at 18:33, Peter van Heusden  wrote:
>>
>>> Just a quick check - did you refresh your history to confirm that the
>>> dataset *is* empty? We had the same thing at SANBI but it turns out that
>>> Galaxy creates an empty output collection and then only populates it
>>> sometime after job completion (this is a know UI bug).
>>>
>>> See:
>>> http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections/
>>>
>>> On 20 October 2015 at 08:48, Steve Cassidy 
>>> wrote:
>>>
 Hi all,
   I'm trying to understand how to write a tool that generates a dataset
 rather than a single output file.  I've tried following all of the examples
 but I'm stuck, so I thought I would distil down the simplest example I
 could write and ask for help here.

 So here's my example:

 https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55

 it's a simple python script that writes three files to a directory
 named for the single input parameter.

 I think one of the problems I'm having is knowing where to write the
 output to. I've run this under planemo serve and the job runs, creating the
 output directory within the 'job_working_directory/000/1/SampleDataset'
 directory, however my dataset doesn't contain anything so clearly my
 outputs directive isn't working:

 
 
 >>> directory="$job_name" />
 
 

 ($job_name is the name of the directory that is being written to,
 SampleDataset in this case)

 Any help in getting this example working would be appreciated.

 Thanks,

 Steve





 --
 Department of Computing, Macquarie University
 http://web.science.mq.edu.au/~cassidy/

 ___
 Please keep all replies on the list by using "reply all"
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/

>>>
>>>
>>> ___
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>   https://lists.galaxyproject.org/
>>>
>>> To search Galaxy mailing lists use the unified search at:
>>>   http://galaxyproject.org/search/mailinglists/
>>>
>>
>>
>>
>> --
>> Department of Computing, Macquarie University
>> http://web.science.mq.edu.au/~cassidy/
>>
>
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
>



-- 
Department of Computing, Macquarie University
http://web.science.mq.edu.au/~cassidy/
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Tools that make datasets

2015-10-20 Thread Steve Cassidy
Hi all,
  I'm trying to understand how to write a tool that generates a dataset
rather than a single output file.  I've tried following all of the examples
but I'm stuck, so I thought I would distil down the simplest example I
could write and ask for help here.

So here's my example:

https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55

it's a simple python script that writes three files to a directory named
for the single input parameter.

I think one of the problems I'm having is knowing where to write the output
to. I've run this under planemo serve and the job runs, creating the output
directory within the 'job_working_directory/000/1/SampleDataset' directory,
however my dataset doesn't contain anything so clearly my outputs directive
isn't working:







($job_name is the name of the directory that is being written to,
SampleDataset in this case)

Any help in getting this example working would be appreciated.

Thanks,

Steve





-- 
Department of Computing, Macquarie University
http://web.science.mq.edu.au/~cassidy/
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Tools that make datasets

2015-10-20 Thread Steve Cassidy
Ah, thankyou, yes, I can now get results by using patterns to match the
output.  I used your example but prepend 'simple' to the filename and then
searches for that with:


this solves the problem for the sample script but not generally since in
general I can't predict the filenames that will be generated - this is a
tool for downloading data from a repository which could be text, audio or
video data.

If I don't use the 'simple' prefix and omit the file extension I still get
my data but I also get three other files which are temporary scripts
generated by galaxy and placed in the working directory.  So, back to
trying to put things in a subdirectory. It turns out that the issue I was
having was as you pointed out earlier, the directory attribute to
discover_datasets doesn't allow variables, so I need to write to a fixed
directory name:


This now works!

I had thought that I'd need to use a unique directory name but since galaxy
runs each job in a separate directory, this isn't required.  My real tool
now works too after following the same pattern.

Thanks for your help.  I'll see if I can write this up in a blog post.

Steve



On 21 October 2015 at 00:06, Peter van Heusden  wrote:

> I poked around at your tool XML and the code a bit and the problem is
> directory="$job_name". Galaxy expects to collect files from the job's
> working directory - basically the current working directory the job runs
> in. The directory= argument doesn't have variables expanded as far as I can
> tell. In any event it is used in walk_over_extra_files() that is in
> lib/galaxy/tools/parameters/output_collect.py - if you look there you see
> that it is simply appended to the job's working directory.
>
> So if you use:
>
> 
>
> (note the  and  - this is effectively the regexp
> r"(?P.*)\.txt" with the < and > escaped out)
>
> And alter the code so that it just writes files to the current directory,
> then you'll pick up the files one.txt, two,txt and three.txt.
>
> Peter
>
> On 20 October 2015 at 12:28, Steve Cassidy 
> wrote:
>
>> Sorry, it was just an example of a tool that works - the file that it
>> writes out is put into that directory, so I assume that's where my files
>> should end up too.
>>
>> Steve
>>
>> On 20 October 2015 at 21:12, Peter van Heusden  wrote:
>>
>>> Sorry, I don't understand - what does the Upload File tool have to do
>>> with this?
>>>
>>> On 20 October 2015 at 11:49, Steve Cassidy 
>>> wrote:
>>>
 Yes, I'm sure that's where the problem lies. Writing out to the current
 directory doesn't work.  The files get written to
 'job_working_directory/000/1/' but if I run the Upload File tool the result
 is placed in 'files/000/'.  I think I need to work out where to write the
 files, I found some references to $__new_file_path__ but that doesn't seem
 to help.

 Steve



 On 20 October 2015 at 19:57, Peter van Heusden  wrote:

> I suspect that the problem might be in the  then.
> I'm not an export on this, but "__name_and_ext__" turns into the
> regexp r"(?P.*)\.(?P[^\.]+)?" in
> lib/galaxy/tools/parameters/output_collect.py, and is used by the
> DatasetCollector (line 358). This looks like it should match the filenames
> you're creating, but I'm not 100% sure how that code works. One thing I
> notice is the "directory" argument. If you write jobs to the current
> directory instead of "output_path" can you get it to work?
>
> Peter
>
> On 20 October 2015 at 09:52, Steve Cassidy 
> wrote:
>
>> Thanks Peter,
>>   I did see that proviso somewhere but no, refreshing doesn't help.
>>
>> That page was one of those that I referred to getting to this point.
>>
>> Steve
>>
>> On 20 October 2015 at 18:33, Peter van Heusden 
>> wrote:
>>
>>> Just a quick check - did you refresh your history to confirm that
>>> the dataset *is* empty? We had the same thing at SANBI but it turns out
>>> that Galaxy creates an empty output collection and then only populates 
>>> it
>>> sometime after job completion (this is a know UI bug).
>>>
>>> See:
>>> http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections/
>>>
>>> On 20 October 2015 at 08:48, Steve Cassidy 
>>> wrote:
>>>
 Hi all,
   I'm trying to understand how to write a tool that generates a
 dataset rather than a single output file.  I've tried following all of 
 the
 examples but I'm stuck, so I thought I would distil down the simplest
 example I could write and ask for help here.

 So here's my example:

 https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55

 it's a simple python 

Re: [galaxy-dev] Tools that make datasets

2015-10-20 Thread Peter van Heusden
I poked around at your tool XML and the code a bit and the problem is
directory="$job_name". Galaxy expects to collect files from the job's
working directory - basically the current working directory the job runs
in. The directory= argument doesn't have variables expanded as far as I can
tell. In any event it is used in walk_over_extra_files() that is in
lib/galaxy/tools/parameters/output_collect.py - if you look there you see
that it is simply appended to the job's working directory.

So if you use:



(note the  and  - this is effectively the regexp
r"(?P.*)\.txt" with the < and > escaped out)

And alter the code so that it just writes files to the current directory,
then you'll pick up the files one.txt, two,txt and three.txt.

Peter

On 20 October 2015 at 12:28, Steve Cassidy  wrote:

> Sorry, it was just an example of a tool that works - the file that it
> writes out is put into that directory, so I assume that's where my files
> should end up too.
>
> Steve
>
> On 20 October 2015 at 21:12, Peter van Heusden  wrote:
>
>> Sorry, I don't understand - what does the Upload File tool have to do
>> with this?
>>
>> On 20 October 2015 at 11:49, Steve Cassidy 
>> wrote:
>>
>>> Yes, I'm sure that's where the problem lies. Writing out to the current
>>> directory doesn't work.  The files get written to
>>> 'job_working_directory/000/1/' but if I run the Upload File tool the result
>>> is placed in 'files/000/'.  I think I need to work out where to write the
>>> files, I found some references to $__new_file_path__ but that doesn't seem
>>> to help.
>>>
>>> Steve
>>>
>>>
>>>
>>> On 20 October 2015 at 19:57, Peter van Heusden  wrote:
>>>
 I suspect that the problem might be in the  then.
 I'm not an export on this, but "__name_and_ext__" turns into the
 regexp r"(?P.*)\.(?P[^\.]+)?" in
 lib/galaxy/tools/parameters/output_collect.py, and is used by the
 DatasetCollector (line 358). This looks like it should match the filenames
 you're creating, but I'm not 100% sure how that code works. One thing I
 notice is the "directory" argument. If you write jobs to the current
 directory instead of "output_path" can you get it to work?

 Peter

 On 20 October 2015 at 09:52, Steve Cassidy 
 wrote:

> Thanks Peter,
>   I did see that proviso somewhere but no, refreshing doesn't help.
>
> That page was one of those that I referred to getting to this point.
>
> Steve
>
> On 20 October 2015 at 18:33, Peter van Heusden 
> wrote:
>
>> Just a quick check - did you refresh your history to confirm that the
>> dataset *is* empty? We had the same thing at SANBI but it turns out that
>> Galaxy creates an empty output collection and then only populates it
>> sometime after job completion (this is a know UI bug).
>>
>> See:
>> http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections/
>>
>> On 20 October 2015 at 08:48, Steve Cassidy 
>> wrote:
>>
>>> Hi all,
>>>   I'm trying to understand how to write a tool that generates a
>>> dataset rather than a single output file.  I've tried following all of 
>>> the
>>> examples but I'm stuck, so I thought I would distil down the simplest
>>> example I could write and ask for help here.
>>>
>>> So here's my example:
>>>
>>> https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55
>>>
>>> it's a simple python script that writes three files to a directory
>>> named for the single input parameter.
>>>
>>> I think one of the problems I'm having is knowing where to write the
>>> output to. I've run this under planemo serve and the job runs, creating 
>>> the
>>> output directory within the 'job_working_directory/000/1/SampleDataset'
>>> directory, however my dataset doesn't contain anything so clearly my
>>> outputs directive isn't working:
>>>
>>> 
>>> 
>>> >> directory="$job_name" />
>>> 
>>> 
>>>
>>> ($job_name is the name of the directory that is being written to,
>>> SampleDataset in this case)
>>>
>>> Any help in getting this example working would be appreciated.
>>>
>>> Thanks,
>>>
>>> Steve
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Department of Computing, Macquarie University
>>> http://web.science.mq.edu.au/~cassidy/
>>>
>>> ___
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>   https://lists.galaxyproject.org/
>>>
>>> To search Galaxy mailing lists use