[galaxy-dev] options from file

2011-05-06 Thread SHAUN WEBB


Hi, I have created a tool that will fetch sequences for selected IDs  
from a tabular file containing multiple IDs and additional info.


I want the tool config to scan the first column of the tab file for  
IDs and provide the user with a selection box where they can select a  
single ID or multiple IDs and get output for all selected.


The following method does this:


param name=tabfile type=data format=tabular label=ID File/
param name=selection type=select multiple=true  
accept_default=true label=ID 

  options from_dataset=tabfile
  column name=name index=0/
  column name=value index=0/
  /options
 /param



The issue is, if the top file in my history is a SAM file containing  
~30,000 IDs in the first column the tool initially attempts to load  
these all in to the selection box and effectively crashes my local  
instance.


I only want to use this on tab files that ill have ~100 IDs at most. I  
have got around this by creating a new datatype indexfile as a class  
of Tabular in tabular.py:


class IndexFile( Tabular ):
file_ext = 'indexfile'

def sniff( self, filename ):
return False

And changing the input file to:
param name=tabfile type=data format=tabular label=ID File/


This means I must first set the tabular file to type indexfile, then  
it will be the only dataset shown under tabfile.



Selecting options from a file is really useful, I was wondering if  
there is a better workaround for this or if a similar indexfile  
datatype could be included in Galaxy.


Thanks
Shaun








--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] options from file

2011-05-06 Thread Peter Cock
On Fri, May 6, 2011 at 10:17 AM, SHAUN WEBB swe...@staffmail.ed.ac.uk wrote:

 Hi, I have created a tool that will fetch sequences for selected IDs from a
 tabular file containing multiple IDs and additional info.

 I want the tool config to scan the first column of the tab file for IDs and
 provide the user with a selection box where they can select a single ID or
 multiple IDs and get output for all selected.

 The following method does this:


 param name=tabfile type=data format=tabular label=ID File/
 param name=selection type=select multiple=true accept_default=true
 label=ID 
              options from_dataset=tabfile
                              column name=name index=0/
                              column name=value index=0/
              /options
  /param



 The issue is, if the top file in my history is a SAM file containing ~30,000
 IDs in the first column the tool initially attempts to load these all in to
 the selection box and effectively crashes my local instance.

 I only want to use this on tab files that ill have ~100 IDs at most.

Maybe the options from_dataset=tabfile tag could have a max
setting? e.g. options from_dataset=tabfile max=100 could
load just the first 100 entries in the tabular file. That seems much
more general than the new filetype idea.

It could have a default max value which should useful.

Thinking of the example of a tabular file of gene IDs for an organism,
you might well want 20 to 30 thousand entries. Since Galaxy puts
a search function on the selection, the UI should be OK. Its just the
performance we need to worry about.

Peter

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/