Re: [galaxy-dev] galaxy core services vs wrapper.py duplication & caching of .fai files

Jeremy Goecks Tue, 29 Nov 2011 15:06:52 -0800

Curtis,

> [curtish@cheaha galaxy]$ find . -name "*.py"  | xargs grep sam_fa_indices.loc
> ./tools/samtools/sam_pileup.py:    seqFile = '%s/sam_fa_indices.loc' % 
> GALAXY_DATA_INDEX_DIR
...
> ./tools/ngs_rna/cufflinks_wrapper_without_gtf.py:        
> cached_seqs_pointer_file = os.path.join( options.index_dir, 
> 'sam_fa_indices.loc' )
>  
> Is there any place in galaxy-core where such a core service lives and could 
> be used by all these adaptors, rather than replicating the code everywhere?


Not yet, but this is definitely needed. However, tools and Galaxy must remain 
independent , so the location of needed indices should be passed to the tool 
via the command line rather than having tools call into Galaxy.
 
> As a related question, for fasta genomes from the current history, these 
> wrappers compute the .fai file on the fly, in TMP, then throw it away, every 
> time. Has there been any discussion about storing such derived indices in the 
> dataset’s metadata (like the .bai file on a .bam data set), so it gets 
> computed once, then re-used?


Converted datasets, which subsume indices-as-metadata, can store dataset 
indices. Extending converted datasets to store indices created on the fly is 
also very much needed.

Any community contributions that address these issues would be most welcome.

Best,
J.

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] galaxy core services vs wrapper.py duplication & caching of .fai files

Reply via email to