Hello,

I wonder if anybody has thought about providing large data sets, like genomes, 
microarray data, etc. into debian "packages" in a way that makes it easy for 
users to get those data sets onto their machine, making it easier to use 
various tools?  I can think of many great ways this would be useful.

For example, If a user had high-throughput sequencing data that they need to 
align to a genome.  Now there is a tool available in debian called bowtie that 
will do the job but the user needs to 1) download the genome and 2) generate 
the bowtie index.  Wouldn't it be great if you just type:

apt-get install bowtie-human-genome-index

which installed the genome and the pre-built indexes, then they could just run 
bowtie directly.

Or another example is wanting to do your own BLAST searches, why not a package 
that has the BLAST database indexes:

apt-get install BLAST-human-genome

What is nice is all of these data sets could be maintained in a global 
directory space, like /usr/share, so all of the tools could share this space 
preventing duplication, and made available to all users on the system.  Right 
now every user has to figure out how to manage their data individually which 
can be difficult for biologists.

What do you think?

Scott


-- 
To UNSUBSCRIBE, email to debian-med-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/de3c6125-dce5-4414-8d0c-41df50764...@mac.com

Reply via email to