Re: [EMBOSS] [Open-bio-l] Common Sample Data Collection, was: SCF files (Staden)

Peter Rice Wed, 30 Nov 2011 03:39:50 -0800

On 11/30/2011 11:32 AM, Pjotr Prins wrote:

Git is not very good for storing large data files, which we would want
to fetch partially. My suggestion would be to have a plain old file
repo, e.g. on S3, which can be mirrored by others.

We had issues with large files in the EMBOSS release, and make thoseavailable via rsync to add to the developers CVS checkout. They includethe NCBI taxonomy source and index files and the ontology source andindex files.

The next EMBOSS release will include http and ftp URLs as valid inputsfor any data type, so EMBOSS could use remote files for format tests. I'look into how other repositories could be added.

I had to add some extra qualifiers to allow queries and offsets to bespecified, and rewrote the query language parsing to merge very similarcode segments.


regards,

Peter Rice
EMBOSS Team
_______________________________________________
EMBOSS mailing list
[email protected]
http://lists.open-bio.org/mailman/listinfo/emboss

Re: [EMBOSS] [Open-bio-l] Common Sample Data Collection, was: SCF files (Staden)

Reply via email to