On 11/30/2011 11:32 AM, Pjotr Prins wrote:

Git is not very good for storing large data files, which we would want
to fetch partially. My suggestion would be to have a plain old file
repo, e.g. on S3, which can be mirrored by others.

We had issues with large files in the EMBOSS release, and make those available via rsync to add to the developers CVS checkout. They include the NCBI taxonomy source and index files and the ontology source and index files.

The next EMBOSS release will include http and ftp URLs as valid inputs for any data type, so EMBOSS could use remote files for format tests. I' look into how other repositories could be added.

I had to add some extra qualifiers to allow queries and offsets to be specified, and rewrote the query language parsing to merge very similar code segments.

regards,

Peter Rice
EMBOSS Team
_______________________________________________
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss

Reply via email to