A common test-data package for genome assemblers

Afif Elghraoui Wed, 13 Jul 2016 08:25:06 -0700

Hi, all,

I've had a couple packages that indicate the availability of data
outside of the source distribution that can be used to try out the
software (and make sure that it actually runs). I didn't think it was a
good idea to bundle the data in with the actual package since it doesn't
change between releases and would take up too much space on the archive
if it was bundled with every upstream tarball.


For example, at <http://canu.readthedocs.io/en/stable/quick-start.html>,
there are a few reduced datasets that can be used to run assemblies for
PacBio and Nanopore sequencing data. Those files can also be used for
tests of the sprai package, and possibly also for other long-read genome
assemblers. There's also the option of packaging the Assemblathon data
for this purpose, or using simulators to generate datasets for testing.

Does anyone have suggestions or thoughts on this?

regards
Afif

-- 
Afif Elghraoui | عفيف الغراوي
http://afif.ghraoui.name

A common test-data package for genome assemblers

Reply via email to