bcbio - about where we are

Steffen Möller Sat, 03 Aug 2019 07:54:53 -0700

Hello,

bcbio (https://github.com/bcbio/bcbio-nextgen) references a set of
gold-standard packages for the interpretation of next-generation
sequencing data. The actual set of packages required varies with the
exact data at hand and the workflow run on it, but Debian Med should
show that it can run this. For the moment we are still missing quite
some bits and, frankly, it does not look too good:


 * The new queue has a serious bandwidth problem.
 * Much work: Many packages come with embedded external packages that
DFSG requires to be separate packages and needs to adapt upstream's
build scripts. More of concern is that it is at times uncertain if
upstream has fiddled with the embedded source tree, so the DFSG version
may have inadvertent scientific consequences.
 * bcbio has 2nd degree dependencies like Vienna-RNA in non-free. And
the testing of these packages also make these non-free bits part of the
build dependencies.

Details on how far we got is stated on
https://salsa.debian.org/med-team/bcbio/blob/master/debian/TODO .  That
document was never finished since with every package one looks in detail
there are more dependencies uncovered that should also be listed. It is
not exactly sure about where to go from here. Emerging ideas are:

  * a repository outside Debian main to harbor packages that have not
yet made it into the distribution (evolving on
http://med.functional.domains)
  * maybe not package what is needed for autotests only when this is not
invoked by the bcbio workflows (ouch! - a very pragmatic approach, isn't?)
  * just don't ignore the DFSG and get stuff functional (have used all
words for that in the line above already - speechless)

We just took one big hurdle, which was seqcluster. The next shall be
mosdepth with its many "nim" dependencies already prepared on
https://salsa.debian.org/nim-team. I'll then send an update on how the
bcbio tests go.

Best,

Steffen

bcbio - about where we are

Reply via email to