Hello, bcbio (https://github.com/bcbio/bcbio-nextgen) references a set of gold-standard packages for the interpretation of next-generation sequencing data. The actual set of packages required varies with the exact data at hand and the workflow run on it, but Debian Med should show that it can run this. For the moment we are still missing quite some bits and, frankly, it does not look too good:
* The new queue has a serious bandwidth problem. * Much work: Many packages come with embedded external packages that DFSG requires to be separate packages and needs to adapt upstream's build scripts. More of concern is that it is at times uncertain if upstream has fiddled with the embedded source tree, so the DFSG version may have inadvertent scientific consequences. * bcbio has 2nd degree dependencies like Vienna-RNA in non-free. And the testing of these packages also make these non-free bits part of the build dependencies. Details on how far we got is stated on https://salsa.debian.org/med-team/bcbio/blob/master/debian/TODO . That document was never finished since with every package one looks in detail there are more dependencies uncovered that should also be listed. It is not exactly sure about where to go from here. Emerging ideas are: * a repository outside Debian main to harbor packages that have not yet made it into the distribution (evolving on http://med.functional.domains) * maybe not package what is needed for autotests only when this is not invoked by the bcbio workflows (ouch! - a very pragmatic approach, isn't?) * just don't ignore the DFSG and get stuff functional (have used all words for that in the line above already - speechless) We just took one big hurdle, which was seqcluster. The next shall be mosdepth with its many "nim" dependencies already prepared on https://salsa.debian.org/nim-team. I'll then send an update on how the bcbio tests go. Best, Steffen

