Hello, I had an exchange with Stian yesterday about what CWL workflow of his database he would propose to use as an experience-gathering example. He proposed the GATK workflow by Farah Zaib Khan et al. for being good to cite about workflows and reproducibility.
https://doi.org/10.1186/s12859-017-1747-0 https://github.com/skanwal/GATK-CaseStudy/tree/master/CWL We have BWA, GATK and Picard Toolkit already in Debian from what I understand (not sure about the state of GATK). Stian had pointed to https://github.com/h3abionet/h3agatk/blob/master/workflows/GATK/GATK-complete-WES-Workflow-h3abionet.cwl as a current variant of the same, but then again, I would not mind to start with a smaller one. Any comments? The main point for me is to have a small test case for running this workflow repeatedly. We would hence also need to decide on appropriate test data at some point. Should we also introduce a package like "genome-human"? Best, Steffen