Robert,

Leaf's webpage seems to be dead, but bpipe looks surprisingly close in
feature set to what we were thinking about, right down to the "plain
command" idea. Their implementation code is different but that's not a big
barrier, this *may* be a winner, I'll have to do some experiments.

Thanks!

-- 
Gabriel A. Devenyi B.Eng. Ph.D.
Research Computing Associate
Computational Brain Anatomy Laboratory
Cerebral Imaging Center
Douglas Mental Health University Institute
McGill University
t: 514.761.6131x4781
e: [email protected]

On Fri, Sep 26, 2014 at 9:21 PM, Robert M. Flight <[email protected]>
wrote:

> Have you considered leaf ( http://www.biomedcentral.com/1471-2105/14/201)
> or bpipe (
> http://m.bioinformatics.oxfordjournals.org/content/early/2012/04/11/bioinformatics.bts167.abstract)?
> These seem to be closer to your description of what you are looking for.
>
> FWIW, I have not used either tool, but have only read their publications.
>
> Robert
> On Sep 26, 2014 8:57 PM, "Gabriel A. Devenyi" <[email protected]>
> wrote:
>
>> Hi Software-Carpentry Discuss,
>>
>> At the COmputational BRain Anatomy Lab at the Douglas Institute in
>> Montreal, the Kimel Family Translational Imaging-Genetics Lab at CAMH in
>> Toronto, and in neuroscience in general, we have a great need to stitch
>> many small command line data processing tools (minc-toolkit etc) to run
>> against very large datasets. At some points in the pipeline, these tools
>> could be run against all the input subjects in parallel, but at other
>> points we need the previous steps to be completed so we can aggregate
>> across subjects.
>>
>> In searching for a tool to manage this workflow, we have found a few
>> (nipype, ruffus, taverna, pydpiper, joblib). But we found that these tools
>> either required programming in the file input-output management or writing
>> of new classes for the pipeline tool. This doesn't fit well with our user
>> base of non-programmers who have a general understanding of scripting. We
>> want to enable them to as easily as possible transform a serial bash script
>> into something that can run in parallel on a supercomputer.
>>
>> Having found no tool, we have considering developing our own tool we have
>> dubbed "Pipeliner - The stupid pipeline maker" which will live at
>> https://github.com/CobraLab/pipeliner
>>
>> We have posted a "functional" prototype of what Pipeliner would do, see
>> https://github.com/CobraLab/pipeliner/issues/1
>>
>> Below is an example of serial bash code we'd like to be able to
>> parallelize:
>> ```sh
>> # correct all images before we begin
>> for image in input/atlases/* input/subjects/*; do
>>    correct $image output/nuc/$(basename $image)
>> done
>>
>> # register all atlases to each subject
>> for atlas in input/atlases/*; do
>>     for subject in input/subjects/*; do
>>         register $atlas $subject output/registrations/$(basename
>> $atlas)/$(basename $subject)/reg.xfm
>>     done
>> done
>>
>> # creage an average transformation for each subject
>> for subject in input/subjects/*; do
>>    subjectname=$(basename $subject)
>>    xfmaverage output/registrations/*/$subjectname/reg.xfm
>> output/averagexfm/$subjectname.xfm
>> done
>> ```
>>
>> This tool would generate an internal representation of a set of commands
>> and then use a number of output plugins to generate bash scripts,
>> GridEngine jobs, slurm jobs, or other outputs.
>>
>> Does anyone have experience creating workflows like this, or know of an
>> existing tool we could use instead of rolling our own? We welcome comments,
>> suggestions, projects that already did this and collaborators to help build
>> this tool. Thanks everyone for your help!
>>
>>
>> --
>> Gabriel A. Devenyi B.Eng. Ph.D.
>> e: [email protected]
>>
>> _______________________________________________
>> Discuss mailing list
>> [email protected]
>>
>> http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org
>>
>
_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org

Reply via email to