On the other hand, I found this September 2015 paper on EBSCOHost, and it looks promising:
Title: Scientific workflow optimization for improved peptide and protein identification. Authors: Holl, Sonja; Mohammed, Yassene; Zimmermann, Olav; Palmblad, Magnus ([email protected]) Source: BMC Bioinformatics. 9/3/2015, Vol. 16 Issue 1, p1-13. 13p. 1 Color Photograph, 1 Diagram, 3 Charts, 4 Graphs. Author-Supplied Keywords: Optimization, Scientific workflow, Tandem mass spectrometry, Taverna workbench, X!Tandem Abstract: Background: Peptide-spectrum matching is a common step in most data processing workflows for mass spectrometry-based proteomics. Many algorithms and software packages, both free and commercial, have been developed to address this task. However, these algorithms typically require the user to select instrument- and sample-dependent parameters, such as mass measurement error tolerances and number of missed enzymatic cleavages. In order to select the best algorithm and parameter set for a particular dataset, in-depth knowledge about the data as well as the algorithms themselves is needed. Most researchers therefore tend to use default parameters, which are not necessarily optimal. Results: *We have applied a new optimization framework for the Taverna scientific workflow management system* (http://ms-utils.org/Taverna_Optimization.pdf) to find the best combination of parameters for a given scientific workflow to perform peptide-spectrum matching. The optimizations themselves are non-trivial, as demonstrated by several phenomena that can be observed when allowing for larger mass measurement errors in sequence database searches. On-the-fly parameter optimization embedded in scientific workflow management systems enables experts and non-experts alike to extract the maximum amount of information from the data. The same workflows could be used for exploring the parameter space and compare algorithms, not only for peptide-spectrum matching, but also for other tasks, such as retention time prediction. Conclusion: Using the optimization framework, we were able to learn about how the data was acquired as well as the explored algorithms. We observed a phenomenon identifying many ammonia-loss b-ion spectra as peptides with N-terminal pyroglutamate and a large precursor mass measurement error. These insights could only be gained with the extension of the common range for the mass measurement error tolerance parameters explored by the optimization framework. [ABSTRACT FROM AUTHOR] Copyright of BMC Bioinformatics is the property of BioMed Central and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) DOI: 10.1186/s12859-015-0714-x On Mon, Jan 11, 2016 at 12:42 PM Gale Naylor <[email protected]> wrote: > I looked at the links to the Google scholar searches above and mostly > found oblique references to Apache Taverna, along the lines of what Alan > mentioned. The only one that looked promising to me was this one: > > Local Graph Patterns for Scientific Workflow Similarity Search > > > https://www.informatik.hu-berlin.de/de/forschung/gebiete/wbi/teaching/studienDiplomArbeiten/running/expose_wiegandt_151116.pdf > > > But, it looked more like a detailed abstract and not a compete paper. > > > The other searches seemed only to return older references to Taverna (2013 > and earlier). > > > I can make an Apache Taverna page for citations, but I will need help with > what publications you think are good to include. > > > Gale > > > > On Mon, Jan 11, 2016 at 8:21 AM alaninmcr <[email protected]> > wrote: > >> Just a warning that you need to be careful when doing a search. A lot of >> the citations will be unhelpful like "Popular workflow systems include >> X, Y, Taverna and Z" or "Our system is better than X, Y, Taverna and Z >> because ...". >> >> Alan >> >
