Today we assembled the first Blue Obelisk workflow on the taverna system (thanks to YY). It is described in:
http://sourceforge.net/mailarchive/forum.php?forum_id=35847&max_rows=25&style=flat&viewmonth=200503&viewday=29 (may not be posted yet).
and
"
With Tom's guidance, Yong Zhang ("YY") in our group has written a chemoinformatics workflow example and sent it to Tom for inclusion in the examples.
Since this is the first chemoinformatics example please excuse the bandwidth in the following explanation which may be useful to list members. Note that it relies primarily on WS at UCC rather than LocalWorker as we haven't sorted out the minimum set of jars yet...
The problem is a typical one in chemoinformatics. The user has a structure (connection table) for a molecule and wants an accurate 3D molecular structure. There needs to be format conversions and callouts to services. The steps are as follows:
* read a SMILES string (or a series of SMILES). Each represents the chemical structure.
* convert it to MDLMol format as this is required for the InChI. There is a generic tool, OpenBabel, which converts almost any format to almost any format. Openbabel is written in C++. Although we can run it with System.exec, this requires a different exe for each platform. Here we call a WS on our site.
* generate the IUPAC InChI identifier from the MDLMolfile. The InChI is a canonicalized unique identifier generated from the chemical structure (connection table). Again the source code is C so it is attractive to use a WS on our site. The result is a string which is cleaned by the WS wrapper.
* The WWMM database of 3D compounds is then searched with the InChI. Over 200,000 compounds with 3D structures have been optimized using the MOPAC PM5 method. They have been indexed by their InChIs which are unique and ideally suited for databases. We use the Xindice XML repository, but are now moving to eXist.
* the 3D structure in CML is then passed to Jmol (already a Taverna component). Since Taverna originally only dealt with proteins today's pictures had a Zaphod-like quality - black atoms and black bonds on a black background. Tom says he will fix the scripts so that there is a "small molecule" Jmol.
"
It uses the following Blue Obelisk components: * OpenBabel * (InChI wrapper) * (JUMBO) * WWMM * Jmol
Egon is also actively looking at how to interface CDK. I think that you will find that Taverna + WS is very easy to set up. It will be slightly more difficult for client-side as we have to work out how to distribute our jars without clashing.
P.
this mail re-usable under CreativeCommons
Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 Fax: +44 1223 763076
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Jmol-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/jmol-developers
