At 14:42 15/09/2007, Christoph Steinbeck wrote: >Peter, > >thanks very much for the interesting posting.
and thanks for answering. This is a good example of avoiding repeating work. I'll let Henry comment. >Warren Here from Wavefunction has been calculating Spektra for all >NMRShiftDB structures using Spartan on Hartree-Fock level. We have a CD >with all of them (Stefan, is that right) but we never got all the >information on the format, I think. Is this data open? >I myself have been calculating about 2000 Gaussian spectra for >NMRShiftDB on a 16-node cluster which came for exactly this purpose with >my NMRShiftDB grant. > >BTW, there has been an interesting article on the topic: > >Structure Validation of Natural Products by Quantum-Mechanical GIAO >Calculations of 13C NMR Chemical Shifts >Giampaolo Barone, et al. >Chemistry - A European Journal >Volume 8, Issue 14 , Pages 3233 - 3239 Thanks - I am offline so will comment when I get in to work... >I would suggest an extended protocol, which slows down things a bit but >make a better research project: We should start with the generation of a >number of lowest energy conformers (probably not for molecules like >alpha-Pinene, http://en.wikipedia.org/wiki/Alpha-Pinene), but for more >floppy compounds, do a Gaussian calculation for each of them and try >averaging shifts. >The protocol will be a bit more fragile, though, I guess. This is probably too extended for Nick's timescale >Another issue will be if there is an option for simulating polar >solvents and their effects on the shifts?! > >There seems to be a general agreement that you get best results with >calculating both the geometry optimization as well as the GIAO shifts on >B3LYP/6-31G(d) level, but I trust whatever Henry suggests :-) >With my own Gaussian calculations, I got excellent results for rigid and >unpolar compounds, just as Barone suggested in his paper. > >Please let me know if I can help. Would be useful to expose the data. P. >Cheers, > >Chris > > >peter murray-rust wrote: > > Is the NMRShiftDB list active? > > > > Here's what we want to do anyway... > > > > It is now possible to compute the 13C chemical shifts of organic > > compounds to an exciting degree of accuracy. Henry and I were talking > > about this yesterday and if the compound is fairly rigid the results > > are believable enough to assign most peaks and to correct errors. > > > > We want to apply this method to as many spectra in NMRShiftDB as we > > can. It will be limited by time, flexibility and lots of scientific > > effects that we haven't thought of and which will be discovered > by the process. > > > > The MW limit is ca 500 (several days) though we'd prefer smaller ones > > to start with in which case they might take half a day each. Nick Day > > has already setup an effective workflow for crystals from crystalEye > > and runs these on much the same timescale - we have processed many > > thousand jobs on the Condor system. We now have Gaussian which - > > according to Henry - should be easy to configure. > > > > Nick is finishing the experimental part of his thesis and this would > > form an interesting final piece. > > > > So what we have to do is (and it must be automatic) > > * extract spectra and connection table from NMRShiftDB > > * generate 3D coordinates > > * generate Gaussian input according to Henry's protocols. > > * run the jobs on Condor (or elsewhere) > > * collect the output > > * parse into CML > > * expose the results on our web site (this is where the Open > Science comes in). > > * annotate the results (humans and machine) > > * display the results (primarily agreement between observed and > > calculated values, but also much else). > > > > The immediate difficulties we can see are: > > * not knowing the stereochemistry. *** WHAT IS THE POSITION IN > > NMRSHIFTDB? We can filter out anything that has more than one > > potential stereocentre. > > * assumptions about completeness of data in NMRShiftDB > > * syntactic problems (almost certain to occur in any large data set) > > * generating initial 3d coordinates. There are several simple approaches: > > - look up the moiety in crystaleye > > - join moieties in crystaleye > > - use CDK - I think Christoph had something here? > > - use the 2D coordinates for "flat" molecules > > *** NEW APPROACHES - MUST BE BLUEOBELISK > > * optimising the coordinates. Probably a cheap level of theory (PM3) > > would work. > > * parsing the Gaussian output (though JUMBOMarker has already been > > tuned for some Gaussian jobs). It may also be that the archive is enough > > * scale. After a few hundred jobs SOME effect of scale will hit us. > > We don't know what but every project of this sort has these scale problems > > * wikifying the results. Ideally we like to expose the results in a > > very similar way to crystaleye with 2D and 3D coordinates and with an > > observed/calc graph. Then we have to protect against spam. > > Suggestions would be welcome > > > > At present I'd like to know of any immediate problems that we haven't > > thought of. If not I suspect Nick will simply download all the data. > > > > Compute resources are probably not the problem at present. But later > > we may ask for volunteers. > > > > If this is a success there is a much wider vision. You don't need me > > to spell it out, and it's probably a good idea to keep things low key. > > > > P. > > > > > > > > Peter Murray-Rust > > Unilever Centre for Molecular Sciences Informatics > > University of Cambridge, > > Lensfield Road, Cambridge CB2 1EW, UK > > +44-1223-763069 > > > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2005. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Blueobelisk-discuss mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss > > >-- >PD Dr. Christoph Steinbeck >Lecturer in Chemoinformatics >Univ. Tuebingen, WSI-RA, Sand 1, D-72076 Tuebingen, Germany >Phone: (+49/0) 7071-29-78978 Fax: (+49/0) 7071-29-5091 > >What is man but that lofty spirit - that sense of enterprise. >... Kirk, "I, Mudd," stardate 4513.3.. Peter Murray-Rust Unilever Centre for Molecular Sciences Informatics University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK +44-1223-763069 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Blueobelisk-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss
