Peter, thanks very much for the interesting posting.
Warren Here from Wavefunction has been calculating Spektra for all NMRShiftDB structures using Spartan on Hartree-Fock level. We have a CD with all of them (Stefan, is that right) but we never got all the information on the format, I think. I myself have been calculating about 2000 Gaussian spectra for NMRShiftDB on a 16-node cluster which came for exactly this purpose with my NMRShiftDB grant. BTW, there has been an interesting article on the topic: Structure Validation of Natural Products by Quantum-Mechanical GIAO Calculations of 13C NMR Chemical Shifts Giampaolo Barone, et al. Chemistry - A European Journal Volume 8, Issue 14 , Pages 3233 - 3239 I would suggest an extended protocol, which slows down things a bit but make a better research project: We should start with the generation of a number of lowest energy conformers (probably not for molecules like alpha-Pinene, http://en.wikipedia.org/wiki/Alpha-Pinene), but for more floppy compounds, do a Gaussian calculation for each of them and try averaging shifts. The protocol will be a bit more fragile, though, I guess. Another issue will be if there is an option for simulating polar solvents and their effects on the shifts?! There seems to be a general agreement that you get best results with calculating both the geometry optimization as well as the GIAO shifts on B3LYP/6-31G(d) level, but I trust whatever Henry suggests :-) With my own Gaussian calculations, I got excellent results for rigid and unpolar compounds, just as Barone suggested in his paper. Please let me know if I can help. Cheers, Chris peter murray-rust wrote: > Is the NMRShiftDB list active? > > Here's what we want to do anyway... > > It is now possible to compute the 13C chemical shifts of organic > compounds to an exciting degree of accuracy. Henry and I were talking > about this yesterday and if the compound is fairly rigid the results > are believable enough to assign most peaks and to correct errors. > > We want to apply this method to as many spectra in NMRShiftDB as we > can. It will be limited by time, flexibility and lots of scientific > effects that we haven't thought of and which will be discovered by the > process. > > The MW limit is ca 500 (several days) though we'd prefer smaller ones > to start with in which case they might take half a day each. Nick Day > has already setup an effective workflow for crystals from crystalEye > and runs these on much the same timescale - we have processed many > thousand jobs on the Condor system. We now have Gaussian which - > according to Henry - should be easy to configure. > > Nick is finishing the experimental part of his thesis and this would > form an interesting final piece. > > So what we have to do is (and it must be automatic) > * extract spectra and connection table from NMRShiftDB > * generate 3D coordinates > * generate Gaussian input according to Henry's protocols. > * run the jobs on Condor (or elsewhere) > * collect the output > * parse into CML > * expose the results on our web site (this is where the Open Science comes > in). > * annotate the results (humans and machine) > * display the results (primarily agreement between observed and > calculated values, but also much else). > > The immediate difficulties we can see are: > * not knowing the stereochemistry. *** WHAT IS THE POSITION IN > NMRSHIFTDB? We can filter out anything that has more than one > potential stereocentre. > * assumptions about completeness of data in NMRShiftDB > * syntactic problems (almost certain to occur in any large data set) > * generating initial 3d coordinates. There are several simple approaches: > - look up the moiety in crystaleye > - join moieties in crystaleye > - use CDK - I think Christoph had something here? > - use the 2D coordinates for "flat" molecules > *** NEW APPROACHES - MUST BE BLUEOBELISK > * optimising the coordinates. Probably a cheap level of theory (PM3) > would work. > * parsing the Gaussian output (though JUMBOMarker has already been > tuned for some Gaussian jobs). It may also be that the archive is enough > * scale. After a few hundred jobs SOME effect of scale will hit us. > We don't know what but every project of this sort has these scale problems > * wikifying the results. Ideally we like to expose the results in a > very similar way to crystaleye with 2D and 3D coordinates and with an > observed/calc graph. Then we have to protect against spam. > Suggestions would be welcome > > At present I'd like to know of any immediate problems that we haven't > thought of. If not I suspect Nick will simply download all the data. > > Compute resources are probably not the problem at present. But later > we may ask for volunteers. > > If this is a success there is a much wider vision. You don't need me > to spell it out, and it's probably a good idea to keep things low key. > > P. > > > > Peter Murray-Rust > Unilever Centre for Molecular Sciences Informatics > University of Cambridge, > Lensfield Road, Cambridge CB2 1EW, UK > +44-1223-763069 > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Blueobelisk-discuss mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss -- PD Dr. Christoph Steinbeck Lecturer in Chemoinformatics Univ. Tuebingen, WSI-RA, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071-29-78978 Fax: (+49/0) 7071-29-5091 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Blueobelisk-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss
