[Rdkit-discuss] molecular descriptors in C++
Hi, I'm trying to calculate molecular descriptors in C++ with the RDKit. Does anyone have a code example that could help in this case? Thanks a lot, Gonzalo Colmenarejo -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] molecular descriptors in C++
Also to add to my previous email, when I started out with C++ RDKit I found it really useful to dig into the source code. Especially look through the code written to test the descriptors, more often than not you can adapt what Greg and co. have done already to do what you want. Best, Nick Nicholas C. Firth | PhD Student | Cancer Therapeutics The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | Surrey | SM2 5NG T 020 8722 4033 | E nicholas.fi...@icr.ac.ukmailto:nicholas.fi...@icr.ac.uk | W www.icr.ac.ukhttp://www.icr.ac.uk/ | Twitter @ICRnewshttps://twitter.com/ICRnews Facebook www.facebook.com/theinstituteofcancerresearchhttp://www.facebook.com/theinstituteofcancerresearch Making the discoveries that defeat cancer [cid:image001.gif@01CE053D.51D3C4E0] On 27 Aug 2013, at 15:05, Gonzalo Colmenarejo-Sanchez gonzalo.2.colmenar...@gsk.commailto:gonzalo.2.colmenar...@gsk.com wrote: Hi, I’m trying to calculate molecular descriptors in C++ with the RDKit. Does anyone have a code example that could help in this case? Thanks a lot, Gonzalo Colmenarejo -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.netmailto:Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network.inline: image001.gif-- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] molecular descriptors in C++
Hi Gonzalo, I shamelessly only have an example using PBF. Forgive the slightly dirty C++ coding, but you can get the idea of descriptor calculation. using namespace std; using namespace RDKit; int main(int argc, char *argv[]){ string fileName = argv[1]; SDMolSupplier reader(fileName,false, false); fileName = fileName.substr(0,fileName.size()-4); fileName += _Scored_PBF.sdf; SDWriter *writer = new SDWriter(fileName); while(!reader.atEnd()){ ROMol *m=reader.next(); //MolOps::removeHs(*m); if(!m) continue; double dpbf=PBFRD(*m); m-setProp(PBF_Score, dpbf); writer-write(*m); delete m; } writer-flush(); writer-close(); return 0; } I hope that is helpful. Best, Nick Nicholas C. Firth | PhD Student | Cancer Therapeutics The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | Surrey | SM2 5NG T 020 8722 4033 | E nicholas.fi...@icr.ac.ukmailto:nicholas.fi...@icr.ac.uk | W www.icr.ac.ukhttp://www.icr.ac.uk/ | Twitter @ICRnewshttps://twitter.com/ICRnews Facebook www.facebook.com/theinstituteofcancerresearchhttp://www.facebook.com/theinstituteofcancerresearch Making the discoveries that defeat cancer [cid:image001.gif@01CE053D.51D3C4E0] On 27 Aug 2013, at 15:05, Gonzalo Colmenarejo-Sanchez gonzalo.2.colmenar...@gsk.commailto:gonzalo.2.colmenar...@gsk.com wrote: Hi, I’m trying to calculate molecular descriptors in C++ with the RDKit. Does anyone have a code example that could help in this case? Thanks a lot, Gonzalo Colmenarejo -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.netmailto:Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network.inline: image001.gif-- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] molecular descriptors in C++
Thanks to both for the fast and extremely helpful reply! Gonzalo From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 27 August 2013 16:39 To: rdkit-discuss@lists.sourceforge.net Cc: Gonzalo Colmenarejo-Sanchez Subject: Re: [Rdkit-discuss] molecular descriptors in C++ Nick beat me to it (thanks Nick!) I was going to send the suggestion that a good place to look is the testing code: https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/test.cpp -greg On Tue, Aug 27, 2013 at 4:32 PM, Nicholas Firth nicholas.fi...@icr.ac.ukmailto:nicholas.fi...@icr.ac.uk wrote: Also to add to my previous email, when I started out with C++ RDKit I found it really useful to dig into the source code. Especially look through the code written to test the descriptors, more often than not you can adapt what Greg and co. have done already to do what you want. Best, Nick Nicholas C. Firth | PhD Student | Cancer Therapeutics The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | Surrey | SM2 5NG T 020 8722 4033 | E nicholas.fi...@icr.ac.ukmailto:nicholas.fi...@icr.ac.uk | W www.icr.ac.ukhttp://www.icr.ac.uk/ | Twitter @ICRnewshttps://twitter.com/ICRnews Facebook www.facebook.com/theinstituteofcancerresearchhttp://www.facebook.com/theinstituteofcancerresearch Making the discoveries that defeat cancer [cid:image001.gif@01CEA346.77BA4F60] On 27 Aug 2013, at 15:05, Gonzalo Colmenarejo-Sanchez gonzalo.2.colmenar...@gsk.commailto:gonzalo.2.colmenar...@gsk.com wrote: Hi, I'm trying to calculate molecular descriptors in C++ with the RDKit. Does anyone have a code example that could help in this case? Thanks a lot, Gonzalo Colmenarejo -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.netmailto:Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.netmailto:Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss inline: image001.gif-- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] name generator
Hi, is there any IUPAC name generator in RDKit? e.g. for transforming CC(C)O into propan-2-ol ? Many thanks Sergio -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] name generator
thanks Greg, indeed, I only found commercial software for it http://www.chemaxon.com/marvin/help/applications/molconvert.html cheers Sergio On 27 August 2013 16:45, Greg Landrum greg.land...@gmail.com wrote: Dear Sergio, On Tue, Aug 27, 2013 at 5:21 PM, Sergio Martinez Cuesta sermar...@gmail.com wrote: is there any IUPAC name generator in RDKit? e.g. for transforming CC(C)O into propan-2-ol ? There is not. In fact, I'm not aware of any open source structure-name converters. -greg -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] name generator
Hi Sergio, here is a solution that uses a free web service offered by the NIH. It's independent of the rdkit but rather slow. Anyway, if you don't need to process too many molecules at a time or if time is not the critical factor maybe it could serve as an intermediate solution: import urllib2 def smi_to_iupac(smi): try: url = 'http://cactus.nci.nih.gov/chemical/structure/'+smi+'/iupac_name' iupacName = urllib2.urlopen(url).read() #print iupacName return iupacName except urllib2.HTTPError, e: print HTTP error: %d % e.code return None except urllib2.URLError, e: print Network error: %s % e.reason.args[1] return None except: print conversion failed for smiles + smi return None smiles = [CC(O)C,CC(=O)O, O=C2OCC(=C2\c1c1)\c3ccc(cc3)S(=O)(=O)C] for s in smiles: print smi_to_iupac(s) returns Propan-2-ol acetic acid 4-(4-methylsulfonylphenyl)-3-phenyl-5H-furan-2-one By the way, this service offers conversions between many different molecule formats/identifiers. I have used it in the past for CAS number look-up. Best, Markus On 08/27/2013 05:21 PM, Sergio Martinez Cuesta wrote: Hi, is there any IUPAC name generator in RDKit? e.g. for transforming CC(C)O into propan-2-ol ? Many thanks Sergio -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] name generator
I think this is not an actual structure to name converter but a look-up service based on a a predefined dictionary. If this is true, then it won't return anything for any novel/unseen structures. Give it a try and let us know. George. Sent from my giPhone On 27 Aug 2013, at 18:39, David Hall li...@cowsandmilk.net wrote: Not sure what software is behind it, but the NCI's Chemical Identifier Resolver may suit your needs. For your example, the URL: http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_name returns Propan-2-ol -David On Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta sermar...@gmail.com wrote: thanks Greg, indeed, I only found commercial software for it http://www.chemaxon.com/marvin/help/applications/molconvert.html cheers Sergio On 27 August 2013 16:45, Greg Landrum greg.land...@gmail.com wrote: Dear Sergio, On Tue, Aug 27, 2013 at 5:21 PM, Sergio Martinez Cuesta sermar...@gmail.com wrote: is there any IUPAC name generator in RDKit? e.g. for transforming CC(C)O into propan-2-ol ? There is not. In fact, I'm not aware of any open source structure-name converters. -greg -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] name generator
Hi, did you tried http://opsin.ch.cam.ac.uk/ ? Vladimir Chupakhin On Tue, Aug 27, 2013 at 6:48 PM, Markus Hartenfeller markus.hartenfel...@molecularhealth.com wrote: Hi Sergio, here is a solution that uses a free web service offered by the NIH. It's independent of the rdkit but rather slow. Anyway, if you don't need to process too many molecules at a time or if time is not the critical factor maybe it could serve as an intermediate solution: import urllib2 def smi_to_iupac(smi): try: url = ' http://cactus.nci.nih.gov/chemical/structure/'+smi+'/iupac_name' iupacName = urllib2.urlopen(url).read() #print iupacName return iupacName except urllib2.HTTPError, e: print HTTP error: %d % e.code return None except urllib2.URLError, e: print Network error: %s % e.reason.args[1] return None except: print conversion failed for smiles + smi return None smiles = [CC(O)C,CC(=O)O, O=C2OCC(=C2\c1c1)\c3ccc(cc3)S(=O)(=O)C] for s in smiles: print smi_to_iupac(s) returns Propan-2-ol acetic acid 4-(4-methylsulfonylphenyl)-3-phenyl-5H-furan-2-one By the way, this service offers conversions between many different molecule formats/identifiers. I have used it in the past for CAS number look-up. Best, Markus On 08/27/2013 05:21 PM, Sergio Martinez Cuesta wrote: Hi, is there any IUPAC name generator in RDKit? e.g. for transforming CC(C)O into propan-2-ol ? Many thanks Sergio -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today!http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] name generator
Yes, in this direction (structure to name) the Resolver is only a database lookup, in the other direction (name to structure), it first uses OPSIN (Daniel Lowe's library)which can resolve correct IUPAC names generically, if OPSIN "fails" it does a database lookup, too.MarkusNot sure what software is behind it, but the NCI's Chemical Identifier Resolver may suit your needs.For your example, the URL:http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_namereturns Propan-2-ol-DavidOn Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta sermar...@gmail.com wrote:thanks Greg,-- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] name generator
Oc(:[nH2]):[nH2] does not seem to be in the database http://cactus.nci.nih.gov/chemical/structure/Oc(:[nH2]):[nH2]/iupac_name molcovert does not generate a name either. On 27 August 2013 18:54, Markus Sitzmann sitzm...@helix.nih.gov wrote: ** Yes, in this direction (structure to name) the Resolver is only a database lookup, in the other direction (name to structure), it first uses OPSIN (Daniel Lowe's library) which can resolve correct IUPAC names generically, if OPSIN fails it does a database lookup, too. Markus Not sure what software is behind it, but the NCI's Chemical Identifier Resolver may suit your needs. For your example, the URL: http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_name returns Propan-2-ol -David On Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta sermar...@gmail.com wrote: thanks Greg, -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] name generator
On Tue, Aug 27, 2013 at 10:32 PM, Sergio Martinez Cuesta sermar...@gmail.com wrote: Oc(:[nH2]):[nH2] does not seem to be in the database http://cactus.nci.nih.gov/chemical/structure/Oc(:[nH2]):[nH2]/iupac_name molcovert does not generate a name either. That's not actually a stable molecule. it is, at best, a piece of a molecule. OC(N)N works fine with the NCI lookup. What molecule are you trying to name? -greg -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss