Re: [Rdkit-discuss] Building RDKit on Windows
Thanks Greg - that did the trick! (I still see pythonTestDbCLI - as previously posted) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Building RDKit on Windows
On Wednesday, March 5, 2014, James Davidson j.david...@vernalis.com wrote: Thanks Greg - that did the trick! (I still see pythonTestDbCLI - as previously posted) Those problems are due to windows not being able to delete files that it has just closed. I need to put the deletion bit in a try..except block. The failures probably do not indicate any actual problem. -greg Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com javascript:;. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Pg cartridge - mol_to_ctab() and trouble with conformers.
Hi, About ready to push a changeset for implementing mol_to_ctab(), but I would like it to play nice and preserve input depictions. Ideally I would like the following select mol_to_ctab(mol_from_ctab(input-molfile)); to output a molfile where the coordinates of input-molfile are preserved. If I do that in Python it works: from rdkit import Chem m = Chem.MolFromMolBlock(chiral1.mol ... ChemDraw04200416412D ... ... 5 4 0 0 0 0 0 0 0 0999 V2000 ...-0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 1 2 1 0 ... 1 3 1 0 ... 1 4 1 1 ... 1 5 1 0 ... M END) m rdkit.Chem.rdchem.Mol object at 0x1240980 * m.GetNumConformers()** **1* Chem.MolToMolBlock(m) 'chiral1.mol\n RDKit 2D\n\n 5 4 0 0 0 0 0 0 0 0999 V2000\n -0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0\n0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0\n -0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0\n -0.0141 -0.7697 0. Cl 0 0 0 0 0 0 0 0 0 0 0 0\n -0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 1 6\n 1 3 1 0\n 1 4 1 0\n 1 5 1 0\nM END\n' quit() In the PG cartridge I lose the conformer of the input. My implementation looks like this: rdkit_io.c: PG_FUNCTION_INFO_V1(mol_to_ctab); Datum mol_to_ctab(PG_FUNCTION_ARGS); Datum mol_to_ctab(PG_FUNCTION_ARGS) { CROMol mol; char*str; int len; fcinfo-flinfo-fn_extra = SearchMolCache( fcinfo-flinfo-fn_extra, fcinfo-flinfo-fn_mcxt, PG_GETARG_DATUM(0), NULL, mol, NULL); bool createDepictionIfMissing = PG_GETARG_BOOL(1); str = makeCtabText(mol, len, createDepictionIfMissing); PG_RETURN_CSTRING( pnstrdup(str, len) ); } adapter.cpp: extern C char * makeCtabText(CROMol data, int *len, bool createDepictionIfMissing) { ROMol *mol = (ROMol*)data; try { ereport(NOTICE, (errcode(ERRCODE_SUCCESSFUL_COMPLETION), errmsg(mol conformer count = %d, mol-getNumConformers(; if (createDepictionIfMissing mol-getNumConformers() == 0) { RDDepict::compute2DCoords(*mol); } StringData = MolToMolBlock(*mol); } catch (...) { ereport(WARNING, (errcode(ERRCODE_WARNING), errmsg(makeCtabText: problems converting molecule to CTAB))); StringData=; } *len = StringData.size(); return (char*)StringData.c_str(); } If I run the Python example equivalent from psql: postgres=# select mol_to_ctab(mol_from_ctab('chiral1.mol ChemDraw04200416412D 5 4 0 0 0 0 0 0 0 0999 V2000 -0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0 0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0 -0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0 -0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0 -0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 1 3 1 0 1 4 1 1 1 5 1 0 M END', false)); *NOTICE: mol conformer count = 0* mol_to_ctab --- + RDKit 2D + + 5 4 0 0 0 0 0 0 0 0999 V2000 + 0.0.0. C 0 0 0 0 0 0 0 0 0 0 0 0+ -1.50000.0. F 0 0 0 0 0 0 0 0 0 0 0 0+ -0. -1.50000. Br 0 0 0 0 0 0 0 0 0 0 0 0+ 0.1.50000. Cl 0 0 0 0 0 0 0 0 0 0 0 0+ 1.50000.0. C 0 0 0 0 0 0 0 0 0 0 0 0+ 1 2 1 6 + 1 3 1 0 + 1 4 1 0 + 1 5 1 0 + M END + (1 row) postgres=# Something I missed about querying a mol for conformers ? As of now I lose the input conformer and the code will always output a calculated-from-scratch depiction. Cheers -- Jan -- Subversion Kills Productivity. Get off Subversion Make the Move to
Re: [Rdkit-discuss] SMARTS/SMARTS and SMILES/SMARTS substructure matching
Hi Greg, Thanks a lot for the explanation. It makes things clearer now. Well the reason I'm doing SMARTS-SMARTS match is because I would like to match functional groups with the reactants in reactions. Regards, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 5 March 2014 04:44, Greg Landrum greg.land...@gmail.com wrote: Hi Christos, On Tue, Mar 4, 2014 at 3:46 PM, Christos Kannas chriskan...@gmail.comwrote: Hi all, Why does the following happen? In [1]: from rdkit import Chem In [2]: from rdkit.Chem import AllChem In [3]: from rdkit.Chem import Draw In [4]: patt = Chem.MolFromSmarts([CH;D2;!$(C-[!#6;!#1])]=O) In [5]: z2 = Chem.MolFromSmarts([*]-C-C([H])(=O), 1) In [6]: print Chem.MolToSmiles(z2) [*]CC=O In [7]: print Chem.MolToSmarts(z2) *-C-[C!H0]=O In [9]: z2.HasSubstructMatch(patt) Out[9]: False In [10]: z3 = Chem.MolFromSmiles(Chem.MolToSmiles(z2)) In [11]: print Chem.MolToSmiles(z3) [*]CC=O In [12]: print Chem.MolToSmarts(z3) [*]-[#6]-[#6]=[#8] In [13]: z3.HasSubstructMatch(patt) Out[13]: True Shouldn't be that z2 and z3 have the same information? The way SMARTS/SMARTS matches is handled is different than the way SMARTS/SMILES matches works. The short answer is that when doing a SMARTS/SMARTS match, the RDKit compares the queries to each other; when doing a SMARTS/SMILES match, on the other hand, it checks to see if the atoms in the SMILES molecule match the queries in the SMARTS molecule. A bit longer answer: Molecules built using MolFromSmiles contain Atoms, molecules built using MolFromSmarts contain QueryAtoms. Both atoms and QueryAtoms have a Match() method that takes another Atom or QueryAtom as an argument and returns whether or not the two match. The substructure matching code makes heavy use of this Match() method. QueryAtom.Match(Atom) checks to see if the Atom satisfies the query. QueryAtom.Match(QueryAtom) checks to see if the queries on the atoms are the same. This uses a crude approach that is easy to fool, but I assume that a SMARTS-SMARTS match is not a frequent thing someone wants to do. query-query matching is also not a particularly easy problem to solve in a general way. -greg -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Two nitrogens in a 5 membered ring
Thanks all for informative and helpful responses, the behaviour I was struggling to understand now makes perfect sense. Toby Wright -- InhibOx Ltd On 4 March 2014 04:06, Greg Landrum greg.land...@gmail.com wrote: Bob hit the nail on the head. The first case, N1N=CC=C1, is aromatic because the RDKit sees that the first nitrogen has two bonds to it, assigns a hydrogen, and then sees a conjugated pi system with 6 electrons that is flagged as aromatic. Something similar would happen with the aromatic form [nH]1nccc1: first the ring system is kekulized to yield N1N=CC=C1, then the sanitization proceeds from there. The same thing would happen with the equivalent n1[nH]ccc1. The second case, N1=NC=CC1, has a C (the last one) that only has single bonds to it. This is assigned sp3 hybridization, so there's no conjugated ring system for aromaticity to be perceived in. The final case, n1nccc1, is an instance of the pyrrole problem: aromatic N's that need an implicit H on them, should have that implicit H present in the aromatic SMILES. -greg On Mon, Mar 3, 2014 at 5:59 PM, Bob Funchess bfunch...@kelaroo.comwrote: Hi Toby, I'd say it's more of a limitation inherent in Kekule representations than an actual bug in RDKit. Trying to get too clever in figuring out what the user meant usually causes more harm than good. I'm not sure what version of RDKit you're using, but the aromatic specification with an explicit hydrogen on one of the nitrogen atoms works for me: Chem.MolFromSmiles('n1[nH]ccc1').Debug(); Atoms: 0 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 1 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 2 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 3 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 4 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 Bonds: 0 0-1 order: 12 conj?: 1 aromatic?: 1 1 1-2 order: 12 conj?: 1 aromatic?: 1 2 2-3 order: 12 conj?: 1 aromatic?: 1 3 3-4 order: 12 conj?: 1 aromatic?: 1 4 4-0 order: 12 conj?: 1 aromatic?: 1 The double bonds in the Kekule representations here can be between atom pairs 1,2 and 3,4 or between atom pairs 2,3 and 4,0. Putting one between pair 0,1 leaves atom 4 with two single bonds to it (and therefore, to satisfy valence requirements, two implicit hydrogens); I'm not horribly surprised that RDKit perceives that as aliphatic. You can see that's what's happening in your second example where the hybridization of atom 4 is 4 (sp3) instead of 3 (sp2). Regards, Bob -- Bob Funchess, Ph.D. Kelaroo, Inc Senior Scientist www.kelaroo.com bfunch...@kelaroo.com (858) 259-7561 x3 -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Pg cartridge - mol_to_ctab() and trouble with conformers.
Hi Jan, The below behavior is the result of a bug ( https://github.com/rdkit/rdkit/issues/229). mol_from_ctab() takes an (undocumented) optional argument that is supposed to determine whether or not the molecule's conformation is stored in the database. The default is to not store the conformation; this reduces the size of the database and the speed at which molecules are depickled. The bug is that even if you try to keep the conformation the argument is ignored and the conformation is discarded. I'll get this fixed tomorrow morning. Alternatively, if you want to fix it now, the change just needs to be made in the definition of mol_from_ctab() in rdkit_io.c -greg On Wed, Mar 5, 2014 at 10:27 AM, Jan Holst Jensen j...@biochemfusion.comwrote: Hi, About ready to push a changeset for implementing mol_to_ctab(), but I would like it to play nice and preserve input depictions. Ideally I would like the following select mol_to_ctab(mol_from_ctab(input-molfile)); to output a molfile where the coordinates of input-molfile are preserved. If I do that in Python it works: from rdkit import Chem m = Chem.MolFromMolBlock(chiral1.mol ... ChemDraw04200416412D ... ... 5 4 0 0 0 0 0 0 0 0999 V2000 ...-0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 1 2 1 0 ... 1 3 1 0 ... 1 4 1 1 ... 1 5 1 0 ... M END) m rdkit.Chem.rdchem.Mol object at 0x1240980 * m.GetNumConformers()* *1* Chem.MolToMolBlock(m) 'chiral1.mol\n RDKit 2D\n\n 5 4 0 0 0 0 0 0 0 0999 V2000\n -0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0\n0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0\n -0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0\n -0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0\n -0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 1 6\n 1 3 1 0\n 1 4 1 0\n 1 5 1 0\nM END\n' quit() In the PG cartridge I lose the conformer of the input. My implementation looks like this: rdkit_io.c: PG_FUNCTION_INFO_V1(mol_to_ctab); Datum mol_to_ctab(PG_FUNCTION_ARGS); Datum mol_to_ctab(PG_FUNCTION_ARGS) { CROMol mol; char*str; int len; fcinfo-flinfo-fn_extra = SearchMolCache( fcinfo-flinfo-fn_extra, fcinfo-flinfo-fn_mcxt, PG_GETARG_DATUM(0), NULL, mol, NULL); bool createDepictionIfMissing = PG_GETARG_BOOL(1); str = makeCtabText(mol, len, createDepictionIfMissing); PG_RETURN_CSTRING( pnstrdup(str, len) ); } adapter.cpp: extern C char * makeCtabText(CROMol data, int *len, bool createDepictionIfMissing) { ROMol *mol = (ROMol*)data; try { ereport(NOTICE, (errcode(ERRCODE_SUCCESSFUL_COMPLETION), errmsg(mol conformer count = %d, mol-getNumConformers(; if (createDepictionIfMissing mol-getNumConformers() == 0) { RDDepict::compute2DCoords(*mol); } StringData = MolToMolBlock(*mol); } catch (...) { ereport(WARNING, (errcode(ERRCODE_WARNING), errmsg(makeCtabText: problems converting molecule to CTAB))); StringData=; } *len = StringData.size(); return (char*)StringData.c_str(); } If I run the Python example equivalent from psql: postgres=# select mol_to_ctab(mol_from_ctab('chiral1.mol ChemDraw04200416412D 5 4 0 0 0 0 0 0 0 0999 V2000 -0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0 0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0 -0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0 -0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0 -0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 1 3 1 0 1 4 1 1 1 5 1 0 M END', false)); *NOTICE: mol conformer count = 0* mol_to_ctab --- + RDKit 2D + + 5 4 0 0 0 0 0 0 0 0999 V2000 + 0.0.0. C 0 0 0 0 0 0 0 0 0 0 0 0+ -1.50000.0. F 0 0 0 0 0 0 0 0 0 0 0 0+ -0. -1.50000.
Re: [Rdkit-discuss] SMARTS/SMARTS and SMILES/SMARTS substructure matching
Hi, This is probably related to the above so I thought I'd post it on this thread. I am noticing inconsistent behaviour when a molecule created via SMARTS that contains an 'or' statement has HasSubstructMatch called on it, as opposed to it being the argument to HasSubstructMatch. A simple example follows: O_or_C = Chem.MolFromSmarts('[O,C]') O = Chem.MolFromSmiles('O') C = Chem.MolFromSmiles('C') O_or_C.HasSubstructMatch(O) True O_or_C.HasSubstructMatch(C) False O.HasSubstructMatch(O_or_C) True C.HasSubstructMatch(O_or_C) True We also see: C_or_O = Chem.MolFromSmarts('[C,O]') C_or_O.HasSubstructMatch(O) False C_or_O.HasSubstructMatch(C) True so the order of elements in a SMARTS 'or' statement changes the behaviour, which is unexpected. Yours, Toby Wright -- InhibOx Ltd On 5 March 2014 10:10, Christos Kannas chriskan...@gmail.com wrote: Hi Greg, Thanks a lot for the explanation. It makes things clearer now. Well the reason I'm doing SMARTS-SMARTS match is because I would like to match functional groups with the reactants in reactions. Regards, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 5 March 2014 04:44, Greg Landrum greg.land...@gmail.com wrote: Hi Christos, On Tue, Mar 4, 2014 at 3:46 PM, Christos Kannas chriskan...@gmail.comwrote: Hi all, Why does the following happen? In [1]: from rdkit import Chem In [2]: from rdkit.Chem import AllChem In [3]: from rdkit.Chem import Draw In [4]: patt = Chem.MolFromSmarts([CH;D2;!$(C-[!#6;!#1])]=O) In [5]: z2 = Chem.MolFromSmarts([*]-C-C([H])(=O), 1) In [6]: print Chem.MolToSmiles(z2) [*]CC=O In [7]: print Chem.MolToSmarts(z2) *-C-[C!H0]=O In [9]: z2.HasSubstructMatch(patt) Out[9]: False In [10]: z3 = Chem.MolFromSmiles(Chem.MolToSmiles(z2)) In [11]: print Chem.MolToSmiles(z3) [*]CC=O In [12]: print Chem.MolToSmarts(z3) [*]-[#6]-[#6]=[#8] In [13]: z3.HasSubstructMatch(patt) Out[13]: True Shouldn't be that z2 and z3 have the same information? The way SMARTS/SMARTS matches is handled is different than the way SMARTS/SMILES matches works. The short answer is that when doing a SMARTS/SMARTS match, the RDKit compares the queries to each other; when doing a SMARTS/SMILES match, on the other hand, it checks to see if the atoms in the SMILES molecule match the queries in the SMARTS molecule. A bit longer answer: Molecules built using MolFromSmiles contain Atoms, molecules built using MolFromSmarts contain QueryAtoms. Both atoms and QueryAtoms have a Match() method that takes another Atom or QueryAtom as an argument and returns whether or not the two match. The substructure matching code makes heavy use of this Match() method. QueryAtom.Match(Atom) checks to see if the Atom satisfies the query. QueryAtom.Match(QueryAtom) checks to see if the queries on the atoms are the same. This uses a crude approach that is easy to fool, but I assume that a SMARTS-SMARTS match is not a frequent thing someone wants to do. query-query matching is also not a particularly easy problem to solve in a general way. -greg -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Pg cartridge - mol_to_ctab() and trouble with conformers.
Hi Greg, Thanks for the explanation. I added this to rdkit_io.c in mol_from_ctab(): + bool keepConformer = PG_GETARG_BOOL(1); - mol = parseMolCTAB(data,false,true); + mol = parseMolCTAB(data,keepConformer,true); and then I can get the expected behavior and have my tests complete successfully. Yes :-). I will go ahead and create a pull request for mol_to_ctab(). The tests for mol_to_ctab() will assume that mol_from_ctab() uses the optional parameter to keep the conformer. Cheers -- Jan On 2014-03-05 13:25, Greg Landrum wrote: Hi Jan, The below behavior is the result of a bug (https://github.com/rdkit/rdkit/issues/229). mol_from_ctab() takes an (undocumented) optional argument that is supposed to determine whether or not the molecule's conformation is stored in the database. The default is to not store the conformation; this reduces the size of the database and the speed at which molecules are depickled. The bug is that even if you try to keep the conformation the argument is ignored and the conformation is discarded. I'll get this fixed tomorrow morning. Alternatively, if you want to fix it now, the change just needs to be made in the definition of mol_from_ctab() in rdkit_io.c -greg On Wed, Mar 5, 2014 at 10:27 AM, Jan Holst Jensen j...@biochemfusion.com mailto:j...@biochemfusion.com wrote: Hi, About ready to push a changeset for implementing mol_to_ctab(), but I would like it to play nice and preserve input depictions. Ideally I would like the following select mol_to_ctab(mol_from_ctab(input-molfile)); to output a molfile where the coordinates of input-molfile are preserved. If I do that in Python it works: from rdkit import Chem m = Chem.MolFromMolBlock(chiral1.mol ... ChemDraw04200416412D ... ... 5 4 0 0 0 0 0 0 0 0999 V2000 ...-0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 1 2 1 0 ... 1 3 1 0 ... 1 4 1 1 ... 1 5 1 0 ... M END) m rdkit.Chem.rdchem.Mol object at 0x1240980 * m.GetNumConformers()** **1* Chem.MolToMolBlock(m) 'chiral1.mol\n RDKit 2D\n\n 5 4 0 0 0 0 0 0 0 0999 V2000\n -0.01410.0553 0. C 0 0 0 0 0 0 0 0 0 0 0 0\n 0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0\n -0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0\n -0.0141 -0.7697 0. Cl 0 0 0 0 0 0 0 0 0 0 0 0\n -0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 1 6\n 1 3 1 0\n 1 4 1 0\n 1 5 1 0\nM END\n' quit() In the PG cartridge I lose the conformer of the input. My implementation looks like this: rdkit_io.c: PG_FUNCTION_INFO_V1(mol_to_ctab); Datum mol_to_ctab(PG_FUNCTION_ARGS); Datum mol_to_ctab(PG_FUNCTION_ARGS) { CROMol mol; char*str; int len; fcinfo-flinfo-fn_extra = SearchMolCache( fcinfo-flinfo-fn_extra, fcinfo-flinfo-fn_mcxt, PG_GETARG_DATUM(0), NULL, mol, NULL); bool createDepictionIfMissing = PG_GETARG_BOOL(1); str = makeCtabText(mol, len, createDepictionIfMissing); PG_RETURN_CSTRING( pnstrdup(str, len) ); } adapter.cpp: extern C char * makeCtabText(CROMol data, int *len, bool createDepictionIfMissing) { ROMol *mol = (ROMol*)data; try { ereport(NOTICE, (errcode(ERRCODE_SUCCESSFUL_COMPLETION), errmsg(mol conformer count = %d, mol-getNumConformers(; if (createDepictionIfMissing mol-getNumConformers() == 0) { RDDepict::compute2DCoords(*mol); } StringData = MolToMolBlock(*mol); } catch (...) { ereport(WARNING, (errcode(ERRCODE_WARNING), errmsg(makeCtabText: problems converting molecule to CTAB))); StringData=; } *len = StringData.size(); return (char*)StringData.c_str(); } If I run the Python example equivalent from psql: postgres=# select mol_to_ctab(mol_from_ctab('chiral1.mol ChemDraw04200416412D 5 4 0 0 0 0 0 0 0 0999 V2000 -0.0141
Re: [Rdkit-discuss] Pg cartridge - mol_to_ctab() and trouble with conformers.
Thanks Jan. The fix and pull request have both been integrated. -greg On Wed, Mar 5, 2014 at 7:35 PM, Jan Holst Jensen j...@biochemfusion.comwrote: Hi Greg, Thanks for the explanation. I added this to rdkit_io.c in mol_from_ctab(): + bool keepConformer = PG_GETARG_BOOL(1); - mol = parseMolCTAB(data,false,true); + mol = parseMolCTAB(data,keepConformer,true); and then I can get the expected behavior and have my tests complete successfully. Yes :-). I will go ahead and create a pull request for mol_to_ctab(). The tests for mol_to_ctab() will assume that mol_from_ctab() uses the optional parameter to keep the conformer. Cheers -- Jan On 2014-03-05 13:25, Greg Landrum wrote: Hi Jan, The below behavior is the result of a bug ( https://github.com/rdkit/rdkit/issues/229). mol_from_ctab() takes an (undocumented) optional argument that is supposed to determine whether or not the molecule's conformation is stored in the database. The default is to not store the conformation; this reduces the size of the database and the speed at which molecules are depickled. The bug is that even if you try to keep the conformation the argument is ignored and the conformation is discarded. I'll get this fixed tomorrow morning. Alternatively, if you want to fix it now, the change just needs to be made in the definition of mol_from_ctab() in rdkit_io.c -greg On Wed, Mar 5, 2014 at 10:27 AM, Jan Holst Jensen j...@biochemfusion.comwrote: Hi, About ready to push a changeset for implementing mol_to_ctab(), but I would like it to play nice and preserve input depictions. Ideally I would like the following select mol_to_ctab(mol_from_ctab(input-molfile)); to output a molfile where the coordinates of input-molfile are preserved. If I do that in Python it works: from rdkit import Chem m = Chem.MolFromMolBlock(chiral1.mol ... ChemDraw04200416412D ... ... 5 4 0 0 0 0 0 0 0 0999 V2000 ...-0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0 ...-0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 1 2 1 0 ... 1 3 1 0 ... 1 4 1 1 ... 1 5 1 0 ... M END) m rdkit.Chem.rdchem.Mol object at 0x1240980 * m.GetNumConformers()* *1* Chem.MolToMolBlock(m) 'chiral1.mol\n RDKit 2D\n\n 5 4 0 0 0 0 0 0 0 0999 V2000\n -0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0\n0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0\n -0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0\n -0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0\n -0.8109 -0.15830. C 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 1 6\n 1 3 1 0\n 1 4 1 0\n 1 5 1 0\nM END\n' quit() In the PG cartridge I lose the conformer of the input. My implementation looks like this: rdkit_io.c: PG_FUNCTION_INFO_V1(mol_to_ctab); Datum mol_to_ctab(PG_FUNCTION_ARGS); Datum mol_to_ctab(PG_FUNCTION_ARGS) { CROMol mol; char*str; int len; fcinfo-flinfo-fn_extra = SearchMolCache( fcinfo-flinfo-fn_extra, fcinfo-flinfo-fn_mcxt, PG_GETARG_DATUM(0), NULL, mol, NULL); bool createDepictionIfMissing = PG_GETARG_BOOL(1); str = makeCtabText(mol, len, createDepictionIfMissing); PG_RETURN_CSTRING( pnstrdup(str, len) ); } adapter.cpp: extern C char * makeCtabText(CROMol data, int *len, bool createDepictionIfMissing) { ROMol *mol = (ROMol*)data; try { ereport(NOTICE, (errcode(ERRCODE_SUCCESSFUL_COMPLETION), errmsg(mol conformer count = %d, mol-getNumConformers(; if (createDepictionIfMissing mol-getNumConformers() == 0) { RDDepict::compute2DCoords(*mol); } StringData = MolToMolBlock(*mol); } catch (...) { ereport(WARNING, (errcode(ERRCODE_WARNING), errmsg(makeCtabText: problems converting molecule to CTAB))); StringData=; } *len = StringData.size(); return (char*)StringData.c_str(); } If I run the Python example equivalent from psql: postgres=# select mol_to_ctab(mol_from_ctab('chiral1.mol ChemDraw04200416412D 5 4 0 0 0 0 0 0 0 0999 V2000 -0.01410.05530. C 0 0 0 0 0 0 0 0 0 0 0 0 0.81090.05530. F 0 0 0 0 0 0 0 0 0 0 0 0 -0.42660.76970. Br 0 0 0 0 0 0 0 0 0 0 0 0 -0.0141 -0.76970. Cl 0 0 0 0 0 0 0 0 0 0 0 0 -0.8109 -0.15830. C 0