Re: [Rdkit-discuss] rdDeprotect & DeprotectData
Hi Katrina, I must confess I haven't actually used rdDeprotect before (I have always created reactions and called the RunReactants() method)... I just tried your use case, and I think it is working as you would like (I can't immediately see what is wrong in the original code you posted). Here is a gist showing it (I hope): https://gist.github.com/jepdavidson/ec1664a8bfa8b921262fc844c0e523e4 Kind regards James From: Katrina Lexa Sent: 21 August 2023 14:58 To: James Davidson Cc: RDKit Discuss Subject: Re: [Rdkit-discuss] rdDeprotect & DeprotectData Hi James, Thanks for the quick reply! You're quite right, I'm simply interested in the virtual reaction to remove the boronates. Thank you for fixing my incorrect mapping. At some point, I had had the aryl carbon properly specified, but I clearly lost my way with it along my quest. Sadly, the reaction_smarts = "[c:1][B;R0](O)O>>[*:1]" still does not remove any of the boronates from my input smiles, but it sounds like everything else about the specification of the reaction is correct, so I'll get there at some point with the right reaction_smarts. Thanks again, Katrina On Mon, Aug 21, 2023 at 3:26 AM James Davidson mailto:j.david...@vernalis.com>> wrote: Hi Katrina, I'm slightly unsure what "deprotection" you are trying to represent, but I think there are a couple of problems with the rsmarts... reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]" This is looking for an aromatic carbon with one hydrogen AND connected to a non-ring boron. This pattern will never be found! Also, you have a mapped atom on the reactant side, but no mapped atoms on the product side. If your reaction is aiming to hydrolyse non-cyclic boronic esters (and return the alcohols), then you should map the oxygen atom on the product side as well - something like: reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]" If, instead, you are interested in the virtual reaction that removes boronates from aryl R-groups (perhaps to calculate R-group fingerprints, etc) - then you should map the aryl carbon on both sides instead: reaction_smarts = "[c:1][B;R0](O)O>>[*:1]" In either case you probably want to deduplicate products (the boronic acids and esters will match the pattern twice). Kind regards James From: Katrina Lexa mailto:kl...@umich.edu>> Sent: 21 August 2023 06:03 To: RDKit Discuss mailto:rdkit-discuss@lists.sourceforge.net>> Subject: [Rdkit-discuss] rdDeprotect & DeprotectData Hi All, I don't know why I'm struggling so much with this, as it seems like it should be pretty straight forward. I'm trying to add some additional deprotection smirks to a data-cleaning python script and I'm not having success with the new reactions actually transforming my reactants to deprotected smiles. I have about 10 I'd like to add, so I know I could do it with simple reactions, but I'd rather figure out where I'm going wrong here. My definition of deprotect data: #deborylation deprotection_class = "boron" reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]" abbreviation = "BOO" full_name = "deboron" bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts, abbreviation, full_name) assert bdata.isValid() I tried adding this line: newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata) but it seems to make no difference: try: #result = rdDeprotect.Deprotect(dep_m,deprotections=[bdata]) result = rdDeprotect.Deprotect(dep_m,newDeprotect) As an example, this is one of the smiles strings in the smiles file I'm reading in I would expect to deprotect" Cc1cc(B(O)O)ccc1OC(C)C Maybe I'm just awful at writing SMIRKS? Thanks for the help here, Katrina PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com<mailto:postmas...@vernalis.com>. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R) Limited (no. 1985479) Granta Park, Great Abington Cambridge,
Re: [Rdkit-discuss] rdDeprotect & DeprotectData
Hi Katrina, I'm slightly unsure what "deprotection" you are trying to represent, but I think there are a couple of problems with the rsmarts... reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]" This is looking for an aromatic carbon with one hydrogen AND connected to a non-ring boron. This pattern will never be found! Also, you have a mapped atom on the reactant side, but no mapped atoms on the product side. If your reaction is aiming to hydrolyse non-cyclic boronic esters (and return the alcohols), then you should map the oxygen atom on the product side as well - something like: reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]" If, instead, you are interested in the virtual reaction that removes boronates from aryl R-groups (perhaps to calculate R-group fingerprints, etc) - then you should map the aryl carbon on both sides instead: reaction_smarts = "[c:1][B;R0](O)O>>[*:1]" In either case you probably want to deduplicate products (the boronic acids and esters will match the pattern twice). Kind regards James From: Katrina Lexa Sent: 21 August 2023 06:03 To: RDKit Discuss Subject: [Rdkit-discuss] rdDeprotect & DeprotectData Hi All, I don't know why I'm struggling so much with this, as it seems like it should be pretty straight forward. I'm trying to add some additional deprotection smirks to a data-cleaning python script and I'm not having success with the new reactions actually transforming my reactants to deprotected smiles. I have about 10 I'd like to add, so I know I could do it with simple reactions, but I'd rather figure out where I'm going wrong here. My definition of deprotect data: #deborylation deprotection_class = "boron" reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]" abbreviation = "BOO" full_name = "deboron" bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts, abbreviation, full_name) assert bdata.isValid() I tried adding this line: newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata) but it seems to make no difference: try: #result = rdDeprotect.Deprotect(dep_m,deprotections=[bdata]) result = rdDeprotect.Deprotect(dep_m,newDeprotect) As an example, this is one of the smiles strings in the smiles file I'm reading in I would expect to deprotect" Cc1cc(B(O)O)ccc1OC(C)C Maybe I'm just awful at writing SMIRKS? Thanks for the help here, Katrina PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R) Limited (no. 1985479) Granta Park, Great Abington Cambridge, CB21 6GB, United Kingdom Tel: +44 (0)1223 895 555 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Problem finding potential stereo-centres in bridged bicyclics involving 4-membered rings?
Hi Greg, Thanks for the response (and sorry to be the bearer of bad news!). Issue added: https://github.com/rdkit/rdkit/issues/4155 Kind regards James From: Greg Landrum Sent: 19 May 2021 14:59 To: James Davidson Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Problem finding potential stereo-centres in bridged bicyclics involving 4-membered rings? Hi James, I don't think that's the same bug as #3490. I think it's something different; "yay". ;-) It would be great if you could file a github issue for this. Thanks, -greg On Wed, May 19, 2021 at 3:20 PM James Davidson mailto:j.david...@vernalis.com>> wrote: Dear All, I’ve got a strong suspicion that what I am seeing is related to the open issue 3490 (https://github.com/rdkit/rdkit/issues/3490), but as I can’t seem to find a mention of a non-spiro problem then I thought I would share. Tested in 2020.09.4 and 2021.03.2 with the same result. smi_list = ['CC1CCC(CC1)C(N)=O', 'CC12CCC(CC1)(C2)C(N)=O', 'CC1CC(C1)C(N)=O', 'CC12CC(C1)(CC2)C(N)=O'] for smi in smi_list: mol = Chem.MolFromSmiles(smi) display(show_mol(mol, size=(450, 200))) # wrapper function for new drawing code in jupyter print(list(Chem.FindPotentialStereo(mol))) print(Chem.FindMolChiralCenters(mol, includeUnassigned=True, useLegacyImplementation=False)) The 4 cases are: * Symmetrically-disubstituted 6-membered ring * A bridged version (using a 1-atom bridge to avoid a completely symmetrical product) * Symmetrically-disubstituted 4-membered ring * A bridged version (this time using a 2-atom bridge to avoid symmetry) And this is what I see: [cid:image001.png@01D74CCD.B6F2A180] In the case of the bridged 4-membered ring (or bridged 5-membererd ring, depending on your viewpoint!), FindPotentialStereo() fails to identify the two potential stereo atoms. If anyone can spot if this is the same issue as 3490, or something different, then that would be appreciated! Kind regards James PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com<mailto:postmas...@vernalis.com>. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R) Limited (no. 1985479) Granta Park, Great Abington Cambridge, CB21 6GB, United Kingdom Tel: +44 (0)1223 895 555 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Problem finding potential stereo-centres in bridged bicyclics involving 4-membered rings?
Dear All, I've got a strong suspicion that what I am seeing is related to the open issue 3490 (https://github.com/rdkit/rdkit/issues/3490), but as I can't seem to find a mention of a non-spiro problem then I thought I would share. Tested in 2020.09.4 and 2021.03.2 with the same result. smi_list = ['CC1CCC(CC1)C(N)=O', 'CC12CCC(CC1)(C2)C(N)=O', 'CC1CC(C1)C(N)=O', 'CC12CC(C1)(CC2)C(N)=O'] for smi in smi_list: mol = Chem.MolFromSmiles(smi) display(show_mol(mol, size=(450, 200))) # wrapper function for new drawing code in jupyter print(list(Chem.FindPotentialStereo(mol))) print(Chem.FindMolChiralCenters(mol, includeUnassigned=True, useLegacyImplementation=False)) The 4 cases are: * Symmetrically-disubstituted 6-membered ring * A bridged version (using a 1-atom bridge to avoid a completely symmetrical product) * Symmetrically-disubstituted 4-membered ring * A bridged version (this time using a 2-atom bridge to avoid symmetry) And this is what I see: [cid:image001.png@01D74CB9.B0931950] In the case of the bridged 4-membered ring (or bridged 5-membererd ring, depending on your viewpoint!), FindPotentialStereo() fails to identify the two potential stereo atoms. If anyone can spot if this is the same issue as 3490, or something different, then that would be appreciated! Kind regards James PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R) Limited (no. 1985479) Granta Park, Great Abington Cambridge, CB21 6GB, United Kingdom Tel: +44 (0)1223 895 555 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Stereochemistry problem with spiro centre
Thanks Paolo - I should have found this before posting! Cheers James On 9 May 2021, at 15:05, Paolo Tosco wrote: Hi James, IIRC that's a known open issue with the way spirocyclic pseudochiral centers are handled: https://github.com/rdkit/rdkit/issues/3490 Cheers, p. On Sun, May 9, 2021 at 10:15 AM James Davidson mailto:j.david...@vernalis.com>> wrote: Dear All, I am having some issues with tetrahedral stereochemistry perception in RDKit (2020.09.4) for a certain class of molecule. Here’s an example (rendered using cdk-depict): https://www.simolecule.com/cdkdepict/depict/bot/svg?smi=F%5BC%40H%5D1C%5BC%40%40%5D2(C1)C%5BC%40H%5D(Cl)C2=-1=-1=off=bridgehead=false=3.65=cip=0 If I try to work with this class of molecule in RDKit, it seems impossible(?) to assign stereo information to the central, spirocyclic stereo centre. Exporting back out as SMILES shows that the information is not present: m = Chem.MolFromSmiles('F[C@H]1C[C@@]2(C1)C[C@H](Cl)C2') print(Chem.MolToSmiles(m)) >>> F[C@H]1CC2(C1)C[C@H](Cl)C2 The spiro-atom is clearly being identified as a potential stereo-centre (strangely, the CIP labels aren’t being generated for the other two centres – just the parity info is returned): centers = Chem.FindMolChiralCenters(m, includeUnassigned=True, useLegacyImplementation=False) print(centers) >>> [(1, 'Tet_CW'), (3, '?'), (6, 'Tet_CCW')] If I look at the atom properties for the central atom, I can see _ChiralityPossible == 1, but I also see _ringStereochemCand == False. Is this the problem? Any help/advice greatly appreciated! Kind regards James PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com<mailto:postmas...@vernalis.com>. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R) Limited (no. 1985479) Granta Park, Great Abington Cambridge, CB21 6GB, United Kingdom Tel: +44 (0)1223 895 555 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Stereochemistry problem with spiro centre
Dear All, I am having some issues with tetrahedral stereochemistry perception in RDKit (2020.09.4) for a certain class of molecule. Here's an example (rendered using cdk-depict): [cid:image002.png@01D744AE.E7C7ACA0] https://www.simolecule.com/cdkdepict/depict/bot/svg?smi=F%5BC%40H%5D1C%5BC%40%40%5D2(C1)C%5BC%40H%5D(Cl)C2=-1=-1=off=bridgehead=false=3.65=cip=0 If I try to work with this class of molecule in RDKit, it seems impossible(?) to assign stereo information to the central, spirocyclic stereo centre. Exporting back out as SMILES shows that the information is not present: m = Chem.MolFromSmiles('F[C@H]1C[C@@]2(C1)C[C@H](Cl)C2') print(Chem.MolToSmiles(m)) >>> F[C@H]1CC2(C1)C[C@H](Cl)C2 The spiro-atom is clearly being identified as a potential stereo-centre (strangely, the CIP labels aren't being generated for the other two centres - just the parity info is returned): centers = Chem.FindMolChiralCenters(m, includeUnassigned=True, useLegacyImplementation=False) print(centers) >>> [(1, 'Tet_CW'), (3, '?'), (6, 'Tet_CCW')] If I look at the atom properties for the central atom, I can see _ChiralityPossible == 1, but I also see _ringStereochemCand == False. Is this the problem? Any help/advice greatly appreciated! Kind regards James PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R) Limited (no. 1985479) Granta Park, Great Abington Cambridge, CB21 6GB, United Kingdom Tel: +44 (0)1223 895 555 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] A question regarding double bonds and reading molblocks
Dear All, I think this question is in some way related to the following closed issue: https://github.com/rdkit/rdkit/pull/3015 I am working with 2020.09.1, but see the following error when calling EnumerateStereoisomers(): RuntimeError: Pre-condition Violation Stereo atoms should be specified before specifying CIS/TRANS bond stereochemistry Violation occurred on line 288 in file Code/GraphMol/Bond.h Failed Expression: what <= STEREOE || getStereoAtoms().size() == 2 RDKIT: 2020.09.1 BOOST: 1_72 I may be wrong, but I think my issue has something to do with incoming STEREOANY bonds *terminating* at double bonds. Here is an example (not very pretty, I know!): [cid:image001.png@01D6D861.708D15D0] The intention for the wavy bonds is (probably) to say that nothing is known about the configuration at the 3 stereocentres. But the intention for the double bonds is that they are as drawn (the hydrazone is trans, and the alkene is cis). Here is the corresponding molblock: Mrv1921 1012592D 10 11 0 0 0 0999 V2000 15.65131.47110. N 0 0 0 0 0 0 0 0 0 0 0 0 16.06350.75580. N 0 0 0 0 0 0 0 0 0 0 0 0 16.88900.75460. C 0 0 0 0 0 0 0 0 0 0 0 0 17.30100.03940. C 0 0 3 0 0 0 0 0 0 0 0 0 17.3788 -0.67540. C 0 0 0 0 0 0 0 0 0 0 0 0 18.0262 -1.15190. C 0 0 3 0 0 0 0 0 0 0 0 0 18.7209 -1.23210. C 0 0 0 0 0 0 0 0 0 0 0 0 18.6301 -0.47940. C 0 0 0 0 0 0 0 0 0 0 0 0 17.3831 -1.57410. C 0 0 0 0 0 0 0 0 0 0 0 0 17.8753 -0.38740. C 0 0 3 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 2 0 0 0 0 4 3 1 4 0 0 0 4 5 1 0 0 0 0 6 5 1 4 0 0 0 6 7 1 0 0 0 0 7 8 2 0 0 0 0 6 9 1 0 0 0 0 10 9 1 4 0 0 0 10 4 1 0 0 0 0 10 8 1 0 0 0 0 M END And we can see 3 atoms are set with atom parity = 3 (either or unmarked; ignored when read). And 3 single bonds are set with bond stereo = 4 (either). Both double bonds are set with bond stereo = 0 (use coords to determine cis or trans). If I read this into RDKit, however, I see one of the double bonds (the hydrazone one) is interpreted as STEREOANY and not STEREONONE as the molblock intended: mol = Chem.MolFromMolBlock(test_mb_20) for bond in mol.GetBonds(): if bond.GetBondType() == Chem.BondType.DOUBLE: print(bond.GetStereo()) STEREOANY STEREONONE And if I make a call to EnumerateStereoisomers() I see the above error. Is there a step (or understanding) I am missing? Kind regards James PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R) Limited (no. 1985479) Granta Park, Great Abington Cambridge, CB21 6GB, United Kingdom Tel: +44 (0)1223 895 555 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Simple question about double bond stereo in molblock output
Dear All, I wonder if I can quickly sanity-check something(?). I have noticed that symmetrical double bonds output with a bond stereo setting of "3" (cis or trans (either) double bond) in the standard molblock output. Is this expected/intentional? I would have expected a setting of "0" (use coords to determine cis or trans) for a non-stereo double bond. (I am using 2020.09.1) Here's a simple example: m = Chem.MolFromSmiles('FC(F)=CC1=CC=CC=C1') print(Chem.MolToMolBlock(m)) RDKit 2D 10 10 0 0 0 0 0 0 0 0999 V2000 5.2500 -1.29900. F 0 0 0 0 0 0 0 0 0 0 0 0 3.7500 -1.29900. C 0 0 0 0 0 0 0 0 0 0 0 0 3. -2.59810. F 0 0 0 0 0 0 0 0 0 0 0 0 3.0.0. C 0 0 0 0 0 0 0 0 0 0 0 0 1.50000.0. C 0 0 0 0 0 0 0 0 0 0 0 0 0.7500 -1.29900. C 0 0 0 0 0 0 0 0 0 0 0 0 -0.7500 -1.29900. C 0 0 0 0 0 0 0 0 0 0 0 0 -1.50000.0. C 0 0 0 0 0 0 0 0 0 0 0 0 -0.75001.29900. C 0 0 0 0 0 0 0 0 0 0 0 0 0.75001.29900. C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 2 3 1 0 2 4 2 3 4 5 1 0 5 6 2 0 6 7 1 0 7 8 2 0 8 9 1 0 9 10 2 0 10 5 1 0 M END This behaviour is maybe what I would expect if the bond was explicitly set using bond.SetStereo(Chem.BondStereo.STEREOANY), but in the absence of this I would expect the bond to default to STEREONONE, and I guess I would expect this to be bond stereo "0" in the output molblock. What am I missing? Kind regards James PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R) Limited (no. 1985479) Granta Park, Great Abington Cambridge, CB21 6GB, United Kingdom Tel: +44 (0)1223 895 555 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Open3DAlign scoring of existing alignment?
Dear All (especially Paolo!), I have a strong suspicion I have already asked this at some point in the past - so apologies in advance (but I can't seem to find the answer)... I am interested in taking an existing overlay of two RDKit molecules in 3D and scoring the overlay using Open3DAlign scoring scheme (eg with MMFF atom-types), but *without* trying to optimise the alignment or score. I thought setting maxIters=0 in the call to AllChem.GetO3A() would do the trick (I even tried setting options=3 to "trigger local optimization"). Eg o3a = AllChem.GetO3A(prb_mol, ref_mol, maxIters=0, options=3) o3a.Matches() # Show the matches But while the options setting certainly changes the matching atoms (and the score), the matches don't seem to correspond well to my starting alignment... Any advice is greatly appreciated (including, of course, simply pointing me to the old answer that I am likely missing!) Kind regards James __ PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis Limited (Company no. 2304992), Vernalis (R) Limited (no. 1985479) and Vernalis Development Limited (no. 2600483) Granta Park Great Abington Cambridge CB21 6GB, UK Tel: +44 (0)1223 895 555 _ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Is it possible to get a breakdown of conformational energy terms?
Dear All, Recently I have been assessing some ligand conformations from crystal structures to identify any non-ideal bond lengths, angles, torsions, or non-bonded contacts. What I am doing at the moment is adding some positional constraints to the crystallographic heavy atom positions, and calculating the energy before and after minimisation: >>> m = Chem.MolFromMolFile('input.mol') >>> mh = AllChem.AddHs(m, addCoords=True) >>> mp = AllChem.MMFFGetMoleculeProperties(mh, mmffVariant='MMFF94s') >>> ff = AllChem.MMFFGetMoleculeForceField(mh, mp) >>> ff.CalcEnergy() This gives the 'raw' energy. >>> for atom in mh.GetAtoms(): >>> if not atom.GetAtomicNum() == 1: >>> idx = atom.GetIdx() >>> ff.MMFFAddPositionConstraint(idx, maxDispl=0.5, forceConstant=100) >>> ff.Minimize(maxIts=1) >>> ff.CalcEnergy() And this gives the energy after applying a moderate restraint (100 kcal/mol, with a maximum displacement of 0.5 A). So I think this is ok(?), and I can compare the two energies and inspect the conformations visually. What I was wondering was whether there is a way of obtaining the individual energy terms (ie each bonded and non-bonded term, angle, and torsion)? Because what I'd really like to do is identify the areas of the molecule that contribute the most to the pre- and post- minimisation energy difference. Any suggestions would be greatly appreciated! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral centres(?)
Hi Greg (et al.), Thanks for looking into it. And thanks to Paolo, who gave me a good workaround suggestion – which was to desymmetrise the spirocyclic centre by modifying the isotope on one of the neighbours. This is good for attended processing of single molecules, but not so good for unattended processing of unknown molecules… Reading in molecules with sanitize=False is a good start, but my first thought was then to do some sort of rSMARTS transform to automate the isomer assignment. It soon became apparent that this wasn’t the way to go – as abilities are limited with an unsanitised molecule(!). So I ended-up with the following: m3 = Chem.MolFromSmiles('O[C@H]1CC[C@]11CC[C@@](Cl)(Br)CC1', sanitize=False) for atom in m3.GetAtoms(): print "Stereo:", atom.GetChiralTag(), "Neighbours:", [n.GetSymbol() for n in atom.GetNeighbors()] # chiral centres currently intact # Find possible spirocentres for atom in m3.GetAtoms(): if len(atom.GetNeighbors()) == 4 and atom.IsInRing() and atom.GetChiralTag() != 'CHI_UNSPECIFIED': # We have found a candidate spirocentre modify a neighbour at random first_neighbour = atom.GetNeighbors()[0] first_neighbour.SetIsotope(100) Chem.SanitizeMol(m3) # Now we can sanitise test3_mols = summarise_conformers(m3) # and generate the conformers (as before) sdf = Chem.SDWriter('test3.sdf') # and write them out (but resetting the isotopes first) for mol in test3_mols: for atom in mol.GetAtoms(): if atom.GetIsotope() == 100: atom.SetIsotope(0) sdf.write(mol) GIST is updated to include this: https://gist.github.com/jepdavidson/fdfbf6366a17f4829de3d4de22f3b442 Kind regards James From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 08 February 2017 03:45 To: James Davidson Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral centres(?) Hi James, This is definitely a bug. The problem seems to be connected to the way what the RDKit calls "ring stereochemistry" is handled when there are spiro linkages. Here's the github issue: https://github.com/rdkit/rdkit/issues/1294 I'll take a look. Best, -greg On Tue, Feb 7, 2017 at 8:32 PM, James Davidson <j.david...@vernalis.com<mailto:j.david...@vernalis.com>> wrote: Dear All, I have hit what I think is a problem with stereochemistry perception/handling for certain types of pseudochiral and/or spirocyclic systems. Basically I am observing that some types of input tetrahedral stereochemical information gets lost when an RDKit molecule is generated. But I only realised this because I was wanting to generate conformers and was seeing stereochemical scrambling… Anyway, an example with pictures will probably explain things better: https://gist.github.com/jepdavidson/fdfbf6366a17f4829de3d4de22f3b442 Any help/advice appreciated. Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com<mailto:postmas...@vernalis.com>. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 <tel:+44%20118%20938%20> To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com<http://www.vernalis.com> and click on the "Company address and registration details" link at the bottom of the page.. __ -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss __ PLEASE READ: This email is confide
[Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral centres(?)
Dear All, I have hit what I think is a problem with stereochemistry perception/handling for certain types of pseudochiral and/or spirocyclic systems. Basically I am observing that some types of input tetrahedral stereochemical information gets lost when an RDKit molecule is generated. But I only realised this because I was wanting to generate conformers and was seeing stereochemical scrambling... Anyway, an example with pictures will probably explain things better: https://gist.github.com/jepdavidson/fdfbf6366a17f4829de3d4de22f3b442 Any help/advice appreciated. Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit patch releases in conda?
Hi Riccardo, > are you working on Windows? Pre-built conda packages targeting the 2016.03 > patch releases are at this time only available for linux and osx. Yes, I'm afraid so... > an additional patch release was tagged before the UGM, and I think it wasn't > yet pushed to the anaconda channel. if there's interest for making this > revision available, > I can try to include some windows packages (for the amd64 platform at > least), otherwise it might make sense to wait for the upcoming release? I would definitely be interested in this (py2 and py3) for win64, but if it is anything more than a small amount of work, then please don't do it just for me! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __ -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit patch releases in conda?
Hi Greg (and Riccardo), > Riccardo had pushed binaries for Linux and I have done most of the mac > versions, but doing windows builds, which I suspect is what you want, is > enough of a barrier that we haven't done those. > There is an ongoing discussion about resolving this problem, but it is > non-trivial. It sounds like there is a chance that Riccardo will look at a win64 2016_03 patch build for conda (which would be geat!) > P.S. Obligatory plug: this is a matter of focusing resources on an > less-than-pleasant task; exactly the kind of thing that RDKit > maintenance/support customers can reasonably request. That sounds fair... Kind regards James _____ From: James Davidson <j.david...@vernalis.com> Sent: Wednesday, November 2, 2016 2:32 PM Subject: [Rdkit-discuss] RDKit patch releases in conda? To: <rdkit-discuss@lists.sourceforge.net> Dear All, I think I probably know the answer to this already, but wanted to double check – did any of the four 2016_03 patch releases ever get pushed to conda? I only seem to get 2016_03_1 with “conda update –c https://conda.anaconda.org/rdkit rdkit” (if not available then I guess this is academic with 2016_09_1 around the corner?) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __ __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __ -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] RDKit patch releases in conda?
Dear All, I think I probably know the answer to this already, but wanted to double check - did any of the four 2016_03 patch releases ever get pushed to conda? I only seem to get 2016_03_1 with "conda update -c https://conda.anaconda.org/rdkit rdkit" (if not available then I guess this is academic with 2016_09_1 around the corner?) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __-- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Problem adding hydrogens to peptides
Dear All, Enthused by all the great talks at the UGM, for the last couple of days I have been getting more hands-on with RDKit than I have in quite a while! I was keen to work with some peptides/proteins in 3D, but am having some problems when adding hydrogens... I have uploaded a GIST to demonstrate the issue (apologies - the py3Dmol js doesn't render in the nbviewer, but this doesn't affect understanding): https://gist.github.com/jepdavidson/f5220187c18be0fc9e119f9da2e7d955 The main problem is that added hydrogens don't automatically get assigned monomer info from the monomer they are being added to, but there are other issues as well (the hydrogens are marked 'HETATM', the occupancy for the ATOM blocks are set to "-nan", and the CONECT block doesn't list the added Hs). Propagating the monomer info from the amino acids to the added Hs isn't too difficult (can call atom.GetNeighbors() and take the info from the neighbouring atom) - but there are also some preferred (or required?) naming and numbering conventions to adhere to ("H" for the backbone NH, "HA" for the hydrogen on the alpha carbon, etc). Perhaps I am missing something here (a secret 'flavour' option? :)) - but if not, it would be interesting to hear what behaviour others would expect when adding explicit hydrogens (I think the same issues will relate to any sequence where monomer information is present). Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __-- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] New problem compiling RDKit on Windows
Hi again, Greg > If you still have problems with this (or hc.c), please let me know, hc.c fails to compile. The errors are shown below, and then I get related linking errors. I'm hoping all the errors are related(?) The first line affected is line 42: static doublereal inf = 1e20; Kind regards James Error 2426error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 42 1 hc Error 2427error C2275: 'integer' : illegal use of this type as an expression C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 45 1 hc Error 2428error C2146: syntax error : missing ';' before identifier 'i__1'C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 45 1 hc Error 2429error C2065: 'i__1' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 45 1 hc Error 2430error C2065: 'i__2' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 45 1 hc Error 2431error C2275: 'doublereal' : illegal use of this type as an expression C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 46 1 hc Error 2432error C2146: syntax error : missing ';' before identifier 'd__1'C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 46 1 hc Error 2433error C2065: 'd__1' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 46 1 hc Error 2434error C2065: 'd__2' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 46 1 hc Error 2435error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 49 1 hc Error 2436error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 50 1 hc Error 2437error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 51 1 hc Error 2438error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 52 1 hc Error 2439error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 53 1 hc Error 2440error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 54 1 hc Error 2441error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 55 1 hc Error 2442error C2143: syntax error : missing ';' before 'type' C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 56 1 hc Error 2443error C2065: 'i__1' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 72 1 hc Error 2444error C2065: 'i__' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 73 1 hc Error 2445error C2065: 'i__1' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 73 1 hc Error 2446error C2065: 'i__' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 74 1 hc Error 2447error C2065: 'i__' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 75 1 hc Error 2448error C2065: 'ncl' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 77 1 hc Error 2449error C2065: 'i__1' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 79 1 hc Error 2450error C2065: 'ind' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 80 1 hc Error 2451error C2065: 'i__1' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 80 1 hc Error 2452error C2065: 'ind' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 81 1 hc Error 2453error C2065: 'i__1' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 90 1 hc Error 2454error C2065: 'i__' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 91 1 hc Error 2455error C2065: 'i__1' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 91 1 hc Error 2456error C2065: 'dmin__' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 92 1 hc Error 2457error C2065: 'inf' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 92 1 hc Error 2458error C2065: 'i__2' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 93 1 hc Error 2459error C2065: 'j' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 94 1 hc Error 2460error C2065: 'i__' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 94 1 hc Error 2461error C2065: 'i__2' : undeclared identifier C:\RDKit\Code\ML\Cluster\Murtagh\hc.c 94 1 hc Error 2462error C2065: 'ind' : undeclared
Re: [Rdkit-discuss] New problem compiling RDKit on Windows
That looks like a leftover from a source-control conflict. I can't find it in github: https://github.com/rdkit/rdkit/blob/master/Code/RDBoost/Wrap.h#L133 Could it be that you are pulling from github and that you had local modifications to the file that lead to a conflict? __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __ -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] New problem compiling RDKit on Windows
Dear All, For quite some time I have been successfully compiling RDKit on Windows using Visual Studio 2012. However, recently (and perhaps triggered by a recent VS update that I accepted) I am getting errors. The problem seems to be in Wrap.h (line 133): <<< .mine VS is complaining "error C2059: syntax error:'<<'" and the corresponding error when inspecting the code is "Error: expected a declaration". Does anyone have any suggestions for working through this? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __-- Go from Idea to Many App Stores Faster with Intel(R) XDK Give your users amazing mobile app experiences with Intel(R) XDK. Use one codebase in this all-in-one HTML5 development environment. Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs. http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Rev. 5775 (Windows) - pyGraphMolWrap test fails
Dear All, I have just built revision 5775 on Windows, and the pyGraphMolWrap test fails. The relevant bit of the verbose output is below: 78: ERROR: testGithub498 (__main__.TestCase) 78: -- 78: Traceback (most recent call last): 78: File C:/RDKit/Code/GraphMol/Wrap/rough_test.py, line 3033, in testGithub498 78: outf = gzip.open(tempfile.mktemp(),'wt+') 78: File C:\Anaconda\lib\gzip.py, line 34, in open 78: return GzipFile(filename, mode, compresslevel) 78: File C:\Anaconda\lib\gzip.py, line 94, in __init__ 78: fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb') 78: ValueError: Invalid mode ('wt+b') This seems to be due to a difference in python/gzip behaviour on Windows vs. eg Linux: On Ubuntu (Anaconda python): In [1]: import tempfile, gzip In [2]: outf = gzip.open(tempfile.mktemp(), 'wt+') In [3]: On Windows (again Anaconda python): In [1]: import tempfile, gzip In [2]: outf = gzip.open(tempfile.mktemp(), 'wt+') --- ValueErrorTraceback (most recent call last) ipython-input-2-6bee12287576 in module() 1 outf = gzip.open(tempfile.mktemp(), 'wt+') C:\Anaconda\lib\gzip.pyc in open(filename, mode, compresslevel) 32 33 --- 34 return GzipFile(filename, mode, compresslevel) 35 36 class GzipFile(io.BufferedIOBase): C:\Anaconda\lib\gzip.pyc in __init__(self, filename, mode, compresslevel, fileobj, mtime) 92 mode += 'b' 93 if fileobj is None: --- 94 fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb') 95 if filename is None: 96 # Issue #13781: os.fdopen() creates a fileobj with a bogus name ValueError: Invalid mode ('wt+b') In [3]: Is this an easy one to fix? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Don't Limit Your Business. Reach for the Cloud. GigeNET's Cloud Solutions provide you with the tools and support that you need to offload your IT needs and focus on growing your business. Configured For All Businesses. Start Your Cloud Today. https://www.gigenetcloud.com/___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] rev5771 and boost_chrono?
Dear All, I recently rebuilt RDKit under 64bit Windows and things worked great for me. However, I found that when I shared the build with another user, things weren't so good - from rdkit.Chem import AllChem gave a DLL error that pointed to rdForceFieldHelpers.pyd. So I then ran Dependecy Walker and, as well as pointing at the usual boost libraries (python, system, thread), it also pointed at chrono. This is the first time I had seen this. Adding boost_chrono-vc110-mt-1_56.dll to the other user's rdkit/lib folder sorted the issue - which is great. So this is a heads-up, in case it helps others; but also a question: is there a good way to figure out all the boost dependencies ahead of deploying? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Don't Limit Your Business. Reach for the Cloud. GigeNET's Cloud Solutions provide you with the tools and support that you need to offload your IT needs and focus on growing your business. Configured For All Businesses. Start Your Cloud Today. https://www.gigenetcloud.com/___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Python GetShortestPath()?
Hi Greg, I just built the latest revision - and the functionality is exposed - thanks (and, of course, thanks Paolo!). Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Python GetShortestPath()?
Dear All, I might be having a 'moment' here, but for the life of me I can't seem to find the equivalent of RDKit::MolOps::getShortestPath exposed in python(?). I want to pass in two atom ids, and get back a list of atom ids in the shortest path. I could possibly try to roll my own by using GetDistanceMatrix() and GetAdjacencyMatrix(), but I think I may struggle(!). So, any pointer to GetShortestPath() greatly appreciated! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Python GetShortestPath()?
Hi Nick, Well on the plus side I don't get a segfault(!) On the minus side - unfortunately I think this approach only gives the length of the shortest path, rather than a list of the atom ids in the shortest path. Kind regards James -Original Message- From: Nicholas Firth [mailto:nicholas.fi...@icr.ac.uk] Sent: 21 April 2015 17:44 To: James Davidson; rdkit-discuss@lists.sourceforge.net Subject: RE: Python GetShortestPath()? Dear James, I tried to be helpful and show you how I do it with GetAdjacencyMatrix, however I ran into my old friend the segmentation fault 11 as there is still some weird error with this function. Here's what I have though, should work for you. from rdkit import Chem m = Chem.MolFromSmiles('CC[C@H](CO)NC1=NC(=C2C(=N1)N(C=N2)C(C)C)NCC 3=CC= CC=C3') atomIdx1 = 0 atomIdx2 = 10 print(Chem.GetAdjacencyMatrix(m)[atomIdx1][atomIdx2]) Segmentation fault: 11 Best, Nick Nicholas C. Firth | PhD Student | Cancer Therapeutics The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | Surrey | SM2 5NG T 020 8722 4033 | E nicholas.fi...@icr.ac.uk | W www.icr.ac.uk | Twitter @ICRnews From: James Davidson [j.david...@vernalis.com] Sent: 21 April 2015 17:06 To: rdkit-discuss@lists.sourceforge.net Subject: [Rdkit-discuss] Python GetShortestPath()? Dear All, I might be having a 'moment' here, but for the life of me I can't seem to find the equivalent of RDKit::MolOps::getShortestPath exposed in python(?). I want to pass in two atom ids, and get back a list of atom ids in the shortest path. I could possibly try to roll my own by using GetDistanceMatrix() and GetAdjacencyMatrix(), but I think I may struggle(!). So, any pointer to GetShortestPath() greatly appreciated! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address
Re: [Rdkit-discuss] Problem building recent revisions on Windows
Thanks Greg I have now tried using boost 1.56 (the cmake configuration once BOOST_ROOT is set is a little different vs. 1.55…). Either way I can’t seem to build with threadsafe/multithreaded support – but perhaps we should draw a line under it for now / follow-up ‘off-line’(?). The good news is that (with the thread settings OFF) the changes you checked-in (rev5616) have indeed sorted the MolHash piece, and all tests pass – thanks! Kind regards James From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 14 April 2015 05:35 To: James Davidson Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Problem building recent revisions on Windows Hi James Thanks for your patience here. I will invest the time in trying to get automated builds happening on this Windows side so that this stops happening. I just pushed some changes that should clear up the MolHash-related build problems. I had forgotten that that bit of code still needed to be tested on Windows. I haven't tested building the cartridge on Windows (I'm not set up to do that), but the python wrappers definitely do build for me now with both RDK_BUILD_THREADSAFE_SSS and RDK_TEST_MULTITHREADED set to ON. I don't think it should make a difference, but just in case: I am doing this using boost 1.56. Best, -greg On Mon, Apr 13, 2015 at 11:30 AM, James Davidson j.david...@vernalis.commailto:j.david...@vernalis.com wrote: Here’s an update: Tried building rev5211, but saw similar linking errors (to do with Boost threading libraries). The most recent build that I have successfully managed without the errors is rev5016. It occurred to me that a couple of my cmake options relate to threading (RDK_BUILD_THREADSAFE=ON; RDK_TEST_MULTITHREADED=ON) – and perhaps other people (Greg, Paolo) with successful Windows builds had these set OFF (default)(?). Indeed, if I set both of these to OFF, I can successfully build rev5211, and all of the tests pass. So this is good – but I still don’t understand what has changed (in relation to Boost threading) that means that later versions don’t build… Threading aside, I was now feeling pretty confident that I would be ok to build the latest revision (rev5611). Unfortunately this was not the case – I initially hit a problem with MolHash.cpp, which I thought was down to MSVC being stupid about snprintf() (line 265). After a little stack-overflowing, I thought changing this to _snprintf() would solve the problem – which it appears to (at least MolHash.cpp now compiles), but then I get a couple of further errors down-stream (probably related to the change I made?): Error 1 error LNK2019: unresolved external symbol class std::basic_stringchar,struct std::char_traitschar,class std::allocatorchar __cdecl RDKit::Descriptors::calcMolFormula(class RDKit::ROMol const ,bool,bool) (?calcMolFormula@Descriptors@RDKit@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVROMol@2@_N1@Z) referenced in function void __cdecl RDKit::MolHash::generateMoleculeHashSet(class RDKit::ROMol const ,struct RDKit::MolHash::HashSet ,class std::vectorunsigned int,class std::allocatorunsigned int const *,class std::vectorunsigned int,class std::allocatorunsigned int const *) (?generateMoleculeHashSet@MolHash@RDKit@@YAXAEBVROMol@2@AEAUHashSet@12@PEBV?$vector@IV?$allocator@I@std@@@std@@2@Z) C:\RDKit\build\Code\GraphMol\MolHash\Wrap\MolHash.lib(MolHash.obj) rdMolHash Error 2 error LNK1120: 1 unresolved externals C:\RDKit\build\rdkit\Chem\Release\rdMolHash.pyd rdMolHash So I guess it would still be good to understand the threading issue (or at least for someone else to be able to reproduce it); and perhaps the observed MolHash issue is an easier one to sort(?) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.commailto:postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 tel:%2B44%20%280
Re: [Rdkit-discuss] Problem building recent revisions on Windows
Here's an update: Tried building rev5211, but saw similar linking errors (to do with Boost threading libraries). The most recent build that I have successfully managed without the errors is rev5016. It occurred to me that a couple of my cmake options relate to threading (RDK_BUILD_THREADSAFE=ON; RDK_TEST_MULTITHREADED=ON) - and perhaps other people (Greg, Paolo) with successful Windows builds had these set OFF (default)(?). Indeed, if I set both of these to OFF, I can successfully build rev5211, and all of the tests pass. So this is good - but I still don't understand what has changed (in relation to Boost threading) that means that later versions don't build... Threading aside, I was now feeling pretty confident that I would be ok to build the latest revision (rev5611). Unfortunately this was not the case - I initially hit a problem with MolHash.cpp, which I thought was down to MSVC being stupid about snprintf() (line 265). After a little stack-overflowing, I thought changing this to _snprintf() would solve the problem - which it appears to (at least MolHash.cpp now compiles), but then I get a couple of further errors down-stream (probably related to the change I made?): Error 1 error LNK2019: unresolved external symbol class std::basic_stringchar,struct std::char_traitschar,class std::allocatorchar __cdecl RDKit::Descriptors::calcMolFormula(class RDKit::ROMol const ,bool,bool) (?calcMolFormula@Descriptors@RDKit@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVROMol@2@_N1@Z) referenced in function void __cdecl RDKit::MolHash::generateMoleculeHashSet(class RDKit::ROMol const ,struct RDKit::MolHash::HashSet ,class std::vectorunsigned int,class std::allocatorunsigned int const *,class std::vectorunsigned int,class std::allocatorunsigned int const *) (?generateMoleculeHashSet@MolHash@RDKit@@YAXAEBVROMol@2@AEAUHashSet@12@PEBV?$vector@IV?$allocator@I@std@@@std@@2@Z) C:\RDKit\build\Code\GraphMol\MolHash\Wrap\MolHash.lib(MolHash.obj) rdMolHash Error 2 error LNK1120: 1 unresolved externals C:\RDKit\build\rdkit\Chem\Release\rdMolHash.pyd rdMolHash So I guess it would still be good to understand the threading issue (or at least for someone else to be able to reproduce it); and perhaps the observed MolHash issue is an easier one to sort(?) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Problem building recent revisions on Windows
Hi Greg, James: one odd thing I notice about the error messages you posted is that they are all referencing a boost library that seems to be present in your build directory: Error 2651 error LNK2005: public: virtual __cdecl boost::detail::thread_data_base::~thread_data_base(void) (??1thread_data_base@detail@boost@@UEAA @XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost _thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Is this just MSVC being odd about how it reports errors or is there really a version of the boost threading library in your build\Code\GraphMol\DistGeomHelpers\Wrap directory? No boost threading library there... And I clear-out the build folder between builds anyway. So I guess it is a quirk of the reporting. I will have another go at building the latest version and report back on success or otherwise(!) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Problem building recent revisions on Windows
Hi Paolo, Unfortunately I have the impression that James' problem is related to neither of those. Might it be a boost/libboost naming issue? Perhaps, but cmake seems happy (see below)... James, could it be that you have multiple version of boost on your Windows machine and CMake is not picking the correct one? You might try to explicitly define on the CMake command line both BOOST_ROOT and BOOST_LIBRARYDIR location as I do on my system: C:\Program Files (x86)\CMake\bin\cmake -DBOOST_LIBRARYDIR=c:\32\boost_1_55_0_py34\lib32-msvc-12.0 -DBOOST_ROOT=c:\32\boost_1_55_0_py34 .. I do have multiple versions around, but I have the following set (from the cmake GUI): BOOST_LIBRARYDIRC:/local/boost_1_55_0-msvc-11.0-64/lib64-msvc-11.0 BOOST_ROOT C:/local/boost_1_55_0-msvc-11.0-64 and cmake reports that this version of boost is found: Boost version: 1.55.0 Found the following Boost libraries: regex So I am not sure what change is giving me the issue... Let's wait and see what Greg finds as well(!) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Problem building recent revisions on Windows
Dear All, I just tried building the latest RDKit build (rev. 5204) from the github repository, and hit a lot of link errors... So (somewhat at random) I tried an older build (5042), and saw very similar things (errors for this attempt are below). I am running on 64-bit Windows, and use cmake and Visual Studio 2012 - my build process hasn't changed since the last time I successfully built (rev. 4274 - and I can confirm that if I roll-back to this revision, the build is once again successful), so I wondered if anyone more skilled in the art than me could suggest what the problem might be from the errors below(?) These are the errors when building the 'ALL_BUILD' project: Error 2651 error LNK2005: public: virtual __cdecl boost::detail::thread_data_base::~thread_data_base(void) (??1thread_data_base@detail@boost@@UEAA@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2652 error LNK2005: public: void __cdecl boost::thread::detach(void) (?detach@thread@boost@@QEAAXXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2653 error LNK2005: class boost::thread::id __cdecl boost::this_thread::get_id(void) (?get_id@this_thread@boost@@YA?AVid@thread@2@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2654 error LNK2005: public: class boost::thread::id __cdecl boost::thread::get_id(void)const (?get_id@thread@boost@@QEBA?AVid@12@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2656 error LNK2005: private: bool __cdecl boost::thread::join_noexcept(void) (?join_noexcept@thread@boost@@AEAA_NXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2658 error LNK2005: public: bool __cdecl boost::thread::joinable(void)const (?joinable@thread@boost@@QEBA_NXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2659 error LNK2005: private: bool __cdecl boost::thread::start_thread_noexcept(void) (?start_thread_noexcept@thread@boost@@AEAA_NXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2660 error LNK1169: one or more multiply defined symbols found C:\RDKit\build\rdkit\Chem\Release\rdDistGeom.pydrdDistGeom Error 2675 error LNK2005: public: virtual __cdecl boost::detail::thread_data_base::~thread_data_base(void) (??1thread_data_base@detail@boost@@UEAA@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdForceFieldHelpers Error 2676 error LNK2005: public: void __cdecl boost::thread::detach(void) (?detach@thread@boost@@QEAAXXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdForceFieldHelpers Error 2677 error LNK2005: class boost::thread::id __cdecl boost::this_thread::get_id(void) (?get_id@this_thread@boost@@YA?AVid@thread@2@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdForceFieldHelpers Error 2678 error LNK2005: public: class boost::thread::id __cdecl boost::thread::get_id(void)const (?get_id@thread@boost@@QEBA?AVid@12@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdForceFieldHelpers Error 2679 error LNK2005: private: bool __cdecl boost::thread::join_noexcept(void) (?join_noexcept@thread@boost@@AEAA_NXZ) already defined in
Re: [Rdkit-discuss] Problem building recent revisions on Windows
Hi Greg – thanks! One extra piece: as of a few minutes ago, I can confirm that revision 4947 (last revision in Feb) builds, and passes all of the tests. Kind regards James From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 08 April 2015 15:14 To: James Davidson Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Problem building recent revisions on Windows I will fire up windows tomorrow morning and ensure that things can build. It's been a couple weeks since I last did that. -greg On Wed, Apr 8, 2015 at 3:43 PM, James Davidson j.david...@vernalis.commailto:j.david...@vernalis.com wrote: Dear All, I just tried building the latest RDKit build (rev. 5204) from the github repository, and hit a lot of link errors… So (somewhat at random) I tried an older build (5042), and saw very similar things (errors for this attempt are below). I am running on 64-bit Windows, and use cmake and Visual Studio 2012 – my build process hasn’t changed since the last time I successfully built (rev. 4274 – and I can confirm that if I roll-back to this revision, the build is once again successful), so I wondered if anyone more skilled in the art than me could suggest what the problem might be from the errors below(?) These are the errors when building the ‘ALL_BUILD’ project: Error 2651 error LNK2005: public: virtual __cdecl boost::detail::thread_data_base::~thread_data_base(void) (??1thread_data_base@detail@boost@@UEAA@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2652 error LNK2005: public: void __cdecl boost::thread::detach(void) (?detach@thread@boost@@QEAAXXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2653 error LNK2005: class boost::thread::id __cdecl boost::this_thread::get_id(void) (?get_id@this_thread@boost@@YA?AVid@thread@2@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2654 error LNK2005: public: class boost::thread::id __cdecl boost::thread::get_id(void)const (?get_id@thread@boost@@QEBA?AVid@12@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2656 error LNK2005: private: bool __cdecl boost::thread::join_noexcept(void) (?join_noexcept@thread@boost@@AEAA_NXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2658 error LNK2005: public: bool __cdecl boost::thread::joinable(void)const (?joinable@thread@boost@@QEBA_NXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2659 error LNK2005: private: bool __cdecl boost::thread::start_thread_noexcept(void) (?start_thread_noexcept@thread@boost@@AEAA_NXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdDistGeom Error 2660 error LNK1169: one or more multiply defined symbols found C:\RDKit\build\rdkit\Chem\Release\rdDistGeom.pydrdDistGeom Error 2675 error LNK2005: public: virtual __cdecl boost::detail::thread_data_base::~thread_data_base(void) (??1thread_data_base@detail@boost@@UEAA@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdForceFieldHelpers Error 2676 error LNK2005: public: void __cdecl boost::thread::detach(void) (?detach@thread@boost@@QEAAXXZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj) rdForceFieldHelpers Error 2677 error LNK2005: class boost::thread::id __cdecl boost::this_thread::get_id(void) (?get_id@this_thread@boost@@YA?AVid@thread@2@XZ) already defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll) C:\RDKit\build\Code\GraphMol\ForceFieldHelpers
Re: [Rdkit-discuss] Tests failing on Windows: more info
Hi Paolo, Greg, et al. I have also been having some problems recently building (64-bit Windows) from recent github versions, but I don't know if this is related to what you see, Paolo... My environment is Win 7 64-bit, CMake 3.0.0, boost_1_55_0-msvc-11.0-64, MS Visual Studio Express 2012. I have done a bit of version rolling-back and forwards to see if I can pinpoint the last version that builds with no errors, and this is what I have found so far (sorted by revision, not by sequence of attempts!): 4577 - compiles fine, - passes all tests 4618 - as above 4649 - some errors during compile, -passes all tests except the molDraw2D bits (which are also involved in the errors) 4651 - as above 4743 - as above 4765 - as above 4780 - pyGraphMolWrap now fails test 4826 - this is where significant problems start (for me at least). pyGraphMolWrap still fails, but now with a segfault 4859 - same segfault as above. Also pymolDraw2D test fails... The errors I start to see for molDraw2D are this sort of thing (is this expected?): Error 49 error C2668: 'boost::tuples::tie' : ambiguous call to overloaded function C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp 341 1 MolDraw2D Error 50 error C2668: 'boost::tuples::tie' : ambiguous call to overloaded function C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp 353 1 MolDraw2D Error 51 error C2668: 'boost::tuples::tie' : ambiguous call to overloaded function C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp 357 1 MolDraw2D Error 61 error C2668: 'boost::tuples::tie' : ambiguous call to overloaded function C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp 544 1 MolDraw2D Error 63 error C2668: 'boost::tuples::tie' : ambiguous call to overloaded function C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp 591 1 MolDraw2D Error 131 error LNK1181: cannot open input file '..\..\..\lib\Release\MolDraw2D.lib' C:\RDKit\build\Code\GraphMol\MolDraw2D\LINK moldraw2DTest1 Error 149 error LNK1181: cannot open input file '..\..\..\lib\Release\MolDraw2D.lib' C:\RDKit\build\Code\GraphMol\Wrap\LINK rdmolops If I see the above errors when building 'ALL_BUILD', I also see the following error when building the 'INSTALL' section: Error 41 error MSB3073: The command setlocal C:\Program Files (x86)\CMake\bin\cmake.exe -DBUILD_TYPE=Release -P cmake_install.cmake if %errorlevel% neq 0 goto :cmEnd :cmEnd endlocal call :cmErrorLevel %errorlevel% goto :cmDone :cmErrorLevel exit /b %1 :cmDone if %errorlevel% neq 0 goto :VCEnd :VCEnd exited with code 1.C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets 134 5 INSTALL Anyway, 4618 is the latest revision that I have tested where I see no build errors, and 4765 is the latest revision I've found before I start to see pyGraphMolWrap tests failing (or segfaults). For now, I have rolled-back my installation to 4618 (but would be very happy if anyone can figure-out what causes the problems with later revisions). Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Dive into the World of Parallel
[Rdkit-discuss] Avalon test failing(?)
Hi Greg, I wondered if you (or anyone else) have been seeing any issues with win64 build of the RDKit - with Avalon toolkit support - recently? Yesterday I updated my local SVN copy of RDKit (to rev4274) and rebuilt. Everything seemed to go ok, but the testAvalonLib1 test is now failing (the pyAvalonTools test passes) - see below. I can see that test1.cpp has changed recently, but my AvalonTools source hasn't... Has a problem been introduced into the test? Kind regards James C:\RDKit\buildctest -R testAvalon -V UpdateCTestConfiguration from :C:/RDKit/build/DartConfiguration.tcl UpdateCTestConfiguration from :C:/RDKit/build/DartConfiguration.tcl Test project C:/RDKit/build Constructing a list of tests Done constructing a list of tests Checking test dependency graph... Checking test dependency graph end test 2 Start 2: testAvalonLib1 2: Test command: C:\RDKit\build\External\AvalonTools\Release\testAvalonLib1.exe 2: Test timeout computed to be: 9.99988e+006 2: [12:31:18] testing canonical smiles generation 2: [12:31:18] done 2: [12:31:18] testing coordinate generation 2: [12:31:18] done 2: [12:31:18] testing fingerprint generation 2: [12:31:18] c1n1 18 2: returning 2: [12:31:18] c1n1 6 2: [12:31:18] c1nnccc1 28 2: [12:31:18] c1ncncc1 25 2: [12:31:18] c1cccnc1 18 2: [12:31:18] c1c1 6 2: [12:31:18] c1cccnc1 19 2: [12:31:18] c1cocc1 48 2: [12:31:18] 2: 2: 2: Test Assert 2: Expression Failed: 2: Violation occurred on line 146 in file ..\..\..\External\AvalonTools\test1.cpp 2: Failed Expression: bv.getNumOnBits()==53 2: 2: 1/1 Test #2: testAvalonLib1 ...***Failed2.87 sec 0% tests passed, 1 tests failed out of 1 Total Test time (real) = 2.98 sec The following tests FAILED: 2 - testAvalonLib1 (Failed) Errors while running CTest __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Avalon test failing(?)
Hi Greg, The new version of the test code is targeting the 1.2 avalon toolkit version. Here's the commit that did that. https://github.com/rdkit/rdkit/commit/42dab414ee6fbe5489078e5e52046608bbf785cb As an FYI, to make these tests pass on windows, you need to edit the code to fix a bug: you need to comment out line 1446 of reaccsio.c: //MyFree((char *)tempdir); Following your advice, I downloaded the 1.2 source from Sourceforge (http://sourceforge.net/projects/avalontoolkit/files/AvalonToolkit_1.2/); commented-out the line in reaccsio.c; and then reconfigured in cmake and rebuilt in VS. The tests pass now - thanks! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] MMFF constraints question
Dear All (but mainly Paulo!), I have really been appreciating the MMFF implementation in RDKit - particularly now with the ability to add position / distance / angle / torsional constraints! I have a couple of naïve questions; and apologise in advance if I have missed answers to these in the documentation / method doc-strings... 1. This is a simple one - but just to categorically confirm - ff.CalcEnergy() gives results in kcal/mol units, right? 2. Now onto force constants... I see from the unittest (aka documentation if you can't find anything else) 'testConstraints.py' that the value of 1.0e5 is used in the tests. 1.0e5 what? And should this be viewed as a strong, modest, or weak restraint? Presuming this constant is somewhat like saying how springy a spring is(?), then what is a sensible value to give me a super-strong steel joist that would essentially resist everything (or is 1.0e5 it)? 3. Final thing: the problem with making something really good / useful is that people get used to it and then start wanting more! How's the GBSA implicit solvation coming on? : ) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] MMFF constraints question
Hi Paolo, First of all - please see this time my brain has engaged quicker than my English-biased touch-typing - and I have spelt your name correctly(!). Thanks for the very clear explanation on force constants - this is really helpful! And, regarding your new non-academic position vs continued 'forcefield tools' development in RDKit - I kind of suspected the answer before I asked! Oh well, my non-academic position of course gives me access to commercial implementations of MMFF + implicit solvation... It's just not nearly as fun! : ( Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit cartridge similarity search speeds(?)
Hi Greg, What these are telling you is that the second query is not using the index: it's a sequential scan, so it has to test all rows of the database. This happens because the index is defined for the operator %, but not for the function tanimoto_sml(). There may be an approach to get the index set up using that function, but there we reach the limits of my expertise. Well, I will stick to the recommended operator use then! One final advanced topic: if you are planning on making regular use of the similarity features in the cartridge and are running on a linux system or Mac I would recommend recompiling the cartridge with some optimizations for tanimoto similarity. To do this, you need to edit the cartridge Makefile from: PG_CPPFLAGS = -I${BOOSTHOME} -I${RDKIT}/Code -DRDKITVER='007200' ${INCHIFLAGS} #-DUSE_BUILTIN_POPCOUNT -msse4.2 to: PG_CPPFLAGS = -I${BOOSTHOME} -I${RDKIT}/Code -DRDKITVER='007200' ${INCHIFLAGS} -DUSE_BUILTIN_POPCOUNT -msse4.2 (I just removed a comment character here). This speeds the Tanimoto calculation up a fair bit (it's still not nearly as fast as Andrew's chemfp, but it's better than the default behavior). I'm on linux (Ubuntu), and have just re-built with the above recommendation. I'll see what the speeds look like afterwards (out of interest, I presume the timings in your examples were with this optimisation in place?). Does this also affect dice? And final question - after rebuilding the cartridge, does the extension need to be dropped and then re-created in all databases; does postgreSQL server need restarting; or neither? Hope this helps, -greg It does - thanks! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] RDKit cartridge similarity search speeds(?)
Dear All, I have recently been spending a bit more time with the RDKit cartridge, and have what is probably a very naïve question... Having built some RDKit fingerprints for ChEMBL_18, I see the following behaviour (for clarification - 'ecfp4_bv' is the column in my rdk.fps table that has been generated using morganbv_fp(mol, 2)): chembl_18=# \timing on Timing is on. chembl_18=# set rdkit.tanimoto_threshold=0.5; SET Time: 0.167 ms chembl_18=# select chembl_id from rdk.fps where ecfp4_bv % morganbv_fp('c1nnccc1'::mol,2); chembl_id - CHEMBL15719 (1 row) Time: 2033.348 ms chembl_18=# select chembl_id from rdk.fps where tanimoto_sml(ecfp4_bv, morganbv_fp('c1nnccc1'::mol, 2)) 0.5; chembl_id - CHEMBL15719 (1 row) Time: 6843.605 ms I can see that the query plans are different in the two cases, but I don't fully understand why - see below: QUERY 1 (with explain analyze) chembl_18=# explain analyze select chembl_id from rdk.fps where ecfp4_bv % morganbv_fp('c1nnccc1'::mol,2); QUERY PLAN Bitmap Heap Scan on fps (cost=106.91..5298.31 rows=1352 width=13) (actual time=1774.986..1774.987 rows=1 loops=1) Recheck Cond: (ecfp4_bv % '\x0100084200048204'::bfp) - Bitmap Index Scan on fps_ecfp4bv_idx (cost=0.00..106.57 rows=1352 width=0) (actual time=1774.969..1774.969 rows=1 loops=1) Index Cond: (ecfp4_bv % '\x0100084200048204'::bfp) Total runtime: 1775.035 ms (5 rows) Time: 1776.133 ms QUERY 2 (with explain analyze) chembl_18=# explain analyze select chembl_id from rdk.fps where tanimoto_sml(ecfp4_bv, morganbv_fp('c1nnccc1'::mol, 2)) 0.5; QUERY PLAN --- Seq Scan on fps (cost=0.00..388808.17 rows=450793 width=13) (actual time=1278.115..6953.977 rows=1 loops=1) Filter: (tanimoto_sml(ecfp4_bv, '\x0100084200048204'::bfp) 0.5::double precision) Rows Removed by Filter: 1352377 Total runtime: 6954.010 ms (4 rows) Time: 6955.103 ms It seems conceptually 'easier' to add the similarity value as part of the query, rather than setting it as a variable ahead of the query; but clearly I should be doing it the latter way for performance reasons. So even if I don't fully understand why at the moment, am I correct in thinking that queries of this sort should always be run with the similarity operators (%, #)? And if so, is the rdkit.tanimoto_threshold variable set at the level of the session, the user, or the database? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for
Re: [Rdkit-discuss] Building RDKit on Windows
Thanks Greg - that did the trick! (I still see pythonTestDbCLI - as previously posted) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Building RDKit on Windows
Hi All, I have just rebuilt RDKit on Windows using the latest source, and am seeing a problem with smaTest1 failing (as well as still seeing the same DbCLI failure posted previously...) The smaTest1 failure seems a little strange because it actually throws a Windows executable error (smaTest1.exe has stopped working, etc). If I run ctest -V -R smaTest1 I see the output below. Any thoughts? Kind regards James C:\RDKit\buildctest -V -R smaTest1 UpdateCTestConfiguration from :C:/RDKit/build/DartConfiguration.tcl UpdateCTestConfiguration from :C:/RDKit/build/DartConfiguration.tcl Test project C:/RDKit/build Constructing a list of tests Done constructing a list of tests Checking test dependency graph... Checking test dependency graph end test 32 Start 32: smaTest1 32: Test command: C:\RDKit\build\Code\GraphMol\SmilesParse\Release\smaTest1.exe 32: Test timeout computed to be: 9.99988e+006 32: [17:42:57] - 32: [17:42:57] Testing patterns which should parse. 32: [17:42:57] SMARTS Parse Error: syntax error for input: c1b1 32: [17:42:57] 32: 32: 32: Invariant Violation 32: c1b1 32: Violation occurred on line 90 in file ..\..\..\..\Code\GraphMol\SmilesParse\smatest.cpp 32: Failed Expression: mol 32: 32: 1/1 Test #32: smaTest1 .***Failed4.03 sec 0% tests passed, 1 tests failed out of 1 Total Test time (real) = 4.29 sec The following tests FAILED: 32 - smaTest1 (Failed) Errors while running CTest __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Building RDKit on Windows
Hi Greg, Try: ctest -V -R DbCLI that should run the test in Verbose mode so that you can see the failures. Thanks - I have pasted the output below - looks like a file access issue (but I don't know why...). Kind regards James C:\RDKit\buildctest -V -R DbCLI UpdateCTestConfiguration from :C:/RDKit/build/DartConfiguration.tcl UpdateCTestConfiguration from :C:/RDKit/build/DartConfiguration.tcl Test project C:/RDKit/build Constructing a list of tests Done constructing a list of tests Checking test dependency graph... Checking test dependency graph end test 76 Start 76: pythonTestDbCLI 76: Test command: C:\Python27\python.exe C:/RDKit/Projects/test_list.py --testDir C:/RDKit/Projects 76: Test timeout computed to be: 9.99988e+006 76: [10:13:51] INFO: Reading molecules and constructing molecular database. 76: [10:13:51] INFO: Generating molecular database in file testData/bzr\Compounds.sqlt 76: [10:13:51] INFO: Processing 10 molecules 76: [10:13:51] INFO: Generating fingerprints and descriptors: 76: [10:13:51] INFO: Finished. 76: [10:13:52] INFO: Reading molecules and constructing molecular database. 76: [10:13:52] INFO: Generating molecular database in file testData/bzr\Compounds.sqlt 76: [10:13:52] INFO: Processing 163 molecules 76: [10:13:53] INFO: done 100 76: [10:13:53] INFO: Generating fingerprints and descriptors: 76: [10:14:02] INFO: Finished. 76: [10:14:02] INFO: Reading query molecules and generating fingerprints 76: [10:14:03] INFO: Finding Neighbors 76: [10:14:04] INFO: The search took 0.6 seconds 76: [10:14:04] INFO: Creating output 76: [10:14:04] INFO: Done! 76: [10:14:05] INFO: Reading query molecules and generating fingerprints 76: [10:14:05] INFO: Finding Neighbors 76: [10:14:05] INFO: The search took 0.3 seconds 76: [10:14:05] INFO: Creating output 76: [10:14:05] INFO: Done! 76: [10:14:06] INFO: Reading query molecules and generating fingerprints 76: [10:14:06] INFO: Finding Neighbors 76: [10:14:06] INFO: The search took 0.1 seconds 76: [10:14:06] INFO: Creating output 76: [10:14:06] INFO: Done! 76: [10:14:07] INFO: Doing property query 76: [10:14:07] INFO: Found 30 molecules matching the query 76: [10:14:07] INFO: Creating output 76: [10:14:07] INFO: Done! 76: [10:14:08] INFO: Doing property query 76: [10:14:08] INFO: Found 30 molecules matching the query 76: [10:14:08] INFO: Creating output 76: [10:14:08] INFO: Done! 76: [10:14:08] INFO: Doing substructure query 76: [10:14:09] INFO:Fingerprint screenout rate: 112 of 163 (%68.71) 76: [10:14:09] INFO: Found 49 molecules matching the query 76: [10:14:09] INFO: Creating output 76: [10:14:09] INFO: Done! 76: [10:14:09] INFO: Doing substructure query 76: [10:14:09] INFO:Fingerprint screenout rate: 112 of 163 (%68.71) 76: [10:14:09] INFO: Found 49 molecules matching the query 76: [10:14:09] INFO: Creating output 76: [10:14:09] INFO: Done! 76: [10:14:10] INFO: Doing substructure query 76: [10:14:10] INFO: Found 114 molecules matching the query 76: [10:14:10] INFO: Creating output 76: [10:14:10] INFO: Done! 76: [10:14:11] INFO: Doing substructure query 76: [10:14:11] INFO:Fingerprint screenout rate: 23 of 30 (%76.67) 76: [10:14:11] INFO: Found 5 molecules matching the query 76: [10:14:11] INFO: Creating output 76: [10:14:11] INFO: Done! 76: [10:14:12] INFO: Doing substructure query 76: [10:14:12] INFO: Found 25 molecules matching the query 76: [10:14:12] INFO: Creating output 76: [10:14:12] INFO: Done! 76: [10:14:13] INFO: Reading query molecules and generating fingerprints 76: [10:14:18] INFO: Finding Neighbors 76: [10:14:19] INFO: The search took 0.9 seconds 76: [10:14:19] INFO: Creating output 76: [10:14:19] INFO: Done! 76: [10:14:20] INFO: Reading query molecules and generating fingerprints 76: [10:14:20] INFO: Finding Neighbors 76: [10:14:20] INFO: The search took 0.0 seconds 76: [10:14:20] INFO: Creating output 76: [10:14:20] INFO: Done! 76: [10:14:21] INFO: Reading molecules and constructing molecular database. 76: [10:14:21] INFO: Generating molecular database in file testData/bzr\Compounds.sqlt 76: [10:14:21] INFO: Processing 10 molecules 76: [10:14:21] INFO: Generating fingerprints and descriptors: 76: [10:14:21] INFO: Finished. 76: [10:14:23] INFO: Reading molecules and constructing molecular database. 76: [10:14:23] INFO: Generating molecular database in file testData/bzr\Compounds.sqlt 76: [10:14:23] INFO: Processing 10 molecules 76: [10:14:23] INFO: Generating fingerprints and descriptors: 76: [10:14:23] INFO: Finished. 76: .Traceback (most recent call last): 76: File CreateDb.py, line 460, in module 76: CreateDb(options,dataFilename) 76: File CreateDb.py, line 214, in CreateDb 76: startAnew=not options.updateDb 76: File C:\RDKit\rdkit\Chem\MolDb\Loader_sa.py, line 111, in LoadDb 76: os.unlink(dbName) 76: WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'testData/bzr\\Compounds.sqlt' 76: [10:14:24] INFO: Reading
[Rdkit-discuss] Building RDKit on Windows
Dear All, As part of a New Year's resolution, I decided I should try to enjoy the benefits of a cutting-edge version of RDKit built from source(!) So far this has proven to be much more realistic than eg 'not drinking for January' - as I now have a working build to show for my efforts. However, I wonder if I could quickly list the steps I took; and also ask a couple of questions (relating to InChi and Avalon)? For reference I am running on Windows7 64-bit, but use python 2.7.6 32bit, so am building 32-bit RDKit. I essentially followed the guide on the wiki (https://code.google.com/p/rdkit/wiki/BuildingOnWindows) but thought the version info of boost, etc may be of use to others, and the steps may help put my questions into context: 1. Downloaded Visual Studio Express 2012 for Desktop, installed, and accepted the updates 2. Downloaded matching version of Windows boost binaries (boost_1_55_0-msvc-11.0-32.exe) from http://sourceforge.net/projects/boost/files/boost-binaries/ and extracted to the default path 3. Used TortoiseSVN to add a repository link to https://github.com/rdkit/rdkit.git/trunk (and not the SF path as currently shown in the wiki guide) in C:/RDKit 4. Set the environment variables as described on the wiki. 5. Downloaded the INCHI src as described in the wiki and set the RDK_BUILD_INCHI_SUPPORT option later in cmake. Incidentally, the location for the downloads from IUPAC have changed (ie the info in the README is out of date): http://www.iupac.org/home/publications/e-resources/inchi/download.html 6. Ran CMake configure (GUI) following the wiki, and based on the output, made some boost-related additions to environment variables a. Added C:\local\boost_1_55_0\lib32-msvc-11.0 to PATH b. Created BOOST_ROOT=C:\local\boost_1_55_0 c. Created BOOST_LIBRARYDIR=C:\local\boost_1_55_0\lib32-msvc-11.0 7. Re-ran configure, then generate, then followed the rest of the wiki instructions to build and test - all tests passed except the dbCli one. So now for the questions: I thought I did everything right for adding INCHI support. However, I see the following: In [1]: from rdkit import Chem In [2]: Chem.inchi.INCHI_AVAILABLE Out[2]: False CMake shows: Could NOT find InChI in system locations (missing: INCHI_LIBRARY INCHI_INCLUDE_DIR) Found InChI software locally Do I also need to download the InChi binary and set these two variables appropriately in CMake? Also, I am struggling to build with Avalon support... Choosing the RDK_BUILD_AVALON_SUPPORT appears to configure fine, but when I try to 'Generate' I see the following error: CMake Error at Code/cmake/Modules/RDKitUtils.cmake:26 (add_library): Cannot find source file: /common/layout.c Tried extensions .c .C .c++ .cc .cpp .cxx .m .M .mm .h .hh .h++ .hm .hpp .hxx .in .txx Call Stack (most recent call first): External/AvalonTools/CMakeLists.txt:43 (rdkit_library) Any thought on how to get around this? Do I need to download the Avalon project src and put it somewhere? And final question - can I happily ignore the CMake messages about not finding FLEX and BISON, or are these needed when incorporating any of the non-default entries (SWIG wrappers, etc)? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance
[Rdkit-discuss] Minimising bits of molecules?
Dear All, I think this is probably one for Paolo - I was looking at fixing certain atoms during MMFF minimisation, but couldn't find the option... Then I re-read the UGM slides, and found the one titled Force-field wish list, and fixed atoms were one of the listed items! My intended use-case is the following: 1. Load protein-ligand complex into PyMOL 2. Make some changes to the bound ligand (using the Builder functionality) 3. Select atoms that are allowed to move (manual selection, then use of PyMOL's 'flag' command) 4. Pass the molecule over to RDKit (already incorporated in a plugin we use), to minimise and then pass back (either as a new object, or apply the new coordinates to the existing object in situ) Actually, this process is already well-used by some of our chemists here - as a way of doing some simple modelling / idea exploration - but is currently using a much 'flakier' MMFF implementation. So I would definitely like to move to RDKit for the minimisation - any idea when a 'fixed atoms' option is likely to be added? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chemistry 101 question...
Greg wrote: This is what it looks like the state of play at the moment is: - Adding nitro groups tends to make molecules more lipophilic, at least as measured by retention time in chromatography. - Nitro groups are H-bond acceptors, at least according to the papers I found above and the evidence one finds in the CSD. This seems like an argument for having nitro groups in the default fdef file as both lumped hydrophobes (the whole group) and acceptors (the Os). Make sense? Makes sense to me! Cheers James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q3 2013 release available
Hi Sereina, Sereina wrote: Regarding the AssignBondOrdersFromTemplate() method: As far as I understood, the PDB reader assigns bond orders to the amino acids in a protein, but if a ligand is present it puts all bonds of it to SINGLE bonds as auto bond-type perception is not trivial (see Roger's comments). However, usually one knows which ligand was crystallized (i.e. the SMILES is available), so the AssignBondOrdersFromTemplate() method can be used to set the bond orders based on the known ligand structure. This is the idea of the method. Now, to your real-world application. I'm sorry but I don't think I understand it completely. Do you want to set only the bond orders of a specific substructure? Or would you like to give the function a set of ligands and a set of templates and it figures out which template belongs to which ligand and sets the bonds orders accordingly? This is very likely to be me being stupid - so please bear with me! If I read in a complex (pdb), and already have my reference ligand (lig), then AllChem.AssignBondOrdersFromTemplate(lig, pdb) fails because the reference ligand has not been matched to the ligand in the pdb 'complex' (dot-separated list of molecules). The doc-string states that the method works on two molecules - but I want to work on a reference molecule (lig) and a *substructure* of the macromolecule (pdb). How should I be getting the bound ligand out as a molecule object to then use the AssignBondOrdersFromTemplate() method? Am I missing some new PDB-related methods, or have I forgotten some fundamental RDKit methods for dealing with multi-component molecules? I guess a sensible process would be: 1. Identify any HETATM residues 2. For each residue (or at least those that have bonds!) extract or copy the mol (unless it can be addressed 'in place'?) 3. Use AssignBondOrdersFromTemplate() - relying on lookup be eg residue name, etc 4. Insert the molecule back into the complex (or update the info if it has been modified 'in place') Is this how the method is intended to be used with complexes (and if so, do you have an example for steps 2 and 4? Thanks James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q3 2013 release available
Hi Greg (et al.), Thanks for the beta! I have been going through some of the recently-added functionality, and had a couple of questions regarding the PDB reading / writing. 1. Do I remember correctly that there was a proposal (from Roger) to add some auto bond-type perception to the PDB parser for ligands (or is that just wishful thinking!)? 2. If not, I notice that there is an AssignBondOrdersFromTemplate() method - but the example in the doc-string only shows (I think) the case where the input PDB is just a single small molecule - so the matching is pretty easy! I think a more real-World case is when one wants to set the bond orders for multiple ligands (HETATM residues) based on substructure matches - which will then return an atom index selection that can be used as a start point. Is there any way to have the AssignBondOrdersFromTemplate() convenience function optionally accept a list of atom indexes to specify a substructure? 3. Is there some explanation for what the 'flavor' option does for reading/writing PDB? 4. Having read in a PDB file I see the correct atoms flagged as HETATM (from GetIsHeteroAtom()). But when call Chem.MolToPDBBlock() these atoms get written as ATOM records... Also, a Chem.MolToPDBFile() method would be nice for completeness / symmetry : ) 5. It seems to me that GetResidueNumber() and GetSerialNumber() may have got mixed-up at some point(?). At least, when I call GetSerialNumber() I see what appears to be the residue number; and when I call GetResidueNumber() I get 0! 6. I also seem to be seeing all of the bonds (for all residues) being written out in CONECT records - such that they all appear as single bonds in eg PyMOL - is this expected behaviour at the moment? Cheers James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chemistry 101 question...
Hi JP, Nik, Greg, RDKitters The question about the lipophilicity (or otherwise) of nitro groups was interesting to me... I came from a CNS background, where there was, of course, a stricter requirement for molecules to be suitably lipophilic to cross the blood-brain barrier. My recollection was that the observed lipophilicity of nitro groups was dependent on their local environments (ie electron rich / +m gave more polar character, and electron poor / -m gave more polar character)... But rather than rely on my hazy recollections, I decided to have a quick look back at some historical reverse-phase analytical LC data. What I did was took all retention times (in mins) under one well-used gradient method, and generated the matched-molecular pairs using George's KNIME node. I was then only interested in *[H] [*][N+](=O)[O-] transformations, so filtered-down to just those changes involving 5 atoms in the transformation (because this was quicker than chemically searching!). I then grouped across the examples of transformations to give some average changes in retention time, plus n, range, sd: Transformation Mean RT change (min) RT range (min) SD n *[H][*]CCC 2.5 3.3 0.999 28 *[H][*]C(C)C 2.19 5.47 1.09 37 *[H][*]CCCl 1.91 1.5 1.06 2 *[H][*]C(F)F 1.22 1.36 0.748 3 *[H][*]C1CC1 1.18 1.04 0.436 4 *[H][*]N(C)C 1.08 1.21 0.472 6 *[H][*]CSC 0.67 0 0 1 *[H][*]OCC 0.479 4.67 1.17 15 *[H][*][N+](=O)[O-] 0.169 2.82 0.645 35 *[H][*]NCC 0.0625 0.045 0.0318 2 *[H][*]CCO 0.06 0.04 0.0283 2 *[H][*]COC -0.001 2.46 0.62 14 *[H][*]CC=C -0.357 0 0 1 *[H][*]C(C)=O -0.397 1.21 0.696 3 *[H][*]C(=O)O -0.848 4.3 2.17 3 *[H][*]CC#N -1.3 2.35 1.66 2 *[H][*]C(N)=O -2.72 0 0 1 *[H][*]CCN -2.77 0 0 1 So on average over the 35 examples of H -- NO2 the change made the molecules slightly more lipophilic (or, at least, they were retained slightly longer on a C18 column). I expect there is much more data-digging that could be done - particularly with larger data sets, and (maybe) with proper logP / logD measurements; but for now I am going to stick to thinking NO2 groups can be lipophilic additions(!) Cheers James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chemistry 101 question...
Hi Nik, Nik wrote: Interesting. I wonder if this is also dependent on the transport phase that was used. Do you have any info on that? Was it a typical 10% MeOH or more something with dichlormethane? I dug-out the conditions: LC retention time Method A refers to elution of a sample through an XTERRA RP18 (50 mm x 4.6 mm) 5 µm column under gradient conditions. The initial eluent comprises 50% Methanol (pump-A) and 50% of a 10 mM aqueous ammonium acetate solution containing 5% IPA (pump-B) at a flow rate of 2 mL/min. After 1 min, a gradient is run over 5 min to an end point of 80% pump-A and 20% pump-B, which is isocratically maintained for a further 3 min. UV peak detection is generally carried out at a wavelength of 220 nm. I should also say that, in my experience, even under normal-phase conditions (ie silica column and organic eluent) nitro-aromatics tend to behave 'greasily'. Who in big pharma wants to mine some nitration reaction data to pull out TLC plate Rf data (normal phase) + LC retention (reverse phase)? I think your DB may be bigger than ours! : ) Cheers James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Handling reaction stereochemistry
Hi Greg, Correct, relative (or other forms of enhanced) stereochemistry is not possible. It's worth talking about how to deal with this, but it's going to be more than a little bit of work, I suspect. I suspect so, too! The conversation about representation of and handling of enhanced stereochemistry, and what the actual use cases are, would be a good one to have. I think it's probably going to be difficult via email though. Maybe a topic for the UGM... I agree re: email. A topic for discussion at the UGM sounds like a very good idea - that gives everybody 6 months to mull it over! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Handling reaction stereochemistry
Hi Greg I should have provided a bit more context around what the current behavior is, or at least what it's supposed to be. Sorry I forgot that. My fault - I should have (re)read the manual (I thought it seemed a bit familiar..!) Currently, when creating a reaction from rxnSMARTS, inversion/retention is handled by looking at the relative stereochemistry of atoms in the reactants and products. If they're different you get inversion (apologies for the extremely bogus example reaction): In [13]: rxn = AllChem.ReactionFromSmarts([C@:1][C@@:1]) In [14]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),)) In [15]: Chem.MolToSmiles(ps[0][0],True) Out[15]: 'F[C@@](Cl)(Br)I' In [16]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@@](Cl)(Br)I'),)) In [17]: Chem.MolToSmiles(ps[0][0],True) Out[17]: 'F[C@](Cl)(Br)I' and if they're the same you get retention: In [7]: rxn2 = AllChem.ReactionFromSmarts([C@:1][C@:1]) In [8]: ps = rxn2.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),)) In [9]: Chem.MolToSmiles(ps[0][0],True) Out[9]: 'F[C@](Cl)(Br)I' In [10]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1]) In [11]: ps = rxn3.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),)) In [12]: Chem.MolToSmiles(ps[0][0],True) Out[12]: 'F[C@](Cl)(Br)I' This much feels logical to me, though of course it can be changed if there's disagreement. It sort of does to me too, but I can't shift the sensation that there might be a can of worms here - more on that in a moment... If you call the reaction with non-chiral starting material, you get non-chiral ouput: In [20]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1]) In [21]: ps = rxn3.RunReactants((Chem.MolFromSmiles('FC(Cl)(Br)I'),)) In [22]: Chem.MolToSmiles(ps[0][0],True) Out[22]: 'FC(Cl)(Br)I' This is probably also ok; it certainly reflects what would happen in the lab (er, at least I think it does). Just to be a pedant for a moment (but actually, this could be important later) - this is actually calling the reaction with *chiral* (albeit presumably racemic) starting material So far so good. We've got inversion of stereochemistry and retention of stereochemistry. There are two cases left: resolution/creation and scrambling. One obvious thing to do here would be: [C@:1][C:1] scrambling [C:1][C@:1] resolution/induction This is where my extremely bogus example starts to make things more difficult to understand, so here's a more real example of the induction case: [#6:1]/[C:2]=[C:3](/[#6:4])[#6:1][C@H:2](Br)[C@H:3](Br)[#6:4] Seem right? Can of worms alert 1!! At first sight this seems perfectly ok(?) - as long as we accept that we know what we mean by the (R) flags on the carbons (by my reckoning we probably mean syn addition of Br2 across a double-bond?). But - problems of symmetry and atom priorities aside(!) - what do I do if I want to employ the same transformation but with no absolute stereo-control (ie if I don't have the same wonder-catalyst)? At the moment I guess there is no way to represent relative stereochemistry in the absence of an enhanced stereochemistry model? This brings me on to the main can of worms sensation - and I think it may revolve trying to service both real and 'virtual/fake' reactions in the same system, as well as some obvious concerns about enhanced stereochemistry. So some examples / questions: 1. I have a super-useful enzyme that will only hydrolyse (R)-esters (or more precisely I should say it won't hydrolyse (S)-esters). So: CC[C@H](C)C(=O)OCCC[C@H](C)C(=O)O ## R gets hydrolysed CC[C@@H](C)C(=O)OCCC[C@@H](C)C(=O)OC ## S doesn't CCC(C)C(=O)OC ## Oh dear, what do we want to happen here? I know what my enzyme will do - but we do have to assume that we are implying a racemic mix (it gets more worrying if we might mean a single, but unknown, enantiomer, or we might know nothing at all - we're back to enhanced stereochemistry again!) CCC(C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)OC ## So this is what the enzyme would do - because we have treated the chiral centre as a racemic mix - essentially expanding out to: CC[C@H](C)C(=O)OC.CC[C@@H](C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)O C The problem with this is that it doesn't fit with the existing rSMARTS nomenclature for retention and inversion, because the absolute stereochemistry of the starting material affects the outcome of the reaction! But I guess my enzyme reaction above would be represented as something like [C@:1][C:2](=[O:3])[O:4]C[C@:1][C:2](=[O:3])[O:4]H But we would have to (a) assume now that '@' in the starting material only matched (R), and (b) treat incoming racemates intrinsically as two-component mixtures of (R) and (S) to then apply the transformation to just the (R) and add the (S) starting material to the products... 2. I am a database admin, and I want to transform some mis-assigned racemates to the (S) enantiomers Eg
Re: [Rdkit-discuss] Handling reaction stereochemistry
Hi Greg, I've got a question for the community about how chirality should be handled in reactions. This morning I managed to fix one of the outstanding reaction stereochemistry problems in the RDKit: the loss of chirality when one bond to a stereocenter is to an unmapped atom. Here's a quick demo of the new behavior (not yet checked in; there are still a couple things to be cleaned up): In [7]: rxn = AllChem.ReactionFromSmarts('[C:1]-O[C:1]-S') In [8]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@H](O)Cl'),)) In [9]: Chem.MolToSmiles(ps[0][0],True) Out[9]: 'F[C@H](S)Cl' It seems nice to be able to preserve chirality in these cases. The question that comes up is: *Should* we be preserving chirality in these cases?. The change makes it impossible to indicate a reaction that scrambles stereochemistry. That doesn't seem right. So... the question to you guys: How should stereochemistry inversion/retention/loss be indicated in Reaction SMARTS? Good question - and let me be the first to jump in, feet first, without thinking enough! : ) Instinctively, I would say it would be good to (a) scramble stereochemistry if not otherwise specified - at least this way we default to losing information rather than risking keeping incorrect information; (b) use a flag at each centre if we want to retain stereochemistry (what about '@'?); (c) use another flag if we want to invert (and, inventive I know, what about '@@'?). So in the above example, let's say I want to always invert (eg to represent an SN2 reaction) - the rSMARTS could then be something like [C:1]-O[C:1@@]-S, and the example input above would give F[C@@H](S)Cl out. The same output with no specification could give FC(H)(S)Cl and, of course, achiral input would always give achiral output - regardless of the flag in the rSMARTS. Bonus points to anyone who can explain to me how the inversion/retention flags in RXN files should be handled. At the moment the RDKit uses what's in the products and ignores them in the reactants. Something like the above? (I told you I hadn't thought about it enough!) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Polymers, S-Groups, and molblock-parsing (oh my!)
Dear All, I just wanted to raise an observation about the behaviour of the molblock parser. I was running some SMARTS-based substructure queries in KNIME, and happened to be looking for aromatic N-oxides - the query was just nO - which should maybe be the answer as well! : ) Anyway, I was actually searching DrugBank (via the SDF - http://www.drugbank.ca/system/downloads/current/structures/small_molecul e.sdf.zip) and found Heparin was a hit for my query - which I thought was a bit funny as there are no aromatic nitrogens. It seems, however, that the match is due to the * atoms in the molblock (see below) that are representing the polymer repeat points (leading to *-O, which is matching n-O). As I understand it, the rest of the info about the polymer is stored as S-Group data - and I am presuming that RDKit is not currently interpreting this(?) So I guess the simple question is - should polymers, etc be handled by the parser (maybe if not fully, just partially - eg by deleting the * atoms if the S-Group data are found)? Kind regards James Mrv0541 09201117322D 14 0 0 1 0999 V2000 12.8725 -11.15210. C 0 0 1 0 0 0 0 0 0 0 0 0 13.5903 -11.56670. C 0 0 1 0 0 0 0 0 0 0 0 0 12.8725 -10.32720. C 0 0 2 0 0 0 0 0 0 0 0 0 11.8517 -11.74930. O 0 0 0 0 0 0 0 0 0 0 0 0 14.2992 -11.15210. C 0 0 2 0 0 0 0 0 0 0 0 0 13.5903 -12.39140. O 0 0 0 0 0 0 0 0 0 0 0 0 13.5903 -9.91720. O 0 0 0 0 0 0 0 0 0 0 0 0 12.1547 -9.91720. C 0 0 0 0 0 0 0 0 0 0 0 0 10.8307 -12.33350. C 0 0 1 0 0 0 0 0 0 0 0 0 14.2992 -10.32720. C 0 0 2 0 0 0 0 0 0 0 0 0 14.9729 -11.91850. N 0 0 0 0 0 0 0 0 0 0 0 0 11.4415 -10.32720. O 0 0 0 0 0 0 0 0 0 0 0 0 10.8307 -13.15820. C 0 0 1 0 0 0 0 0 0 0 0 0 10.1175 -11.92320. O 0 0 0 0 0 0 0 0 0 0 0 0 15.3200 -9.74330. O 0 0 0 0 0 0 0 0 0 0 0 0 16.1934 -11.91390. S 0 0 0 0 0 0 0 0 0 0 0 0 10.1175 -13.57280. C 0 0 2 0 0 0 0 0 0 0 0 0 11.7684 -14.14450. O 0 0 0 0 0 0 0 0 0 0 0 0 9.3996 -12.33350. C 0 0 2 0 0 0 0 0 0 0 0 0 16.3409 -9.15040. C 0 0 2 0 0 0 0 0 0 0 0 0 16.1889 -12.73870. O 0 0 0 0 0 0 0 0 0 0 0 0 16.1889 -11.08920. O 0 0 0 0 0 0 0 0 0 0 0 0 17.0225 -11.91390. O 0 0 0 0 0 0 0 0 0 0 0 0 9.3996 -13.15820. C 0 0 1 0 0 0 0 0 0 0 0 0 10.1175 -14.39750. O 0 0 0 0 0 0 0 0 0 0 0 0 8.6864 -11.92320. C 0 0 0 0 0 0 0 0 0 0 0 0 16.3409 -8.32570. C 0 0 2 0 0 0 0 0 0 0 0 0 17.0586 -9.56500. C 0 0 1 0 0 0 0 0 0 0 0 0 8.6819 -13.56830. O 0 0 0 0 0 0 0 0 0 0 0 0 7.9730 -12.33350. O 0 0 0 0 0 0 0 0 0 0 0 0 8.6864 -11.09850. O 0 0 0 0 0 0 0 0 0 0 0 0 17.0586 -7.91540. O 0 0 0 0 0 0 0 0 0 0 0 0 15.6276 -7.91540. C 0 0 0 0 0 0 0 0 0 0 0 0 17.7720 -9.15040. C 0 0 2 0 0 0 0 0 0 0 0 0 17.0586 -10.39420. O 0 0 0 0 0 0 0 0 0 0 0 0 6.9121 -13.55940. * 0 0 0 0 0 0 0 0 0 0 0 0 17.7720 -8.32570. C 0 0 2 0 0 0 0 0 0 0 0 0 14.9099 -8.32570. O 0 0 0 0 0 0 0 0 0 0 0 0 15.6276 -7.09070. O 0 0 0 0 0 0 0 0 0 0 0 0 19.3208 -10.04870. O 0 0 0 0 0 0 0 0 0 0 0 0 18.7974 -7.73260. O 0 0 0 0 0 0 0 0 0 0 0 0 19.8138 -7.14420. C 0 0 2 0 0 0 0 0 0 0 0 0 19.8138 -6.31940. C 0 0 2 0 0 0 0 0 0 0 0 0 20.5314 -7.55890. C 0 0 1 0 0 0 0 0 0 0 0 0 20.5314 -5.90930. O 0 0 0 0 0 0 0 0 0 0 0 0 19.1005 -5.90930. C 0 0 0 0 0 0 0 0 0 0 0 0 21.2449 -7.14420. C 0 0 2 0 0 0 0 0 0 0 0 0 20.5314 -8.38800. O 0 0 0 0 0 0 0 0 0 0 0 0 21.2449 -6.31940. C 0 0 0 0 0 0 0 0 0 0 0 0 18.2713 -5.90040. O 0 0 0 0 0 0 0 0 0 0 0 0 22.5298 -7.73480. N 0 0 0 0 0 0 0 0 0 0 0 0 22.7828 -5.43680. * 0 0 0 0 0 0 0 0 0 0 0 0 17.4465 -5.89590. S 0 0 0 0 0 0 0 0 0 0 0 0 22.5342 -8.56390. C 0 0 0 0 0 0 0 0 0 0 0 0 17.4421 -6.72070. O 0 0 0 0 0 0 0 0 0 0 0 0
Re: [Rdkit-discuss] 2011.09 (Q3 2011) RDKit release
Thanks Greg, and George. I have not tested the new win-py27 binary fully - but it does at least behave itself when importing AllChem! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] 2011.09 (Q3 2011) RDKit release
Hi Greg, I probably should have picked this up in the beta (but didn't...) When I try to import AllChem, I see the following: from rdkit import Chem from rdkit.Chem import AllChem Traceback (most recent call last): File pyshell#6, line 1, in module from rdkit.Chem import AllChem File C:\Python27\RDKit_2011_09_1\rdkit\Chem\AllChem.py, line 28, in module from rdkit.Chem.rdSLNParse import * ImportError: DLL load failed: The specified module could not be found. Any advice? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q3 2011 Release Available
Hi Greg, If there's demand for it, I will also put up a windows binary. As usual, I'd appreciate a Windows build against python 2.7 : ) Thanks James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Lipinski HBD count
Hi Greg, Greg wrote: You actually don't need to add the Hs: p1 = Chem.MolFromSmarts('[#7,#8;H1]') p2 = Chem.MolFromSmarts('[#7,#8;H2]') p3 = Chem.MolFromSmarts('[#7,#8;H3]') m = Chem.MolFromSmiles('CC(=O)N') m2 = Chem.MolFromSmiles('OCC(=O)N') def NHOHCount(mol): return len(mol.GetSubstructMatches(p1))+2*len(mol.GetSubstructMatches(p2))+ 3*len(mol.GetSubstructMatches(p3)) ... NHOHCount(m) 2 NHOHCount(m2) 3 I think this system works well in almost all cases : ) However, I had a nagging concern over a couple of 'edge' cases - namely water, and ammonia (and for that matter, the oxonium and ammonium ions). I guess the simple inclusion of P4 = Chem.MolFromSmarts('[#8;H4]') would make sure all cases were covered(?). Out of interest, I decided to compile a small list of 'normal' and 'edge' case SMILES, and ran it through the MOE descriptor node in KNIME. For all these cases, lip_don behaves as I would expect (tab-separated output included below) Kind regards James SMILESa_acc a_don lip_acc lip_don CO1.0 1.0 1.0 1.0 C(=O)N1.0 1.0 2.0 2.0 O 1.0 1.0 1.0 2.0 CN1.0 1.0 1.0 2.0 [O+] 1.0 0.0 1.0 3.0 C[O+] 1.0 0.0 1.0 2.0 [N+] 0.0 0.0 1.0 4.0 C[N+] 0.0 0.0 1.0 3.0 [N-] 0.0 1.0 1.0 2.0 [O-] 0.0 1.0 1.0 1.0 C(=O)[N-] 0.0 1.0 2.0 1.0 __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Lipinski HBD count
Hi Greg, Greg wrote: For what it's worth: the results here are definitely not correct for the SMILES as provided. Atoms in SMILES that are in square brackets have no implicit Hs, so [N+] actually has zero hydrogens. I guess you actually provided the molecules to MOE in some other form. Oops - you're quite right - I converted them to MOL format with ChemAxon MolConverter. However, the point about implicit hydrogens for atoms in square brackets had completely passed me by - thanks! Output with the SVN version of the RDKit: #-- Smiles NOCount NHOHCount CO 1 1 C(=O)N 2 2 O 1 2 CN 1 2 [OH3+] 1 3 C[OH2+] 1 2 [NH4+] 1 4 C[NH3+] 1 3 [NH2-] 1 2 [OH-] 1 1 C(=O)[NH-] 2 1 #- Looks great! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q2 2011 Release Available
Hi Greg, windows binary (py27, please : ) ) It's up on the google download page; hopefully I remembered all the DLLs this time. :-S -greg The binary works a treat - no sign of missing DLLs - thanks! __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] rdkit.Chem.Draw.spingCanvas.py (and py27 aggdraw / cairo help?)
Hi Greg, Greg wrote: The attached .pyd is 32-bit aggdraw build for python2.7 on windows. I tested it very briefly and it seems to work; let me know if you have problems with it. It works a treat - very much appreciated! My molecules have never looked better : ) __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] rdkit.Chem.Draw.spingCanvas.py (and py27 aggdraw / cairo help?)
Dear Greg, Riccardo, et al. Riccardo wrote: I don't know exactly about the other problems, but this one should be related to the version of the installed PIL. If I remember correctly, BGRA raw mode requires PIL 1.1.7. @Riccardo - Thanks for the advice, Riccardo. I think I was already on 1.1.7 - but maybe an alpha release(?) Anyway, I have now standardised across Python 2.6 and 2.7 with the latest PIL installers from http://www.pythonware.com/products/pil/. Greg wrote: I will try to do an aggdraw build for 2.7. If I succeed, I'll post something. @Greg - Thanks for the kind offer. I for one would be very pleased to be using aggdraw again (as I think the image quality seems the best). I am pleased to say, however, that it is less critical now that I have worked through my cairo issues! : ) Previously I was getting: Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32 Type copyright, credits or license() for more information. from rdkit import Chem from rdkit.Chem import AllChem, Draw mol = Chem.MolFromSmiles(c1c1) AllChem.Compute2DCoords(mol) im = Draw.MolToImage(mol) !!!PYTHONW.EXE CRASH!!! Finally, after a morning of going round in circles (and following red-herring Dependency Walker trails(!)), things are running well; with RDKit now happily calling cairo! I thought it might be useful for others to list the versions of software / DLLs that I finally found to work: Windows XP Pro. SP3 (32-bit) Python 2.7.1 (http://www.python.org/ftp/python/2.7.1/python-2.7.1.msi) PIL 1.1.7 (installer - http://effbot.org/downloads/PIL-1.1.7.win32-py2.7.exe) Pycairo-1.8.10.win32-py2.7 (installer - http://ftp.gnome.org/pub/GNOME/binaries/win32/pycairo/1.8/pycairo-1.8.10 .win32-py2.7.exe) Libcairo-2.dll (get from the following archive: http://wxpython.org/cairo/cairo_1.8.6-1_win32.zip) libpng12-0.dll (get from the following archive: http://wxpython.org/cairo/libpng_1.2.34-1_win32.zip) Zlib1.dll (get from the following archive: http://wxpython.org/cairo/zlib123-dll.zip) I then put the 3 DLLs into the C:\Python27\Lib\site-packages\cairo\ folder, and made sure that this is on the system path. The use of the wxPython DLLs seemed to be the key to sorting things out (I certainly tried a few other versions!) - thanks to Alex Matan's blog-post for the instructions (http://electromagnetictelegraph.com/install-cairo-wxpyton-pycairo-pytho n-windows) The setup under 2.6 was exactly the same - except I used the corresponding 2.6 installer for PIL, and used the Pycairo-1.8.4.win32-py26 from the wxPython site (http://wxpython.org/cairo/pycairo-1.8.4.win32-py2.6.exe) as detailed in Alex's blog. I can add these instructions to to the wiki if you like(?) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] rdkit.Chem.Draw.spingCanvas.py (and py27 aggdraw / cairo help?)
Dear All, I am in the process of upgrading to python 2.7 under Windows, and part of this has included moving to the RDKit_2011_03_2 (py27) build. I had previously done most work with earlier versions of RDKit under python 2.6, but have found a problem with calling Draw.MolToImage() with the latest RDKit binary for both py26 and py27: Traceback (innermost last): File C:\Python26\lib\site-packages\Pmw\Pmw_1_3\lib\PmwBase.py, line 1747, in __call__ return apply(self.func, args) File C:\Python26\lib\site-packages\pmg_tk\startup\VerMOL.py, line 1188, in lambda command=lambda s=self:s.draw_ligand(self.modelling_chainlist.listbox, self.ligcanvas, self.smiles, '3D',200,200, 'modelling_lig_image')) File C:\Python26\lib\site-packages\pmg_tk\startup\VerMOL.py, line 2871, in draw_ligand im = Draw.MolToImage(mol, size=(x,y)) File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\__init__.py, line 71, in MolToImage drawer.AddMol(mol,**kwargs) File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\MolDrawing.py, line 361, in AddMol color=color,width=width,color2=color2) File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\MolDrawing.py, line 190, in _drawBond dash=self.dash) File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\MolDrawing.py, line 169, in _drawWedgedBond self.canvas.addCanvasDashedWedge(poly[0],poly[1],poly[2],color=color) File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\spingCanvas.py, line 104, in addCanvasDashedWedge pts1 = _getLinePoints(p1,p2,dash) type 'exceptions.NameError': global name '_getLinePoints' is not defined Not a big problem to sort - I think spingCanvas.addCanvasDashedWedge() should read: pts1 = self._getLinePoints(p1,p2,dash) pts2 = self._getLinePoints(p1,p3,dash) on lines 104, 105 instead of: pts1 = _getLinePoints(p1,p2,dash) pts2 = _getLinePoints(p1,p3,dash) Anyway, this is only a problem if spingCanvas is being called - which I think only happens as a last resort if aggdraw or cairo aren't found. So on that note, the reason I was calling spingCanvas was that I don't have a build of aggdraw for python 2.7, and I have found that when cairo/pycairo are available to python 2.7 I get a pythonw.exe Application Error at the point of calling Draw.MolToImage(). Under python 2.6 I thought I would see what would happen if I removed aggdraw to force cairo into play (different version of PIL, different version of cairo - not an ideal comparison!): File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\__init__.py, line 54, in MolToImage canvas = Canvas(img) File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\cairoCanvas.py, line 38, in __init__ imgd = image.tostring(raw,BGRA) File C:\Python26\lib\site-packages\PIL\Image.py, line 516, in tostring e = _getencoder(self.mode, encoder_name, args) File C:\Python26\lib\site-packages\PIL\Image.py, line 389, in _getencoder return apply(encoder, (mode,) + args + extra) type 'exceptions.SystemError': unknown raw mode If it helps, I can follow-up with more details on exact versions of DLLs, etc; but for now wondered if: (a) anybody had a version of aggdraw for windows, built with python 2.7? (b) or any recommendations for reliable PIL / cairo / pycairo combinations for python 2.7 / windows? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking.
Re: [Rdkit-discuss] Sample RD Files?
Hi Greg, Thanks for the python-full reply! # let's test the reaction to make sure it works. # due to a (already reported) bug in the way atom properties are handled, nrxn cannot be directly used, # so we use a hack and reparse it: nrxn = AllChem.ReactionFromSmarts(AllChem.ReactionToSmarts(nrxn)) nrxn.Validate() # now we can run a molecule through to make sure it works: nmol = Chem.MolFromSmiles('c1c1C') nps = nrxn.RunReactants((nmol,)) print Chem.MolToSmiles(nps[0][0]) # output is: BrCc1c1 Is that what you're looking for? It certainly allows me to do what I want - which is get a mapped RXN out. And this can even be done with coordinates - which I have added below as a reminder to anyone (which included me until about 10 mins ago!) who had forgotten: AllChem.Compute2DCoordsForReaction(nrxn) rxnBlock = AllChem.ReactionToRxnBlock(nrxn) So Thanks very much! : ) - and thanks for the reminder about sanitizing products from reactions... The molecules that come back from reactions have not been sanitized, so all you need to do is add a call to Chem.SanitizeMol first: Chem.SanitizeMol(prod) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Problem with ConstrainedEmbed()
Dear All, I am currently having some problems using the AllChem.ConstrainedEmbed() - which I have previously used successfully in version 2010_09_1 (Windows py26 binary). The following example demonstrates the issue: from rdkit import Chem from rdkit.Chem import AllChem template = Chem.MolFromSmiles(c1cnn(Cc2c2)c1) mol = Chem.MolFromSmiles(c1ccc(Cn2ncc(-c3c3)c2)cc1) Now I give the template some 3D coordinates: AllChem.EmbedMolecule(template) AllChem.UFFOptimizeMolecule(template) and finally, try to force an overlay of 'mol' onto 'template' AllChem.ConstrainedEmbed(mol, template, True) Traceback (most recent call last): File pyshell#7, line 1, in module AllChem.ConstrainedEmbed(mol, template, True) File C:\RDKit_2010_12_1\rdkit\Chem\AllChem.py, line 294, in ConstrainedEmbed rms = AlignMol(mol,core,atomMap=algMap) RuntimeError: Range Error I am not getting a ValueError - so I think this shows the substructure match is ok, but wasn't sure where to go to dig into the AlignMol() function... As the error message above shows, this is running 2010_12_1. I have just tried with 2011_03_1beta1 and 2011_03_2 and get the same thing. In 2010_09_1 the ConstrainedEmbed() passes back an RDKit molecule fine. Kind regards James -- What Every C/C++ and Fortran developer Should Know! Read this article and learn how Intel has extended the reach of its next-generation tools to help Windows* and Linux* C/C++ and Fortran developers boost performance applications - including clusters. http://p.sf.net/sfu/intel-dev2devmay___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q1 2011 Release Available
Hi Greg - great news about the beta / new functionality! Greg wrote: This morning I tagged the beta for the Q1 2011 (2011.03 in the new numbering) release in svn: http://rdkit.svn.sourceforge.net/viewvc/rdkit/tags/Release_201 1_03_1beta1/ and uploaded a source distribution to the google code site: http://code.google.com/p/rdkit/downloads/detail?name=RDKit_201 1_03_1beta1.tgz If there's demand for it, I will also put up a windows binary. As usual, yes, please for a python 2.6 windows binary if possible : ) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Create and publish websites with WebMatrix Use the most popular FREE web apps or write code yourself; WebMatrix provides all the features you need to develop and publish your website. http://p.sf.net/sfu/ms-webmatrix-sf ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Beta of Q4 2010 release up
Hi Greg, Greg wrote: If there's demand for it, I will also put up a windows binary. As usual: if no show-stopper bugs appear, I will do the release itself in about a week. I would appreciate a Windows binary to check out the beta release - but if it is just me, I can obviously wait for the full release (presuming a windows binary would be available at that point?) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Canonical smiles for medium and large rings?
Hi Greg, On Sat, Dec 18, 2010 at 6:27 AM, Greg Landrum greg.land...@gmail.com wrote: I just checked in a set of changes that should get this (mostly) working correctly. Here's a demonstration with Geldanamycin: In [7]: smi=r'NC(=O)o...@h]1c(/C)=C/[...@h](C)[C@@H](O)[C@@H](OC)c...@h](C )C\C2=C(/OC)C(=O)\C=C(\NC(=O)C(\C)=C\C=C/[C@@H]1OC)C2=O' In [8]: print Chem.CanonSmiles(smi) COC1=C2C[C@@H](C)c...@h](OC)[...@h](O)[C@@H](C)/C=C(\C)[...@h](OC(N )=O)[C@@H](OC)/C=C\C=C(/C)C(=O)NC(=CC1=O)C2=O Thanks for looking into this so quickly! It would be *really* useful to have some more real-world cases like this one to use as tests. So if you happen to have others you can send I would be quite happy to have them. On that note, I have added a comment to the bug tracker (https://sourceforge.net/tracker/?func=detailaid=3139534group_id=16013 9atid=814650) - but was not sure how to attach a file (eg sdf) there, so apologies for it ending up on more lines than I intended... Also, I logged in with my google account, but it looks like it may not be clear who it is! The first two examples are two marine natural products that only differ in the geometry of the double bond in the medium ring. The final example is a cis- analogue that I synthesised during my PhD for which a crystal structure was also obtained. The stereochemistry in these systems is 'challenging' to say the least, so I thought they would make reasonable test cases. I should say that even for the cis- double bond cases, RDKit does a rather ugly job of the 2D depiction - but I am not sure if other depictors will perform much better... On a related note, I was keen to manually double-check the stereochemistry that had been assigned to each of the chiral centres (particularly the ones involving the 9-5 ring connections - as these are potentially troublesome), and found myself wishing there was a way to easily label a 2D depiction of the molecules with the atom ID. What I ended-up doing was the following: 1. Getting the R/S info + atomIdx back from RDKit (example output): Chem.FindMolChiralCenters(mol) [(3, 'R'), (7, 'R'), (8, 'S'), (9, 'R'), (11, 'R'), (18, 'R'), (24, 'R')] 2. Opening the molfile in a program where I know how to label with atom IDs (pymol) 3. Check which atom is which manually (had to add 1 to the RDKit atomIdx values as they start at 0) then double-check with reference values. RDKit performed admirably - but I presume this is dependant on the quality of the wedge info coming in from the SDF(?) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Canonical smiles for medium and large rings?
Dear All, I have been investigating an issue that a colleague of mine identified. He was working with the RDKit Canon Smiles node in Knime, and found that for the natural product, Geldanamycin, the double-bond geometry information was being lost during canonicalisation. I repeated this result outside of knime: from rdkit import Chem from rdkit.Chem import AllChem smi = r'NC(=O)o...@h]1c(/C)=C/[...@h](C)[C@@H](O)[C@@H](OC)c...@h](C)C\C2=C(/OC)C( =O)\C=C(\NC(=O)C(\C)=C\C=C/[C@@H]1OC)C2=O' AllChem.CanonSmiles(smi) 'COC1=C2C[C@@H](C)c...@h](OC)[...@h](O)[C@@H](C)C=C(C)[...@h](OC(N)=O)[C@@H]( OC)C=CC=C(C)C(=O)NC(=CC1=O)C2=O' The simpler example below may be better: smi1 = r'O1CC/C=C\1' # cyclic ether smi2 = r'OCC/C=C\' # corresponding acyclic alcohol AllChem.CanonSmiles(smi1) 'C1C=CCCOCCC1' - stereochemistry lost AllChem.CanonSmiles(smi2) '/C=C\\CCO' - stereochemistry retained So, I am guessing that double-bonds in rings are being 'ignored'(?) by the canonicaliser? For 'classic' aliphatic systems, double-bonds in 3-7-membered rings can only sensibly exist in the cis orientation, so 'ignoring' them would be ok. However, for 8-membered and above, cis or trans are certainly both possible, so it becomes more important to keep track - particularly if canonical smiles are being used to check for unique structures, as my colleague was doing with the geldanamycin example above. Any thoughts / suggestions are much appreciated as always! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Handling certain sterochemistry in reactions
Dear All, I wonder if anybody can help with the following? I am trying to figure-out how to handle double-bond stereochemistry in reactions when the stereochemistry is involved with the making / breaking bond. Hopefully this example will explain better than that sentence(!): rxn = AllChem.ReactionFromSmarts('[c:1][Cl,Br,I].[#6:2][B][*:1][*:2]') mol1 = Chem.MolFromSmiles('c1c1Br') mol2 = Chem.MolFromSmiles('C\C=C\B(O)O') ps = rxn.RunReactants((mol1, mol2)) Chem.MolToSmiles(ps[0][0], True) --- 'CC=Cc1c1' (stereochemical information lost) whereas using mol2 = Chem.MolFromSmiles('C\C=C\c1c1B(O)O') gives --- 'C/C=C/c1c(-c2c2)1' (stereochemical information retained) Not quite the same, but I have read through some related SMIRKS info here: http://www.daylight.com/dayhtml/doc/theory/theory.smirks.html http://www.daylight.com/dayhtml/doc/theory/theory.smirks.html . However, this explains how to handle stereo centres and stereo bonds in reactions when they are explicitly defined on both sides of the reaction. I guess what I am looking for is a shortcut for saying 'retain' or 'invert' stereochemistry at reacting centre (sp3) or bond attached to reacting centre (sp2)... Having got to the end of explaining that, I am thinking that the way I should handle this is to check for 'problem' reactants and pass to a more specific rSMARTS when required! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Increase Visibility of Your 3D Game App Earn a Chance To Win $500! Tap into the largest installed PC base get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of RDKit knime nodes available
Hi Greg and Thorsten, Greg: Thorsten: On the other hand, 4000 rows should not take that long in KNIME. How much times does it currently take? I just did 1000 rows on my macbook. Assuming I'm reading the knime log correctly, that took about a minute. Thanks for testing this out, Greg. I must confess, I didn't wait for the hierarchical clustering to finish for the 4000! Going back and selecting a random 1000 molecule subset, I reproduce your result of ~ 1 min (I get 67 secs). If I then go to 2000, it takes 520 secs - so to me this looks like cubic complexity - which is what the documentation for the node states (this would mean 1 hr for my original 4000...) For completeness - this result was with the Hierarchical Clustering(DistMatrix) node set with 'Tanimoto' similarity and 'Complete Linkage' for cluster comparison. Changing the comparison to 'Single Linkage' did not reduce the time. Interestingly, the documentation for the 'standard' Hierarchical Clustering' (ie non-distance matrix) node states that it operates with n-squared complexity. I guess other clustering algorithms available in knime must scale better than cubicly as well (k-means, fuzzy c-means?) - but as far as I can see they don't currently operate on distance matrices (or directly on bit vectors). If they could, then this may be a solution; or implementing the Murtagh algorithm (I am guessing the scaling is below cubic from my recollection of the speeds observed in rdkit). Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Increase Visibility of Your 3D Game App Earn a Chance To Win $500! Tap into the largest installed PC base get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Beta of RDKit knime nodes available
Dear Greg (and, of course, Thorsten and Bernd!) Great job on the Knime nodes! I have been giving these a go and am impressed (and excited about the future development!). A couple of observations / comments / questions: 1. I have observed that sometimes the FP node seems to generate blank fingerprints (doesn't appear to just be the rendering - eg blank if I swap to 'Bit Scratch' render as well. I have mainly been trying the default Morgan FPs, and find that if I reset the node and re-run, the FP is still blank. If, however, I swap the node to eg atompair, run, then swap back to Morgan - it seems to work... I am running on knime 2.2.2 on Windows 32-bit. 2. The next point is probably down to cheminformatics / knime naivety, but I must confess I am struggling a little to cluster compounds based on the FP... I have used the 'Distance Matrix Calculate' node (with Tanimoto similarity) to get a matrix that can be used by the 'Heirarchical Clustering (DistMatrix)' or 'k-Medoids' nodes. However, both of these appear to perform VERY slowly for a set of ~ 4000 compounds. I also attempted to cluster on the fingerprints directly, using the Neighborgrams nodes - but must confess I am some way off understanding what I am doing! My limited experience of using the RDKit functionality to cluster compounds and eg select a representative set (based on the FP Tanimoto distances and the Murtagh clustering) was that it performed rather rapidly. Is there the intention to expose this functionality in knime (or is the functionality already there and I just don't know how?) 3. Any plans for Windows 64-bit support? 4. I would be interested to know what the team views as the next priorities - property calcs, 3D conformations, pharmacophores, rendering? So much great stuff to choose from! :-) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Increase Visibility of Your 3D Game App Earn a Chance To Win $500! Tap into the largest installed PC base get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] How can I escape to this error
Hi Greg, Apologies for resurrecting a rather old thread, but I have been investigating the Q32010_1beta1 release on a set of commercial amines (from ACD) and came across the 'hypervalent P' issue as well. Greg wrote: To continue and try to answer Christian's question: it is currently impossible to really work with this hypervalent molecule in the RDKit. The only real solution is to tell the RDKit that P is allowed to have 7 substituents. If you really want to do this you can edit the file $RDBASE/Code/GraphMol/atomic_data.cpp and change the allowed valence list for P from 3 5 to 3 5 7. After you do this, you will need to rebuild the code. With the new release currently in beta, I wondered whether this would be a good time to consider if the change you suggest above for P should make it into the release code(?) Having said that, I am expecting that your comment it is currently impossible to really work with this hypervalent molecule in the RDKit suggests that a robust solution is not as simple as just changing the allowed valence list... Anyway, what I was finding from my list of amines was that the hexafluorophosphate counterion [PF6]- was triggering the error. Not a particularly common counterion - so I can certainly live without(!) - but not particularly esoteric either :-) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Cleaning SD files
Dear All, Today I have spent some time processing a freely-available SDF that contains many compounds and melting-points / ranges ( http://www.mdpi.org/molmall/mdpi1-51sd.zip). The reason for doing this is that I wanted to implement a melting-point predictor following the work of Andreas Bender (J. Chem. Inf. Model. 2005, 45, 581-590) and more recently Reifeng Liu at AZ (J. Chem. Inf. Model. 2008, 48, 981-987). I have attached the python-script that I have at the moment (a) in case it is of some use to anybody else, (b) in the hope that I can improve my python and rdkit abilities through any suggested alterations (I'm sure there are many!), and (c) to form the basis of a couple of questions. At the moment, the script is just running through each compound; checking if the molecule is valid; and if so, noting how many components, and whether any of the atoms are outside of the desired list. These two results are then written out to a new SDF. I am then using this to make sure my data-set contains only compounds that I would say are 'reasonable' to build a melting-point model with. Now for the questions: 1. In RDKit, has the 'cleaning / washing / salt-stripping' of molecules already been formalised based on a set of rules, etc? 2. When identifying compounds that contain a non-allowed atom-type, why do I find the SMARTS def [!H;!C;!N;!O;!F;!S;!Cl;!Br;!I] gives unexpected results, but [!#1;!#6;!#7;!#8;!#9;!#16;!#17;!#35;!#53] works as I would expect? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ inorg_or_mix.py Description: inorg_or_mix.py -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Align SDF to user-supplied template coordinates (2D)
Dear All, I am currently struggling with something that I expect is very easy to solve (I have just got back from holiday, so I think my brain isn't quite in the zone!) I am trying to read in an SDF and align each molecule to a template scaffold provided in molfile format. I want to supply a tool that allows a user to sketch in a template and view their SDF entries in 2D all aligned (where there is a match) to the supplied template. I have essentially followed this entry in the Chemistry Toolkit Rosetta - http://ctr.wikia.com/wiki/Align_the_depiction_using_a_fixed_substructure http://ctr.wikia.com/wiki/Align_the_depiction_using_a_fixed_substructur e , which in essence is pretty-much the same as the info in the RDKit documentation. However, when I am using pre-supplied 2D coordinates for the template, rather than generating them from the first substructure match (as in the CTR example), I find that the alignment proceeds as required, but there is a mis-match between the scaling of the bond-lengths in the aligned substructure compared with the rest of the molecule... Is there a way to 'scale' the molecules according to the template mol (or alternatively scale the template according to the RDKit default)? Or is it that I am tackling this in the wrong way? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Align SDF to user-supplied template coordinates (2D)
Thanks Greg, Greg wrote: Ah yes, the depictions that you get look rather silly, no? Yes they do! You're doing it correctly; no worries there. The problem is that most pieces of chemical drawing software generate 2D coordinates for molecules such that a C-C single bond is 1.0A long. The RDKit, on the other hand, sets the C-C single bond to be 1.5A long. The consequence is a depiction with a core that's smaller than it should be. I was using ISISDraw for sketching the core motif. It seems that the single-bond length (from the Origdraw settings) is ~ 0.825 A (!) So I modified the scalar to 1.818 (1.5/0.825) in your code and it works beautifully! It should be possible to specify the scale used in the RDKit depictions so that these contortions are not necessary. I will put a feature request in for this and get it in the next version. Thanks for this - I certainly think this would be a worthwhile feature. What I may implement in the meantime is running through all the bonds of type SINGLE in the core molecule, taking the average, then using 1.5/(average) as the scalar to protect against differing user settings! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Reading Molfiles with \'ambiguous\' 5-membered aromatics
Dear All, It's been a couple of weeks since Greg first helped me with this, and after some further help I agreed that I would do my best to summarise things for the benefit of the Group. The attached file 'sanifix3.py' was provided to me by Greg, and essentially does exactly what I (thought I) wanted - ie if required, 'cleans' up an input molecule by modifying aromatic nitrogen-containing ring systems until a 'sanitizable' form is generated. However, having tested this a bit further, I found that N-containing heteroaromatics (which I originally posted the question about) are only one of many possible issues when dealing with automated atom- and bond-typing from PDB files! So taking this approach would require a significantly larger set of 'rules' to cover all possible problems (I'm sure many people more experienced than me will have been aware of this for a long time!). As Greg said: Figuring out the correct chemistry for a pdb ligand is one of those challenges at I wouldn't dream of attempting. Between the various sources of ligand structures out there you can probably find omsething at least halfway acceptable. For in house stuff, I would assume that you can use the registry number to get a smiles or mol block, right? You could use that with the rdkit substructure matching code to test the pymol-assigned structures. And indeed, this is the way that I ended-up going for in-house structures - a script that extracts our corporate ID from the PDB file and searches our database to return the SMILES. Then (again, thanks to Greg for more help here, and steering me away from some clumsy usage of ConstrainedEmbed!) a substructure match is conducted between an RDKit mol from the SMILES (refered to as 'db_mol' in the function below), and the original ligand. The main point here is to convert the original ligand structure to a set of non-aromatic atoms joined by 'unspecified' bond-types. Below is the excerpt from what I am using with PyMOL: 'molfile3D' is a temporary molfile that has been created using the PyMOL 'save' command, that gets converted to the required 'connectivity substructure' that carries the 3D coordinates we will need later: def make3DTemplate(molfile3D): mol = Chem.MolFromMolFile(molfile3D, False) for atom in mol.GetAtoms(): atom.SetIsAromatic(False) for bond in mol.GetBonds(): bond.SetBondType(rdkit.Chem.rdchem.BondType.UNSPECIFIED) return mol Then once we have this '3D template', the substructure match can be conducted for the molecule built from the database SMILES string (db_mol). If the match is successful, the original 3D coordinates for the atoms in the 'template' are then applied back to a conformer of our new molecule. Finally, this new molecule + conformation is returned as the molblock, which I then read back in PyMOL to give a 'sanitized' version of the bound ligand for any in-house crystal structure: def outputMolBlock(db_mol, template_mol): matches = db_mol.GetSubstructMatches(template_mol) if not matches: raise ValueError,no substruct match if len(matches)1: print warning! more than one isomorphism found! db_conf = db_mol.GetConformer() template_conf = template_mol.GetConformer() match = matches[0] # This sets the 3D coordinates for for i,mIdx in enumerate(match): db_conf.SetAtomPosition(mIdx, template_conf.GetAtomPosition(i)) db_conf.Set3D(True) return Chem.MolToMolBlock(db_mol) It wouldn't now be too much of a leap(?) to extend the same methodology to public PDB structures - using the LigandExpo SDF. See this post from Noel on Blue Obelisk for background: http://blueobelisk.shapado.com/questions/how-to-get-an-experimental-liga nd-structure-from-the-pdb Also, just for interest - I am using cx_Oracle to connect to our corporate database from Python, which is now allowing me to add a few extra bits - like flagging up to people if the in-house structure they have just opened has been previously crystallised in any other targets, etc, etc. If anybody is trying to do similar, but has not used cx_Oracle, then give me a shout and I will see if I can help (although SQL is definitely also on the list of things I know only barely enough about!). Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for
[Rdkit-discuss] Interacting with molecules in PyMOL
Dear All, I am trying to work out the best way to accomplish some tasks involving RDKit, using PyMOL as an interface, and would appreciate some help. I would like to be able to start from a PDB file of a ligand-bound crystal structure loaded in PyMOL and be able to 'virtually' build some analogues - initially just simple substitutions - and visually inspect the results. (1) So my first question is - having started PyMOL with the -R option, is there an easy or recommended way of transferring molecules from PyMOL to RDKit? I can accomplish this by writing molfiles to a temporary file, but wondered if I am creating work, if eg RDKit could automatically create Mol objects from non-biopolymer atoms in PyMol(?). ie it would be nice if: from rdkit import Chem from rdkit.Chem import PyMol v = PyMol.MolViewer() # Invented function to create an RDKit mol object mol = v.GetAtomsAsMol(selection) (2) Once the ligand is converted to an RDKit mol object (by whatever means) I want to enumrate some libraries of virtual products - eg choosing an atom in PyMol as the attachment point, then running a set of reactions to get products with a set of substituents added. In principle I think this is quite straightforward; however, what I am struggling with is a mechanism to 'freeze' the 3D coordintes of the original ligand atoms, but still be able to use RDKit to generate sensible 3D positions for the newly added atoms so that the products can be passed back to PyMOL and minimised in situ (if required) using 'mengine.exe'. Am I looking at this the wrong way, and should I actually try aligning the virtual products back on the starting ligand conformation? (3) Apologies that this last point is maybe a bit off-topic, but I wondered if anyone has an opinion as to whether MMTK is the way to go for 'simple' minimisations of modified ligands bound to proteins? (I don't have any real experience with MMTK... Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Interacting with molecules in PyMOL
Dear Greg, Thanks for your very rapid response - 'AllChem.ConstrainedEmbed' was just what I was looking for! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Interacting with molecules in PyMOL
I just wanted to quickly update the List on how I've got on with this, in case it is of use / interest to others. I followed Greg's advice and did the following: 1. Exported molfile from PyMOL 2. Read into RDKit 3. Read in an SDF of already-constructed molecules based on the core (could have built the products in RDKit, but the SDF was already available!) 4. Iterated over the objects in the supplier to do the AllChem.ConstrainedEmbed as discussed, then load the results into PyMOL NOTE - Because the molecules weren't built in RDKit, I couldn't rely on the atom numbering when read into PyMOL (maybe this can't be relied on anyway?). So I ran mol.GetSubstructMatch(core) for each molecule to get the aligned product atom IDs that matched the core. I then flagged these in PyMOL with flag 3 [Fixed Atoms (no movement allowed)] (flag 2 - harmonically constrained may be better..?) so that I could subsequently run the mengine 'clean' command in PyMOL to tidy-up the UFF output without allowing the template to move: from rdkit import Chem core = Chem.MolFromMolFile(mol_filepath) supplier = Chem.SDMolSupplier(sdf_filepath) for n,mol in enumerate(supplier): mol = Chem.AddHs(mol) newMol = AllChem.ConstrainedEmbed(mol, core, True) name = mol+str(n) fix = ','.join([str(n) for n in mol.GetSubstructMatch(core)]) v.ShowMol(newMol, name=name, showOnly=False) selection = '('+name+' and (id '+fix+'))' v.server.do('flag 3, '+selection+', set') Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] [Rdkit-announce] Q2 2010 Release
Congratulations on the release, Greg! I am really a very recent adopter of RDKit, but even in the short time I have been using it I have been amazed at the quality and depth of functionality! Please keep up the good work, and I hope I can continue to help a tiny amount in the only way I know how - by selfishly requesting new features :) Kind regards James -Original Message- From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: Wed 30/06/2010 21:16 To: RDKit Discuss; RDKit Developers List; rdkit-annou...@lists.sourceforge.net Subject: [Rdkit-announce] Q2 2010 Release Dear all, I'm very happy to announce that the next version of the RDKit -- Q22010_1 -- is released. The release notes are below. The source release and windows binaries (python 2.6 only this time, please let me know if anyone needs a python 2.5 release) will be on the sourceforge downloads page: http://sourceforge.net/projects/rdkit/files/rdkit/Q2_2010/ The files can also be downloaded from the google project page: http://code.google.com/p/rdkit/downloads/list I have also updated the online documentation. Thanks to the everyone who submitted bug reports and suggestions for this release! Please let me know if you find any problems with the release or have suggestions for the next one. -greg ** Release_Q22010_1 *** (Changes relative to Release_Q12010_1) !! IMPORTANT !! - There are a couple of refactoring changes that affect people using the RDKit from C++. Please look in the Other section below for a list. - If you are building the RDKit yourself, changes made in this release require that you use a reasonably up-to-date version of flex to build it. Please look in the Other section below for more information. Acknowledgements: - Andrew Dalke, James Davidson, Kirk DeLisle, Thomas Heller, Peter Gedeck, Greg Magoon, Noel O'Boyle, Nik Stiefl, Bug Fixes: - The depictor no longer generates NaNs for some molecules on windows (issue 2995724) - [X] query features work correctly with chiral atoms. (issue 3000399) - mols will no longer be deleted by python when atoms/bonds returned from mol.Get{Atom,Bond}WithIdx() are still active. (issue 3007178) - a problem with force-field construction for five-coordinate atoms was fixed. (issue 3009337) - double bonds to terminal atoms are no longer marked as any bonds when writing mol blocks. (issue 3009756) - a problem with stereochemistry of double bonds linking rings was fixed. (issue 3009836) - a problem with R/S assignment was fixed. (issue 3009911) - error and warning messages are now properly displayed when cmake builds are used on windows. - a canonicalization problem with double bonds incident onto aromatic rings was fixed. (issue 3018558) - a problem with embedding fused small ring systems was fixed. (issue 3019283) New Features: - RXN files can now be written. (issue 3011399) - reaction smarts can now be written. - v3000 RXN files can now be read. (issue 3009807) - better support for query information in mol blocks is present. (issue 2942501) - Depictions of reactions can now be generated. - Morgan fingerprints can now be calculated as bit vectors (as opposed to count vectors. - the method GetFeatureDefs() has been added to MolChemicalFeatureFactory - repeated recursive SMARTS queries in a single SMARTS will now be recognized and matched much faster. - the SMILES and SMARTS parsers can now be run safely in multi-threaded code. Deprecated modules (to be removed in next release): - rdkit/qtGui - Projects/SDView Removed modules: - SVD code: External/svdlibc External/svdpackc rdkit/PySVD - rdkit/Chem/CDXMLWriter.py Other: - The large scale changes in the handling of stereochemistry were made for this release. These should make the code more robust. - If you are building the RDKit yourself, changes made in this release require that you use a reasonably up-to-date version of flex to build it. This is likely to be a problem on Redhat, and redhat-derived systems. Specifically: if your version of flex is something like 2.5.4 (as opposed to something like 2.5.33, 2.5.34, etc.), you will need to get a newer version from http://flex.sourceforge.net in order to build the RDKit. - Changes only affecting C++ programmers: - The code for calculating topological-torsion and atom-pair fingerprints has been moved from $RDBASE/Code/GraphMol/Descriptors to $RDBASE/Code/GraphMol/Fingerprints. - The naming convention for methods of ExplicitBitVect and SparseBitVect have been changed to make it more consistent with the rest of the RDKit. - the bjam-based build system should be considered deprecated. This is the last release it will be actively maintained. -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit
Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Thanks Greg - this is great! I must confess, I was eager to try this out asap - but have not built rdkit before. I did start having a go over the weekend on my home PC (Windows MCE2005) but ran into a couple of unexpected issues with the software installs that made me think I would wait and retry on my work PC. [not really relevant, but for interest - I think the problems may have been related to the Visual Studio 2010 Express installation. The result was an infuriating clicking in the audio when streaming live or recorded TV to an extender!! Not an issue that I felt was easy to troubleshoot... I reinstalled my system from a drive image backup and the problem was gone... That's when I decided to leave well alone, as my family may not have seen the benefit of up-to-the-minute builds at home at the expense of TV enjoyment : ) ] I will get my PC at work setup to build from SVN snapshots - but I was very pleased to see your post (http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01097.html) saying that Q2 binaries should be available next week - great news! Kind regards James -Original Message- From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 18 June 2010 06:08 To: rdkit-discuss@lists.sourceforge.net Cc: James Davidson Subject: Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts Dear all, A followup/update on a request from a couple weeks ago: On Fri, Jun 4, 2010 at 6:13 AM, Greg Landrum greg.land...@gmail.com wrote: On Thu, Jun 3, 2010 at 7:51 PM, James Davidson j.david...@vernalis.com wrote: (1) I see that the reaction objects can be created from MDL Reaction Files/Blocks - is there a way to do the reverse, and save a reaction object in MDL .rxn format? I tried using investigating the rxn.ToBinary() attribute, but didn't get very far... The reason I wanted to do this, is that I was trying to figure-out how to generate a form of the reaction object (generated from reaction SMARTS) that was suitable for converting into a 2D depiction of the transformation. At the moment the reactions are essentially input-only. There's really no way to get them out in any format that could be used elsewhere. This is a sadly missing feature: it would be really nice to be able to generate either .rxn files (or at least reaction smarts) from reactions. I will add a feature request for this, but it may take a while to happen.[1] I've added a partial solution to this that at least provides some help with visualizing reactions. Here's my reaction: [12] rxn = AllChem.ReactionFromSmarts('[C:1](=[O:2])-[O;-,H].[N;!$(N-C=[O,N,S]);!$(N=*):3][C:1](=[O:2])-[N:3]') You can now output reaction smarts: [13] AllChem.ReactionToSmarts(rxn) Out[13] '[C:1](=[O:2])-[O;-,H1].[N;!$(N-C=[O,N,S]);!$(N=*):3][C:1](=[O:2])-[N:3]' You can also generate coordinates for a reaction and the create an rxn file: [14] AllChem.Compute2DCoordsForReaction(rxn) [15] print AllChem.ReactionToRxnBlock(rxn) -- print(AllChem.ReactionToRxnBlock(rxn)) $RXN RDKit 2 1 $MOL RDKit 2D 3 2 0 0 0 0 0 0 0 0999 V2000 -0.0.0. C 0 0 0 0 0 0 0 0 0 1 0 0 -0. -1.50000. O 0 0 0 0 0 0 0 0 0 2 0 0 -0.1.50000. * 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 1 3 1 0 V3 [O;-,H1] M END $MOL RDKit 2D 1 0 0 0 0 0 0 0 0 0999 V2000 0.50000.0. * 0 0 0 0 0 0 0 0 0 3 0 0 V1 [N;!$(N-C=[O,N,S]);!$(N=*):3] M END $MOL RDKit 2D 3 2 0 0 0 0 0 0 0 0999 V2000 1.50000.0. C 0 0 0 0 0 0 0 0 0 1 0 0 1.5000 -1.50000. O 0 0 0 0 0 0 0 0 0 2 0 0 1.50001.50000. N 0 0 0 0 0 0 0 0 0 3 0 0 1 2 2 0 1 3 1 0 M END # Notice that query features on atoms in the rxn blocks are not output as property ctab query features. Instead I use the atom-value feature of ctabs and output the SMARTS query for the atoms. This has the marked disadvantage that it won't actually generate reactions that do sensible things in other tools, but at least you can do some debugging of reactions. At some point in the future it would be nice to have ctab queries handled correctly, but this is at least something. These changes are checked into subversion and will be in the next release. Best Regards, -greg __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify
Re: [Rdkit-discuss] Number of Aromatic Rings
Hi Greg Well, I managed to have a go at this earlier than I expected. So first some apologies, provisos, and caveats to warn you, and other readers, that your eyes will soon experience things inelegant and unpythonic, but it's the best I could come up with, with my limited faculties and experience! On the plus side - I think it is doing what I wanted - ie giving a count of the number of aromatic systems (if you always want count a fused aromatic as 1 aromatic system). The downside is that the way I have done this now makes your script eg output (6,1) for anthracene - where the 1 is the count of aromatic systems (fused or otherwise). It would be most generic if it maybe returned (6,3,1) as (all unique aromatic substructures, unique mono-cyclic substructures, aromatic systems). I'm sure this is fairly straightforward, but for another day! So what I added was: def GetOuterSet(rings): # Initialise a counter for parent aromatic 'super' rings result = 0 # Set-up a dictionary so that items can be referenced and deleted ring_set = {} for k, v in enumerate(rings): ring_set[k] = v # While there is something to process while len(ring_set): # Set the ring to be checked as the last in the list - should be the biggest reference = sorted(ring_set)[-1] for k,v in sorted(ring_set.iteritems()): # if current item is contained in last item - remove current from dictionary if vring_set[reference]: ring_set.pop(k) # If we are at the reference, then we have found our 'super' ring if k == reference: result += 1 break return result and I passed in the aromaticRings list from your script, then returned both the length of the aromaticRings list (as before) plus the output of GetOuterSet(). ie: superRings = GetOuterSet(aromaticRings) return len(aromaticRings), superRings So once again, thanks for the help, and I would welcome any pointers from anyone on tidying-up and improving this modification! (or corrections if anyone spots them - I have only briefly tested this) Kind regards James -Original Message- From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 11 June 2010 06:02 To: James Davidson Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Number of Aromatic Rings Dear James, On Thu, Jun 10, 2010 at 2:35 PM, James Davidson j.david...@vernalis.com wrote: I have been trying figure-out how to return the count of aromatic rings for molecules (in Python), and am going to have to admit defeat! I saw in an earlier message (http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg00 153.html) a similar query, but I'm afraid it didn't help me very much. I also read the section on Aromaticity in the rdkit book, and realised that maybe this isn't a trivial exercise! Correct. Counting the number of non-fused rings that are aromatic, like the post you reference does, is pretty easy; including the fused rings that are aromatic is more challenging. I would like the count to count aromatic ring-systems such that bicyclic (eg indole or naphthalene) would only count as 1. For reference, this appears to be the behaviour of the OpenEye OEDetermineAromaticRingSystems function - where the molecule derived from the smiles C(O)(=O)c12c1[nH]c(C3CCCc4c34)c2 (which contains an indole and a tetrahydronaphthalene) gives a count of 2. Any help would be greatly appreciated. I've attached a script that's not quite what you want, but it gets you almost there: it finds all aromatic ring systems, including fused ones. Anthracene, for example, gives 6 rings. The modifications to this to get what you're looking for aren't a straightforward post-processing step, but shouldn't be too bad. If there's not enough here, let me know and I will take a look at adding the extra code. This code isn't perfectly polished and could certainly be faster, but it does seem mostly functional. -greg __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any
Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Thanks for the help, Greg - my reaction SMARTS are now behaving themselves! I must confess, I had not actually realised that the documentation from install (ie the 'Book') was different to the 'Getting Started' one that I had linked from the website. Kind regards, James -Original Message- From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: Fri 04/06/2010 05:13 To: James Davidson Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts Dear James, On Thu, Jun 3, 2010 at 7:51 PM, James Davidson j.david...@vernalis.com wrote: First of all, I'd like to start by saying how much I've been enjoying exploring the functionality of RDKit - great job, Greg! Thanks! I have a couple of questions regarding 'rdkit.Chem.AllChem.ReactionFromSmarts': (1) I see that the reaction objects can be created from MDL Reaction Files/Blocks - is there a way to do the reverse, and save a reaction object in MDL .rxn format? I tried using investigating the rxn.ToBinary() attribute, but didn't get very far... The reason I wanted to do this, is that I was trying to figure-out how to generate a form of the reaction object (generated from reaction SMARTS) that was suitable for converting into a 2D depiction of the transformation. At the moment the reactions are essentially input-only. There's really no way to get them out in any format that could be used elsewhere. This is a sadly missing feature: it would be really nice to be able to generate either .rxn files (or at least reaction smarts) from reactions. I will add a feature request for this, but it may take a while to happen.[1] A workaround that kind of works is to paste the reaction smarts into something like Marvin Sketch. It will normally display something that at least gives some idea of what the reaction is. (2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some behaviour that I would not expect - however, this could be down to my SMARTS-naivety; my SMIRKS-naivety; or both! Anytime reactions behave in ways you don't expect, it's probably best to just blame me for coming up with yet another way of expressing them that is slightly incompatible with the existing ones. :-) I initially tried the following: from rdkit import Chem from rdkit.Chem import AllChem rxn_smarts = '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]' sm = Chem.MolFromSmiles('CC(=O)NC') rxn = AllChem.ReactionFromSmarts(rxn_smarts) prods = rxn.RunReactants((sm,)) prod = Chem.MolToSmiles(prod[0][0]) This gives me prod = '[H]C(=O)NC' There's a discussion of this kind of case in the RDKit Book ($RDBASE/Docs/Book/RDKit_Book.pdf) starting on page 3. The short answer is that if you have a query feature (atom list, recursive smarts, etc.) in the reactants and you would like the matching atom to be copied into the products you should include a dummy for that atom in the products. A working form of your example is then: [11] rxn_smarts = '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][*:1]-[C:3](=[O:4])-[NH:2]-[*:5]' [12] rxn = AllChem.ReactionFromSmarts(rxn_smarts) [13] prods = rxn.RunReactants((Chem.MolFromSmiles('c1c1C(=O)NCC1CC1'),)) [14] Chem.MolToSmiles(prods[0][0]) Out[14] 'O=C(CC1CC1)Nc1c1' As an aside, in SMARTS it's shorter (and I think clearer) to write [C,c] as [#6]. It also produces a query that runs a bit quicker, but you probably won't notice that difference in most cases. If I replace with rxn_smarts = '[!H:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]', I get the behaviour I want - with prod = 'CNC(=O)C'. So I think I can get the behaviour I want, but was curious if I am using the SMARTS ! operator incorrectly in conjunction with atomic numbers, or whether this may be a bug? Not really a bug. The behavior when you have queries in the products is undocumented: depending on the details of the query it will sometimes do the right thing, sometimes not. It's much safer to just use *. What I probably should do is add a warning message if the reaction contains a query in the products, I will think about this. Best Regards, -greg [1] The underlying problem isn't actually generating the rxn files themselves, they are just a collection of mol blocks with a bit of extra verbiage sprinkled around. The problem is generating reasonable mol blocks for molecules with query features. I already have a feature request in for that one (http://sourceforge.net/tracker/?group_id=160139atid=814653), but it turns out to not be quite as easy as it sounds. __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance
[Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Hi, First of all, I'd like to start by saying how much I've been enjoying exploring the functionality of RDKit - great job, Greg! I have a couple of questions regarding 'rdkit.Chem.AllChem.ReactionFromSmarts': (1) I see that the reaction objects can be created from MDL Reaction Files/Blocks - is there a way to do the reverse, and save a reaction object in MDL .rxn format? I tried using investigating the rxn.ToBinary() attribute, but didn't get very far... The reason I wanted to do this, is that I was trying to figure-out how to generate a form of the reaction object (generated from reaction SMARTS) that was suitable for converting into a 2D depiction of the transformation. (2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some behaviour that I would not expect - however, this could be down to my SMARTS-naivety; my SMIRKS-naivety; or both! I initially tried the following: from rdkit import Chem from rdkit.Chem import AllChem rxn_smarts = '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c :5]' sm = Chem.MolFromSmiles('CC(=O)NC') rxn = AllChem.ReactionFromSmarts(rxn_smarts) prods = rxn.RunReactants((sm,)) prod = Chem.MolToSmiles(prod[0][0]) This gives me prod = '[H]C(=O)NC' If I replace with rxn_smarts = '[!H:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5 ]', I get the behaviour I want - with prod = 'CNC(=O)C'. So I think I can get the behaviour I want, but was curious if I am using the SMARTS ! operator incorrectly in conjunction with atomic numbers, or whether this may be a bug? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss