Re: [Rdkit-discuss] Kekulizing thiazoles
On Tue, 17 Jan 2017 16:52:36 + Chris Arthurwrote: > ValueError: Sanitization error: Can't kekulize mol In most case I have 'Can't kekulize mol' error for hetorocycle with hydrogen on nitrogen and smiles which have not explicite hydrogen on N. Exempli gratia: >>> Chem.MolFromSmiles('c1ccnc1') [10:23:21] Can't kekulize mol >>> Chem.MolFromSmiles('c1cc[nH]c1') > I can generate a smiles string from it (I had thought of doing a smiles to > molecule conversion) so if this is the issue, you can convert your Mol object to smiles add missing H and build Mol from this new smiles. Regards, RafaĆ -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Kekulizing thiazoles
I don't have anything to add to this other than to agree with Curt: I think that the existing code should work fine with thiazoles. @Curt: thanks for providing this detailed and thought-through answer! -greg On Tue, Jan 17, 2017 at 7:01 PM, Curt Fischerwrote: > To troubleshoot your sanitization problems, I think it would be helpful if > you could share your SMARTS reaction string and the rdkit version you are > using. > > I just simulated the Hantzsch thiazole synthesis shown on Wikipedia, and > everythink worked normally for me. Admittedly, my reaction definition is > overly tailored toward these two reactants, but I think it shows that rdkit > can *Sanitize()* thiazoles correctly. > > # Hantzsch thiazole synthesis > thiourea = Chem.MolFromSmiles('CN(C)C(=S)N') > haloketone = Chem.MolFromSmiles('c1c1C(=O)C(C)Cl') > rxn_smarts = '[NH2:1][C:2](=[S:3])[NH0:4].[C:5](=[O:6])[C:7][Cl:8]>>[N:4] > [c:2]1[s:3][c:5][c:7][n:1]1' > rxn = AllChem.ReactionFromSmarts(rxn_smarts) > product = rxn.RunReactants((thiourea, haloketone))[0][0] > Chem.SanitizeMol(product) > Chem.MolToSmiles(product) > > Out[33]: 'Cc1nc(N(C)C)sc1-c1c1' > > > On Tue, Jan 17, 2017 at 9:29 AM, Curt Fischer > wrote: > >> I can't answer your root question, but if you want to go to SMILES and >> then back, I think you want *Chem.MolFromSmiles()*, not >> *Chem.MolToSmiles()*. >> >> Curt >> >> On Tue, Jan 17, 2017 at 8:52 AM, Chris Arthur > > wrote: >> >>> Dear all >>> >>> >>> I have a molecule containing a thiazole ring which has been generated by >>> a reaction in Rdkit. >>> >>> Sanitising the molecule gives kekulization error... >>> >>> Chem.SanitizeMol(forwardProduct_) >>> Traceback (most recent call last): >>> >>> File "", line 1, in >>> Chem.SanitizeMol(forwardProduct_) >>> >>> ValueError: Sanitization error: Can't kekulize mol >>> >>> I can generate a smiles string from it (I had thought of doing a smiles >>> to molecule conversion) >>> >>> #Rdkit generated smiles that started us down this rabbit-hole >>> temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C') >>> >>> But this fails >>> >>> ArgumentError: Python argument types in >>> rdkit.Chem.rdmolfiles.MolToSmiles(str) >>> did not match C++ signature: >>> MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool >>> kekuleSmiles=False, int rootedAtAtom=-1, bool canonical=True, bool >>> allBondsExplicit=False, bool allHsExplicit=False) >>> >>> >>> So I thought I would try with simpler thiazoles >>> >>> #ChemDraws smiles representation >>> temp = Chem.MolToSmiles('C1=CN=CS1') >>> >>> #From wikipedias smile for thiazole >>> temp = Chem.MolToSmiles('n1ccsc1') >>> >>> These however also fail. >>> >>> Can anyone suggest how I can proceed in order to sanitize such >>> molecules >>> >>> Thanks >>> >>> Chris >>> >>> >>> >>> -- >>> Dr Christopher J. Arthur >>> School of Chemistry >>> University of Bristol >>> BRISTOL, BS8 1TS, UK >>> E-mail: chris.art...@bristol.ac.uk >>> >>> Office: (+44 117) 331 7192 <+44%20117%20331%207192> >>> Mass Spectrometry Lab: (+44 117) 331 7358 <+44%20117%20331%207358>. >>> FAX: (+44 117) 927 7985 <+44%20117%20927%207985> >>> >>> WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm >>> LinkedIn Profile: https://www.linkedin.com/in/drchrisarthur >>> >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Kekulizing thiazoles
To troubleshoot your sanitization problems, I think it would be helpful if you could share your SMARTS reaction string and the rdkit version you are using. I just simulated the Hantzsch thiazole synthesis shown on Wikipedia, and everythink worked normally for me. Admittedly, my reaction definition is overly tailored toward these two reactants, but I think it shows that rdkit can *Sanitize()* thiazoles correctly. # Hantzsch thiazole synthesis thiourea = Chem.MolFromSmiles('CN(C)C(=S)N') haloketone = Chem.MolFromSmiles('c1c1C(=O)C(C)Cl') rxn_smarts = '[NH2:1][C:2](=[S:3])[NH0:4].[C:5](=[O:6])[C:7][Cl:8]>>[N:4][c:2]1[s:3][c:5][c:7][n:1]1' rxn = AllChem.ReactionFromSmarts(rxn_smarts) product = rxn.RunReactants((thiourea, haloketone))[0][0] Chem.SanitizeMol(product) Chem.MolToSmiles(product) Out[33]: 'Cc1nc(N(C)C)sc1-c1c1' On Tue, Jan 17, 2017 at 9:29 AM, Curt Fischerwrote: > I can't answer your root question, but if you want to go to SMILES and > then back, I think you want *Chem.MolFromSmiles()*, not > *Chem.MolToSmiles()*. > > Curt > > On Tue, Jan 17, 2017 at 8:52 AM, Chris Arthur > wrote: > >> Dear all >> >> >> I have a molecule containing a thiazole ring which has been generated by >> a reaction in Rdkit. >> >> Sanitising the molecule gives kekulization error... >> >> Chem.SanitizeMol(forwardProduct_) >> Traceback (most recent call last): >> >> File "", line 1, in >> Chem.SanitizeMol(forwardProduct_) >> >> ValueError: Sanitization error: Can't kekulize mol >> >> I can generate a smiles string from it (I had thought of doing a smiles >> to molecule conversion) >> >> #Rdkit generated smiles that started us down this rabbit-hole >> temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C') >> >> But this fails >> >> ArgumentError: Python argument types in >> rdkit.Chem.rdmolfiles.MolToSmiles(str) >> did not match C++ signature: >> MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool >> kekuleSmiles=False, int rootedAtAtom=-1, bool canonical=True, bool >> allBondsExplicit=False, bool allHsExplicit=False) >> >> >> So I thought I would try with simpler thiazoles >> >> #ChemDraws smiles representation >> temp = Chem.MolToSmiles('C1=CN=CS1') >> >> #From wikipedias smile for thiazole >> temp = Chem.MolToSmiles('n1ccsc1') >> >> These however also fail. >> >> Can anyone suggest how I can proceed in order to sanitize such molecules >> >> Thanks >> >> Chris >> >> >> >> -- >> Dr Christopher J. Arthur >> School of Chemistry >> University of Bristol >> BRISTOL, BS8 1TS, UK >> E-mail: chris.art...@bristol.ac.uk >> >> Office: (+44 117) 331 7192 <+44%20117%20331%207192> >> Mass Spectrometry Lab: (+44 117) 331 7358 <+44%20117%20331%207358>. >> FAX: (+44 117) 927 7985 <+44%20117%20927%207985> >> >> WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm >> LinkedIn Profile: https://www.linkedin.com/in/drchrisarthur >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Kekulizing thiazoles
Dear all I have a molecule containing a thiazole ring which has been generated by a reaction in Rdkit. Sanitising the molecule gives kekulization error... Chem.SanitizeMol(forwardProduct_) Traceback (most recent call last): File "", line 1, in Chem.SanitizeMol(forwardProduct_) ValueError: Sanitization error: Can't kekulize mol I can generate a smiles string from it (I had thought of doing a smiles to molecule conversion) #Rdkit generated smiles that started us down this rabbit-hole temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C') But this fails ArgumentError: Python argument types in rdkit.Chem.rdmolfiles.MolToSmiles(str) did not match C++ signature: MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool kekuleSmiles=False, int rootedAtAtom=-1, bool canonical=True, bool allBondsExplicit=False, bool allHsExplicit=False) So I thought I would try with simpler thiazoles #ChemDraws smiles representation temp = Chem.MolToSmiles('C1=CN=CS1') #From wikipedias smile for thiazole temp = Chem.MolToSmiles('n1ccsc1') These however also fail. Can anyone suggest how I can proceed in order to sanitize such molecules Thanks Chris -- Dr Christopher J. Arthur School of Chemistry University of Bristol BRISTOL, BS8 1TS, UK E-mail: chris.art...@bristol.ac.uk Office: (+44 117) 331 7192 Mass Spectrometry Lab: (+44 117) 331 7358. FAX: (+44 117) 927 7985 WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm LinkedIn Profile: https://www.linkedin.com/in/drchrisarthur -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss