Re: [Rdkit-discuss] Kekulizing thiazoles

2017-01-18 Thread Rafal Roszak
On Tue, 17 Jan 2017 16:52:36 +
Chris Arthur  wrote:

> ValueError: Sanitization error: Can't kekulize mol

In most case I have 'Can't kekulize mol' error for hetorocycle with hydrogen on 
nitrogen and smiles which have not explicite hydrogen on N.
Exempli gratia:

>>> Chem.MolFromSmiles('c1ccnc1')
[10:23:21] Can't kekulize mol 
>>> Chem.MolFromSmiles('c1cc[nH]c1')



> I can generate a smiles string from it (I had thought of doing a smiles to
> molecule conversion)

so if this is the issue, you can convert your Mol object to smiles add missing 
H and build Mol from this new smiles.

Regards,

RafaƂ

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Kekulizing thiazoles

2017-01-17 Thread Greg Landrum
I don't have anything to add to this other than to agree with Curt: I think
that the existing code should work fine with thiazoles.

@Curt: thanks for providing this detailed and thought-through answer!

-greg


On Tue, Jan 17, 2017 at 7:01 PM, Curt Fischer 
wrote:

> To troubleshoot your sanitization problems, I think it would be helpful if
> you could share your SMARTS reaction string and the rdkit version you are
> using.
>
> I just simulated the Hantzsch thiazole synthesis shown on Wikipedia, and
> everythink worked normally for me.  Admittedly, my reaction definition is
> overly tailored toward these two reactants, but I think it shows that rdkit
> can *Sanitize()* thiazoles correctly.
>
> # Hantzsch thiazole synthesis
> thiourea = Chem.MolFromSmiles('CN(C)C(=S)N')
> haloketone = Chem.MolFromSmiles('c1c1C(=O)C(C)Cl')
> rxn_smarts = '[NH2:1][C:2](=[S:3])[NH0:4].[C:5](=[O:6])[C:7][Cl:8]>>[N:4]
> [c:2]1[s:3][c:5][c:7][n:1]1'
> rxn = AllChem.ReactionFromSmarts(rxn_smarts)
> product = rxn.RunReactants((thiourea, haloketone))[0][0]
> Chem.SanitizeMol(product)
> Chem.MolToSmiles(product)
>
> Out[33]: 'Cc1nc(N(C)C)sc1-c1c1'
>
>
> On Tue, Jan 17, 2017 at 9:29 AM, Curt Fischer 
> wrote:
>
>> I can't answer your root question, but if you want to go to SMILES and
>> then back, I think you want *Chem.MolFromSmiles()*, not
>> *Chem.MolToSmiles()*.
>>
>> Curt
>>
>> On Tue, Jan 17, 2017 at 8:52 AM, Chris Arthur > > wrote:
>>
>>> Dear all
>>>
>>>
>>> I have a molecule containing a thiazole ring which has been generated by
>>> a reaction in Rdkit.
>>>
>>> Sanitising the molecule gives kekulization error...
>>>
>>> Chem.SanitizeMol(forwardProduct_)
>>> Traceback (most recent call last):
>>>
>>>   File "", line 1, in 
>>> Chem.SanitizeMol(forwardProduct_)
>>>
>>> ValueError: Sanitization error: Can't kekulize mol
>>>
>>> I can generate a smiles string from it (I had thought of doing a smiles
>>> to molecule conversion)
>>>
>>> #Rdkit generated smiles that started us down this rabbit-hole
>>> temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C')
>>>
>>> But this fails
>>>
>>> ArgumentError: Python argument types in
>>> rdkit.Chem.rdmolfiles.MolToSmiles(str)
>>> did not match C++ signature:
>>> MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool
>>> kekuleSmiles=False, int rootedAtAtom=-1, bool canonical=True, bool
>>> allBondsExplicit=False, bool allHsExplicit=False)
>>>
>>>
>>> So I thought I would try with simpler thiazoles
>>>
>>> #ChemDraws smiles representation
>>> temp = Chem.MolToSmiles('C1=CN=CS1')
>>>
>>> #From wikipedias smile for thiazole
>>> temp = Chem.MolToSmiles('n1ccsc1')
>>>
>>> These however also fail.
>>>
>>>  Can anyone suggest how I can proceed in order to sanitize such
>>> molecules
>>>
>>>  Thanks
>>>
>>>  Chris
>>>
>>>
>>>
>>> --
>>> Dr Christopher J. Arthur
>>> School of Chemistry
>>> University of Bristol
>>> BRISTOL, BS8 1TS,  UK
>>> E-mail:  chris.art...@bristol.ac.uk
>>>
>>> Office: (+44 117) 331 7192 <+44%20117%20331%207192>
>>> Mass Spectrometry Lab: (+44 117) 331 7358 <+44%20117%20331%207358>.
>>> FAX: (+44 117) 927 7985 <+44%20117%20927%207985>
>>>
>>> WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm
>>> LinkedIn  Profile: https://www.linkedin.com/in/drchrisarthur
>>>
>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Kekulizing thiazoles

2017-01-17 Thread Curt Fischer
To troubleshoot your sanitization problems, I think it would be helpful if
you could share your SMARTS reaction string and the rdkit version you are
using.

I just simulated the Hantzsch thiazole synthesis shown on Wikipedia, and
everythink worked normally for me.  Admittedly, my reaction definition is
overly tailored toward these two reactants, but I think it shows that rdkit
can *Sanitize()* thiazoles correctly.

# Hantzsch thiazole synthesis
thiourea = Chem.MolFromSmiles('CN(C)C(=S)N')
haloketone = Chem.MolFromSmiles('c1c1C(=O)C(C)Cl')
rxn_smarts =
'[NH2:1][C:2](=[S:3])[NH0:4].[C:5](=[O:6])[C:7][Cl:8]>>[N:4][c:2]1[s:3][c:5][c:7][n:1]1'
rxn = AllChem.ReactionFromSmarts(rxn_smarts)
product = rxn.RunReactants((thiourea, haloketone))[0][0]
Chem.SanitizeMol(product)
Chem.MolToSmiles(product)

Out[33]: 'Cc1nc(N(C)C)sc1-c1c1'


On Tue, Jan 17, 2017 at 9:29 AM, Curt Fischer 
wrote:

> I can't answer your root question, but if you want to go to SMILES and
> then back, I think you want *Chem.MolFromSmiles()*, not
> *Chem.MolToSmiles()*.
>
> Curt
>
> On Tue, Jan 17, 2017 at 8:52 AM, Chris Arthur 
> wrote:
>
>> Dear all
>>
>>
>> I have a molecule containing a thiazole ring which has been generated by
>> a reaction in Rdkit.
>>
>> Sanitising the molecule gives kekulization error...
>>
>> Chem.SanitizeMol(forwardProduct_)
>> Traceback (most recent call last):
>>
>>   File "", line 1, in 
>> Chem.SanitizeMol(forwardProduct_)
>>
>> ValueError: Sanitization error: Can't kekulize mol
>>
>> I can generate a smiles string from it (I had thought of doing a smiles
>> to molecule conversion)
>>
>> #Rdkit generated smiles that started us down this rabbit-hole
>> temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C')
>>
>> But this fails
>>
>> ArgumentError: Python argument types in
>> rdkit.Chem.rdmolfiles.MolToSmiles(str)
>> did not match C++ signature:
>> MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool
>> kekuleSmiles=False, int rootedAtAtom=-1, bool canonical=True, bool
>> allBondsExplicit=False, bool allHsExplicit=False)
>>
>>
>> So I thought I would try with simpler thiazoles
>>
>> #ChemDraws smiles representation
>> temp = Chem.MolToSmiles('C1=CN=CS1')
>>
>> #From wikipedias smile for thiazole
>> temp = Chem.MolToSmiles('n1ccsc1')
>>
>> These however also fail.
>>
>>  Can anyone suggest how I can proceed in order to sanitize such molecules
>>
>>  Thanks
>>
>>  Chris
>>
>>
>>
>> --
>> Dr Christopher J. Arthur
>> School of Chemistry
>> University of Bristol
>> BRISTOL, BS8 1TS,  UK
>> E-mail:  chris.art...@bristol.ac.uk
>>
>> Office: (+44 117) 331 7192 <+44%20117%20331%207192>
>> Mass Spectrometry Lab: (+44 117) 331 7358 <+44%20117%20331%207358>.
>> FAX: (+44 117) 927 7985 <+44%20117%20927%207985>
>>
>> WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm
>> LinkedIn  Profile: https://www.linkedin.com/in/drchrisarthur
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Kekulizing thiazoles

2017-01-17 Thread Chris Arthur
Dear all


I have a molecule containing a thiazole ring which has been generated by a
reaction in Rdkit.

Sanitising the molecule gives kekulization error...

Chem.SanitizeMol(forwardProduct_)
Traceback (most recent call last):

  File "", line 1, in 
Chem.SanitizeMol(forwardProduct_)

ValueError: Sanitization error: Can't kekulize mol

I can generate a smiles string from it (I had thought of doing a smiles to
molecule conversion)

#Rdkit generated smiles that started us down this rabbit-hole
temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C')

But this fails

ArgumentError: Python argument types in
rdkit.Chem.rdmolfiles.MolToSmiles(str)
did not match C++ signature:
MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool
kekuleSmiles=False, int rootedAtAtom=-1, bool canonical=True, bool
allBondsExplicit=False, bool allHsExplicit=False)


So I thought I would try with simpler thiazoles

#ChemDraws smiles representation
temp = Chem.MolToSmiles('C1=CN=CS1')

#From wikipedias smile for thiazole
temp = Chem.MolToSmiles('n1ccsc1')

These however also fail.

 Can anyone suggest how I can proceed in order to sanitize such molecules

 Thanks

 Chris



-- 
Dr Christopher J. Arthur
School of Chemistry
University of Bristol
BRISTOL, BS8 1TS,  UK
E-mail:  chris.art...@bristol.ac.uk

Office: (+44 117) 331 7192
Mass Spectrometry Lab: (+44 117) 331 7358.
FAX: (+44 117) 927 7985

WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm
LinkedIn  Profile: https://www.linkedin.com/in/drchrisarthur
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss