Re: [Rdkit-discuss] HasSubStructureMatch error

2016-08-03 Thread Greg Landrum
Hi,

The problem is connected with the lack of explicit Hs in the aromatic
version of the SMILES. The general rule is that if an aromatic heteroatom
needs to have an H on it in order for the valence to make sense, then you
need to include that in the SMILES. This does not hold when you express the
SMILES in Kekule form (your second example), where the H counts are clear.

Here's an example:

In [2]: m = Chem.MolFromSmiles('[nH]1c(=O)[nH]c(=O)cc1')

In [3]: Chem.MolToSmiles(m)
Out[3]: 'O=c1cc[nH]c(=O)[nH]1'

In [4]: m = Chem.MolFromSmiles('N1C(=O)NC(=O)C=C1')

In [5]: Chem.MolToSmiles(m)
Out[5]: 'O=c1cc[nH]c(=O)[nH]1'

I hope this helps,
-greg





On Thu, Aug 4, 2016 at 5:13 AM, macbook  wrote:

> Dear all,
>
> There are several  questions I want to ask for help.
>
> 1. When I read a molecular by  MolFromSmiles and MolFromSmarts,it  throw
> an exception,As shown below , the molecule “n1c(=O)nc(=O)cc1” can’t be read
> by MolFromSmiles while MolFromSmarts work well.So what the difference
> between MolFromSmiles and MolFromSmarts when read a molecular smiles?( I
> still expect MolFromSmarts is compatible with MolFromSmile and more
> advanced.)
>
> 2.The smiles “n1c(=O)nc(=O)cc1” and “N1C(=O)NC(=O)C=C1” is two kinds of
> ways to write the structure,when I use the HasSubStructureMatch test it ,I
> expect the one is contained within the other,but the result is  inconsistent
> with my expect, I just want to know why? Did I misunderstand this function?
>
> >>> mfsmi *=* AllChem*.*MolFromSmiles
> >>> mfsma *=* AllChem*.*MolFromSmarts
> >>> asub *=* mfsma("n1c(=O)nc(=O)cc1")
> >>> a *=* mfsmi("n1c(=O)nc(=O)cc1")
> >>> a1 *=* mfsma("n1c(=O)nc(=O)cc1")
> >>> a*.*HasSubstructureMatch(asub)
> Traceback (most recent call last):
>   File "", line *1*, in 
> a.HasSubstructureMatch(asub)
> *AttributeError*: 'NoneType' object has no attribute
> 'HasSubstructureMatch'
> >>> a1*.*HasSubstructMatch(asub)
> True
> >>> bsub *=* mfsma("N1C(=O)NC(=O)C=C1")
> >>> b *=* mfsmi("N1C(=O)NC(=O)C=C1")
> >>> b1 *=* mfsma("N1C(=O)NC(=O)C=C1")
> >>> b*.*HasSubstructMatch(bsub)
> False
> >>> b1*.*HasSubstructMatch(bsub)
> True
> >>> a1*.*HasSubstructMatch(bsub)
> False
> >>> b*.*HasSubstructMatch(asub)
> True
> >>> b1*.*HasSubstructMatch(asub)
> False
>
>
>
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] HasSubStructureMatch error

2016-08-03 Thread macbook
Dear all,

There are several  questions I want to ask for help.

1. When I read a molecular by  MolFromSmiles and MolFromSmarts,it  throw an 
exception,As shown below , the molecule “n1c(=O)nc(=O)cc1” can’t be read by 
MolFromSmiles while MolFromSmarts work well.So what the difference between 
MolFromSmiles and MolFromSmarts when read a molecular smiles?( I still expect 
MolFromSmarts is compatible with MolFromSmile and more advanced.)

2.The smiles “n1c(=O)nc(=O)cc1” and “N1C(=O)NC(=O)C=C1” is two kinds of ways to 
write the structure,when I use the HasSubStructureMatch test it ,I expect the 
one is contained within the other,but the result is  inconsistent with my 
expect, I just want to know why? Did I misunderstand this function?

>>> mfsmi = AllChem.MolFromSmiles
>>> mfsma = AllChem.MolFromSmarts
>>> asub = mfsma("n1c(=O)nc(=O)cc1")
>>> a = mfsmi("n1c(=O)nc(=O)cc1")
>>> a1 = mfsma("n1c(=O)nc(=O)cc1")
>>> a.HasSubstructureMatch(asub)
Traceback (most recent call last):
  File "", line 1, in 
a.HasSubstructureMatch(asub)
AttributeError: 'NoneType' object has no attribute 'HasSubstructureMatch'
>>> a1.HasSubstructMatch(asub)
True
>>> bsub = mfsma("N1C(=O)NC(=O)C=C1")
>>> b = mfsmi("N1C(=O)NC(=O)C=C1")
>>> b1 = mfsma("N1C(=O)NC(=O)C=C1")
>>> b.HasSubstructMatch(bsub)
False
>>> b1.HasSubstructMatch(bsub)
True
>>> a1.HasSubstructMatch(bsub)
False
>>> b.HasSubstructMatch(asub)
True
>>> b1.HasSubstructMatch(asub)
False

--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange behavior with MMFFHasAllMoleculeParams()

2016-08-03 Thread Paolo Tosco
Dear Sereina,

I'll have a look. I suspect it might have something to do with the MMFF94 
aromaticity model, but it could also be a bug.
I'll get back to you later.

Cheers,
p.

> On 3 Aug 2016, at 08:00, Sereina  wrote:
> 
> Dear all,
> 
> I stumbled upon a - to me - rather strange behavior with 
> MMFFHasAllMoleculeParams().
> 
> I want to generate a molecule from SMILES, check if all MMFF parameters are 
> present, add hydrogens and generate conformers. However, the outcome (error 
> or not error) depends on the order of checking of the MMFF parameters and 
> adding hydrogens. 
> 
> Everything is fine if I first add the hydrogens:
> In [1]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')
> 
> In [1]: m = AllChem.AddHs(m)
> 
> Out[2]: AllChem.MMFFHasAllMoleculeParams(m)
> Out[2]: True
> 
> In [3]: AllChem.EmbedMultipleConfs(m, numConfs=100)
> Out[3]: 
> 
> But here’s what happens when I first check the MMFF parameters:
> In [4]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')
> 
> In [5]: AllChem.MMFFHasAllMoleculeParams(m)
> Out[5]: True
> 
> In [6]: m = AllChem.AddHs(m)
> 
> In [7]: AllChem.EmbedMultipleConfs(m, numConfs=100)
> RDKit ERROR: [08:41:02] Explicit valence for atom # 11 N, 4, is greater than 
> permitted
> ---
> ValueErrorTraceback (most recent call last)
>  in ()
> > 1 AllChem.EmbedMultipleConfs(m, numConfs=100)
> 
> ValueError: Sanitization error: Explicit valence for atom # 11 N, 4, is 
> greater than permitted
> 
> Interestingly, if I do the check first, but then remove the hydrogens before 
> adding hydrogens, things work again:
> In [8]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')
> 
> In [9]: AllChem.MMFFHasAllMoleculeParams(m)
> Out[9]: True
> 
> In [10]: m = AllChem.RemoveHs(m)
> 
> In [11]: m = AllChem.AddHs(m)
> 
> In [12]: AllChem.EmbedMultipleConfs(m, numConfs=100)
> Out[12]: 
> 
> I cannot really explain the behavior. It only happens for some molecules. Is 
> MMFFHasAllMoleculeParams() modifying the molecule, i.e. already addying 
> hydrogens?
> 
> Best,
> Sereina
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Strange behavior with MMFFHasAllMoleculeParams()

2016-08-03 Thread Sereina
Dear all,

I stumbled upon a - to me - rather strange behavior with 
MMFFHasAllMoleculeParams().

I want to generate a molecule from SMILES, check if all MMFF parameters are 
present, add hydrogens and generate conformers. However, the outcome (error or 
not error) depends on the order of checking of the MMFF parameters and adding 
hydrogens. 

Everything is fine if I first add the hydrogens:
In [1]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')

In [1]: m = AllChem.AddHs(m)

Out[2]: AllChem.MMFFHasAllMoleculeParams(m)
Out[2]: True

In [3]: AllChem.EmbedMultipleConfs(m, numConfs=100)
Out[3]: 

But here’s what happens when I first check the MMFF parameters:
In [4]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')

In [5]: AllChem.MMFFHasAllMoleculeParams(m)
Out[5]: True

In [6]: m = AllChem.AddHs(m)

In [7]: AllChem.EmbedMultipleConfs(m, numConfs=100)
RDKit ERROR: [08:41:02] Explicit valence for atom # 11 N, 4, is greater than 
permitted
---
ValueErrorTraceback (most recent call last)
 in ()
> 1 AllChem.EmbedMultipleConfs(m, numConfs=100)

ValueError: Sanitization error: Explicit valence for atom # 11 N, 4, is greater 
than permitted

Interestingly, if I do the check first, but then remove the hydrogens before 
adding hydrogens, things work again:
In [8]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')

In [9]: AllChem.MMFFHasAllMoleculeParams(m)
Out[9]: True

In [10]: m = AllChem.RemoveHs(m)

In [11]: m = AllChem.AddHs(m)

In [12]: AllChem.EmbedMultipleConfs(m, numConfs=100)
Out[12]: 

I cannot really explain the behavior. It only happens for some molecules. Is 
MMFFHasAllMoleculeParams() modifying the molecule, i.e. already addying 
hydrogens?

Best,
Sereina--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss