Re: [Rdkit-discuss] HasSubStructureMatch error
Hi, The problem is connected with the lack of explicit Hs in the aromatic version of the SMILES. The general rule is that if an aromatic heteroatom needs to have an H on it in order for the valence to make sense, then you need to include that in the SMILES. This does not hold when you express the SMILES in Kekule form (your second example), where the H counts are clear. Here's an example: In [2]: m = Chem.MolFromSmiles('[nH]1c(=O)[nH]c(=O)cc1') In [3]: Chem.MolToSmiles(m) Out[3]: 'O=c1cc[nH]c(=O)[nH]1' In [4]: m = Chem.MolFromSmiles('N1C(=O)NC(=O)C=C1') In [5]: Chem.MolToSmiles(m) Out[5]: 'O=c1cc[nH]c(=O)[nH]1' I hope this helps, -greg On Thu, Aug 4, 2016 at 5:13 AM, macbookwrote: > Dear all, > > There are several questions I want to ask for help. > > 1. When I read a molecular by MolFromSmiles and MolFromSmarts,it throw > an exception,As shown below , the molecule “n1c(=O)nc(=O)cc1” can’t be read > by MolFromSmiles while MolFromSmarts work well.So what the difference > between MolFromSmiles and MolFromSmarts when read a molecular smiles?( I > still expect MolFromSmarts is compatible with MolFromSmile and more > advanced.) > > 2.The smiles “n1c(=O)nc(=O)cc1” and “N1C(=O)NC(=O)C=C1” is two kinds of > ways to write the structure,when I use the HasSubStructureMatch test it ,I > expect the one is contained within the other,but the result is inconsistent > with my expect, I just want to know why? Did I misunderstand this function? > > >>> mfsmi *=* AllChem*.*MolFromSmiles > >>> mfsma *=* AllChem*.*MolFromSmarts > >>> asub *=* mfsma("n1c(=O)nc(=O)cc1") > >>> a *=* mfsmi("n1c(=O)nc(=O)cc1") > >>> a1 *=* mfsma("n1c(=O)nc(=O)cc1") > >>> a*.*HasSubstructureMatch(asub) > Traceback (most recent call last): > File "", line *1*, in > a.HasSubstructureMatch(asub) > *AttributeError*: 'NoneType' object has no attribute > 'HasSubstructureMatch' > >>> a1*.*HasSubstructMatch(asub) > True > >>> bsub *=* mfsma("N1C(=O)NC(=O)C=C1") > >>> b *=* mfsmi("N1C(=O)NC(=O)C=C1") > >>> b1 *=* mfsma("N1C(=O)NC(=O)C=C1") > >>> b*.*HasSubstructMatch(bsub) > False > >>> b1*.*HasSubstructMatch(bsub) > True > >>> a1*.*HasSubstructMatch(bsub) > False > >>> b*.*HasSubstructMatch(asub) > True > >>> b1*.*HasSubstructMatch(asub) > False > > > > -- > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] HasSubStructureMatch error
Dear all, There are several questions I want to ask for help. 1. When I read a molecular by MolFromSmiles and MolFromSmarts,it throw an exception,As shown below , the molecule “n1c(=O)nc(=O)cc1” can’t be read by MolFromSmiles while MolFromSmarts work well.So what the difference between MolFromSmiles and MolFromSmarts when read a molecular smiles?( I still expect MolFromSmarts is compatible with MolFromSmile and more advanced.) 2.The smiles “n1c(=O)nc(=O)cc1” and “N1C(=O)NC(=O)C=C1” is two kinds of ways to write the structure,when I use the HasSubStructureMatch test it ,I expect the one is contained within the other,but the result is inconsistent with my expect, I just want to know why? Did I misunderstand this function? >>> mfsmi = AllChem.MolFromSmiles >>> mfsma = AllChem.MolFromSmarts >>> asub = mfsma("n1c(=O)nc(=O)cc1") >>> a = mfsmi("n1c(=O)nc(=O)cc1") >>> a1 = mfsma("n1c(=O)nc(=O)cc1") >>> a.HasSubstructureMatch(asub) Traceback (most recent call last): File "", line 1, in a.HasSubstructureMatch(asub) AttributeError: 'NoneType' object has no attribute 'HasSubstructureMatch' >>> a1.HasSubstructMatch(asub) True >>> bsub = mfsma("N1C(=O)NC(=O)C=C1") >>> b = mfsmi("N1C(=O)NC(=O)C=C1") >>> b1 = mfsma("N1C(=O)NC(=O)C=C1") >>> b.HasSubstructMatch(bsub) False >>> b1.HasSubstructMatch(bsub) True >>> a1.HasSubstructMatch(bsub) False >>> b.HasSubstructMatch(asub) True >>> b1.HasSubstructMatch(asub) False -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Strange behavior with MMFFHasAllMoleculeParams()
Dear Sereina, I'll have a look. I suspect it might have something to do with the MMFF94 aromaticity model, but it could also be a bug. I'll get back to you later. Cheers, p. > On 3 Aug 2016, at 08:00, Sereinawrote: > > Dear all, > > I stumbled upon a - to me - rather strange behavior with > MMFFHasAllMoleculeParams(). > > I want to generate a molecule from SMILES, check if all MMFF parameters are > present, add hydrogens and generate conformers. However, the outcome (error > or not error) depends on the order of checking of the MMFF parameters and > adding hydrogens. > > Everything is fine if I first add the hydrogens: > In [1]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1') > > In [1]: m = AllChem.AddHs(m) > > Out[2]: AllChem.MMFFHasAllMoleculeParams(m) > Out[2]: True > > In [3]: AllChem.EmbedMultipleConfs(m, numConfs=100) > Out[3]: > > But here’s what happens when I first check the MMFF parameters: > In [4]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1') > > In [5]: AllChem.MMFFHasAllMoleculeParams(m) > Out[5]: True > > In [6]: m = AllChem.AddHs(m) > > In [7]: AllChem.EmbedMultipleConfs(m, numConfs=100) > RDKit ERROR: [08:41:02] Explicit valence for atom # 11 N, 4, is greater than > permitted > --- > ValueErrorTraceback (most recent call last) > in () > > 1 AllChem.EmbedMultipleConfs(m, numConfs=100) > > ValueError: Sanitization error: Explicit valence for atom # 11 N, 4, is > greater than permitted > > Interestingly, if I do the check first, but then remove the hydrogens before > adding hydrogens, things work again: > In [8]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1') > > In [9]: AllChem.MMFFHasAllMoleculeParams(m) > Out[9]: True > > In [10]: m = AllChem.RemoveHs(m) > > In [11]: m = AllChem.AddHs(m) > > In [12]: AllChem.EmbedMultipleConfs(m, numConfs=100) > Out[12]: > > I cannot really explain the behavior. It only happens for some molecules. Is > MMFFHasAllMoleculeParams() modifying the molecule, i.e. already addying > hydrogens? > > Best, > Sereina > -- > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Strange behavior with MMFFHasAllMoleculeParams()
Dear all, I stumbled upon a - to me - rather strange behavior with MMFFHasAllMoleculeParams(). I want to generate a molecule from SMILES, check if all MMFF parameters are present, add hydrogens and generate conformers. However, the outcome (error or not error) depends on the order of checking of the MMFF parameters and adding hydrogens. Everything is fine if I first add the hydrogens: In [1]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1') In [1]: m = AllChem.AddHs(m) Out[2]: AllChem.MMFFHasAllMoleculeParams(m) Out[2]: True In [3]: AllChem.EmbedMultipleConfs(m, numConfs=100) Out[3]: But here’s what happens when I first check the MMFF parameters: In [4]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1') In [5]: AllChem.MMFFHasAllMoleculeParams(m) Out[5]: True In [6]: m = AllChem.AddHs(m) In [7]: AllChem.EmbedMultipleConfs(m, numConfs=100) RDKit ERROR: [08:41:02] Explicit valence for atom # 11 N, 4, is greater than permitted --- ValueErrorTraceback (most recent call last) in () > 1 AllChem.EmbedMultipleConfs(m, numConfs=100) ValueError: Sanitization error: Explicit valence for atom # 11 N, 4, is greater than permitted Interestingly, if I do the check first, but then remove the hydrogens before adding hydrogens, things work again: In [8]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1') In [9]: AllChem.MMFFHasAllMoleculeParams(m) Out[9]: True In [10]: m = AllChem.RemoveHs(m) In [11]: m = AllChem.AddHs(m) In [12]: AllChem.EmbedMultipleConfs(m, numConfs=100) Out[12]: I cannot really explain the behavior. It only happens for some molecules. Is MMFFHasAllMoleculeParams() modifying the molecule, i.e. already addying hydrogens? Best, Sereina-- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss