This doesn't immediately help, but it's worth mentioning the upcoming
2019.09 release has functionality that should help here:

In [18]: m = Chem.MolFromSmiles('CN(C)(C)C',sanitize=False)


In [19]: problems = Chem.DetectChemistryProblems(m)


[06:47:43] Explicit valence for atom # 1 N, 4, is greater than permitted

In [20]: len(problems)


Out[20]: 1

In [21]: problems[0].GetType()


Out[21]: 'AtomValenceException'

In [22]: problems[0].GetAtomIdx()


Out[22]: 1

In [23]: problems[0].Message()


Out[23]: 'Explicit valence for atom # 1 N, 4, is greater than permitted'

In [24]: m2 = Chem.MolFromSmiles('c1cncc1',sanitize=False)



In [25]: problems = Chem.DetectChemistryProblems(m2)


[06:48:19] Can't kekulize mol.  Unkekulized atoms: 0 1 2 3 4


In [26]: len(problems)


Out[26]: 1

In [27]: problems[0].GetType()


Out[27]: 'KekulizeException'

In [28]: problems[0].GetAtomIndices()


Out[28]: (0, 1, 2, 3, 4)

In [29]: problems[0].Message()


Out[29]: "Can't kekulize mol.  Unkekulized atoms: 0 1 2 3 4\n"


For your case, since you have Hs and bonds, I would suggest directly
setting the charge on any 4-valent neutral nitrogen to +1.

One thing to also check is what the representation you are using does for
nitro groups.

-greg



On Fri, Oct 4, 2019 at 6:54 PM Chaya Stern <chaya.st...@choderalab.org>
wrote:

> Hello all,
>
> I am trying to create a molecule from geometry (a numpy array n_atoms x
> 3), symbols (list of atom symbols) and a connectivity map (list of list
> where each list is [atom_1_idx, atom_2_idx, bond_type]). The information
> also has all hydrogens. The following code works most of the time:
>
> from rdkit import Chem
> from rdkit.Geometry.rdGeometry import Point3D
>
> _BO_DISPATCH_TABLE = {1: Chem.BondType.SINGLE, 2: Chem.BondType.DOUBLE, 3:
> Chem.BondType.TRIPLE}
>
> conformer = Chem.Conformer(len(symbols))
>
> molecule = Chem.Mol()
> em = Chem.RWMol(molecule)
> for i, s in enumerate(symbols):
>     atom = em.AddAtom(Chem.Atom(cmiles.utils._symbols[s]))
>     atom_position = Point3D(geometry[i][0], geometry[i][1], geometry[i][2])
>     conformer.SetAtomPosition(atom, atom_position)
>
> # Add connectivity
> for bond in connectivity:
>     bond_type = _BO_DISPATCH_TABLE[bond[-1]]
>     em.AddBond(bond[0], bond[1], bond_type)
>
> molecule = em.GetMol()
> Chem.SanitizeMol(molecule)
>
> However, if a molecule has a tetravalent nitrogen, the data that I have
> does not have the explicit formal charge for each atom so I get the
> following error:
>
> ValueError: Sanitization error: Explicit valence for atom # 0 N, 4, is 
> greater than permitted
>
>
> Given that I have all the hydrogen and the total charge of the molecules, I 
> can go in and add the charge to the problematic nitrogen and check that the 
> total charge is still the same. But I am not sure how to capture the 
> offending atom instance. I can get the information from parsing the error 
> message (which is the hack I use now) but I was wondering if there is a 
> better way to do it.
>
>
> Thank you,
>
> Chaya
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to