Given that those molecules are not chemically reasonable, I would suggest
either fixing them by hand or removing them.

On Wed, 23 Oct 2019 at 16:46, Navid Shervani-Tabar <nshe...@gmail.com>
wrote:

> Thanks Dan,
>
> There are two more issues after sanitizing:
>
> 1. For some molecules (e.g; c1ccnc1), I get the following error: Can't
> kekulize mol.  Unkekulized atoms: 0 1 2 3 4
>
> 2. For some molecules (e.g; C#CC#CN#N), I get the following error:
> ValueError: Sanitization error: Explicit valence for atom # 4 N, 4, is
> greater than permitted
>
> I asked a similar question few weeks ago, where I got a similar error
> while having SMILES as my input, but non of the suggestions helped. Should
> I just get rid of these molecules?
>
> Thanks,
> Navid
>
> On Tue, Oct 22, 2019 at 11:57 PM Dan Nealschneider <
> dan.nealschnei...@schrodinger.com> wrote:
>
>> Navid-
>> You probably need to "sanitize" the mol:
>>
>> rdkit.Chem.rdmolops.SanitizeMol(mol)
>>
>> *dan nealschneider* | senior developer
>> [image: Schrodinger Logo] <https://www.schrodinger.com/>
>>
>>
>> On Tue, Oct 22, 2019 at 6:31 PM Navid Shervani-Tabar <nshe...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> I am trying to load a dataset using a vector of atoms (e.g
>>> [6,6,7,6,6,8]) and the corresponding adjacency matrix. I am using the
>>> following script to transform these into a mol object:
>>>
>>> def MolFromGraphs(node_list, adjacency_matrix):
>>>
>>>     # create empty editable mol object
>>>     mol = Chem.RWMol()
>>>
>>>     # add atoms to mol and keep track of index
>>>     node_to_idx = {}
>>>     for i in range(len(node_list)):
>>>         a = Chem.Atom(node_list[i].item())
>>>         molIdx = mol.AddAtom(a)
>>>         node_to_idx[i] = molIdx
>>>
>>>     # add bonds between adjacent atoms
>>>     for ix, row in enumerate(adjacency_matrix):
>>>         for iy, bond in enumerate(row):
>>>
>>>             # only traverse half the matrix
>>>             if iy <= ix:
>>>                 continue
>>>
>>>             # add relevant bond type (there are many more of these)
>>>             if bond == 0:
>>>                 continue
>>>             elif bond == 1:
>>>                 bond_type = Chem.rdchem.BondType.SINGLE
>>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>>             elif bond == 2:
>>>                 bond_type = Chem.rdchem.BondType.DOUBLE
>>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>>             elif bond == 3:
>>>                 bond_type = Chem.rdchem.BondType.TRIPLE
>>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>>             elif bond == 1.5:
>>>                 bond_type = Chem.rdchem.BondType.AROMATIC
>>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>>
>>>     # Convert RWMol to Mol object
>>>     mol = mol.GetMol()
>>>
>>>     return mol
>>>
>>>
>>> When I try to get the hybridization of atoms using the mol object
>>> generated from the function above, I get *UNSPECIFIED.*
>>>
>>> To make sure that this function works, I used *MolToSmiles *to generate
>>> a SMILES string from the generated mol object and it matched the actual
>>> SMILES from the dataset. Interestingly, when I regenerate the mol object
>>> from the SMILES that I already generated from the above function, I can get
>>> the hybridization from the new mol object with no problem. I was wondering
>>> if there is a flag or variable that I should set in the above function to
>>> be able to get hybridization from the generated mol object.
>>>
>>> Thanks!
>>> Navid
>>>
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to