Thanks Dan,

There are two more issues after sanitizing:

1. For some molecules (e.g; c1ccnc1), I get the following error: Can't
kekulize mol.  Unkekulized atoms: 0 1 2 3 4

2. For some molecules (e.g; C#CC#CN#N), I get the following error:
ValueError: Sanitization error: Explicit valence for atom # 4 N, 4, is
greater than permitted

I asked a similar question few weeks ago, where I got a similar error while
having SMILES as my input, but non of the suggestions helped. Should I just
get rid of these molecules?

Thanks,
Navid

On Tue, Oct 22, 2019 at 11:57 PM Dan Nealschneider <
dan.nealschnei...@schrodinger.com> wrote:

> Navid-
> You probably need to "sanitize" the mol:
>
> rdkit.Chem.rdmolops.SanitizeMol(mol)
>
> *dan nealschneider* | senior developer
> [image: Schrodinger Logo] <https://www.schrodinger.com/>
>
>
> On Tue, Oct 22, 2019 at 6:31 PM Navid Shervani-Tabar <nshe...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I am trying to load a dataset using a vector of atoms (e.g [6,6,7,6,6,8])
>> and the corresponding adjacency matrix. I am using the following script to
>> transform these into a mol object:
>>
>> def MolFromGraphs(node_list, adjacency_matrix):
>>
>>     # create empty editable mol object
>>     mol = Chem.RWMol()
>>
>>     # add atoms to mol and keep track of index
>>     node_to_idx = {}
>>     for i in range(len(node_list)):
>>         a = Chem.Atom(node_list[i].item())
>>         molIdx = mol.AddAtom(a)
>>         node_to_idx[i] = molIdx
>>
>>     # add bonds between adjacent atoms
>>     for ix, row in enumerate(adjacency_matrix):
>>         for iy, bond in enumerate(row):
>>
>>             # only traverse half the matrix
>>             if iy <= ix:
>>                 continue
>>
>>             # add relevant bond type (there are many more of these)
>>             if bond == 0:
>>                 continue
>>             elif bond == 1:
>>                 bond_type = Chem.rdchem.BondType.SINGLE
>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>             elif bond == 2:
>>                 bond_type = Chem.rdchem.BondType.DOUBLE
>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>             elif bond == 3:
>>                 bond_type = Chem.rdchem.BondType.TRIPLE
>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>             elif bond == 1.5:
>>                 bond_type = Chem.rdchem.BondType.AROMATIC
>>                 mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
>>
>>     # Convert RWMol to Mol object
>>     mol = mol.GetMol()
>>
>>     return mol
>>
>>
>> When I try to get the hybridization of atoms using the mol object
>> generated from the function above, I get *UNSPECIFIED.*
>>
>> To make sure that this function works, I used *MolToSmiles *to generate
>> a SMILES string from the generated mol object and it matched the actual
>> SMILES from the dataset. Interestingly, when I regenerate the mol object
>> from the SMILES that I already generated from the above function, I can get
>> the hybridization from the new mol object with no problem. I was wondering
>> if there is a flag or variable that I should set in the above function to
>> be able to get hybridization from the generated mol object.
>>
>> Thanks!
>> Navid
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to