Thanks Dan, There are two more issues after sanitizing:
1. For some molecules (e.g; c1ccnc1), I get the following error: Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 2. For some molecules (e.g; C#CC#CN#N), I get the following error: ValueError: Sanitization error: Explicit valence for atom # 4 N, 4, is greater than permitted I asked a similar question few weeks ago, where I got a similar error while having SMILES as my input, but non of the suggestions helped. Should I just get rid of these molecules? Thanks, Navid On Tue, Oct 22, 2019 at 11:57 PM Dan Nealschneider < dan.nealschnei...@schrodinger.com> wrote: > Navid- > You probably need to "sanitize" the mol: > > rdkit.Chem.rdmolops.SanitizeMol(mol) > > *dan nealschneider* | senior developer > [image: Schrodinger Logo] <https://www.schrodinger.com/> > > > On Tue, Oct 22, 2019 at 6:31 PM Navid Shervani-Tabar <nshe...@gmail.com> > wrote: > >> Hello, >> >> I am trying to load a dataset using a vector of atoms (e.g [6,6,7,6,6,8]) >> and the corresponding adjacency matrix. I am using the following script to >> transform these into a mol object: >> >> def MolFromGraphs(node_list, adjacency_matrix): >> >> # create empty editable mol object >> mol = Chem.RWMol() >> >> # add atoms to mol and keep track of index >> node_to_idx = {} >> for i in range(len(node_list)): >> a = Chem.Atom(node_list[i].item()) >> molIdx = mol.AddAtom(a) >> node_to_idx[i] = molIdx >> >> # add bonds between adjacent atoms >> for ix, row in enumerate(adjacency_matrix): >> for iy, bond in enumerate(row): >> >> # only traverse half the matrix >> if iy <= ix: >> continue >> >> # add relevant bond type (there are many more of these) >> if bond == 0: >> continue >> elif bond == 1: >> bond_type = Chem.rdchem.BondType.SINGLE >> mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type) >> elif bond == 2: >> bond_type = Chem.rdchem.BondType.DOUBLE >> mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type) >> elif bond == 3: >> bond_type = Chem.rdchem.BondType.TRIPLE >> mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type) >> elif bond == 1.5: >> bond_type = Chem.rdchem.BondType.AROMATIC >> mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type) >> >> # Convert RWMol to Mol object >> mol = mol.GetMol() >> >> return mol >> >> >> When I try to get the hybridization of atoms using the mol object >> generated from the function above, I get *UNSPECIFIED.* >> >> To make sure that this function works, I used *MolToSmiles *to generate >> a SMILES string from the generated mol object and it matched the actual >> SMILES from the dataset. Interestingly, when I regenerate the mol object >> from the SMILES that I already generated from the above function, I can get >> the hybridization from the new mol object with no problem. I was wondering >> if there is a flag or variable that I should set in the above function to >> be able to get hybridization from the generated mol object. >> >> Thanks! >> Navid >> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss