Given that those molecules are not chemically reasonable, I would suggest either fixing them by hand or removing them.
On Wed, 23 Oct 2019 at 16:46, Navid Shervani-Tabar <nshe...@gmail.com> wrote: > Thanks Dan, > > There are two more issues after sanitizing: > > 1. For some molecules (e.g; c1ccnc1), I get the following error: Can't > kekulize mol. Unkekulized atoms: 0 1 2 3 4 > > 2. For some molecules (e.g; C#CC#CN#N), I get the following error: > ValueError: Sanitization error: Explicit valence for atom # 4 N, 4, is > greater than permitted > > I asked a similar question few weeks ago, where I got a similar error > while having SMILES as my input, but non of the suggestions helped. Should > I just get rid of these molecules? > > Thanks, > Navid > > On Tue, Oct 22, 2019 at 11:57 PM Dan Nealschneider < > dan.nealschnei...@schrodinger.com> wrote: > >> Navid- >> You probably need to "sanitize" the mol: >> >> rdkit.Chem.rdmolops.SanitizeMol(mol) >> >> *dan nealschneider* | senior developer >> [image: Schrodinger Logo] <https://www.schrodinger.com/> >> >> >> On Tue, Oct 22, 2019 at 6:31 PM Navid Shervani-Tabar <nshe...@gmail.com> >> wrote: >> >>> Hello, >>> >>> I am trying to load a dataset using a vector of atoms (e.g >>> [6,6,7,6,6,8]) and the corresponding adjacency matrix. I am using the >>> following script to transform these into a mol object: >>> >>> def MolFromGraphs(node_list, adjacency_matrix): >>> >>> # create empty editable mol object >>> mol = Chem.RWMol() >>> >>> # add atoms to mol and keep track of index >>> node_to_idx = {} >>> for i in range(len(node_list)): >>> a = Chem.Atom(node_list[i].item()) >>> molIdx = mol.AddAtom(a) >>> node_to_idx[i] = molIdx >>> >>> # add bonds between adjacent atoms >>> for ix, row in enumerate(adjacency_matrix): >>> for iy, bond in enumerate(row): >>> >>> # only traverse half the matrix >>> if iy <= ix: >>> continue >>> >>> # add relevant bond type (there are many more of these) >>> if bond == 0: >>> continue >>> elif bond == 1: >>> bond_type = Chem.rdchem.BondType.SINGLE >>> mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type) >>> elif bond == 2: >>> bond_type = Chem.rdchem.BondType.DOUBLE >>> mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type) >>> elif bond == 3: >>> bond_type = Chem.rdchem.BondType.TRIPLE >>> mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type) >>> elif bond == 1.5: >>> bond_type = Chem.rdchem.BondType.AROMATIC >>> mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type) >>> >>> # Convert RWMol to Mol object >>> mol = mol.GetMol() >>> >>> return mol >>> >>> >>> When I try to get the hybridization of atoms using the mol object >>> generated from the function above, I get *UNSPECIFIED.* >>> >>> To make sure that this function works, I used *MolToSmiles *to generate >>> a SMILES string from the generated mol object and it matched the actual >>> SMILES from the dataset. Interestingly, when I regenerate the mol object >>> from the SMILES that I already generated from the above function, I can get >>> the hybridization from the new mol object with no problem. I was wondering >>> if there is a flag or variable that I should set in the above function to >>> be able to get hybridization from the generated mol object. >>> >>> Thanks! >>> Navid >>> >>> _______________________________________________ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss