Makes sense, apologies for the lack of details -- it was a bit of a convoluted process to get to that molecule. Attached is a python script that hopefully reproduces it.
Essentially I'm taking the result of a Gaussian optimization (for a radical); constructing an SDF file with OpenBabel (via cclib), and then trying to read the result in RDKit. I have the SMILES string of the radical, but the connectivity is lost in the gaussian output file. So the SDF that gets created by OpenBabel has to assume bond orders based on distances that it sometimes gets wrong. I also had to edit the AssignBondOrdersFromTemplate function in AllChem to handle the radical atoms. If you had another recommendation on going from a gaussian output file to an RDKit mol though, I'd certainly like to hear it. Thanks! -- Peter On Wed, Nov 14, 2018 at 10:53 PM Greg Landrum <greg.land...@gmail.com> wrote: > Hi Peter, > > Without seeing how you're building the molecule this one is a bit tricky > to help with. > > If I start with a standard molecule and just adjust the valence count > things are fine: > > In [22]: m = Chem.MolFromSmiles('CNC(C)C') > > In [23]: m.GetAtomWithIdx(0).SetNumRadicalElectrons(1) > > In [24]: mh = Chem.AddHs(m) > > In [25]: print(Chem.MolToMolBlock(mh)) > > RDKit 2D > > 16 15 0 0 0 0 0 0 0 0999 V2000 > 0.0000 0.0000 0.0000 C 0 0 0 0 0 4 0 0 0 0 0 0 > 1.5000 -0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 > 2.2500 -1.2990 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 > 0.9510 -2.0490 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 > 3.5490 -0.5490 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 > -1.5000 0.0000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 0.0000 1.5000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > -0.0972 -0.7912 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 2.0861 1.3808 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 3.0000 -2.5981 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > -0.3481 -2.7990 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 0.3314 -1.5474 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 1.7010 -3.3481 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 4.8481 0.2010 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 4.2990 -1.8481 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 2.9630 0.8317 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 > 1 2 1 0 > 2 3 1 0 > 3 4 1 0 > 3 5 1 0 > 1 6 1 0 > 1 7 1 0 > 1 8 1 0 > 2 9 1 0 > 3 10 1 0 > 4 11 1 0 > 4 12 1 0 > 4 13 1 0 > 5 14 1 0 > 5 15 1 0 > 5 16 1 0 > M RAD 1 1 2 > M END > > > In [26]: Chem.SanitizeMol(mh) > Out[26]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE > > In [27]: Chem.SanitizeMol(m) > Out[27]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE > > > How are you constructing the molecule with the radical? > > Best, > -greg > > > On Wed, Nov 14, 2018 at 7:36 PM Peter St. John <peterc.stj...@gmail.com> > wrote: > >> I have a molecule with radicals for which I'm trying to correct the bond >> orders. >> The mol block I have currently is shown below. >> >> Ultimately it thinks the first carbon (which is supposed to have 2 >> explicit hydrogens, 1 C-C bond, and 1 radical electron) has a valence of 5. >> So when I try to call `SanitizeMol`, it errors out with too high a valence. >> >> for the problematic atom 'a', >> >> >>> a.GetNumImplicitHs() >> >> RuntimeError: Pre-condition Violation >> getNumImplicitHs() called without preceding call to >> calcImplicitValence() >> >> >> >>> a.GetTotalValence() >> >> 3 (odd, since this is what I want) >> >> >> >>> a.UpdatePropertyCache() >> >> ValueError: Sanitization error: Explicit valence for atom # 0 C, 5, is >> greater than permitted >> >> >> And when I print the mol block, it clearly thinks that first carbon as a >> valence of 5. >> >> Any suggestions how to fix this? >> >> >> >>> print(Chem.MolToMolBlock(mol)) >> >> 9572 >> RDKit 3D >> >> 15 14 0 0 0 0 0 0 0 0999 V2000 >> 2.0411 -0.0455 -0.1061 C 0 0 0 0 0 *5* 0 0 0 0 0 0 >> 0.8127 -0.5644 0.2519 N 0 0 0 0 0 0 0 0 0 0 0 0 >> -0.3953 0.0049 -0.3294 C 0 0 0 0 0 0 0 0 0 0 0 0 >> -0.6511 1.4326 0.1487 C 0 0 0 0 0 0 0 0 0 0 0 0 >> -1.5741 -0.9060 -0.0263 C 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.1578 0.2387 -1.1430 H 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.9032 -0.4021 0.4366 H 0 0 0 0 0 0 0 0 0 0 0 0 >> 0.7154 -0.7889 1.2330 H 0 0 0 0 0 0 0 0 0 0 0 0 >> -0.2282 0.0219 -1.4109 H 0 0 0 0 0 0 0 0 0 0 0 0 >> -0.8463 1.4378 1.2242 H 0 0 0 0 0 0 0 0 0 0 0 0 >> 0.2197 2.0597 -0.0426 H 0 0 0 0 0 0 0 0 0 0 0 0 >> -1.5161 1.8651 -0.3565 H 0 0 0 0 0 0 0 0 0 0 0 0 >> -1.7375 -0.9640 1.0535 H 0 0 0 0 0 0 0 0 0 0 0 0 >> -1.3932 -1.9131 -0.4005 H 0 0 0 0 0 0 0 0 0 0 0 0 >> -2.4874 -0.5194 -0.4787 H 0 0 0 0 0 0 0 0 0 0 0 0 >> 1 2 1 0 >> 1 7 1 0 >> 2 8 1 0 >> 3 5 1 0 >> 3 4 1 0 >> 3 2 1 0 >> 4 10 1 0 >> 5 13 1 0 >> 6 1 1 0 >> 9 3 1 0 >> 11 4 1 0 >> 12 4 1 0 >> 14 5 1 0 >> 15 5 1 0 >> M RAD 1 1 2 >> M END >> >> >> Thanks! >> >> -- Peter St. John >> >> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >
from rdkit import Chem from rdkit.Chem import rdchem, SanitizeMol, MolFromSmiles from rdkit.Chem import BondType def AssignBondOrdersFromSMILES(mol, SMILES): """ assigns bond orders to a molecule based on the bond orders in a template molecule Arguments - refmol: the template molecule - mol: the molecule to assign bond orders to An example, start by generating a template from a SMILES and read in the PDB structure of the molecule """ refmol = MolFromSmiles(SMILES) refmol2 = rdchem.Mol(refmol) mol2 = rdchem.Mol(mol) # do the molecules match already? matching = mol2.GetSubstructMatch(refmol2) if not matching: # no, they don't match # check if bonds of mol are SINGLE for b in mol2.GetBonds(): if b.GetBondType() != BondType.SINGLE: b.SetBondType(BondType.SINGLE) b.SetIsAromatic(False) # set the bonds of mol to SINGLE for b in refmol2.GetBonds(): b.SetBondType(BondType.SINGLE) b.SetIsAromatic(False) # set atom charges to zero; for a in refmol2.GetAtoms(): a.SetFormalCharge(0) a.SetNumRadicalElectrons(0) for a in mol2.GetAtoms(): a.SetFormalCharge(0) a.SetNumRadicalElectrons(0) matching = mol2.GetSubstructMatches(refmol2, uniquify=False) # do the molecules match now? if not matching: raise RuntimeWarning("No Matches") matching = matching[0] # apply matching: set bond properties for b in refmol.GetBonds(): atom1 = matching[b.GetBeginAtomIdx()] atom2 = matching[b.GetEndAtomIdx()] b2 = mol2.GetBondBetweenAtoms(atom1, atom2) b2.SetBondType(b.GetBondType()) b2.SetIsAromatic(b.GetIsAromatic()) # apply matching: set atom properties for a in refmol.GetAtoms(): a2 = mol2.GetAtomWithIdx(matching[a.GetIdx()]) a2.SetHybridization(a.GetHybridization()) a2.SetIsAromatic(a.GetIsAromatic()) a2.SetNumExplicitHs(a.GetNumExplicitHs()) a2.SetFormalCharge(a.GetFormalCharge()) a2.SetNumRadicalElectrons(a.GetNumRadicalElectrons()) a2.SetNoImplicit(True) # SanitizeMol(mol2) return mol2 mol = Chem.MolFromMolBlock("""9572 RDKit 3D 15 14 0 0 0 0 0 0 0 0999 V2000 2.0411 -0.0455 -0.1061 C 0 0 0 0 0 0 0 0 0 0 0 0 0.8127 -0.5644 0.2519 N 0 0 0 0 0 0 0 0 0 0 0 0 -0.3953 0.0049 -0.3294 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.6511 1.4326 0.1487 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.5741 -0.9060 -0.0263 C 0 0 0 0 0 0 0 0 0 0 0 0 2.1578 0.2387 -1.1430 H 0 0 0 0 0 0 0 0 0 0 0 0 2.9032 -0.4021 0.4366 H 0 0 0 0 0 0 0 0 0 0 0 0 0.7154 -0.7889 1.2330 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.2282 0.0219 -1.4109 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.8463 1.4378 1.2242 H 0 0 0 0 0 0 0 0 0 0 0 0 0.2197 2.0597 -0.0426 H 0 0 0 0 0 0 0 0 0 0 0 0 -1.5161 1.8651 -0.3565 H 0 0 0 0 0 0 0 0 0 0 0 0 -1.7375 -0.9640 1.0535 H 0 0 0 0 0 0 0 0 0 0 0 0 -1.3932 -1.9131 -0.4005 H 0 0 0 0 0 0 0 0 0 0 0 0 -2.4874 -0.5194 -0.4787 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 1 7 1 0 2 8 1 0 3 5 1 0 3 4 1 0 3 2 1 0 4 10 1 0 5 13 1 0 6 1 1 0 9 3 1 0 11 4 1 0 12 4 1 0 14 5 1 0 15 5 1 0 M END """, sanitize=False) # The resulting 'mol2' is the problem here mol2 = AssignBondOrdersFromSMILES(mol, '[CH2]NC(C)C') a = mol2.GetAtoms()[0] a.GetTotalValence() # 4 a.GetNumRadicalElectrons() # 1 print(Chem.MolToMolBlock(mol2)) # 9572 # RDKit 3D # # 15 14 0 0 0 0 0 0 0 0999 V2000 # 2.0411 -0.0455 -0.1061 C 0 0 0 0 0 5 0 0 0 0 0 0 # 0.8127 -0.5644 0.2519 N 0 0 0 0 0 0 0 0 0 0 0 0 # -0.3953 0.0049 -0.3294 C 0 0 0 0 0 0 0 0 0 0 0 0 # -0.6511 1.4326 0.1487 C 0 0 0 0 0 0 0 0 0 0 0 0 # -1.5741 -0.9060 -0.0263 C 0 0 0 0 0 0 0 0 0 0 0 0 # 2.1578 0.2387 -1.1430 H 0 0 0 0 0 0 0 0 0 0 0 0 # 2.9032 -0.4021 0.4366 H 0 0 0 0 0 0 0 0 0 0 0 0 # 0.7154 -0.7889 1.2330 H 0 0 0 0 0 0 0 0 0 0 0 0 # -0.2282 0.0219 -1.4109 H 0 0 0 0 0 0 0 0 0 0 0 0 # -0.8463 1.4378 1.2242 H 0 0 0 0 0 0 0 0 0 0 0 0 # 0.2197 2.0597 -0.0426 H 0 0 0 0 0 0 0 0 0 0 0 0 # -1.5161 1.8651 -0.3565 H 0 0 0 0 0 0 0 0 0 0 0 0 # -1.7375 -0.9640 1.0535 H 0 0 0 0 0 0 0 0 0 0 0 0 # -1.3932 -1.9131 -0.4005 H 0 0 0 0 0 0 0 0 0 0 0 0 # -2.4874 -0.5194 -0.4787 H 0 0 0 0 0 0 0 0 0 0 0 0 # 1 2 1 0 # 1 7 1 0 # 2 8 1 0 # 3 5 1 0 # 3 4 1 0 # 3 2 1 0 # 4 10 1 0 # 5 13 1 0 # 6 1 1 0 # 9 3 1 0 # 11 4 1 0 # 12 4 1 0 # 14 5 1 0 # 15 5 1 0 # M RAD 1 1 2 # M END for atom in mol2.GetAtoms(): atom.SetNoImplicit(True) print(Chem.MolToMolBlock(mol2)) # 9572 # RDKit 3D # # 15 14 0 0 0 0 0 0 0 0999 V2000 # 2.0411 -0.0455 -0.1061 C 0 0 0 0 0 4 0 0 0 0 0 0 # 0.8127 -0.5644 0.2519 N 0 0 0 0 0 0 0 0 0 0 0 0 # -0.3953 0.0049 -0.3294 C 0 0 0 0 0 0 0 0 0 0 0 0 # -0.6511 1.4326 0.1487 C 0 0 0 0 0 0 0 0 0 0 0 0 # -1.5741 -0.9060 -0.0263 C 0 0 0 0 0 0 0 0 0 0 0 0 # 2.1578 0.2387 -1.1430 H 0 0 0 0 0 0 0 0 0 0 0 0 # 2.9032 -0.4021 0.4366 H 0 0 0 0 0 0 0 0 0 0 0 0 # 0.7154 -0.7889 1.2330 H 0 0 0 0 0 0 0 0 0 0 0 0 # -0.2282 0.0219 -1.4109 H 0 0 0 0 0 0 0 0 0 0 0 0 # -0.8463 1.4378 1.2242 H 0 0 0 0 0 0 0 0 0 0 0 0 # 0.2197 2.0597 -0.0426 H 0 0 0 0 0 0 0 0 0 0 0 0 # -1.5161 1.8651 -0.3565 H 0 0 0 0 0 0 0 0 0 0 0 0 # -1.7375 -0.9640 1.0535 H 0 0 0 0 0 0 0 0 0 0 0 0 # -1.3932 -1.9131 -0.4005 H 0 0 0 0 0 0 0 0 0 0 0 0 # -2.4874 -0.5194 -0.4787 H 0 0 0 0 0 0 0 0 0 0 0 0 # 1 2 1 0 # 1 7 1 0 # 2 8 1 0 # 3 5 1 0 # 3 4 1 0 # 3 2 1 0 # 4 10 1 0 # 5 13 1 0 # 6 1 1 0 # 9 3 1 0 # 11 4 1 0 # 12 4 1 0 # 14 5 1 0 # 15 5 1 0 # M RAD 1 1 2 # M END
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss