Makes sense, apologies for the lack of details -- it was a bit of a
convoluted process to get to that molecule.
Attached is a python script that hopefully reproduces it.

Essentially I'm taking the result of a Gaussian optimization (for a
radical); constructing an SDF file with OpenBabel (via cclib), and then
trying to read the result in RDKit.
I have the SMILES string of the radical, but the connectivity is lost in
the gaussian output file. So the SDF that gets created by OpenBabel has to
assume bond orders based on distances that it sometimes gets wrong.
I also had to edit the AssignBondOrdersFromTemplate function in AllChem to
handle the radical atoms.

If you had another recommendation on going from a gaussian output file to
an RDKit mol though, I'd certainly like to hear it.

Thanks!
-- Peter

On Wed, Nov 14, 2018 at 10:53 PM Greg Landrum <greg.land...@gmail.com>
wrote:

> Hi Peter,
>
> Without seeing how you're building the molecule this one is a bit tricky
> to help with.
>
> If I start with a standard molecule and just adjust the valence count
> things are fine:
>
> In [22]: m = Chem.MolFromSmiles('CNC(C)C')
>
> In [23]: m.GetAtomWithIdx(0).SetNumRadicalElectrons(1)
>
> In [24]: mh = Chem.AddHs(m)
>
> In [25]: print(Chem.MolToMolBlock(mh))
>
>      RDKit          2D
>
>  16 15  0  0  0  0  0  0  0  0999 V2000
>     0.0000    0.0000    0.0000 C   0  0  0  0  0  4  0  0  0  0  0  0
>     1.5000   -0.0000    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
>     2.2500   -1.2990    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     0.9510   -2.0490    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     3.5490   -0.5490    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -1.5000    0.0000    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>     0.0000    1.5000    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>    -0.0972   -0.7912    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>     2.0861    1.3808    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>     3.0000   -2.5981    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>    -0.3481   -2.7990    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>     0.3314   -1.5474    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>     1.7010   -3.3481    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>     4.8481    0.2010    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>     4.2990   -1.8481    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>     2.9630    0.8317    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
>   2  3  1  0
>   3  4  1  0
>   3  5  1  0
>   1  6  1  0
>   1  7  1  0
>   1  8  1  0
>   2  9  1  0
>   3 10  1  0
>   4 11  1  0
>   4 12  1  0
>   4 13  1  0
>   5 14  1  0
>   5 15  1  0
>   5 16  1  0
> M  RAD  1   1   2
> M  END
>
>
> In [26]: Chem.SanitizeMol(mh)
> Out[26]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> In [27]: Chem.SanitizeMol(m)
> Out[27]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
>
> How are you constructing the molecule with the radical?
>
> Best,
> -greg
>
>
> On Wed, Nov 14, 2018 at 7:36 PM Peter St. John <peterc.stj...@gmail.com>
> wrote:
>
>> I have a molecule with radicals for which I'm trying to correct the bond
>> orders.
>> The mol block I have currently is shown below.
>>
>> Ultimately it thinks the first carbon (which is supposed to have 2
>> explicit hydrogens, 1 C-C bond, and 1 radical electron) has a valence of 5.
>> So when I try to call `SanitizeMol`, it errors out with too high a valence.
>>
>> for the problematic atom 'a',
>>
>> >>> a.GetNumImplicitHs()
>>
>> RuntimeError: Pre-condition Violation
>>      getNumImplicitHs() called without preceding call to 
>> calcImplicitValence()
>>
>>
>> >>> a.GetTotalValence()
>>
>> 3 (odd, since this is what I want)
>>
>>
>> >>> a.UpdatePropertyCache()
>>
>> ValueError: Sanitization error: Explicit valence for atom # 0 C, 5, is 
>> greater than permitted
>>
>>
>> And when I print the mol block, it clearly thinks that first carbon as a 
>> valence of 5.
>>
>> Any suggestions how to fix this?
>>
>>
>> >>> print(Chem.MolToMolBlock(mol))
>>
>> 9572
>>      RDKit          3D
>>
>>  15 14  0  0  0  0  0  0  0  0999 V2000
>>     2.0411   -0.0455   -0.1061 C   0  0  0  0  0  *5*  0  0  0  0  0  0
>>     0.8127   -0.5644    0.2519 N   0  0  0  0  0  0  0  0  0  0  0  0
>>    -0.3953    0.0049   -0.3294 C   0  0  0  0  0  0  0  0  0  0  0  0
>>    -0.6511    1.4326    0.1487 C   0  0  0  0  0  0  0  0  0  0  0  0
>>    -1.5741   -0.9060   -0.0263 C   0  0  0  0  0  0  0  0  0  0  0  0
>>     2.1578    0.2387   -1.1430 H   0  0  0  0  0  0  0  0  0  0  0  0
>>     2.9032   -0.4021    0.4366 H   0  0  0  0  0  0  0  0  0  0  0  0
>>     0.7154   -0.7889    1.2330 H   0  0  0  0  0  0  0  0  0  0  0  0
>>    -0.2282    0.0219   -1.4109 H   0  0  0  0  0  0  0  0  0  0  0  0
>>    -0.8463    1.4378    1.2242 H   0  0  0  0  0  0  0  0  0  0  0  0
>>     0.2197    2.0597   -0.0426 H   0  0  0  0  0  0  0  0  0  0  0  0
>>    -1.5161    1.8651   -0.3565 H   0  0  0  0  0  0  0  0  0  0  0  0
>>    -1.7375   -0.9640    1.0535 H   0  0  0  0  0  0  0  0  0  0  0  0
>>    -1.3932   -1.9131   -0.4005 H   0  0  0  0  0  0  0  0  0  0  0  0
>>    -2.4874   -0.5194   -0.4787 H   0  0  0  0  0  0  0  0  0  0  0  0
>>   1  2  1  0
>>   1  7  1  0
>>   2  8  1  0
>>   3  5  1  0
>>   3  4  1  0
>>   3  2  1  0
>>   4 10  1  0
>>   5 13  1  0
>>   6  1  1  0
>>   9  3  1  0
>>  11  4  1  0
>>  12  4  1  0
>>  14  5  1  0
>>  15  5  1  0
>> M  RAD  1   1   2
>> M  END
>>
>>
>> Thanks!
>>
>> -- Peter St. John
>>
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
from rdkit import Chem
from rdkit.Chem import rdchem, SanitizeMol, MolFromSmiles
from rdkit.Chem import BondType

def AssignBondOrdersFromSMILES(mol, SMILES):
    """ assigns bond orders to a molecule based on the
    bond orders in a template molecule
    Arguments
    - refmol: the template molecule
    - mol: the molecule to assign bond orders to
    An example, start by generating a template from a SMILES
    and read in the PDB structure of the molecule
    """

    refmol = MolFromSmiles(SMILES)
    refmol2 = rdchem.Mol(refmol)
    mol2 = rdchem.Mol(mol)
    # do the molecules match already?
    matching = mol2.GetSubstructMatch(refmol2)
    if not matching:  # no, they don't match
        # check if bonds of mol are SINGLE
        for b in mol2.GetBonds():
            if b.GetBondType() != BondType.SINGLE:
                b.SetBondType(BondType.SINGLE)
                b.SetIsAromatic(False)
        # set the bonds of mol to SINGLE
        for b in refmol2.GetBonds():
            b.SetBondType(BondType.SINGLE)
            b.SetIsAromatic(False)
        # set atom charges to zero;
        for a in refmol2.GetAtoms():
            a.SetFormalCharge(0)
            a.SetNumRadicalElectrons(0)
        for a in mol2.GetAtoms():
            a.SetFormalCharge(0)
            a.SetNumRadicalElectrons(0)

    matching = mol2.GetSubstructMatches(refmol2, uniquify=False)
    # do the molecules match now?
    if not matching:
        raise RuntimeWarning("No Matches")

    matching = matching[0]
    # apply matching: set bond properties
    for b in refmol.GetBonds():
        atom1 = matching[b.GetBeginAtomIdx()]
        atom2 = matching[b.GetEndAtomIdx()]
        b2 = mol2.GetBondBetweenAtoms(atom1, atom2)
        b2.SetBondType(b.GetBondType())
        b2.SetIsAromatic(b.GetIsAromatic())
    # apply matching: set atom properties
    for a in refmol.GetAtoms():
        a2 = mol2.GetAtomWithIdx(matching[a.GetIdx()])
        a2.SetHybridization(a.GetHybridization())
        a2.SetIsAromatic(a.GetIsAromatic())
        a2.SetNumExplicitHs(a.GetNumExplicitHs())
        a2.SetFormalCharge(a.GetFormalCharge())
        a2.SetNumRadicalElectrons(a.GetNumRadicalElectrons())
        a2.SetNoImplicit(True)
    # SanitizeMol(mol2)
    return mol2


mol = Chem.MolFromMolBlock("""9572
     RDKit          3D

 15 14  0  0  0  0  0  0  0  0999 V2000
    2.0411   -0.0455   -0.1061 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.8127   -0.5644    0.2519 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3953    0.0049   -0.3294 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.6511    1.4326    0.1487 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.5741   -0.9060   -0.0263 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.1578    0.2387   -1.1430 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.9032   -0.4021    0.4366 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.7154   -0.7889    1.2330 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.2282    0.0219   -1.4109 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.8463    1.4378    1.2242 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.2197    2.0597   -0.0426 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.5161    1.8651   -0.3565 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7375   -0.9640    1.0535 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.3932   -1.9131   -0.4005 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.4874   -0.5194   -0.4787 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0
  1  7  1  0
  2  8  1  0
  3  5  1  0
  3  4  1  0
  3  2  1  0
  4 10  1  0
  5 13  1  0
  6  1  1  0
  9  3  1  0
 11  4  1  0
 12  4  1  0
 14  5  1  0
 15  5  1  0
M  END
""", sanitize=False)


# The resulting 'mol2' is the problem here
mol2 = AssignBondOrdersFromSMILES(mol, '[CH2]NC(C)C')

a = mol2.GetAtoms()[0]
a.GetTotalValence()  # 4
a.GetNumRadicalElectrons()  # 1

print(Chem.MolToMolBlock(mol2))
# 9572
#      RDKit          3D
#
#  15 14  0  0  0  0  0  0  0  0999 V2000
#     2.0411   -0.0455   -0.1061 C   0  0  0  0  0  5  0  0  0  0  0  0
#     0.8127   -0.5644    0.2519 N   0  0  0  0  0  0  0  0  0  0  0  0
#    -0.3953    0.0049   -0.3294 C   0  0  0  0  0  0  0  0  0  0  0  0
#    -0.6511    1.4326    0.1487 C   0  0  0  0  0  0  0  0  0  0  0  0
#    -1.5741   -0.9060   -0.0263 C   0  0  0  0  0  0  0  0  0  0  0  0
#     2.1578    0.2387   -1.1430 H   0  0  0  0  0  0  0  0  0  0  0  0
#     2.9032   -0.4021    0.4366 H   0  0  0  0  0  0  0  0  0  0  0  0
#     0.7154   -0.7889    1.2330 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -0.2282    0.0219   -1.4109 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -0.8463    1.4378    1.2242 H   0  0  0  0  0  0  0  0  0  0  0  0
#     0.2197    2.0597   -0.0426 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -1.5161    1.8651   -0.3565 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -1.7375   -0.9640    1.0535 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -1.3932   -1.9131   -0.4005 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -2.4874   -0.5194   -0.4787 H   0  0  0  0  0  0  0  0  0  0  0  0
#   1  2  1  0
#   1  7  1  0
#   2  8  1  0
#   3  5  1  0
#   3  4  1  0
#   3  2  1  0
#   4 10  1  0
#   5 13  1  0
#   6  1  1  0
#   9  3  1  0
#  11  4  1  0
#  12  4  1  0
#  14  5  1  0
#  15  5  1  0
# M  RAD  1   1   2
# M  END


for atom in mol2.GetAtoms():
    atom.SetNoImplicit(True)

print(Chem.MolToMolBlock(mol2))
# 9572
#      RDKit          3D
#
#  15 14  0  0  0  0  0  0  0  0999 V2000
#     2.0411   -0.0455   -0.1061 C   0  0  0  0  0  4  0  0  0  0  0  0
#     0.8127   -0.5644    0.2519 N   0  0  0  0  0  0  0  0  0  0  0  0
#    -0.3953    0.0049   -0.3294 C   0  0  0  0  0  0  0  0  0  0  0  0
#    -0.6511    1.4326    0.1487 C   0  0  0  0  0  0  0  0  0  0  0  0
#    -1.5741   -0.9060   -0.0263 C   0  0  0  0  0  0  0  0  0  0  0  0
#     2.1578    0.2387   -1.1430 H   0  0  0  0  0  0  0  0  0  0  0  0
#     2.9032   -0.4021    0.4366 H   0  0  0  0  0  0  0  0  0  0  0  0
#     0.7154   -0.7889    1.2330 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -0.2282    0.0219   -1.4109 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -0.8463    1.4378    1.2242 H   0  0  0  0  0  0  0  0  0  0  0  0
#     0.2197    2.0597   -0.0426 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -1.5161    1.8651   -0.3565 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -1.7375   -0.9640    1.0535 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -1.3932   -1.9131   -0.4005 H   0  0  0  0  0  0  0  0  0  0  0  0
#    -2.4874   -0.5194   -0.4787 H   0  0  0  0  0  0  0  0  0  0  0  0
#   1  2  1  0
#   1  7  1  0
#   2  8  1  0
#   3  5  1  0
#   3  4  1  0
#   3  2  1  0
#   4 10  1  0
#   5 13  1  0
#   6  1  1  0
#   9  3  1  0
#  11  4  1  0
#  12  4  1  0
#  14  5  1  0
#  15  5  1  0
# M  RAD  1   1   2
# M  END
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to