Hi Patrick,

I don't know Avogadro in detail, but as it is based on OpenBabel I can imagine it uses an algorithm to guess bond orders from bond distances, angles, etc., whereas the RDKit does not. In fact, if I convert a benzene ring from SMILES to SDF through PDB format (which does not contain any bond information once CONECT records are removed), I still get correct bond orders; this does not happen with RDKit (which is expected):


echo 'c1ccccc1' | babel --gen3d -i smi -o pdb | grep -v CONECT | babel -i pdb -o sdf

yields


 OpenBabel11021718533D

 12 12  0  0  0  0  0  0  0  0999 V2000
   -0.7600    1.1690   -0.0010 C   0  0  0  0  0  0  0 0  0  0  0  0
    0.6330    1.2450   -0.0010 C   0  0  0  0  0  0  0 0  0  0  0  0
    1.3950    0.0770    0.0000 C   0  0  0  0  0  0  0 0  0  0  0  0
    0.7640   -1.1680    0.0030 C   0  0  0  0  0  0  0 0  0  0  0  0
   -0.6290   -1.2430    0.0000 C   0  0  0  0  0  0  0 0  0  0  0  0
   -1.3910   -0.0750   -0.0020 C   0  0  0  0  0  0  0 0  0  0  0  0
   -1.3540    2.0790    0.0010 H   0  0  0  0  0  0  0 0  0  0  0  0
    1.1240    2.2140   -0.0030 H   0  0  0  0  0  0  0 0  0  0  0  0
    2.4800    0.1350   -0.0000 H   0  0  0  0  0  0  0 0  0  0  0  0
    1.3580   -2.0780    0.0060 H   0  0  0  0  0  0  0 0  0  0  0  0
   -1.1200   -2.2130   -0.0000 H   0  0  0  0  0  0  0 0  0  0  0  0
   -2.4760   -0.1340   -0.0030 H   0  0  0  0  0  0  0 0  0  0  0  0
  1  2  1  0  0  0  0
  1  7  1  0  0  0  0
  2  3  2  0  0  0  0
  3  9  1  0  0  0  0
  3  4  1  0  0  0  0
  4 10  1  0  0  0  0
  5 11  1  0  0  0  0
  5  4  2  0  0  0  0
  6  1  2  0  0  0  0
  6  5  1  0  0  0  0
  8  2  1  0  0  0  0
 12  6  1  0  0  0  0
M  END

whereas:

mol = Chem.AddHs(Chem.MolFromSmiles('c1ccccc1'))
rdDistGeom.EmbedMolecule(mol)
molFromPDB = Chem.MolFromPDBBlock(re.sub('CONECT.*\n', '', Chem.MolToPDBBlock(mol), flags=re.MULTILINE), removeHs = False)
print (Chem.MolToMolBlock(molFromPDB))

yields


     RDKit          3D

 12 12  0  0  0  0  0  0  0  0999 V2000
   -0.0060   -1.3620    0.0590 C   0  0  0  0  0  0  0 0  0  0  0  0
   -1.2180   -0.7080    0.0690 C   0  0  0  0  0  0  0 0  0  0  0  0
   -1.2090    0.6740   -0.0030 C   0  0  0  0  0  0  0 0  0  0  0  0
   -0.0030    1.3620    0.0280 C   0  0  0  0  0  0  0 0  0  0  0  0
    1.2210    0.7090    0.0760 C   0  0  0  0  0  0  0 0  0  0  0  0
    1.1960   -0.6780    0.0000 C   0  0  0  0  0  0  0 0  0  0  0  0
   -0.0250   -2.4100   -0.2260 H   0  0  0  0  0  0  0 0  0  0  0  0
   -2.1060   -1.1760    0.4870 H   0  0  0  0  0  0  0 0  0  0  0  0
   -2.0910    1.1950   -0.3790 H   0  0  0  0  0  0  0 0  0  0  0  0
    0.0420    2.4080   -0.2780 H   0  0  0  0  0  0  0 0  0  0  0  0
    2.1030    1.1830    0.5000 H   0  0  0  0  0  0  0 0  0  0  0  0
    2.0960   -1.1980   -0.3330 H   0  0  0  0  0  0  0 0  0  0  0  0
  2  1  1  0
  3  2  1  0
  4  3  1  0
  5  4  1  0
  6  5  1  0
  6  1  1  0
  7  1  1  0
  8  2  1  0
  9  3  1  0
 10  4  1  0
 11  5  1  0
 12  6  1  0
M  END

Note that the RDKit gets the correct bond orders if the PDB file contains CONECT records with duplicate (or triplicate) entries for the relevant atom indices, which is the widely accepted (though non official) way to encode bond orders in PDB files.

Cheers,
p.


On 11/02/17 18:45, Patrick Avery wrote:
Yes, that is probably correct. I am loading a pdb file for the initial conformer, so bond orders are not specified.

But I find it strange, still, that when I use MMFF94 in Avogadro to optimize it, it results in a planar shape even though all the bonds are still single.

On Thu, Nov 2, 2017 at 2:38 PM, Paolo Tosco <paolo.to...@unito.it <mailto:paolo.to...@unito.it>> wrote:

    Dear Patrick,

    my guess is that you loaded the caffeine coordinates from PDB, or
    anyway from a format where bond orders were not specified. All
    atoms appear to be sp3-hybridized, which results in the wrong
    geometry being generated.

    Hope that helps,
    Paolo


    On 11/02/17 18:20, Patrick Avery wrote:
    Hey there RDKitters,

    I have been generating conformers in RDKit, optimizing them using
    MMFF94, and sorting them by their energies. For some tests, I
    have been using caffeine. But I seem to have some strange
    results, and I wonder if anyone knows why.

    I have attached two images of an MMFF94 optimized caffeine
    conformer. In them, the oxygen in the top right corner is the
    primary strange thing I see (although it doesn't seem to be very
    planar either, which may be somewhat strange).

    Note that the oxygen is sticking out of the plane of the
    molecule. This is the lowest energy conformer generated and
    optimized by RDKit (and other conformers low in energy are
    similar to it - with the oxygen sticking out).

    If I take that exact molecule and MMFF94 optimize it in Avogadro,
    all the atoms move to be in the same plane (including the oxygen
    that was sticking out). So the problem goes away if I MMFF94
    optimize it with Avogadro.

    So the question is: why are RDKit and Avogadro giving different
    results for the MMFF94 optimization? And why does RDKit's MMFF94
    push the oxygen out of the plane so much?

    In RDKit, I am using the C++ function
    RDKit::MMFF::MMFFOptimizeMoleculeConfs().

    My parameters are the mol (with all the conformers in it), an
    empty result vector, 1 thread, 1000 maximum optimization
    iterations, "MMFF94" for the mmffVariant, 100.0 for the nonbonded
    threshold, and true for ignoreInterfragInteractions.

    Let me know if anyone knows why.

    Thanks,
    Patrick


    
------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, Slashdot.org!http://sdm.link/slashdot


    _______________________________________________
    Rdkit-discuss mailing list
    Rdkit-discuss@lists.sourceforge.net
    <mailto:Rdkit-discuss@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
    <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>


    
------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, Slashdot.org! http://sdm.link/slashdot
    _______________________________________________
    Rdkit-discuss mailing list
    Rdkit-discuss@lists.sourceforge.net
    <mailto:Rdkit-discuss@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
    <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to