I managed to make some time to look into this this weekend and I've found a
bug and something I don't understand. Hopefully the community can help out
here.
On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearns...@gmail.com>
wrote:
> 4) The big one! The returned results look very odd. They appear to relate
> more to the dimensions of the molecule than the moments of inertia. For a
> rod-like molecule (dimethylacetylene) I'd expect two large and one small
> PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828
> NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3:
> 25.3699 NPR1: 0.002430 NPR2: 0.002430.
> For disk-like (benzene) the result should be one large and two medium
> (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2:
> 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1:
> 2.14213e-11 NPR2: 0.999933.
> Finally for a roughly spherical molecule (neopentane) the NPR values look
> reasonable (no great surprise) but the absolute PMI values may be too
> small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799
> NPR1: 0.999966 NPR2: 0.999988, new program - PMI1: 6.59466 PMI2:
> 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935
>
Your expectations are correct: the current RDKit implementation is wrong.
The corresponding github entry is here:
https://github.com/rdkit/rdkit/issues/1262
This is due to a mistake in the way the principal moments are calculated
(which is due to the fact that I don't spend a lot of time working
with/thinking about 3D descriptors). Instead of using the
eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the
RDKit is currently using the covariance matrix. There's some more on the
relationship between these two here:
http://number-none.com/blow/inertia/deriving_i.html
The problem is easy to fix (and I have something working here:
https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up
the values of the descriptors that are derived from here:
Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of
Chemoinformatics http://dx.doi.org/10.1002/9783527618279.ch37
These include the radius of gyration, inertial shape factor, etc.
Within that article they state that Ic = 0 for planar molecules. Ignoring
the inequality on page 1010, which says that Ic is the largest moment and
is contradicted by the rest of the text (particularly the inequalities on
page 1011), Ic corresponds to the smallest principal moment : PMI1.
So now I'm confused, but I'm hoping this is obvious to someone versed in
the field: I'd like to reproduce the descriptors described in the
Todeschini article, but I clearly can't do that using the actual moments of
inertia. I could keep using the eigenvalues of the covariance matrix there,
but that doesn't match what's described in the text.
Two things that would be extremely helpful:
1) an explanation of the disconnect here from someone who knows this stuff,
I would guess that it's pretty simple
2) The results of running the files github1262_1.mol, github1262_2.mol, and
github1262_3.mol from here:
https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data
through Dragon and calculating the radius of gyration, inertial shape
factor, eccentricity, molecular asphericity, and spherocity index.
Best,
-greg
>
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss