Re: [Rdkit-discuss] PMI API

2017-01-17 Thread Guillaume GODIN
Thanks Brian,


PBF = 0 <=> 2D & PBF >0 <=> 3D.


I forget that point.


BR,

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645
MOBILE  +41 (0)79 536 1039
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Brian Kelley <fustiga...@gmail.com>
Envoyé : mardi 17 janvier 2017 14:06
À : Guillaume GODIN
Cc : cgearns...@gmail.com; Rdkit-discuss@lists.sourceforge.net; Greg Landrum
Objet : Re: [Rdkit-discuss] PMI API

In the inertial frame this is trivial, however, with the current RDKit can't 
you just use the plane of best fit here for the planar/3D?  For a linear 
molecule, you can use the PMI descriptors.

See PBF in RDKit

http://pubs.acs.org/doi/abs/10.1021/ci300293f

Cheers,
 Brian

On Tue, Jan 17, 2017 at 7:58 AM, Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>> wrote:

​Great! I also notice confusing usage of moment of Inertia in those descriptors.


For exemple in WHIM case, we need to know if the molecule is linear, planar or 
3D in order to compute the descriptors.


I did not find a easy way to determine this yet.


BR,​

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645<tel:+41%2022%20780%2036%2045>
MOBILE  +41 (0)79 536 1039<tel:+41%2079%20536%2010%2039>
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Brian Kelley <fustiga...@gmail.com<mailto:fustiga...@gmail.com>>
Envoyé : mardi 17 janvier 2017 13:44
À : Chris Earnshaw
Cc : 
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>;
 Greg Landrum
Objet : Re: [Rdkit-discuss] PMI API

I think we agree here.  Here I was talking about the raw Moment (M1z) not the 
moment of interia (MI1), I should have made the disctinction more explicit.  
Moments are not necessarily Moments of inertia.  The terminology gets confusing.

After a brief discussion with Greg, the Moments.py does the correct calculation 
which indirectly verifies MOE and the newer RDKit implementation.

Cheers,
 Brian

On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw 
<cgearns...@gmail.com<mailto:cgearns...@gmail.com>> wrote:
The dimensions along one of the axes of a planar molecule in its inertial frame 
will be zero, but the principal moments of inertia will all be non-zero. The 
moment of inertia about an axis can only be zero if all the atoms in the 
molecule are precisely aligned on that axis. That's only possible for linear 
molecules. There's no way to draw a straight line axis through all the atoms in 
a non-linear molecule, which would be a requirement for the corresponding 
moment of inertia to be zero.

Chris

On 17 January 2017 at 12:29, Brian Kelley 
<fustiga...@gmail.com<mailto:fustiga...@gmail.com>> wrote:
Looks like I'm late to the game.  I don't know about the PMI descriptors 
per-se, but if a planar molecule is in it's inertial frame, one of the axes 
should be zero (whether it is x, y or z) which means that the one of the M1x, 
M1y or M1z should be zero.

We had some good experimentation with multipole expansion of moments 
(essentially based on the description of electrostatic multipoles) that might 
be nice to add to the PMI framework.

Greg, I'm assuming that the Moments.py we opensourced a while back is similarly 
broken?  I'm attaching it here for posterity but it does appear to match the 
moe PMI's.



On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw 
<cgearns...@gmail.com<mailto:cgearns...@gmail.com>> wrote:
The new version looks good to me as far as I can test it. PMI and NPR are still 
fine, the radius of gyration is right (for an extremely artificial test system) 
and the asphericity index also seems right (despite my best efforts to confuse 
things further - sorry about that!). Also highlights even more confusion in the 
Todeschini article - the approximate asphericity values for prolate and oblate 
molecules are reversed.

The only (very trivial) thing I've spotted is the comment in the 
inertialShapeFactor function. 'planar or no coordinates' should be 'linear or 
no coordinates' to avoid confusion.

Chris

On 16 January 2017 at 09:30, Greg Landrum 
<greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote:


On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw 
<ch...@cge-compchem.co.uk<mailto:ch...@cge-compchem.co.uk>> wrote:

Either way, it makes it rather hard to trust their derivations generally - 
especially as there appear to be other errors (e.g. the denominator in eq. 16 
should be the square root of the given sum of squares, according to their 
reference).

Indeed. Given the problems encountered, I went back and checked some addition

Re: [Rdkit-discuss] PMI API

2017-01-17 Thread Brian Kelley
In the inertial frame this is trivial, however, with the current RDKit
can't you just use the plane of best fit here for the planar/3D?  For a
linear molecule, you can use the PMI descriptors.

See PBF in RDKit

http://pubs.acs.org/doi/abs/10.1021/ci300293f

Cheers,
 Brian

On Tue, Jan 17, 2017 at 7:58 AM, Guillaume GODIN <
guillaume.go...@firmenich.com> wrote:

> ​Great! I also notice confusing usage of moment of Inertia in those
> descriptors.
>
>
> For exemple in WHIM case, we need to know if the molecule is linear,
> planar or 3D in order to compute the descriptors.
>
>
> I did not find a easy way to determine this yet.
>
>
> BR,​
>
> *Dr. Guillaume GODIN*
> Principal Scientist
> Chemoinformatic & Datamining
> Innovation
> CORPORATE R DIVISION
> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045>
> MOBILE  +41 (0)79 536 1039 <+41%2079%20536%2010%2039>
> Firmenich SA
> RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8
>
> --
> *De :* Brian Kelley <fustiga...@gmail.com>
> *Envoyé :* mardi 17 janvier 2017 13:44
> *À :* Chris Earnshaw
> *Cc :* Rdkit-discuss@lists.sourceforge.net; Greg Landrum
> *Objet :* Re: [Rdkit-discuss] PMI API
>
> I think we agree here.  Here I was talking about the raw Moment (M1z) not
> the moment of interia (MI1), I should have made the disctinction more
> explicit.  Moments are not necessarily Moments of inertia.  The terminology
> gets confusing.
>
> After a brief discussion with Greg, the Moments.py does the correct
> calculation which indirectly verifies MOE and the newer RDKit
> implementation.
>
> Cheers,
>  Brian
>
> On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw <cgearns...@gmail.com>
> wrote:
>
>> The dimensions along one of the axes of a planar molecule in its inertial
>> frame will be zero, but the principal moments of inertia will all be
>> non-zero. The moment of inertia about an axis can only be zero if all the
>> atoms in the molecule are precisely aligned on that axis. That's only
>> possible for linear molecules. There's no way to draw a straight line axis
>> through all the atoms in a non-linear molecule, which would be a
>> requirement for the corresponding moment of inertia to be zero.
>>
>> Chris
>>
>> On 17 January 2017 at 12:29, Brian Kelley <fustiga...@gmail.com> wrote:
>>
>>> Looks like I'm late to the game.  I don't know about the PMI descriptors
>>> per-se, but if a planar molecule is in it's inertial frame, one of the axes
>>> should be zero (whether it is x, y or z) which means that the one of the
>>> M1x, M1y or M1z should be zero.
>>>
>>> We had some good experimentation with multipole expansion of moments
>>> (essentially based on the description of electrostatic multipoles) that
>>> might be nice to add to the PMI framework.
>>>
>>> Greg, I'm assuming that the Moments.py we opensourced a while back is
>>> similarly broken?  I'm attaching it here for posterity but it does appear
>>> to match the moe PMI's.
>>>
>>>
>>>
>>> On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw <cgearns...@gmail.com>
>>> wrote:
>>>
>>>> The new version looks good to me as far as I can test it. PMI and NPR
>>>> are still fine, the radius of gyration is right (for an extremely
>>>> artificial test system) and the asphericity index also seems right (despite
>>>> my best efforts to confuse things further - sorry about that!). Also
>>>> highlights even more confusion in the Todeschini article - the approximate
>>>> asphericity values for prolate and oblate molecules are reversed.
>>>>
>>>> The only (very trivial) thing I've spotted is the comment in the
>>>> inertialShapeFactor function. 'planar or no coordinates' should be 'linear
>>>> or no coordinates' to avoid confusion.
>>>>
>>>> Chris
>>>>
>>>> On 16 January 2017 at 09:30, Greg Landrum <greg.land...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw <
>>>>> ch...@cge-compchem.co.uk> wrote:
>>>>>
>>>>>>
>>>>>> Either way, it makes it rather hard to trust their derivations
>>>>>> generally - especially as there appear to be other errors (e.g. the
>>>>>> denominator in eq. 16 should be the square root of the given sum of
>>>>>> squares, according to

Re: [Rdkit-discuss] PMI API

2017-01-17 Thread Guillaume GODIN
​Great! I also notice confusing usage of moment of Inertia in those descriptors.


For exemple in WHIM case, we need to know if the molecule is linear, planar or 
3D in order to compute the descriptors.


I did not find a easy way to determine this yet.


BR,​

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645
MOBILE  +41 (0)79 536 1039
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Brian Kelley <fustiga...@gmail.com>
Envoyé : mardi 17 janvier 2017 13:44
À : Chris Earnshaw
Cc : Rdkit-discuss@lists.sourceforge.net; Greg Landrum
Objet : Re: [Rdkit-discuss] PMI API

I think we agree here.  Here I was talking about the raw Moment (M1z) not the 
moment of interia (MI1), I should have made the disctinction more explicit.  
Moments are not necessarily Moments of inertia.  The terminology gets confusing.

After a brief discussion with Greg, the Moments.py does the correct calculation 
which indirectly verifies MOE and the newer RDKit implementation.

Cheers,
 Brian

On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw 
<cgearns...@gmail.com<mailto:cgearns...@gmail.com>> wrote:
The dimensions along one of the axes of a planar molecule in its inertial frame 
will be zero, but the principal moments of inertia will all be non-zero. The 
moment of inertia about an axis can only be zero if all the atoms in the 
molecule are precisely aligned on that axis. That's only possible for linear 
molecules. There's no way to draw a straight line axis through all the atoms in 
a non-linear molecule, which would be a requirement for the corresponding 
moment of inertia to be zero.

Chris

On 17 January 2017 at 12:29, Brian Kelley 
<fustiga...@gmail.com<mailto:fustiga...@gmail.com>> wrote:
Looks like I'm late to the game.  I don't know about the PMI descriptors 
per-se, but if a planar molecule is in it's inertial frame, one of the axes 
should be zero (whether it is x, y or z) which means that the one of the M1x, 
M1y or M1z should be zero.

We had some good experimentation with multipole expansion of moments 
(essentially based on the description of electrostatic multipoles) that might 
be nice to add to the PMI framework.

Greg, I'm assuming that the Moments.py we opensourced a while back is similarly 
broken?  I'm attaching it here for posterity but it does appear to match the 
moe PMI's.



On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw 
<cgearns...@gmail.com<mailto:cgearns...@gmail.com>> wrote:
The new version looks good to me as far as I can test it. PMI and NPR are still 
fine, the radius of gyration is right (for an extremely artificial test system) 
and the asphericity index also seems right (despite my best efforts to confuse 
things further - sorry about that!). Also highlights even more confusion in the 
Todeschini article - the approximate asphericity values for prolate and oblate 
molecules are reversed.

The only (very trivial) thing I've spotted is the comment in the 
inertialShapeFactor function. 'planar or no coordinates' should be 'linear or 
no coordinates' to avoid confusion.

Chris

On 16 January 2017 at 09:30, Greg Landrum 
<greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote:


On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw 
<ch...@cge-compchem.co.uk<mailto:ch...@cge-compchem.co.uk>> wrote:

Either way, it makes it rather hard to trust their derivations generally - 
especially as there appear to be other errors (e.g. the denominator in eq. 16 
should be the square root of the given sum of squares, according to their 
reference).

Indeed. Given the problems encountered, I went back and checked some additional 
references to find definitions of the descriptors. The results are in this PR, 
which I'd love feedback on if you have time to take a look:
https://github.com/rdkit/rdkit/pull/1265

I didn't manage to find any information about "inertial shape factor" and don't 
have access to the references cited in the Todeschini paper, but I think the 
others are now reasonably reliable.

-greg




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





**  
DISCLAIMER  
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the

Re: [Rdkit-discuss] PMI API

2017-01-17 Thread Brian Kelley
I think we agree here.  Here I was talking about the raw Moment (M1z) not
the moment of interia (MI1), I should have made the disctinction more
explicit.  Moments are not necessarily Moments of inertia.  The terminology
gets confusing.

After a brief discussion with Greg, the Moments.py does the correct
calculation which indirectly verifies MOE and the newer RDKit
implementation.

Cheers,
 Brian

On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw 
wrote:

> The dimensions along one of the axes of a planar molecule in its inertial
> frame will be zero, but the principal moments of inertia will all be
> non-zero. The moment of inertia about an axis can only be zero if all the
> atoms in the molecule are precisely aligned on that axis. That's only
> possible for linear molecules. There's no way to draw a straight line axis
> through all the atoms in a non-linear molecule, which would be a
> requirement for the corresponding moment of inertia to be zero.
>
> Chris
>
> On 17 January 2017 at 12:29, Brian Kelley  wrote:
>
>> Looks like I'm late to the game.  I don't know about the PMI descriptors
>> per-se, but if a planar molecule is in it's inertial frame, one of the axes
>> should be zero (whether it is x, y or z) which means that the one of the
>> M1x, M1y or M1z should be zero.
>>
>> We had some good experimentation with multipole expansion of moments
>> (essentially based on the description of electrostatic multipoles) that
>> might be nice to add to the PMI framework.
>>
>> Greg, I'm assuming that the Moments.py we opensourced a while back is
>> similarly broken?  I'm attaching it here for posterity but it does appear
>> to match the moe PMI's.
>>
>>
>>
>> On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw 
>> wrote:
>>
>>> The new version looks good to me as far as I can test it. PMI and NPR
>>> are still fine, the radius of gyration is right (for an extremely
>>> artificial test system) and the asphericity index also seems right (despite
>>> my best efforts to confuse things further - sorry about that!). Also
>>> highlights even more confusion in the Todeschini article - the approximate
>>> asphericity values for prolate and oblate molecules are reversed.
>>>
>>> The only (very trivial) thing I've spotted is the comment in the
>>> inertialShapeFactor function. 'planar or no coordinates' should be 'linear
>>> or no coordinates' to avoid confusion.
>>>
>>> Chris
>>>
>>> On 16 January 2017 at 09:30, Greg Landrum 
>>> wrote:
>>>


 On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw <
 ch...@cge-compchem.co.uk> wrote:

>
> Either way, it makes it rather hard to trust their derivations
> generally - especially as there appear to be other errors (e.g. the
> denominator in eq. 16 should be the square root of the given sum of
> squares, according to their reference).
>

 Indeed. Given the problems encountered, I went back and checked some
 additional references to find definitions of the descriptors. The results
 are in this PR, which I'd love feedback on if you have time to take a look:
 https://github.com/rdkit/rdkit/pull/1265

 I didn't manage to find any information about "inertial shape factor"
 and don't have access to the references cited in the Todeschini paper, but
 I think the others are now reasonably reliable.

 -greg



>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-17 Thread Chris Earnshaw
The dimensions along one of the axes of a planar molecule in its inertial
frame will be zero, but the principal moments of inertia will all be
non-zero. The moment of inertia about an axis can only be zero if all the
atoms in the molecule are precisely aligned on that axis. That's only
possible for linear molecules. There's no way to draw a straight line axis
through all the atoms in a non-linear molecule, which would be a
requirement for the corresponding moment of inertia to be zero.

Chris

On 17 January 2017 at 12:29, Brian Kelley  wrote:

> Looks like I'm late to the game.  I don't know about the PMI descriptors
> per-se, but if a planar molecule is in it's inertial frame, one of the axes
> should be zero (whether it is x, y or z) which means that the one of the
> M1x, M1y or M1z should be zero.
>
> We had some good experimentation with multipole expansion of moments
> (essentially based on the description of electrostatic multipoles) that
> might be nice to add to the PMI framework.
>
> Greg, I'm assuming that the Moments.py we opensourced a while back is
> similarly broken?  I'm attaching it here for posterity but it does appear
> to match the moe PMI's.
>
>
>
> On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw 
> wrote:
>
>> The new version looks good to me as far as I can test it. PMI and NPR are
>> still fine, the radius of gyration is right (for an extremely artificial
>> test system) and the asphericity index also seems right (despite my best
>> efforts to confuse things further - sorry about that!). Also highlights
>> even more confusion in the Todeschini article - the approximate asphericity
>> values for prolate and oblate molecules are reversed.
>>
>> The only (very trivial) thing I've spotted is the comment in the
>> inertialShapeFactor function. 'planar or no coordinates' should be 'linear
>> or no coordinates' to avoid confusion.
>>
>> Chris
>>
>> On 16 January 2017 at 09:30, Greg Landrum  wrote:
>>
>>>
>>>
>>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw <
>>> ch...@cge-compchem.co.uk> wrote:
>>>

 Either way, it makes it rather hard to trust their derivations
 generally - especially as there appear to be other errors (e.g. the
 denominator in eq. 16 should be the square root of the given sum of
 squares, according to their reference).

>>>
>>> Indeed. Given the problems encountered, I went back and checked some
>>> additional references to find definitions of the descriptors. The results
>>> are in this PR, which I'd love feedback on if you have time to take a look:
>>> https://github.com/rdkit/rdkit/pull/1265
>>>
>>> I didn't manage to find any information about "inertial shape factor"
>>> and don't have access to the references cited in the Todeschini paper, but
>>> I think the others are now reasonably reliable.
>>>
>>> -greg
>>>
>>>
>>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-17 Thread Brian Kelley
Looks like I'm late to the game.  I don't know about the PMI descriptors
per-se, but if a planar molecule is in it's inertial frame, one of the axes
should be zero (whether it is x, y or z) which means that the one of the
M1x, M1y or M1z should be zero.

We had some good experimentation with multipole expansion of moments
(essentially based on the description of electrostatic multipoles) that
might be nice to add to the PMI framework.

Greg, I'm assuming that the Moments.py we opensourced a while back is
similarly broken?  I'm attaching it here for posterity but it does appear
to match the moe PMI's.



On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw 
wrote:

> The new version looks good to me as far as I can test it. PMI and NPR are
> still fine, the radius of gyration is right (for an extremely artificial
> test system) and the asphericity index also seems right (despite my best
> efforts to confuse things further - sorry about that!). Also highlights
> even more confusion in the Todeschini article - the approximate asphericity
> values for prolate and oblate molecules are reversed.
>
> The only (very trivial) thing I've spotted is the comment in the
> inertialShapeFactor function. 'planar or no coordinates' should be 'linear
> or no coordinates' to avoid confusion.
>
> Chris
>
> On 16 January 2017 at 09:30, Greg Landrum  wrote:
>
>>
>>
>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw <
>> ch...@cge-compchem.co.uk> wrote:
>>
>>>
>>> Either way, it makes it rather hard to trust their derivations generally
>>> - especially as there appear to be other errors (e.g. the denominator in
>>> eq. 16 should be the square root of the given sum of squares, according to
>>> their reference).
>>>
>>
>> Indeed. Given the problems encountered, I went back and checked some
>> additional references to find definitions of the descriptors. The results
>> are in this PR, which I'd love feedback on if you have time to take a look:
>> https://github.com/rdkit/rdkit/pull/1265
>>
>> I didn't manage to find any information about "inertial shape factor" and
>> don't have access to the references cited in the Todeschini paper, but I
>> think the others are now reasonably reliable.
>>
>> -greg
>>
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>


Moments.py
Description: Binary data
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-17 Thread Chris Earnshaw
The new version looks good to me as far as I can test it. PMI and NPR are
still fine, the radius of gyration is right (for an extremely artificial
test system) and the asphericity index also seems right (despite my best
efforts to confuse things further - sorry about that!). Also highlights
even more confusion in the Todeschini article - the approximate asphericity
values for prolate and oblate molecules are reversed.

The only (very trivial) thing I've spotted is the comment in the
inertialShapeFactor function. 'planar or no coordinates' should be 'linear
or no coordinates' to avoid confusion.

Chris

On 16 January 2017 at 09:30, Greg Landrum  wrote:

>
>
> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw  > wrote:
>
>>
>> Either way, it makes it rather hard to trust their derivations generally
>> - especially as there appear to be other errors (e.g. the denominator in
>> eq. 16 should be the square root of the given sum of squares, according to
>> their reference).
>>
>
> Indeed. Given the problems encountered, I went back and checked some
> additional references to find definitions of the descriptors. The results
> are in this PR, which I'd love feedback on if you have time to take a look:
> https://github.com/rdkit/rdkit/pull/1265
>
> I didn't manage to find any information about "inertial shape factor" and
> don't have access to the references cited in the Todeschini paper, but I
> think the others are now reasonably reliable.
>
> -greg
>
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-16 Thread Chris Earnshaw
Dear Guillaume

Thanks - looks like we agree about reality (good!) and that Todeschini et
al. are wrong in their discussion about planar molecules. Whether this is a
simple mistaken assertion, or if they've mixed up another quantity (e.g.
the eigenvalues of the covariance matrix) with the PMIs is impossible to
say.

Either way, it makes it rather hard to trust their derivations generally -
especially as there appear to be other errors (e.g. the denominator in eq.
16 should be the square root of the given sum of squares, according to
their reference).

Best regards,
Chris

Dr Chris Earnshaw
CGE Computational Chemistry
Phone: +44(0) 1223 426000
Mobile: 07944 707773
E-mail: ch...@cge-compchem.co.uk

On 16 January 2017 at 08:54, Guillaume GODIN <guillaume.go...@firmenich.com>
wrote:

> Dear Chris,
>
>
> No prob let me explain:
>
>
> I Aggree on monoatomics center of mass is the atom so  (for all x axis:
> Ix= 0)
>
> ​
>
> Now I consider the mathematics only not the physics.
>
>
> I suggest that they (Todeschini) are not really computing the "real
> physical" PMi on the 3 axis but arbitrary said that for 2D molecules the
> 3nd axis PMi is zero.
>
>
> BR
>
>
>
> *Dr. Guillaume GODIN*
> Principal Scientist
> Chemoinformatic & Datamining
> Innovation
> CORPORATE R DIVISION
> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045>
> MOBILE  +41 (0)79 536 1039 <+41%2079%20536%2010%2039>
> Firmenich SA
> RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8
>
> --
> *De :* Chris Earnshaw <cgearns...@gmail.com>
> *Envoyé :* lundi 16 janvier 2017 09:36
> *À :* Guillaume GODIN
> *Cc :* Greg Landrum; RDKit Discuss
>
> *Objet :* Re: [Rdkit-discuss] PMI API
>
>
>
> On 16 January 2017 at 06:25, Guillaume GODIN <
> guillaume.go...@firmenich.com> wrote:
>
>> reading carefully the Todeschini article, them said that Ic,Ib,Ia are
>> determine as max & min values of I other all 3D axis passing throught the
>> center of mass!
>>
> I don't quite understand this comment. The inequality Ia <= Ib <= Ic is
> one of the errors in the Todeschini article pointed out by Greg yesterday.
> By definition, the Principal Moment of Inertia axes pass through the centre
> of mass.
>
> The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for
>> planar molecule.
>>
> The global Moment of Inertia is only zero for monatomics.
>
>
>> But When you have a planar molecule, the matrix is no more 3D but 2D! so
>> it's normal to consider that the 3nd PM is zero.
>>
> I really don't understand this - it's simply wrong. The molecule may be 2D
> but the three principal moments of inertia are most definitely non-zero for
> a planar structure. For a fully symmetrical molecule like benzene the
> largest PMI is around the axis perpendicular to the plane of the molecule
> and there are two equivalent, smaller, PMIs perpendicular to each other in
> the plane of the molecule. For a less symmetrical molecule like
> naphthalene, the largest PMI is again around the axis perpendicular to the
> plane, the intermediate PMI is along the fusion bond between the rings and
> the smallest PMI is around the long axis of the molecule. There's no way it
> can be correct to consider the 3rd PMI as zero in any planar molecule -
> it's never equal to zero and is only degenerate with the 2nd PMI for fully
> symmetric molecules. Only in the special case of a completely linear
> molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis of the
> molecule) equal to zero.
>
> Apologies - I appear to have opened a can of worms here...
>
> Chris
>
>> --
>> *De :* Greg Landrum <greg.land...@gmail.com>
>> *Envoyé :* dimanche 15 janvier 2017 17:42
>> *À :* Guillaume GODIN; RDKit Discuss
>>
>> *Objet :* Re: [Rdkit-discuss] PMI API
>>
>> Thanks Guillaume!
>>
>> On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN <
>> guillaume.go...@firmenich.com> wrote:
>>
>>> Here, Dragon results for the 3 molecules: I've included both  Whim and
>>> 3D descriptors but I don't have access to PMi!
>>>
>>>
>>> I found the second document in agreement with Peter answer...
>>>
>>>
>>> BR,
>>>
>>> *Dr. Guillaume GODIN*
>>> Principal Scientist
>>> Chemoinformatic & Datamining
>>> Innovation
>>> CORPORATE R DIVISION
>>> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045>
>>> MOBILE  +41 (0)79 536 1039 <+41%2079%20536%2010%2039>
>>

Re: [Rdkit-discuss] PMI API

2017-01-16 Thread Greg Landrum
On Mon, Jan 16, 2017 at 9:36 AM, Chris Earnshaw 
wrote:

>
> Apologies - I appear to have opened a can of worms here...
>

No need whatsoever to apologize. You identified and pointed out a bug in
the implementation of the new 3D descriptors, which is something very much
appreciated.
The fact that I picked a seemingly unreliable source for the definitions of
those descriptors and that it's turning out to be difficult than I might
like to find reliable definitions for some of them is just the way things
are.

I'll have an updated version checked in (hopefully) in the next couple
hours. It would be great if you could take a look at it and let me know if
it looks right.

-greg
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-16 Thread Guillaume GODIN
Dear Chris,


No prob let me explain:


I Aggree on monoatomics center of mass is the atom so  (for all x axis: Ix= 0)

​

Now I consider the mathematics only not the physics.


I suggest that they (Todeschini) are not really computing the "real physical" 
PMi on the 3 axis but arbitrary said that for 2D molecules the 3nd axis PMi is 
zero.


BR


Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645
MOBILE  +41 (0)79 536 1039
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Chris Earnshaw <cgearns...@gmail.com>
Envoyé : lundi 16 janvier 2017 09:36
À : Guillaume GODIN
Cc : Greg Landrum; RDKit Discuss
Objet : Re: [Rdkit-discuss] PMI API



On 16 January 2017 at 06:25, Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>> wrote:

reading carefully the Todeschini article, them said that Ic,Ib,Ia are determine 
as max & min values of I other all 3D axis passing throught the center of mass!

I don't quite understand this comment. The inequality Ia <= Ib <= Ic is one of 
the errors in the Todeschini article pointed out by Greg yesterday. By 
definition, the Principal Moment of Inertia axes pass through the centre of 
mass.


The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for planar 
molecule.

The global Moment of Inertia is only zero for monatomics.


But When you have a planar molecule, the matrix is no more 3D but 2D! so it's 
normal to consider that the 3nd PM is zero.

I really don't understand this - it's simply wrong. The molecule may be 2D but 
the three principal moments of inertia are most definitely non-zero for a 
planar structure. For a fully symmetrical molecule like benzene the largest PMI 
is around the axis perpendicular to the plane of the molecule and there are two 
equivalent, smaller, PMIs perpendicular to each other in the plane of the 
molecule. For a less symmetrical molecule like naphthalene, the largest PMI is 
again around the axis perpendicular to the plane, the intermediate PMI is along 
the fusion bond between the rings and the smallest PMI is around the long axis 
of the molecule. There's no way it can be correct to consider the 3rd PMI as 
zero in any planar molecule - it's never equal to zero and is only degenerate 
with the 2nd PMI for fully symmetric molecules. Only in the special case of a 
completely linear molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis 
of the molecule) equal to zero.

Apologies - I appear to have opened a can of worms here...

Chris


De : Greg Landrum <greg.land...@gmail.com<mailto:greg.land...@gmail.com>>
Envoyé : dimanche 15 janvier 2017 17:42
À : Guillaume GODIN; RDKit Discuss

Objet : Re: [Rdkit-discuss] PMI API

Thanks Guillaume!

On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>> wrote:

Here, Dragon results for the 3 molecules: I've included both  Whim and 3D 
descriptors but I don't have access to PMi!


I found the second document in agreement with Peter answer...


BR,

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645<tel:+41%2022%20780%2036%2045>
MOBILE  +41 (0)79 536 1039<tel:+41%2079%20536%2010%2039>
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Peter Gedeck <peter.ged...@gmail.com<mailto:peter.ged...@gmail.com>>
Envoyé : dimanche 15 janvier 2017 15:07
À : Greg Landrum; RDKit Discuss; Guillaume GODIN

Objet : Re: [Rdkit-discuss] PMI API

According to this:
https://en.wikipedia.org/wiki/List_of_moments_of_inertia
The moments of inertia of a disk (something like benzene) are:

Iz = mr^2/2
Ix = Iy = mr^2/4

None of them is zero. The smallest moment of inertia of a rod-like molecule 
(e.g. C#C) is zero.

Best,

Peter



On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum 
<greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote:
Hi Guillaume,

I think it this case it's something else. According to the Todeschini article 
the smallest moment of inertia of a planar molecule like benzene should be 
zero. The eigenvalues of the inertia matrix for benzene, however, are 
definitely not zero (and not close enough that it's likely to be round-off 
error).
It would be very nice if you could run the three files I mention through Dragon 
and let me know what it calculates for those descriptors.

-greg


_
From: Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>>
Sent: Sunday, January 15, 2017 1:11 PM
Subject: RE: [Rdkit-discuss] PMI API
To: Greg Landrum <greg.land...@gmail.c

Re: [Rdkit-discuss] PMI API

2017-01-16 Thread Chris Earnshaw
On 16 January 2017 at 06:25, Guillaume GODIN <guillaume.go...@firmenich.com>
wrote:

> reading carefully the Todeschini article, them said that Ic,Ib,Ia are
> determine as max & min values of I other all 3D axis passing throught the
> center of mass!
>
I don't quite understand this comment. The inequality Ia <= Ib <= Ic is one
of the errors in the Todeschini article pointed out by Greg yesterday. By
definition, the Principal Moment of Inertia axes pass through the centre of
mass.

The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for
> planar molecule.
>
The global Moment of Inertia is only zero for monatomics.


> But When you have a planar molecule, the matrix is no more 3D but 2D! so
> it's normal to consider that the 3nd PM is zero.
>
I really don't understand this - it's simply wrong. The molecule may be 2D
but the three principal moments of inertia are most definitely non-zero for
a planar structure. For a fully symmetrical molecule like benzene the
largest PMI is around the axis perpendicular to the plane of the molecule
and there are two equivalent, smaller, PMIs perpendicular to each other in
the plane of the molecule. For a less symmetrical molecule like
naphthalene, the largest PMI is again around the axis perpendicular to the
plane, the intermediate PMI is along the fusion bond between the rings and
the smallest PMI is around the long axis of the molecule. There's no way it
can be correct to consider the 3rd PMI as zero in any planar molecule -
it's never equal to zero and is only degenerate with the 2nd PMI for fully
symmetric molecules. Only in the special case of a completely linear
molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis of the
molecule) equal to zero.

Apologies - I appear to have opened a can of worms here...

Chris

> --
> *De :* Greg Landrum <greg.land...@gmail.com>
> *Envoyé :* dimanche 15 janvier 2017 17:42
> *À :* Guillaume GODIN; RDKit Discuss
>
> *Objet :* Re: [Rdkit-discuss] PMI API
>
> Thanks Guillaume!
>
> On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN <
> guillaume.go...@firmenich.com> wrote:
>
>> Here, Dragon results for the 3 molecules: I've included both  Whim and 3D
>> descriptors but I don't have access to PMi!
>>
>>
>> I found the second document in agreement with Peter answer...
>>
>>
>> BR,
>>
>> *Dr. Guillaume GODIN*
>> Principal Scientist
>> Chemoinformatic & Datamining
>> Innovation
>> CORPORATE R DIVISION
>> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045>
>> MOBILE  +41 (0)79 536 1039 <+41%2079%20536%2010%2039>
>> Firmenich SA
>> RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8
>>
>> --
>> *De :* Peter Gedeck <peter.ged...@gmail.com>
>> *Envoyé :* dimanche 15 janvier 2017 15:07
>> *À :* Greg Landrum; RDKit Discuss; Guillaume GODIN
>>
>> *Objet :* Re: [Rdkit-discuss] PMI API
>>
>> According to this:
>> https://en.wikipedia.org/wiki/List_of_moments_of_inertia
>> The moments of inertia of a disk (something like benzene) are:
>>
>> Iz = mr^2/2
>> Ix = Iy = mr^2/4
>>
>> None of them is zero. The smallest moment of inertia of a rod-like
>> molecule (e.g. C#C) is zero.
>>
>> Best,
>>
>> Peter
>>
>>
>>
>> On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.land...@gmail.com>
>> wrote:
>>
>>> Hi Guillaume,
>>>
>>> I think it this case it's something else. According to the Todeschini
>>> article the smallest moment of inertia of a planar molecule like benzene
>>> should be zero. The eigenvalues of the inertia matrix for benzene, however,
>>> are definitely not zero (and not close enough that it's likely to be
>>> round-off error).
>>> It would be very nice if you could run the three files I mention through
>>> Dragon and let me know what it calculates for those descriptors.
>>>
>>> -greg
>>>
>>>
>>> _
>>> From: Guillaume GODIN <guillaume.go...@firmenich.com>
>>> Sent: Sunday, January 15, 2017 1:11 PM
>>> Subject: RE: [Rdkit-discuss] PMI API
>>> To: Greg Landrum <greg.land...@gmail.com>, RDKit Discuss <
>>> rdkit-discuss@lists.sourceforge.net>, Chris Earnshaw <
>>> cgearns...@gmail.com>
>>>
>>>
>>>
>>> Dear Greg,
>>>
>>>
>>> I  suspect that it's a precision error or eigen algorithm shift between
>>> rdkit c++ & dragon.
>>>
>&

Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Guillaume GODIN
No problem Greg,


reading carefully the Todeschini article, them said that Ic,Ib,Ia are determine 
as max & min values of I other all 3D axis passing throught the center of mass!


The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for planar 
molecule.


But When you have a planar molecule, the matrix is no more 3D but 2D! so it's 
normal to consider that the 3nd PM is zero.


BR,


Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645
MOBILE  +41 (0)79 536 1039
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Greg Landrum <greg.land...@gmail.com>
Envoyé : dimanche 15 janvier 2017 17:42
À : Guillaume GODIN; RDKit Discuss
Objet : Re: [Rdkit-discuss] PMI API

Thanks Guillaume!

On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>> wrote:

Here, Dragon results for the 3 molecules: I've included both  Whim and 3D 
descriptors but I don't have access to PMi!


I found the second document in agreement with Peter answer...


BR,

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645<tel:+41%2022%20780%2036%2045>
MOBILE  +41 (0)79 536 1039<tel:+41%2079%20536%2010%2039>
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Peter Gedeck <peter.ged...@gmail.com<mailto:peter.ged...@gmail.com>>
Envoyé : dimanche 15 janvier 2017 15:07
À : Greg Landrum; RDKit Discuss; Guillaume GODIN

Objet : Re: [Rdkit-discuss] PMI API

According to this:
https://en.wikipedia.org/wiki/List_of_moments_of_inertia
The moments of inertia of a disk (something like benzene) are:

Iz = mr^2/2
Ix = Iy = mr^2/4

None of them is zero. The smallest moment of inertia of a rod-like molecule 
(e.g. C#C) is zero.

Best,

Peter



On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum 
<greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote:
Hi Guillaume,

I think it this case it's something else. According to the Todeschini article 
the smallest moment of inertia of a planar molecule like benzene should be 
zero. The eigenvalues of the inertia matrix for benzene, however, are 
definitely not zero (and not close enough that it's likely to be round-off 
error).
It would be very nice if you could run the three files I mention through Dragon 
and let me know what it calculates for those descriptors.

-greg


_
From: Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>>
Sent: Sunday, January 15, 2017 1:11 PM
Subject: RE: [Rdkit-discuss] PMI API
To: Greg Landrum <greg.land...@gmail.com<mailto:greg.land...@gmail.com>>, RDKit 
Discuss 
<rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net>>,
 Chris Earnshaw <cgearns...@gmail.com<mailto:cgearns...@gmail.com>>




Dear Greg,


I  suspect that it's a precision error or eigen algorithm shift between rdkit 
c++ & dragon.


To obtain good value, I suggest to try to implement a test on the eigen values 
like i did in gateway.cpp implementation.



JacobiSVD getSVD(MatrixXd A) {

JacobiSVD mysvd(A,  ComputeThinU | ComputeThinV);

return mysvd;

}


// get the A-1 matrix using

MatrixXd GetPinv(MatrixXd A){

JacobiSVD svd = getSVD(A);

double  pinvtoler=1.e-2;// choose your tolerance wisely!

VectorXd vs=svd.singularValues();

VectorXd vsinv=svd.singularValues();


for (unsignedint i=0; i<A.cols(); ++i) {

if ( vs(i) > pinvtoler )

   vsinv(i)=1.0/vs(i);

   else vsinv(i)=0.0;

}


MatrixXd S =  vsinv.asDiagonal();

MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose();

return Ap;

}


If it's not solve the problem, I would like to test it in Matlab. can you 
provide me the 3 (3d xyz matrix) of your example please ?


I also have Dragon 6


best regards,

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645<tel:022%20780%2036%2045>
MOBILE  +41 (0)79 536 1039<tel:079%20536%2010%2039>
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Greg Landrum <greg.land...@gmail.com<mailto:greg.land...@gmail.com>>
Envoyé : dimanche 15 janvier 2017 11:50
À : Chris Earnshaw; RDKit Discuss
Objet : Re: [Rdkit-discuss] PMI API

I managed to make some time to look into this this weekend and I've found a bug 
and something I don't understand. Hopefully the community can help out here.
On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw 
<cgearns...@gmail.com<mailto:cgearns...@gma

Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Greg Landrum
Thanks Guillaume!

On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN <
guillaume.go...@firmenich.com> wrote:

> Here, Dragon results for the 3 molecules: I've included both  Whim and 3D
> descriptors but I don't have access to PMi!
>
>
> I found the second document in agreement with Peter answer...
>
>
> BR,
>
> *Dr. Guillaume GODIN*
> Principal Scientist
> Chemoinformatic & Datamining
> Innovation
> CORPORATE R DIVISION
> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045>
> MOBILE  +41 (0)79 536 1039 <+41%2079%20536%2010%2039>
> Firmenich SA
> RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8
>
> --
> *De :* Peter Gedeck <peter.ged...@gmail.com>
> *Envoyé :* dimanche 15 janvier 2017 15:07
> *À :* Greg Landrum; RDKit Discuss; Guillaume GODIN
>
> *Objet :* Re: [Rdkit-discuss] PMI API
>
> According to this:
> https://en.wikipedia.org/wiki/List_of_moments_of_inertia
> The moments of inertia of a disk (something like benzene) are:
>
> Iz = mr^2/2
> Ix = Iy = mr^2/4
>
> None of them is zero. The smallest moment of inertia of a rod-like
> molecule (e.g. C#C) is zero.
>
> Best,
>
> Peter
>
>
>
> On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.land...@gmail.com>
> wrote:
>
>> Hi Guillaume,
>>
>> I think it this case it's something else. According to the Todeschini
>> article the smallest moment of inertia of a planar molecule like benzene
>> should be zero. The eigenvalues of the inertia matrix for benzene, however,
>> are definitely not zero (and not close enough that it's likely to be
>> round-off error).
>> It would be very nice if you could run the three files I mention through
>> Dragon and let me know what it calculates for those descriptors.
>>
>> -greg
>>
>>
>> _
>> From: Guillaume GODIN <guillaume.go...@firmenich.com>
>> Sent: Sunday, January 15, 2017 1:11 PM
>> Subject: RE: [Rdkit-discuss] PMI API
>> To: Greg Landrum <greg.land...@gmail.com>, RDKit Discuss <
>> rdkit-discuss@lists.sourceforge.net>, Chris Earnshaw <
>> cgearns...@gmail.com>
>>
>>
>>
>> Dear Greg,
>>
>>
>> I  suspect that it's a precision error or eigen algorithm shift between
>> rdkit c++ & dragon.
>>
>>
>> To obtain good value, I suggest to try to implement a test on the eigen
>> values like i did in gateway.cpp implementation.
>>
>>
>>
>> JacobiSVD getSVD(MatrixXd A) {
>>
>> JacobiSVD mysvd(A,  ComputeThinU | ComputeThinV);
>>
>> return mysvd;
>>
>> }
>>
>>
>> // get the A-1 matrix using
>>
>> MatrixXd GetPinv(MatrixXd A){
>>
>> JacobiSVD svd = getSVD(A);
>>
>> double  pinvtoler=1.e-2;// choose your tolerance wisely!
>>
>> VectorXd vs=svd.singularValues();
>>
>> VectorXd vsinv=svd.singularValues();
>>
>>
>> for (unsignedint i=0; i<A.cols(); ++i) {
>>
>> if ( vs(i) > pinvtoler )
>>
>>vsinv(i)=1.0/vs(i);
>>
>>else vsinv(i)=0.0;
>>
>> }
>>
>>
>> MatrixXd S =  vsinv.asDiagonal();
>>
>> MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose();
>>
>> return Ap;
>>
>> }
>>
>>
>> If it's not solve the problem, I would like to test it in Matlab. can you
>> provide me the 3 (3d xyz matrix) of your example please ?
>>
>>
>> I also have Dragon 6
>>
>>
>> best regards,
>>
>> *Dr. Guillaume GODIN*
>> Principal Scientist
>> Chemoinformatic & Datamining
>> Innovation
>> CORPORATE R DIVISION
>> DIRECT LINE +41 (0)22 780 3645 <022%20780%2036%2045>
>> MOBILE  +41 (0)79 536 1039 <079%20536%2010%2039>
>> Firmenich SA
>> RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8
>>
>> --
>> *De :* Greg Landrum <greg.land...@gmail.com>
>> *Envoyé :* dimanche 15 janvier 2017 11:50
>> *À :* Chris Earnshaw; RDKit Discuss
>> *Objet :* Re: [Rdkit-discuss] PMI API
>>
>> I managed to make some time to look into this this weekend and I've found
>> a bug and something I don't understand. Hopefully the community can help
>> out here.
>> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearns...@gmail.com>
>> wrote:
>>
>> 4) The big one! The returned results look very odd. They appear 

Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Greg Landrum
On Sun, Jan 15, 2017 at 5:15 PM, Chris Earnshaw 
wrote:

>
> I've built a version of RDKit with fixes from https://github.com/
> greglandrum/rdkit/tree/fix/github1262 and can confirm that it gives
> exactly the same values of PMI and NPR that I got with the RDKit fork by
> 'hahnda6'. I can't say for certain that the PMI values are correct in
> absolute terms, but the NPR values are certainly what would be expected for
> those test molecules.
>

Glad to hear it.



> I'm worried about the Todeschini paper - I think there are errors in some
> of the equations and inconsistencies in the discussion, some of which may
> involve mixing up PMIs with eigenvalues of the covariance matrix.
> Unfortunately I don't have access to the original references so can't check
> in detail, but I'd be disinclined to take any of the equations at face
> value.
>

Ok. I'm going to have to see if I can track down some additional references
and work from there.


-greg
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Chris Earnshaw
Thanks Greg

I've built a version of RDKit with fixes from https://github.com/
greglandrum/rdkit/tree/fix/github1262 and can confirm that it gives exactly
the same values of PMI and NPR that I got with the RDKit fork by 'hahnda6'.
I can't say for certain that the PMI values are correct in absolute terms,
but the NPR values are certainly what would be expected for those test
molecules.

I'm worried about the Todeschini paper - I think there are errors in some
of the equations and inconsistencies in the discussion, some of which may
involve mixing up PMIs with eigenvalues of the covariance matrix.
Unfortunately I don't have access to the original references so can't check
in detail, but I'd be disinclined to take any of the equations at face
value.

Chris

On 15 January 2017 at 10:50, Greg Landrum  wrote:

> I managed to make some time to look into this this weekend and I've found
> a bug and something I don't understand. Hopefully the community can help
> out here.
>
> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw 
> wrote:
>
>> 4) The big one! The returned results look very odd. They appear to relate
>> more to the dimensions of the molecule than the moments of inertia. For a
>> rod-like molecule (dimethylacetylene) I'd expect two large and one small
>> PMI (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828
>> NPR2: 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:
>> 25.3699  NPR1: 0.002430  NPR2: 0.002430.
>> For disk-like (benzene) the result should be one large and two medium
>> (e.g. PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2:
>> 0.500013) but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1:
>> 2.14213e-11  NPR2: 0.33.
>> Finally for a roughly spherical molecule (neopentane) the NPR values look
>> reasonable (no great surprise) but the absolute PMI values may be too
>> small: old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
>> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2:
>> 6.59488  PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>>
>
> Your expectations are correct: the current RDKit implementation is wrong.
> The corresponding github entry is here: https://github.com/
> rdkit/rdkit/issues/1262
> This is due to a mistake in the way the principal moments are calculated
> (which is due to the fact that I don't spend a lot of time working
> with/thinking about 3D descriptors). Instead of using the
> eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the
> RDKit is currently using the covariance matrix. There's some more on the
> relationship between these two here: http://number-none.com/
> blow/inertia/deriving_i.html
>
> The problem is easy to fix (and I have something working here:
> https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws
> up the values of the descriptors that are derived from here:
> Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of
> Chemoinformatics http://dx.doi.org/10.1002/9783527618279.ch37
> These include the radius of gyration, inertial shape factor, etc.
> Within that article they state that Ic = 0 for planar molecules. Ignoring
> the inequality on page 1010, which says that Ic is the largest moment and
> is contradicted by the rest of the text (particularly the inequalities on
> page 1011), Ic corresponds to the smallest principal moment : PMI1.
>
> So now I'm confused, but I'm hoping this is obvious to someone versed in
> the field: I'd like to reproduce the descriptors described in the
> Todeschini article, but I clearly can't do that using the actual moments of
> inertia. I could keep using the eigenvalues of the covariance matrix there,
> but that doesn't match what's described in the text.
>
> Two things that would be extremely helpful:
> 1) an explanation of the disconnect here from someone who knows this
> stuff, I would guess that it's pretty simple
> 2) The results of running the files github1262_1.mol, github1262_2.mol,
> and github1262_3.mol from here: https://github.com/
> greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/
> MolTransforms/test_data through Dragon and calculating the radius of
> gyration, inertial shape factor, eccentricity, molecular asphericity, and
> spherocity index.
>
> Best,
> -greg
>
>
>
>>
>
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Guillaume GODIN
Here, Dragon results for the 3 molecules: I've included both  Whim and 3D 
descriptors but I don't have access to PMi!


I found the second document in agreement with Peter answer...


BR,

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645
MOBILE  +41 (0)79 536 1039
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Peter Gedeck <peter.ged...@gmail.com>
Envoyé : dimanche 15 janvier 2017 15:07
À : Greg Landrum; RDKit Discuss; Guillaume GODIN
Objet : Re: [Rdkit-discuss] PMI API

According to this:
https://en.wikipedia.org/wiki/List_of_moments_of_inertia
The moments of inertia of a disk (something like benzene) are:

Iz = mr^2/2
Ix = Iy = mr^2/4

None of them is zero. The smallest moment of inertia of a rod-like molecule 
(e.g. C#C) is zero.

Best,

Peter



On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum 
<greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote:
Hi Guillaume,

I think it this case it's something else. According to the Todeschini article 
the smallest moment of inertia of a planar molecule like benzene should be 
zero. The eigenvalues of the inertia matrix for benzene, however, are 
definitely not zero (and not close enough that it's likely to be round-off 
error).
It would be very nice if you could run the three files I mention through Dragon 
and let me know what it calculates for those descriptors.

-greg


_
From: Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>>
Sent: Sunday, January 15, 2017 1:11 PM
Subject: RE: [Rdkit-discuss] PMI API
To: Greg Landrum <greg.land...@gmail.com<mailto:greg.land...@gmail.com>>, RDKit 
Discuss 
<rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net>>,
 Chris Earnshaw <cgearns...@gmail.com<mailto:cgearns...@gmail.com>>




Dear Greg,


I  suspect that it's a precision error or eigen algorithm shift between rdkit 
c++ & dragon.


To obtain good value, I suggest to try to implement a test on the eigen values 
like i did in gateway.cpp implementation.



JacobiSVD getSVD(MatrixXd A) {

JacobiSVD mysvd(A,  ComputeThinU | ComputeThinV);

return mysvd;

}


// get the A-1 matrix using

MatrixXd GetPinv(MatrixXd A){

JacobiSVD svd = getSVD(A);

double  pinvtoler=1.e-2;// choose your tolerance wisely!

VectorXd vs=svd.singularValues();

VectorXd vsinv=svd.singularValues();


for (unsignedint i=0; i<A.cols(); ++i) {

if ( vs(i) > pinvtoler )

   vsinv(i)=1.0/vs(i);

   else vsinv(i)=0.0;

}


MatrixXd S =  vsinv.asDiagonal();

MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose();

return Ap;

}


If it's not solve the problem, I would like to test it in Matlab. can you 
provide me the 3 (3d xyz matrix) of your example please ?


I also have Dragon 6


best regards,

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645<tel:022%20780%2036%2045>
MOBILE  +41 (0)79 536 1039<tel:079%20536%2010%2039>
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Greg Landrum <greg.land...@gmail.com<mailto:greg.land...@gmail.com>>
Envoyé : dimanche 15 janvier 2017 11:50
À : Chris Earnshaw; RDKit Discuss
Objet : Re: [Rdkit-discuss] PMI API

I managed to make some time to look into this this weekend and I've found a bug 
and something I don't understand. Hopefully the community can help out here.
On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw 
<cgearns...@gmail.com<mailto:cgearns...@gmail.com>> wrote:
4) The big one! The returned results look very odd. They appear to relate more 
to the dimensions of the molecule than the moments of inertia. For a rod-like 
molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 
6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828  NPR2: 0.98) but 
actually get PMI1: 0.061647  PMI2: 0.061652  PMI3: 25.3699  NPR1: 0.002430  
NPR2: 0.002430.
For disk-like (benzene) the result should be one large and two medium (e.g. 
PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2: 0.500013) 
but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1: 2.14213e-11  
NPR2: 0.33.
Finally for a roughly spherical molecule (neopentane) the NPR values look 
reasonable (no great surprise) but the absolute PMI values may be too small: 
old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2: 6.59488  
PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35

Your expectations are correct: the current RDKit implementation is wrong. The 
corresponding github entry is here: ht

Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Peter Gedeck
According to this:
https://en.wikipedia.org/wiki/List_of_moments_of_inertia
The moments of inertia of a disk (something like benzene) are:

Iz = mr^2/2
Ix = Iy = mr^2/4

None of them is zero. The smallest moment of inertia of a rod-like molecule
(e.g. C#C) is zero.

Best,

Peter



On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Guillaume,
>
> I think it this case it's something else. According to the Todeschini
> article the smallest moment of inertia of a planar molecule like benzene
> should be zero. The eigenvalues of the inertia matrix for benzene, however,
> are definitely not zero (and not close enough that it's likely to be
> round-off error).
> It would be very nice if you could run the three files I mention through
> Dragon and let me know what it calculates for those descriptors.
>
> -greg
>
>
> _
> From: Guillaume GODIN <guillaume.go...@firmenich.com>
> Sent: Sunday, January 15, 2017 1:11 PM
> Subject: RE: [Rdkit-discuss] PMI API
> To: Greg Landrum <greg.land...@gmail.com>, RDKit Discuss <
> rdkit-discuss@lists.sourceforge.net>, Chris Earnshaw <cgearns...@gmail.com
> >
>
>
>
> Dear Greg,
>
>
> I  suspect that it's a precision error or eigen algorithm shift between
> rdkit c++ & dragon.
>
>
> To obtain good value, I suggest to try to implement a test on the eigen
> values like i did in gateway.cpp implementation.
>
>
>
> JacobiSVD getSVD(MatrixXd A) {
>
> JacobiSVD mysvd(A,  ComputeThinU | ComputeThinV);
>
> return mysvd;
>
> }
>
>
> // get the A-1 matrix using
>
> MatrixXd GetPinv(MatrixXd A){
>
> JacobiSVD svd = getSVD(A);
>
> double  pinvtoler=1.e-2;// choose your tolerance wisely!
>
> VectorXd vs=svd.singularValues();
>
> VectorXd vsinv=svd.singularValues();
>
>
> for (unsignedint i=0; i<A.cols(); ++i) {
>
> if ( vs(i) > pinvtoler )
>
>vsinv(i)=1.0/vs(i);
>
>else vsinv(i)=0.0;
>
> }
>
>
> MatrixXd S =  vsinv.asDiagonal();
>
> MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose();
>
> return Ap;
>
> }
>
>
> If it's not solve the problem, I would like to test it in Matlab. can you
> provide me the 3 (3d xyz matrix) of your example please ?
>
>
> I also have Dragon 6
>
>
> best regards,
>
> *Dr. Guillaume GODIN*
> Principal Scientist
> Chemoinformatic & Datamining
> Innovation
> CORPORATE R DIVISION
> DIRECT LINE +41 (0)22 780 3645 <022%20780%2036%2045>
> MOBILE      +41 (0)79 536 1039 <079%20536%2010%2039>
> Firmenich SA
> RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8
>
> --
> *De :* Greg Landrum <greg.land...@gmail.com>
> *Envoyé :* dimanche 15 janvier 2017 11:50
> *À :* Chris Earnshaw; RDKit Discuss
> *Objet :* Re: [Rdkit-discuss] PMI API
>
> I managed to make some time to look into this this weekend and I've found
> a bug and something I don't understand. Hopefully the community can help
> out here.
> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearns...@gmail.com>
> wrote:
>
> 4) The big one! The returned results look very odd. They appear to relate
> more to the dimensions of the molecule than the moments of inertia. For a
> rod-like molecule (dimethylacetylene) I'd expect two large and one small
> PMI (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828
> NPR2: 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:
> 25.3699  NPR1: 0.002430  NPR2: 0.002430.
> For disk-like (benzene) the result should be one large and two medium
> (e.g. PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2:
> 0.500013) but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1:
> 2.14213e-11  NPR2: 0.33.
> Finally for a roughly spherical molecule (neopentane) the NPR values look
> reasonable (no great surprise) but the absolute PMI values may be too
> small: old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2:
> 6.59488  PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>
>
> Your expectations are correct: the current RDKit implementation is wrong.
> The corresponding github entry is here:
> https://github.com/rdkit/rdkit/issues/1262
> This is due to a mistake in the way the principal moments are calculated
> (which is due to the fact that I don't spend a lot of time working
> with/thinking about 3D descriptors). Instead of using the
> eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the
> RDKit is currently usi

Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Greg Landrum
Hi Guillaume,
I think it this case it's something else. According to the Todeschini article 
the smallest moment of inertia of a planar molecule like benzene should be 
zero. The eigenvalues of the inertia matrix for benzene, however, are 
definitely not zero (and not close enough that it's likely to be round-off 
error).It would be very nice if you could run the three files I mention through 
Dragon and let me know what it calculates for those descriptors.
-greg


_
From: Guillaume GODIN <guillaume.go...@firmenich.com>
Sent: Sunday, January 15, 2017 1:11 PM
Subject: RE: [Rdkit-discuss] PMI API
To: Greg Landrum <greg.land...@gmail.com>, RDKit Discuss 
<rdkit-discuss@lists.sourceforge.net>, Chris Earnshaw <cgearns...@gmail.com>




Dear Greg,





I  suspect that it's a precision error or eigen algorithm shift between rdkit 
c++ & dragon.





To obtain good value, I suggest to try to implement a test on the eigen values 
like i did in gateway.cpp implementation.








JacobiSVD getSVD(MatrixXd A) {

    JacobiSVD mysvd(A,  ComputeThinU | ComputeThinV);

    return mysvd;

}




// get the A-1 matrix using 


MatrixXd GetPinv(MatrixXd A){

    JacobiSVD svd = getSVD(A);

    double  pinvtoler=1.e-2;// choose your tolerance wisely!

    VectorXd vs=svd.singularValues();

    VectorXd vsinv=svd.singularValues();




    for (unsignedint i=0; i<A.cols(); ++i) {

        if ( vs(i) > pinvtoler )

           vsinv(i)=1.0/vs(i);

       else vsinv(i)=0.0;

    }




    MatrixXd S =  vsinv.asDiagonal();

    MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose();

    return Ap;

}




If it's not solve the problem, I would like to test it in Matlab. can you 
provide me the 3 (3d xyz matrix) of your example please ?





I also have Dragon 6





best regards,

Dr. Guillaume GODINPrincipal ScientistChemoinformatic & 
DataminingInnovationCORPORATE R DIVISIONDIRECT LINE +41 (0)22 780 3645MOBILE  
        +41 (0)79 536 1039        Firmenich SA        RUE DES JEUNES 1 | CASE 
POSTALE 239 | CH-1211 GENEVE 8
De : Greg Landrum <greg.land...@gmail.com>
Envoyé : dimanche 15 janvier 2017 11:50
À : Chris Earnshaw; RDKit Discuss
Objet : Re: [Rdkit-discuss] PMI API I managed to make some time to look into 
this this weekend and I've found a bug and something I don't understand. 
Hopefully the community can help out here.

On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearns...@gmail.com> wrote:
4) The big one! The returned results look very odd. They appear to relate more 
to the dimensions of the molecule than the moments of inertia. For a rod-like 
molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 
6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828  NPR2: 0.98) but 
actually get PMI1: 0.061647  PMI2: 0.061652  PMI3: 25.3699  NPR1: 0.002430  
NPR2: 0.002430.
For disk-like (benzene) the result should be one large and two medium (e.g. 
PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2: 0.500013) 
but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1: 2.14213e-11  
NPR2: 0.33.
Finally for a roughly spherical molecule (neopentane) the NPR values look 
reasonable (no great surprise) but the absolute PMI values may be too small: 
old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2: 6.59488  
PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
Your expectations are correct: the current RDKit implementation is wrong. The 
corresponding github entry is here: 
https://github.com/rdkit/rdkit/issues/1262This is due to a mistake in the way 
the principal moments are calculated (which is due to the fact that I don't 
spend a lot of time working with/thinking about 3D descriptors). Instead of 
using the eigenvectors/eigenvalues of the inertia matrix (the tensor of 
inertia) the RDKit is currently using the covariance matrix. There's some more 
on the relationship between these two here: 
http://number-none.com/blow/inertia/deriving_i.html

The problem is easy to fix (and I have something working here: 
https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the 
values of the descriptors that are derived from here:Todeschini and Consoni 
"Descriptors from Molecular Geometry" Handbook of 
Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37These include the 
radius of gyration, inertial shape factor, etc.Within that article they state 
that Ic = 0 for planar molecules. Ignoring the inequality on page 1010, which 
says that Ic is the largest moment and is contradicted by the rest of the text 
(particularly the inequalities on page 1011), Ic corresponds to the smallest 
principal moment : PMI1.
So now I'm confused, but I'm hoping this is obvious to someone versed in the 
field: I'd like to reproduce the descriptors described in the Todeschini 
article, but I clearly

Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Guillaume GODIN
Dear Greg,


I  suspect that it's a precision error or eigen algorithm shift between rdkit 
c++ & dragon.


To obtain good value, I suggest to try to implement a test on the eigen values 
like i did in gateway.cpp implementation.



JacobiSVD getSVD(MatrixXd A) {

JacobiSVD mysvd(A,  ComputeThinU | ComputeThinV);

return mysvd;

}


// get the A-1 matrix using

MatrixXd GetPinv(MatrixXd A){

JacobiSVD svd = getSVD(A);

double  pinvtoler=1.e-2; // choose your tolerance wisely!

VectorXd vs=svd.singularValues();

VectorXd vsinv=svd.singularValues();


for (unsigned int i=0; i<A.cols(); ++i) {

if ( vs(i) > pinvtoler )

   vsinv(i)=1.0/vs(i);

   else vsinv(i)=0.0;

}


MatrixXd S =  vsinv.asDiagonal();

MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose();

return Ap;

}


If it's not solve the problem, I would like to test it in Matlab. can you 
provide me the 3 (3d xyz matrix) of your example please ?


I also have Dragon 6


best regards,

Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645
MOBILE  +41 (0)79 536 1039
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Greg Landrum <greg.land...@gmail.com>
Envoyé : dimanche 15 janvier 2017 11:50
À : Chris Earnshaw; RDKit Discuss
Objet : Re: [Rdkit-discuss] PMI API

I managed to make some time to look into this this weekend and I've found a bug 
and something I don't understand. Hopefully the community can help out here.

On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw 
<cgearns...@gmail.com<mailto:cgearns...@gmail.com>> wrote:
4) The big one! The returned results look very odd. They appear to relate more 
to the dimensions of the molecule than the moments of inertia. For a rod-like 
molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 
6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828  NPR2: 0.98) but 
actually get PMI1: 0.061647  PMI2: 0.061652  PMI3: 25.3699  NPR1: 0.002430  
NPR2: 0.002430.
For disk-like (benzene) the result should be one large and two medium (e.g. 
PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2: 0.500013) 
but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1: 2.14213e-11  
NPR2: 0.33.
Finally for a roughly spherical molecule (neopentane) the NPR values look 
reasonable (no great surprise) but the absolute PMI values may be too small: 
old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2: 6.59488  
PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35

Your expectations are correct: the current RDKit implementation is wrong. The 
corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262
This is due to a mistake in the way the principal moments are calculated (which 
is due to the fact that I don't spend a lot of time working with/thinking about 
3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia 
matrix (the tensor of inertia) the RDKit is currently using the covariance 
matrix. There's some more on the relationship between these two here: 
http://number-none.com/blow/inertia/deriving_i.html

The problem is easy to fix (and I have something working here: 
https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the 
values of the descriptors that are derived from here:
Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of 
Chemoinformatics http://dx.doi.org/10.1002/9783527618279.ch37
These include the radius of gyration, inertial shape factor, etc.
Within that article they state that Ic = 0 for planar molecules. Ignoring the 
inequality on page 1010, which says that Ic is the largest moment and is 
contradicted by the rest of the text (particularly the inequalities on page 
1011), Ic corresponds to the smallest principal moment : PMI1.

So now I'm confused, but I'm hoping this is obvious to someone versed in the 
field: I'd like to reproduce the descriptors described in the Todeschini 
article, but I clearly can't do that using the actual moments of inertia. I 
could keep using the eigenvalues of the covariance matrix there, but that 
doesn't match what's described in the text.

Two things that would be extremely helpful:
1) an explanation of the disconnect here from someone who knows this stuff, I 
would guess that it's pretty simple
2) The results of running the files github1262_1.mol, github1262_2.mol, and 
github1262_3.mol from here: 
https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data
 through Dragon and calculating the radius of gyration, inertial shape factor, 
eccentricity, molecular asphericity, and spherocity index.

Best,
-greg

Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Greg Landrum
I managed to make some time to look into this this weekend and I've found a
bug and something I don't understand. Hopefully the community can help out
here.

On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw 
wrote:

> 4) The big one! The returned results look very odd. They appear to relate
> more to the dimensions of the molecule than the moments of inertia. For a
> rod-like molecule (dimethylacetylene) I'd expect two large and one small
> PMI (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828
> NPR2: 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:
> 25.3699  NPR1: 0.002430  NPR2: 0.002430.
> For disk-like (benzene) the result should be one large and two medium
> (e.g. PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2:
> 0.500013) but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1:
> 2.14213e-11  NPR2: 0.33.
> Finally for a roughly spherical molecule (neopentane) the NPR values look
> reasonable (no great surprise) but the absolute PMI values may be too
> small: old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2:
> 6.59488  PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>

Your expectations are correct: the current RDKit implementation is wrong.
The corresponding github entry is here:
https://github.com/rdkit/rdkit/issues/1262
This is due to a mistake in the way the principal moments are calculated
(which is due to the fact that I don't spend a lot of time working
with/thinking about 3D descriptors). Instead of using the
eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the
RDKit is currently using the covariance matrix. There's some more on the
relationship between these two here:
http://number-none.com/blow/inertia/deriving_i.html

The problem is easy to fix (and I have something working here:
https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up
the values of the descriptors that are derived from here:
Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of
Chemoinformatics http://dx.doi.org/10.1002/9783527618279.ch37
These include the radius of gyration, inertial shape factor, etc.
Within that article they state that Ic = 0 for planar molecules. Ignoring
the inequality on page 1010, which says that Ic is the largest moment and
is contradicted by the rest of the text (particularly the inequalities on
page 1011), Ic corresponds to the smallest principal moment : PMI1.

So now I'm confused, but I'm hoping this is obvious to someone versed in
the field: I'd like to reproduce the descriptors described in the
Todeschini article, but I clearly can't do that using the actual moments of
inertia. I could keep using the eigenvalues of the covariance matrix there,
but that doesn't match what's described in the text.

Two things that would be extremely helpful:
1) an explanation of the disconnect here from someone who knows this stuff,
I would guess that it's pretty simple
2) The results of running the files github1262_1.mol, github1262_2.mol, and
github1262_3.mol from here:
https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data
through Dragon and calculating the radius of gyration, inertial shape
factor, eccentricity, molecular asphericity, and spherocity index.

Best,
-greg



>
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-08 Thread Greg Landrum
A more straightforward solution to this one, and what I probably should
have done in the first place, would be to not include the conditional
compilation directives in the PMI.h header file. It should be fine to have
the declarations in the header even if there is no corresponding
definition, and then client code wouldn't need to know about the extra
options.

PR coming.

On Sun, Jan 8, 2017 at 7:17 PM, Brian Kelley  wrote:

> I think the relevant issue is that if you are using an existing build, we
> don't yet have the capability for you to know what was built and what was
> not.  I.e. You need to add the compiler flag to indicate that the 3D stuff
> was actually built.
>
> I had a PR to fix this a while ago that was postponed that we should
> probably resurrect.  Basically it is an rdkit.h header file that has these
> flags built in so you won't have to include them yourself.
>
> 
> Brian Kelley
>
> On Jan 8, 2017, at 11:31 AM, Greg Landrum  wrote:
>
> Hi Chris,
>
> The RDKit should automatically build with the new descriptors enabled if
> eigen3 can be found when cmake is run. When you run cmake you should see a
> message if/when the build is disabled.
>
> If you want to call the functions, the best documentation available is the
> standard C++ API documentation, but something seems to have gone wrong when
> I ran doxygen. I'll look into this. That documentation is generated from
> the header file, so you can just look there:
> https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/PMI.h
> not that there's a huge amount of documentation available.
>
> W.r.t. efficiency: you do need to call the functions individually, but the
> expensive calculation of the moments will only be done once, so it doesn't
> end up doing repeated work.
>
> And, finally, on the values themselves: I will have to take a look at
> that.
> -greg
>
>
>
>
> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw 
> wrote:
>
>> Hi
>>
>> A while ago I had a project which needed PMI descriptors (specifically
>> NPR1 and NPR2) which were not available in the main branch of RDKit at the
>> time. At the time I used the fork by 'hahnda6' which provided the
>> calcPMIDescriptors() function, and this worked well. Now that PMI
>> descriptors are available in the main RDKit distrubution I thought I'd
>> rewrite my code to use the official version.
>>
>> Building the new RDKit was no problem, but things went downhill shortly
>> after that. There's every chance that I've missed the relevant
>> documentation (I hope someone can point me in the right direction if so)
>> and done something stupid!
>>
>> The issues are -
>> 1) I can't find any documentation of the C++ API - the only reference to
>> PMI in the online RDKit documentation appears to be to the PMI.h file
>> 2) Having written a program using the PMI[123] and/or NPR[12] functions,
>> I couldn't get it to compile until I added the  -DRDK_BUILD_DESCRIPTORS3D
>> directive -
>> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit
>> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers
>> -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D
>> This seems a bit odd...
>> 3) Is it necessary to make separate calls to the individual PMI() and/or
>> NPR() functions? Surely this results in duplication of some of the heavier
>> calculations? I can't find any equivalent of calcPMIDescriptors() which
>> returned a 'Moments' struct containing all the PMI and NPR values in one go.
>> 4) The big one! The returned results look very odd. They appear to relate
>> more to the dimensions of the molecule than the moments of inertia. For a
>> rod-like molecule (dimethylacetylene) I'd expect two large and one small
>> PMI (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828
>> NPR2: 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:
>> 25.3699  NPR1: 0.002430  NPR2: 0.002430.
>> For disk-like (benzene) the result should be one large and two medium
>> (e.g. PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2:
>> 0.500013) but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1:
>> 2.14213e-11  NPR2: 0.33.
>> Finally for a roughly spherical molecule (neopentane) the NPR values look
>> reasonable (no great surprise) but the absolute PMI values may be too
>> small: old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
>> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2:
>> 6.59488  PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>>
>> As I say, it's entirely likely that I'm doing something stupid here so
>> any pointers will be gratefully received. FWIW, the core of my program is -
>> mol = MolBlockToMol(ctab, true, false);
>> double pmi1 = RDKit::Descriptors::PMI1(*mol);
>> double pmi2 = RDKit::Descriptors::PMI2(*mol);
>> double pmi3 = RDKit::Descriptors::PMI3(*mol);
>> double npr1 = RDKit::Descriptors::NPR1(*mol);
>> double npr2 = 

Re: [Rdkit-discuss] PMI API

2017-01-08 Thread Chris Earnshaw
Hi Brian & Greg

Many thanks for the replies. I built RDKit with Descriptors3D enabled
without any problems, it was working out how to tell the compiler to
process my source code using the new functions which was troublesome. It
would be very helpful if the need for the -DRDK_BUILD_DESCRIPTORS3D
compiler directive was documented, e.g. with a comment near the top of
PMI.h, at least until a better solution is in place.

Good to know that the expensive calculation is only done once. Hope it
won't be difficult to sort out the strange PMI & NPR values - please let me
kbow if you need any more information from me.

Chris Earnshaw


On 8 Jan 2017 18:17, "Brian Kelley"  wrote:

I think the relevant issue is that if you are using an existing build, we
don't yet have the capability for you to know what was built and what was
not.  I.e. You need to add the compiler flag to indicate that the 3D stuff
was actually built.

I had a PR to fix this a while ago that was postponed that we should
probably resurrect.  Basically it is an rdkit.h header file that has these
flags built in so you won't have to include them yourself.


Brian Kelley

On Jan 8, 2017, at 11:31 AM, Greg Landrum  wrote:

Hi Chris,

The RDKit should automatically build with the new descriptors enabled if
eigen3 can be found when cmake is run. When you run cmake you should see a
message if/when the build is disabled.

If you want to call the functions, the best documentation available is the
standard C++ API documentation, but something seems to have gone wrong when
I ran doxygen. I'll look into this. That documentation is generated from
the header file, so you can just look there:
https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/PMI.h
not that there's a huge amount of documentation available.

W.r.t. efficiency: you do need to call the functions individually, but the
expensive calculation of the moments will only be done once, so it doesn't
end up doing repeated work.

And, finally, on the values themselves: I will have to take a look at that.
-greg




On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw 
wrote:

> Hi
>
> A while ago I had a project which needed PMI descriptors (specifically
> NPR1 and NPR2) which were not available in the main branch of RDKit at the
> time. At the time I used the fork by 'hahnda6' which provided the
> calcPMIDescriptors() function, and this worked well. Now that PMI
> descriptors are available in the main RDKit distrubution I thought I'd
> rewrite my code to use the official version.
>
> Building the new RDKit was no problem, but things went downhill shortly
> after that. There's every chance that I've missed the relevant
> documentation (I hope someone can point me in the right direction if so)
> and done something stupid!
>
> The issues are -
> 1) I can't find any documentation of the C++ API - the only reference to
> PMI in the online RDKit documentation appears to be to the PMI.h file
> 2) Having written a program using the PMI[123] and/or NPR[12] functions, I
> couldn't get it to compile until I added the  -DRDK_BUILD_DESCRIPTORS3D
> directive -
> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit
> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers
> -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D
> This seems a bit odd...
> 3) Is it necessary to make separate calls to the individual PMI() and/or
> NPR() functions? Surely this results in duplication of some of the heavier
> calculations? I can't find any equivalent of calcPMIDescriptors() which
> returned a 'Moments' struct containing all the PMI and NPR values in one go.
> 4) The big one! The returned results look very odd. They appear to relate
> more to the dimensions of the molecule than the moments of inertia. For a
> rod-like molecule (dimethylacetylene) I'd expect two large and one small
> PMI (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828
> NPR2: 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:
> 25.3699  NPR1: 0.002430  NPR2: 0.002430.
> For disk-like (benzene) the result should be one large and two medium
> (e.g. PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2:
> 0.500013) but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1:
> 2.14213e-11  NPR2: 0.33.
> Finally for a roughly spherical molecule (neopentane) the NPR values look
> reasonable (no great surprise) but the absolute PMI values may be too
> small: old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2:
> 6.59488  PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>
> As I say, it's entirely likely that I'm doing something stupid here so any
> pointers will be gratefully received. FWIW, the core of my program is -
> mol = MolBlockToMol(ctab, true, false);
> double pmi1 = RDKit::Descriptors::PMI1(*mol);
> double pmi2 = RDKit::Descriptors::PMI2(*mol);
> 

Re: [Rdkit-discuss] PMI API

2017-01-08 Thread Brian Kelley
I think the relevant issue is that if you are using an existing build, we don't 
yet have the capability for you to know what was built and what was not.  I.e. 
You need to add the compiler flag to indicate that the 3D stuff was actually 
built.  

I had a PR to fix this a while ago that was postponed that we should probably 
resurrect.  Basically it is an rdkit.h header file that has these flags built 
in so you won't have to include them yourself.


Brian Kelley

> On Jan 8, 2017, at 11:31 AM, Greg Landrum  wrote:
> 
> Hi Chris,
> 
> The RDKit should automatically build with the new descriptors enabled if 
> eigen3 can be found when cmake is run. When you run cmake you should see a 
> message if/when the build is disabled.
> 
> If you want to call the functions, the best documentation available is the 
> standard C++ API documentation, but something seems to have gone wrong when I 
> ran doxygen. I'll look into this. That documentation is generated from the 
> header file, so you can just look there:
> https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/PMI.h
> not that there's a huge amount of documentation available.
> 
> W.r.t. efficiency: you do need to call the functions individually, but the 
> expensive calculation of the moments will only be done once, so it doesn't 
> end up doing repeated work.
> 
> And, finally, on the values themselves: I will have to take a look at that. 
> -greg
> 
> 
> 
> 
>> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw  wrote:
>> Hi
>> 
>> A while ago I had a project which needed PMI descriptors (specifically NPR1 
>> and NPR2) which were not available in the main branch of RDKit at the time. 
>> At the time I used the fork by 'hahnda6' which provided the 
>> calcPMIDescriptors() function, and this worked well. Now that PMI 
>> descriptors are available in the main RDKit distrubution I thought I'd 
>> rewrite my code to use the official version.
>> 
>> Building the new RDKit was no problem, but things went downhill shortly 
>> after that. There's every chance that I've missed the relevant documentation 
>> (I hope someone can point me in the right direction if so) and done 
>> something stupid!
>> 
>> The issues are -
>> 1) I can't find any documentation of the C++ API - the only reference to PMI 
>> in the online RDKit documentation appears to be to the PMI.h file
>> 2) Having written a program using the PMI[123] and/or NPR[12] functions, I 
>> couldn't get it to compile until I added the  -DRDK_BUILD_DESCRIPTORS3D 
>> directive -
>> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit 
>> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers -Wno-deprecated 
>> -O2 -DRDK_BUILD_DESCRIPTORS3D
>> This seems a bit odd...
>> 3) Is it necessary to make separate calls to the individual PMI() and/or 
>> NPR() functions? Surely this results in duplication of some of the heavier 
>> calculations? I can't find any equivalent of calcPMIDescriptors() which 
>> returned a 'Moments' struct containing all the PMI and NPR values in one go.
>> 4) The big one! The returned results look very odd. They appear to relate 
>> more to the dimensions of the molecule than the moments of inertia. For a 
>> rod-like molecule (dimethylacetylene) I'd expect two large and one small PMI 
>> (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828  NPR2: 
>> 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:  25.3699  
>> NPR1: 0.002430  NPR2: 0.002430.
>> For disk-like (benzene) the result should be one large and two medium (e.g. 
>> PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2: 0.500013) 
>> but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1: 2.14213e-11  
>> NPR2: 0.33.
>> Finally for a roughly spherical molecule (neopentane) the NPR values look 
>> reasonable (no great surprise) but the absolute PMI values may be too small: 
>> old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
>> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2: 6.59488  
>> PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>> 
>> As I say, it's entirely likely that I'm doing something stupid here so any 
>> pointers will be gratefully received. FWIW, the core of my program is -
>> mol = MolBlockToMol(ctab, true, false);
>> double pmi1 = RDKit::Descriptors::PMI1(*mol);
>> double pmi2 = RDKit::Descriptors::PMI2(*mol);
>> double pmi3 = RDKit::Descriptors::PMI3(*mol);
>> double npr1 = RDKit::Descriptors::NPR1(*mol);
>> double npr2 = RDKit::Descriptors::NPR2(*mol);
>> 
>> Thanks for any help!
>> Chris
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> 

Re: [Rdkit-discuss] PMI API

2017-01-08 Thread Greg Landrum
Hi Chris,

The RDKit should automatically build with the new descriptors enabled if
eigen3 can be found when cmake is run. When you run cmake you should see a
message if/when the build is disabled.

If you want to call the functions, the best documentation available is the
standard C++ API documentation, but something seems to have gone wrong when
I ran doxygen. I'll look into this. That documentation is generated from
the header file, so you can just look there:
https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/PMI.h
not that there's a huge amount of documentation available.

W.r.t. efficiency: you do need to call the functions individually, but the
expensive calculation of the moments will only be done once, so it doesn't
end up doing repeated work.

And, finally, on the values themselves: I will have to take a look at that.
-greg




On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw 
wrote:

> Hi
>
> A while ago I had a project which needed PMI descriptors (specifically
> NPR1 and NPR2) which were not available in the main branch of RDKit at the
> time. At the time I used the fork by 'hahnda6' which provided the
> calcPMIDescriptors() function, and this worked well. Now that PMI
> descriptors are available in the main RDKit distrubution I thought I'd
> rewrite my code to use the official version.
>
> Building the new RDKit was no problem, but things went downhill shortly
> after that. There's every chance that I've missed the relevant
> documentation (I hope someone can point me in the right direction if so)
> and done something stupid!
>
> The issues are -
> 1) I can't find any documentation of the C++ API - the only reference to
> PMI in the online RDKit documentation appears to be to the PMI.h file
> 2) Having written a program using the PMI[123] and/or NPR[12] functions, I
> couldn't get it to compile until I added the  -DRDK_BUILD_DESCRIPTORS3D
> directive -
> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit
> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers
> -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D
> This seems a bit odd...
> 3) Is it necessary to make separate calls to the individual PMI() and/or
> NPR() functions? Surely this results in duplication of some of the heavier
> calculations? I can't find any equivalent of calcPMIDescriptors() which
> returned a 'Moments' struct containing all the PMI and NPR values in one go.
> 4) The big one! The returned results look very odd. They appear to relate
> more to the dimensions of the molecule than the moments of inertia. For a
> rod-like molecule (dimethylacetylene) I'd expect two large and one small
> PMI (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828
> NPR2: 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:
> 25.3699  NPR1: 0.002430  NPR2: 0.002430.
> For disk-like (benzene) the result should be one large and two medium
> (e.g. PMI1: 89.1448  PMI2: 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2:
> 0.500013) but get PMI1: 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1:
> 2.14213e-11  NPR2: 0.33.
> Finally for a roughly spherical molecule (neopentane) the NPR values look
> reasonable (no great surprise) but the absolute PMI values may be too
> small: old program - PMI1: 114.795  PMI2: 114.797  PMI3: 114.799
> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2:
> 6.59488  PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>
> As I say, it's entirely likely that I'm doing something stupid here so any
> pointers will be gratefully received. FWIW, the core of my program is -
> mol = MolBlockToMol(ctab, true, false);
> double pmi1 = RDKit::Descriptors::PMI1(*mol);
> double pmi2 = RDKit::Descriptors::PMI2(*mol);
> double pmi3 = RDKit::Descriptors::PMI3(*mol);
> double npr1 = RDKit::Descriptors::NPR1(*mol);
> double npr2 = RDKit::Descriptors::NPR2(*mol);
>
> Thanks for any help!
> Chris
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PMI API

2017-01-08 Thread Chris Earnshaw
Hi David

Thanks for the rapid reply! Looks like a very useful document for people
getting started with the RDKit C++ API.

As you suspected, I'm slightly beyond that stage having been an RDKit user
for a number of years. My queries are specifically to do with using the PMI
functionality; most particularly why the numbers produced by the current
implementation don't appear to match expected values for particular shapes
of molecule, but also the lack of information about the PMI-related
functions in the main RDKit C++ API documentation and the apparently odd
requirement for the -DRDK_BUILD_DESCRIPTORS3D flag (looks more like a cmake
directive) when compiling a program which uses GraphMol/Descriptors3D
functions.

Cheers,
Chris

On 8 January 2017 at 12:13, David Cosgrove 
wrote:

> Hi Chris,
> I can help a bit with the first point - I am currently 'porting' the
> getting started in Python bit of the documentation to c++. There's a long
> way to go, but if you go to my fork of RDKit at https://github.com/
> DavidACosgrove and check out the GetStartedC++ branch, you can at least
> use what I've managed so far (https://github.com/
> DavidACosgrove/rdkit/blob/GetStartedC%2B%2B/Docs/Book/
> GettingStartedInC%2B%2B.md).  It's pretty basic stuff that you may
> already be beyond, but there are some examples and a CMakeLists.txt file
> that builds them which might be helpful.
>
>
> It's probably time I tidied it up (having just looked at it to get the
> link above, I see there's a typo on the first sentence, for example!) and
> sent in an interim Pull Request as for people starting out it might already
> be of value.
>
> Cheers,
> Dave
>
> On Sun, 8 Jan 2017 at 10:19, Chris Earnshaw  wrote:
>
>> Hi
>>
>> A while ago I had a project which needed PMI
>>
>> descriptors (specifically NPR1 and NPR2) which were not available in the
>>
>> main branch of RDKit at the time. At the time I used the fork by
>>
>> 'hahnda6' which provided the calcPMIDescriptors() function, and this
>>
>> worked well. Now that PMI descriptors are available in the main RDKit
>>
>> distrubution I thought I'd rewrite my code to use the official version.
>>
>> Building
>>
>> the new RDKit was no problem, but things went downhill shortly after
>>
>> that. There's every chance that I've missed the relevant documentation
>>
>> (I hope someone can point me in the right direction if so) and done
>>
>> something stupid!
>>
>> The issues are -
>> 1) I can't find
>>
>> any documentation of the C++ API - the only reference to PMI in the
>>
>> online RDKit documentation appears to be to the PMI.h file
>> 2)
>>
>> Having written a program using the PMI[123] and/or NPR[12] functions, I
>>
>> couldn't get it to compile until I added the  -DRDK_BUILD_DESCRIPTORS3D
>>
>> directive -
>> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit
>> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers
>> -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D
>> This seems a bit odd...
>> 3)
>>
>> Is it necessary to make separate calls to the individual PMI() and/or
>>
>> NPR() functions? Surely this results in duplication of some of the
>>
>> heavier calculations? I can't find any equivalent of
>>
>> calcPMIDescriptors() which returned a 'Moments' struct containing all
>>
>> the PMI and NPR values in one go.
>> 4) The big one! The
>>
>> returned results look very odd. They appear to relate more to the
>>
>> dimensions of the molecule than the moments of inertia. For a rod-like
>>
>> molecule (dimethylacetylene) I'd expect two large and one small PMI
>>
>> (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828
>>
>> NPR2: 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:
>>
>> 25.3699  NPR1: 0.002430  NPR2: 0.002430.
>> For disk-like (benzene) the
>>
>> result should be one large and two medium (e.g. PMI1: 89.1448  PMI2:
>>
>> 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2: 0.500013) but get PMI1:
>>
>> 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1: 2.14213e-11  NPR2:
>>
>> 0.33.
>> Finally for a roughly spherical molecule (neopentane) the
>>
>> NPR values look reasonable (no great surprise) but the absolute PMI
>>
>> values may be too small: old program - PMI1: 114.795  PMI2: 114.797
>>
>> PMI3: 114.799
>> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2:
>> 6.59488  PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>>
>> As
>>
>> I say, it's entirely likely that I'm doing something stupid here so any
>>
>> pointers will be gratefully received. FWIW, the core of my program is -
>> mol = MolBlockToMol(ctab, true, false);
>> double pmi1 = RDKit::Descriptors::PMI1(*mol);
>> double pmi2 = RDKit::Descriptors::PMI2(*mol);
>> double pmi3 = RDKit::Descriptors::PMI3(*mol);
>> double npr1 = RDKit::Descriptors::NPR1(*mol);
>> double npr2 = RDKit::Descriptors::NPR2(*mol);
>>
>> Thanks for any help!
>> Chris
>>
>>
>> 
>> 

Re: [Rdkit-discuss] PMI API

2017-01-08 Thread David Cosgrove
Hi Chris,
I can help a bit with the first point - I am currently 'porting' the
getting started in Python bit of the documentation to c++. There's a long
way to go, but if you go to my fork of RDKit at
https://github.com/DavidACosgrove and check out the GetStartedC++ branch,
you can at least use what I've managed so far (
https://github.com/DavidACosgrove/rdkit/blob/GetStartedC%2B%2B/Docs/Book/GettingStartedInC%2B%2B.md).
It's pretty basic stuff that you may already be beyond, but there are some
examples and a CMakeLists.txt file that builds them which might be helpful.


It's probably time I tidied it up (having just looked at it to get the link
above, I see there's a typo on the first sentence, for example!) and sent
in an interim Pull Request as for people starting out it might already be
of value.

Cheers,
Dave

On Sun, 8 Jan 2017 at 10:19, Chris Earnshaw  wrote:

> Hi
>
> A while ago I had a project which needed PMI
>
> descriptors (specifically NPR1 and NPR2) which were not available in the
>
> main branch of RDKit at the time. At the time I used the fork by
>
> 'hahnda6' which provided the calcPMIDescriptors() function, and this
>
> worked well. Now that PMI descriptors are available in the main RDKit
>
> distrubution I thought I'd rewrite my code to use the official version.
>
> Building
>
> the new RDKit was no problem, but things went downhill shortly after
>
> that. There's every chance that I've missed the relevant documentation
>
> (I hope someone can point me in the right direction if so) and done
>
> something stupid!
>
> The issues are -
> 1) I can't find
>
> any documentation of the C++ API - the only reference to PMI in the
>
> online RDKit documentation appears to be to the PMI.h file
> 2)
>
> Having written a program using the PMI[123] and/or NPR[12] functions, I
>
> couldn't get it to compile until I added the  -DRDK_BUILD_DESCRIPTORS3D
>
> directive -
> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit
> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers
> -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D
> This seems a bit odd...
> 3)
>
> Is it necessary to make separate calls to the individual PMI() and/or
>
> NPR() functions? Surely this results in duplication of some of the
>
> heavier calculations? I can't find any equivalent of
>
> calcPMIDescriptors() which returned a 'Moments' struct containing all
>
> the PMI and NPR values in one go.
> 4) The big one! The
>
> returned results look very odd. They appear to relate more to the
>
> dimensions of the molecule than the moments of inertia. For a rod-like
>
> molecule (dimethylacetylene) I'd expect two large and one small PMI
>
> (e.g. PMI1: 6.61651   PMI2: 150.434   PMI3: 150.434  NPR1: 0.0439828
>
> NPR2: 0.98) but actually get PMI1: 0.061647  PMI2: 0.061652  PMI3:
>
> 25.3699  NPR1: 0.002430  NPR2: 0.002430.
> For disk-like (benzene) the
>
> result should be one large and two medium (e.g. PMI1: 89.1448  PMI2:
>
> 89.1495  PMI3: 178.294  NPR1: 0.499987  NPR2: 0.500013) but get PMI1:
>
> 2.37457e-10  PMI2: 11.0844  PMI3: 11.0851  NPR1: 2.14213e-11  NPR2:
>
> 0.33.
> Finally for a roughly spherical molecule (neopentane) the
>
> NPR values look reasonable (no great surprise) but the absolute PMI
>
> values may be too small: old program - PMI1: 114.795  PMI2: 114.797
>
> PMI3: 114.799
> NPR1: 0.66  NPR2: 0.88, new program - PMI1: 6.59466  PMI2:
> 6.59488  PMI3: 6.59531  NPR1: 0.02  NPR2: 0.35
>
> As
>
> I say, it's entirely likely that I'm doing something stupid here so any
>
> pointers will be gratefully received. FWIW, the core of my program is -
> mol = MolBlockToMol(ctab, true, false);
> double pmi1 = RDKit::Descriptors::PMI1(*mol);
> double pmi2 = RDKit::Descriptors::PMI2(*mol);
> double pmi3 = RDKit::Descriptors::PMI3(*mol);
> double npr1 = RDKit::Descriptors::NPR1(*mol);
> double npr2 = RDKit::Descriptors::NPR2(*mol);
>
> Thanks for any help!
> Chris
>
>
> 
> --
>
> Check out the vibrant tech community on one of the world's most
>
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot__
> _
>
> Rdkit-discuss mailing list
>
> Rdkit-discuss@lists.sourceforge.net
>
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss