Hi Bruce,


On Mon, May 15, 2017 at 3:46 PM, Bruce Milne <bfmi...@gmail.com> wrote:

> Hi,
>
> I've noticed that if I define a molecule as neutral, positive and negative
> (depending on protonation) the EState indices calculated by RDKit reflect
> the changes as expected:
>
> m = Chem.MolFromSmiles('CCO')
> m_neg = Chem.MolFromSmiles('CC[O-]')
> m_pos = Chem.MolFromSmiles('CC[OH2+]')
>
>
> EState indices:
>
> m          :  [ 1.68055556  0.25        7.56944444]
> m_neg  :  [ 1.56944444  0.          8.93055556]
> m_pos  :  [ 1.79166667  0.5         6.20833333]
>
> However, when I calculate the EState fingerprints there seems to be some
> problem with the (de)protonated oxygen as its value (and even the presence
> of its atoms type) fails to show up. If EState.EStateIndices() can
> calculate the value for oxygen in all three states then
> Fingerprinter.FingerprintMol() should also be able to handle these
> notations?
>

The EState fingerprinter uses a set of atom types defined in Table 1 of
this publication:
http://pubs.acs.org/doi/abs/10.1021/ci00028a014?journalCode=jcics1

Atoms that don't match any of those types don't contribute to the
fingerprint. There are no charged O atoms in the table, so those atoms
don't get typed.

The current implementation is quite literal and just uses the atom types
that are explicitly defined in that table. It might be worth exploring a
different scheme for defining types that handle all (or at least most)
atoms, but that would result in a different fingerprint.

-greg
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to