Hi, After my presentation at the RDkit UGM, I had a brief discussion with
Greg about the behavior of setting the fromAtoms flag when generating
fingerprints. My test script below illustrates the behavior.
from __future__ import print_function
from rdkit import Chem
from rdkit.Chem.AtomPairs import Torsions
from rdkit import rdBase
print(rdBase.rdkitVersion)
mol = Chem.MolFromSmiles('OCCS')
fp = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol)
print(fp.GetNonzeroElements())
fp1 = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, fromAtoms=[0])
fp2 = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, fromAtoms=[3])
print(fp1 == fp2)
print(fp1.GetNonzeroElements())
print(fp2.GetNonzeroElements())
print("With targetSize=[3]")
fp = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, targetSize=3)
print(fp.GetNonzeroElements())
fp1 = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, fromAtoms=[0],
targetSize=3)
fp2 = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, fromAtoms=[3],
targetSize=3)
print(fp1 == fp2)
print(fp1.GetNonzeroElements())
print(fp2.GetNonzeroElements())
The output is
2016.09.1.dev1
{30073176160: 1}
True
{30073176160: 1}
{30073176160: 1}
With targetSize=[3]
{25182241: 1, 58736673: 1}
False
{25182241: 1}
{58736673: 1}
So for targetLength=4 (default), the fingerprint turns up the same no matter if
I want to get it from O or S. For molecular similarity comparisons this seems
smart not to double all paths, but as I'm trying to model pKa, it will probably
be easier to know the difference between OH and SH by getting different bits
set by OCCS and SCCO.
For targetSize=3, there is a clear difference between OCC and SCC, showing that
the syntax should be correct. Esben Jannik Bjerrum
cand.pharm, Ph.D
/Sent from my Ubuntu Touch Phone
Phone +45 2823 8009
http://dk.linkedin.com/in/esbenbjerrum
http://www.wildcardconsulting.dk
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss