Hi,   After my presentation at the RDkit UGM, I had a brief discussion with 
Greg about the behavior of setting the fromAtoms flag when generating 
fingerprints. My test script below illustrates the behavior.
from __future__ import print_function
from rdkit import Chem
from rdkit.Chem.AtomPairs import Torsions
from rdkit import rdBase

print(rdBase.rdkitVersion)


mol = Chem.MolFromSmiles('OCCS')

fp = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol)
print(fp.GetNonzeroElements())

fp1 = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, fromAtoms=[0])
fp2 = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, fromAtoms=[3])
print(fp1 == fp2)

print(fp1.GetNonzeroElements())
print(fp2.GetNonzeroElements())


print("With targetSize=[3]")

fp = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, targetSize=3)
print(fp.GetNonzeroElements())

fp1 = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, fromAtoms=[0], 
targetSize=3)
fp2 = Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, fromAtoms=[3], 
targetSize=3)
print(fp1 == fp2)

print(fp1.GetNonzeroElements())
print(fp2.GetNonzeroElements())

The output is
2016.09.1.dev1
{30073176160: 1}
True
{30073176160: 1}
{30073176160: 1}
With targetSize=[3]
{25182241: 1, 58736673: 1}
False
{25182241: 1}
{58736673: 1}
So for targetLength=4 (default), the fingerprint turns up the same no matter if 
I want to get it from O or S. For molecular similarity comparisons this seems 
smart not to double all paths, but as I'm trying to model pKa, it will probably 
be easier to know the difference between OH and SH by getting different bits 
set by OCCS and SCCO.

For targetSize=3, there is a clear difference between OCC and SCC, showing that 
the syntax should be correct. Esben Jannik Bjerrum
cand.pharm, Ph.D
/Sent from my Ubuntu Touch Phone

Phone +45 2823 8009
http://dk.linkedin.com/in/esbenbjerrum
http://www.wildcardconsulting.dk
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to