This is my 3rd attempt to get an explanation about how these invariants
work in the ECFP fingerprint cause I can't find it anywhere in the
documentation.
I tried the generateAtomInvariant() [see below] and the resulting ECFP
bit-vectors had for the same molecules drastically reduced variance, 2360
variant bits without invariants versus 795 with the invariants.
Surprisingly, the performance of the ECFP with invariants was better in
this dataset in terms of affinity ranking. Can someone please explain what
happens when I pass invariants to the AllChem.GetMorganFingerprint()
function??? I hope that I will get an answer this time.


>> def generateAtomInvariant(mol):
>>     """
>>     >>> generateAtomInvariant(Chem.MolFromSmiles("Cc1ncccc1"))
>>     [341294046, 3184205312, 522345510, 1545984525, 1545984525, 1545984525, 
>> 1545984525]
>>     """
>>     num_atoms = mol.GetNumAtoms()
>>     invariants = [0]*num_atoms
>>     for i,a in enumerate(mol.GetAtoms()):
>>         descriptors=[]
>>         descriptors.append(a.GetAtomicNum())
>>         descriptors.append(a.GetTotalDegree())
>>         descriptors.append(a.GetTotalNumHs())
>>         descriptors.append(a.IsInRing())
>>         descriptors.append(a.GetIsAromatic())
>>         invariants[i]=hash(tuple(descriptors))& 0xffffffff
>>     return invariants
>>
>>
>> And then generate the fingerprint like this:
>>
>>
>> fp = AllChem.GetMorganFingerprint(mol, radius=3, 
>> invariants=generateAtomInvariant(mol))
>>
>>
>>

-- 

======================================================================

Dr. Thomas Evangelidis

Research Scientist

IOCB - Institute of Organic Chemistry and Biochemistry of the Czech Academy
of Sciences <https://www.uochb.cz/web/structure/31.html?lang=en>, Prague,
Czech Republic
  &
CEITEC - Central European Institute of Technology
<https://www.ceitec.eu/>, Brno,
Czech Republic

email: teva...@gmail.com, Twitter: tevangelidis
<https://twitter.com/tevangelidis>, LinkedIn: Thomas Evangelidis
<https://www.linkedin.com/in/thomas-evangelidis-495b45125/>

website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to