Hi Gareth, Thank you. I do exactly as you wrote. That's not the issue. Please note, that all the keys in elements are in range of 2**32 - the main hash function used is definitely 32 bit
According to https://www.rdkit.org/docs/source/rdkit.Chem.rdFingerprintGenerator.html both *class *rdkit.Chem.rdFingerprintGenerator.FingerprintGenerator32 and *class *rdkit.Chem.rdFingerprintGenerator.FingerprintGenerator64 exist. However with my limited knowledge I don't know how to access the 64 bit version and that is my problem. Kindest regards, Wojtek Wojtek Plonka +48885756652 wojtekplonka.com <http://www.wojtekplonka.com> fb.com/wojtek.plonka On Thu, Apr 22, 2021 at 1:27 AM Gareth Jones <java.jo...@gmail.com> wrote: > Wojtek, > > You can use GetNonzeroelements() to convert the sparse fingerprint to a > Python Dict of hash to count. > > Cheers, > Gareth > > > In [7]: mol = Chem.MolFromSmiles('Cn1cnc2n(C)c(=O)n(C)c(=O)c12') > > In [8]: fp = AllChem.GetMorganFingerprint(mol, 2) > > In [9]: elements = fp.GetNonzeroElements(); > > In [10]: elements > Out[10]: > {10565946: 2, > 348155210: 1, > 476388586: 1, > 540046244: 1, > 553412256: 1, > 864942730: 2, > 909857231: 1, > 1100037548: 1, > 1333761024: 1, > 1512818157: 1, > 1981181107: 1, > 2030573601: 1, > 2041434490: 1, > 2092489639: 3, > 2246728737: 3, > 2370996728: 1, > 2877515035: 1, > 2971716993: 1, > 2975126068: 2, > 3140581776: 1, > 3217380708: 4, > 3218693969: 1, > 3462333187: 1, > 3657471097: 3, > 3796970912: 1} > > In [11]: > On 4/21/2021 5:44 AM, Wojtek Plonka wrote: > > Dear All > > Do any of you have a working example of getting Morgan Fingerprints, as > sparse bit vector (non-hashed) in the 64 bit version using Python? > I'm looking into the issue of collisions on the "main hash" on large (100+ > million molecules) data > Thank you very much! > Kindest regards, > > Wojtek Plonka > +48885756652 > wojtekplonka.com <http://www.wojtekplonka.com> > fb.com/wojtek.plonka > > > > _______________________________________________ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss