Wojtek,

You can use GetNonzeroelements() to convert the sparse fingerprint to a Python Dict of hash to count.

Cheers,
Gareth


In [7]: mol = Chem.MolFromSmiles('Cn1cnc2n(C)c(=O)n(C)c(=O)c12')

In [8]: fp = AllChem.GetMorganFingerprint(mol, 2)

In [9]: elements = fp.GetNonzeroElements();

In [10]: elements
Out[10]:
{10565946: 2,
 348155210: 1,
 476388586: 1,
 540046244: 1,
 553412256: 1,
 864942730: 2,
 909857231: 1,
 1100037548: 1,
 1333761024: 1,
 1512818157: 1,
 1981181107: 1,
 2030573601: 1,
 2041434490: 1,
 2092489639: 3,
 2246728737: 3,
 2370996728: 1,
 2877515035: 1,
 2971716993: 1,
 2975126068: 2,
 3140581776: 1,
 3217380708: 4,
 3218693969: 1,
 3462333187: 1,
 3657471097: 3,
 3796970912: 1}

In [11]:

On 4/21/2021 5:44 AM, Wojtek Plonka wrote:
Dear All

Do any of you have a working example of getting Morgan Fingerprints, as sparse bit vector (non-hashed) in the 64 bit version using Python? I'm looking into the issue of collisions on the "main hash" on large (100+ million molecules) data
Thank you very much!
Kindest regards,

Wojtek Plonka
+48885756652
wojtekplonka.com <http://www.wojtekplonka.com>
fb.com/wojtek.plonka <https://fb.com/wojtek.plonka>



_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to