Dear Eric,
Sure, if fingerprints are not stable over time, some people who check
things very
carefully (as you did) will have some surprises.
This being said, you should probably be using InChI keys, if you want a
hash
for each molecule.
Regards,
F.
On 13/01/2023 06:37, Eric Jonas wrote:
Hello! I use the crc of morgan fingerprints as a quick-and-dirty way
to keep track of different molecules, but now I realize it might have
been too quick and dirty! In particular, there appears to have been a
change in the morgan code sometime between 2021.09.02 and 2022.03.05.
The following code produces different output under these versions:
import rdkit.Chem
import pickle
from rdkit import Chem
import rdkit.Chem.rdMolDescriptors
import zlib
def get_morgan4_crc32(m):
mf = Chem.rdMolDescriptors.GetHashedMorganFingerprint(m, 4)
morgan4_crc32 = zlib.crc32(mf.ToBinary())
return morgan4_crc32
mol = Chem.AddHs(Chem.MolFromSmiles('Oc1cc(O)c(O)c(O)c1'))
print(get_morgan4_crc32(mol))
2021.09.2 : 1567135676
2022.03.5 : 204854560
I tried looking at the release notes but I didn't seem to see any
breaking changes (I might have missed them!) and I tried looking at
"blame" for the relevant source but didn't see any
seemingly-substantive changes within the relevant timeframe.
So am I doing something crazy here, or did something change
deliberately, or is it possible this is a bug?
...E
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss