Hi all, I'm getting ready to make a new chemfp release, which means checking it works with the latest versions of RDKit.
I noticed there was a change in RDKit 2024.09.4 which affects rdMHFPFingerprint. I use that module in chemfp to generate SEC fingerprints (SECFP = SMILES extended connectivity fingerprint), a fingerprint type described in https://link.springer.com/article/10.1186/s13321-018-0321-8 . In my test set of 19 structures, 2 of them have different SECFPs. One of them is PubChem id 9425030 with SMILES "CC(C)NC(=O)N1CCN([C@H](C1)C(=O)N[C@H]2CCCNC2=O)C(=O)[C@@H]3CSC[NH2+]3" from https://pubchem.ncbi.nlm.nih.gov/compound/9425030 with Using 2024.09.3 or earlier I get a set of 74 on-bits. Using 2024.09.3 or later I get a slightly different set of 73 on-bits. 3 bits are missing and 2 new ones are set. Here is a short program which generates those bits. from rdkit import Chem from rdkit.Chem import rdMHFPFingerprint smiles = "CC(C)NC(=O)N1CCN([C@H](C1)C(=O)N[C@H]2CCCNC2=O)C(=O)[C@@H]3CSC[NH2+]3" mol = Chem.MolFromSmiles(smiles) encoder = rdMHFPFingerprint.MHFPEncoder(0, 0) fp = encoder.EncodeSECFPMol(mol) print(list(fp.GetOnBits())) I've put a more complete reproducible at https://paste.sr.ht/~dalke/d9593867e2cb68b95a6a166dd5c5b0fa30362b74 . I don't see anything in the release notes for 2024.09.4 which would cause this change. Can anyone here provide insight? Best regards, Andrew [email protected] _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

