Hi, I've come across a discrepancy between the pubchem fingerprint obtained through CDK (calculated from SMILES) and the pubchem fingerprint extracted directly from the pubchem website. For example, the Canonical SMILES of compound Ampicillin (pubchem CID 6249) is CC1(C(N2C(S1)C(C2=O)NC(=O)C(C3=CC=CC=C3)N)C(=O)O)C. The calculation of pubchem fingerprint based on this SMILES by CDK is 00000000011110110011100000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010110000000000101100000000000000000000000000000001100000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000001111000000100000100000000100000000000000000000000110000101000110001011101100000000100101100100000100010000011110000000000001000001000100010000000001000100001110100100001100000000000100000100000000000000000011000000000000000010000000010001000100010000001100010000000010010001000000010100110000000111010101000001001010100110001100101000110000000000000011001001001011000000000101110001000100000000111000110001000100010000000100011100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 The pubchem fingerprint extracted from pubchem website for this compound is 11100000011110110011100000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010110000000000101100000000000000000000000000000001100000000000000000000000000000000010110000000000000000000000000000000000000010000000000000000000000000001111000000100000100000000100000000000000000000000110000101000110001011101100000000100101100100000100010000011110000000000001000001000100010000000001000100001110100100001100000000000100000100000000000000000011000000000000000010000000010001000100010000001100010000000010010001000000010100110000000111010101000001001010100110001100101000110000000000000011001001001011000000000101110001000100000000111000110001000100010000000100011100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 The two fingerprints differ on positions 0, 1, 2, 213, 215, and 216. Have any of you encountered a similar issue or could anyone identify what mistake I may have made? Any assistance provided would be greatly appreciated!
Thank you, Yihan <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> 无病毒。www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user