Dear RDKitters,
I just calculated RDKit "Daylight-like" fingerprints for a number of public
compound databases and found quite a number of examples where the resulting
fingerprints have *all* bits set to 1. This happens in both KNIME 3.2.1
(1024/1/7) and also via the command line (2048/1/7/4) for RDKit 2016.03.
Examples include (from SureChEMBL):
SCHEMBL5141968
SCHEMBL13916889
SCHEMBL16257315
SCHEMBL16257310
SCHEMBL16257297
SCHEMBL16257215
SCHEMBL16257169
SCHEMBL8232906
SCHEMBL16257312
SCHEMBL13011081
SCHEMBL12570100
SCHEMBL14524878
SCHEMBL6370886
SCHEMBL15305169
SCHEMBL16912871
SCHEMBL13290179
Now, these are obviously some very large and complex molecules, so I would
expect that they contain many features and thus set many bits - but all of
them?
So, in short: Are these compounds so ugly that it is normal for the
fingerprints to have all bits set or are they so ugly that they trigger
some rare bug in RDKit?
Any ideas / suggestions / comments?
Thanks a lot,
Nils
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss