Hi Nils,
> On Jan 27, 2026, at 22:10, Nils Weskamp <[email protected]> wrote:
> there is some discussion of the parameter available at
>
> https://docs.chemaxon.com/display/docs/fingerprints_chemical-hashed-fingerprint.md#src-1806332-chemicalhashedfingerprint-references
It says: "Again, the situation is somewhat different in similarity searching,
yet values higher than 2 rarely increase the amount of information represented
by the fingerprint significantly"
I'm curious for any research showing that 2 is more useful than 1, for a given
bit length, since my experiments conclude that using 2 for similarity is worse
than using 1.
> I would also assume that it is mainly relevant for database pre-screens in
> substructure searches and provides a simple way to tune the "darkness" of the
> fingerprint.
That's my conclusion too. The mathematics for optimizing screenout is different
than for similarity.
Andrew
[email protected]
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss