I would just call BulkTanimotoSimilarity() for each of the molecules in
your seed set, querying against the other set, and then keep the top N
results for each molecule in the seed set. You can then merge those lists
to get the top N overall.

I think this may also be something that you can easily (and quickly) do
with chemfp.

-greg


On Thu, Feb 4, 2021 at 11:46 AM Tim Dudgeon <tdudgeon...@gmail.com> wrote:

> I'm wanting to pick the molecules in one set that are most similar to
> those in a seed set.
> Seems like I should be able to use the MaxMinPicker to do this by using a
> function of (1 - tanimoto).
> Would this work, or is there a better approach?
> Thanks
> Tim
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to