I would just call BulkTanimotoSimilarity() for each of the molecules in your seed set, querying against the other set, and then keep the top N results for each molecule in the seed set. You can then merge those lists to get the top N overall.
I think this may also be something that you can easily (and quickly) do with chemfp. -greg On Thu, Feb 4, 2021 at 11:46 AM Tim Dudgeon <tdudgeon...@gmail.com> wrote: > I'm wanting to pick the molecules in one set that are most similar to > those in a seed set. > Seems like I should be able to use the MaxMinPicker to do this by using a > function of (1 - tanimoto). > Would this work, or is there a better approach? > Thanks > Tim > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss