I would just call BulkTanimotoSimilarity() for each of the molecules in your seed set, querying against the other set, and then keep the top N results for each molecule in the seed set. You can then merge those lists to get the top N overall.
I think this may also be something that you can easily (and quickly) do with chemfp. -greg On Thu, Feb 4, 2021 at 11:46 AM Tim Dudgeon <[email protected]> wrote: > I'm wanting to pick the molecules in one set that are most similar to > those in a seed set. > Seems like I should be able to use the MaxMinPicker to do this by using a > function of (1 - tanimoto). > Would this work, or is there a better approach? > Thanks > Tim > _______________________________________________ > Rdkit-discuss mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

