Andrew, Wow, thank you for the detailed reply.
I am happy with the current processing time of 5 secs to compare 400,000+ fingerprints, but I will look at the stack overflow discussion. I am pretty well versed in MongoDB and hadn't thought about calculating it fully in MongoDB. I will also talk to the staff about the various fingerprints and determine where to go next. The initial code was written in late 2017. This first migration step was to upgrade the various libraries and packages (oBabel 2.4 to 3.1.1). I hadn't really thought about looking at updated features or a different approach. -------------------------------------------------------------------- ...... >> With this knowledge you may be able to do the Tanimoto calculation all in >> MongoDB, for example, by converting the "1" bit positions to an array field >> and using the aggregation framework. >> >> (These are magic words found by reading >> https://stackoverflow.com/questions/27805634/can-i-calculate-the-similarity-of-document-fields-using-mapreduce >> ; note that "Tanimoto similarity" is the domain-specific term for what the >> IR field generally refers to as "Jaccard similarity"). ...... There are many different fingerprint types. The ECFP fingerprints in more recent releases of Open Babel may be more relevant than the Daylight-like FP2 fingerprints. This will require talking to your user base. Chris Wolcott chris.wolc...@nih.gov _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss