Hi everyone, I've just released chemfp 4.1. To install the pre-compiled package for Linux-based OSes do:
python -m pip install chemfp -i https://chemp.com/packages/ For a detailed description of what's new, see: https://chemfp.readthedocs.io/en/latest/whats_new_in_41.html As a summary, the new features in this release include: - Supports RDKit 2023.03.1 and Python 3.8 through 3.11 - Interprets input SMILES as CXSMILES by default, with an option to turn that off - Can save/load similarity search results to a NumPy file in a form compatible with SciPy compressed sparse matrices - Implements Butina clustering, with several variations. While building the similarity matrix may take an hour, the result can be saved to an npz file that the Butina implementation can use as input. This can be useful when tuning the Butina parameters because the NxN matrix can be constructed once, at the lowest reasonable threshold, while the Butina clustering can use a higher threshold. It takes only a few seconds to cluster ChEMBL at a threshold of 0.6. - Sphere exclusion ("spherex") has been parallelized, with new options for specifying directed sphere exclusion ranking and a new output format compatible with the Butina output - The new "chemfp csv2fps" tool for generating fingerprints from CSV files containing identifiers and molecules. - The new "chemfp translate" tool for structure file format conversion. These are available for no cost under the Chemfp Base License Agreement at https://chemfp.com/BaseLicense.txt . For other licensing options, including no-cost license key for academic use, see https://chemfp.com/license/ . Best regards, Andrew da...@dalkescientific.com _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss