Dear RDKitters, To those of you who are interested in Matched Pairs analysis:
mmpdb, a library for doing large-scale MMP analysis, has been updated to version 2.2. This update introduces three new features that enable reducing the size of the database created. These are: --min-heavies-per-const-frag: During fragmentation, double- and triple cuts where one of the constant parts has less heavy atoms than this number will be discarded. This removes a lot of pseudo-multiple cuts where one of the constant parts is very small, like only a halogen or a methoxy. --smallest-transformation-only: If set on during indexing, only the smallest transformation per pair will be indexed. If for example a transformation is p-F-phenyl >> p-Cl-phenyl, only the F>>Cl transformation will be indexed. Note that the phenyl environment is still encoded as part of the environment fingerprint. --max-radius: The maximum radius up to which environments are enumerated can now be set on the command line during indexing. In my tests, the using --min-heavies-per-const-frag 3 and --smallest-transformation-only reduces the size of the database by ~70%. A smaller max-radius can have another dramatic effect on the database size. The latest version of mmpdb is now available for download on https://github.com/rdkit/mmpdb. If you are interested in it, please check it out and let me know if you find it useful or if you have problems with it. Best regards, Christian *Dr. Christian Kramer* Computer-Aided Drug Design (CADD) F. Hoffmann-La Roche Ltd Pharma Research and Early Development Bldg. 092/8.56 C CH-4070 Basel Phone +41 61 682 2471 mailto: christian.kra...@roche.com
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss