Re: [Rdkit-discuss] MFP question about similar substructures and feature reduction

2021-09-29 Thread Natasha Gupta
Thank you so much! This is so clear and very helpful. On Wednesday, September 29, 2021, Rafael L wrote: > Hello, your question prompted me to write a small notebook, which I hope > you may find useful: > https://github.com/rflameiro/projects/blob/main/comparing_ > fingerprint_bits.ipynb > > In

Re: [Rdkit-discuss] MFP question about similar substructures and feature reduction

2021-09-29 Thread Rajarshi Guha
I'd be wary of using PCA on binary fingerprints based on Martin and Cao (2015 ) On Wed, Sep 29, 2021 at 3:34 PM Rafael L via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > Hello, your question prompted me to write a small notebook,

Re: [Rdkit-discuss] MFP question about similar substructures and feature reduction

2021-09-29 Thread Rafael L via Rdkit-discuss
Hello, your question prompted me to write a small notebook, which I hope you may find useful: https://github.com/rflameiro/projects/blob/main/comparing_fingerprint_bits.ipynb In summary, bits that are active in both fingerprints usually correspond to the same substructure, unless bit collision