Re: [Rdkit-discuss] Difference between ECFP and MorganFingerprint
On Wed, Sep 30, 2015 at 8:05 AM, Guillaume GODIN < guillaume.go...@firmenich.com> wrote: > > > When you say as closely as I could, do you mean that all the paramaters > are the sames in ECFP and Morgan but the only divergence between them is on > the way RDKit/Pipeline handle aromaticity + hashing ? > I don't think that there are any parameters. I followed the algorithm description in the paper, but since the fingerprints include information about chemistry (specifically about bond types, which include aromaticity), differences could arise. The hashing algorithm is not described in the paper, so that will definitely be different. The Morgan fingerprints in the RDKit will not produce the same fingerprint as PP's ECFP implementation, but they should produce very similar similarity values (as the presentation I referenced earlier demonstrates). Best, -greg -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Difference between ECFP and MorganFingerprint
Dear Greg, When you say as closely as I could, do you mean that all the paramaters are the sames in ECFP and Morgan but the only divergence between them is on the way RDKit/Pipeline handle aromaticity + hashing ? Thanks Guillaume From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: mercredi 30 septembre 2015 08:01 To: Jing Lu Cc: RDKit Discuss Subject: Re: [Rdkit-discuss] Difference between ECFP and MorganFingerprint On Wed, Sep 30, 2015 at 6:47 AM, Greg Landrum <greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote: On Tue, Sep 29, 2015 at 8:22 PM, Jing Lu <ajin...@gmail.com<mailto:ajin...@gmail.com>> wrote: I was treating AllChem.GetMorganFingerprint(m1,2) the same as ECFP4. I am writing a paper for a open source tool, so I need to be very accurate. I have seen one open source implementation for ECFP, which is from CDK. Most researchers are using Pipeline Pilot to calculate ECFP. But, Pipeline Pilot is not open source. To be very clear: the only implementation of ECFP is the one in Pipeline Pilot. The other implementations like the one in the CDK and the RDKit, may have followed the algorithm description that was published, but due to differences in aromaticity perception and hashing algorithms the results will not be exactly the same. Sorry, should have been more explicit here: when I did the Morgan fingerprint implementation for the RDKit, I followed the published algorithm description as closely as I could. -greg ** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their official business. **-- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Difference between ECFP and MorganFingerprint
On Wed, Sep 30, 2015 at 6:47 AM, Greg Landrumwrote: > > On Tue, Sep 29, 2015 at 8:22 PM, Jing Lu wrote: > >> >> I was treating AllChem.GetMorganFingerprint(m1,2) the same as ECFP4. I am >> writing a paper for a open source tool, so I need to be very accurate. I >> have seen one open source implementation for ECFP, which is from CDK. Most >> researchers are using Pipeline Pilot to calculate ECFP. But, Pipeline Pilot >> is not open source. >> > > To be very clear: the only implementation of ECFP is the one in Pipeline > Pilot. The other implementations like the one in the CDK and the RDKit, may > have followed the algorithm description that was published, but due to > differences in aromaticity perception and hashing algorithms the results > will not be exactly the same. > Sorry, should have been more explicit here: when I did the Morgan fingerprint implementation for the RDKit, I followed the published algorithm description as closely as I could. -greg -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Difference between ECFP and MorganFingerprint
On Tue, Sep 29, 2015 at 8:22 PM, Jing Luwrote: > > I was treating AllChem.GetMorganFingerprint(m1,2) the same as ECFP4. I am > writing a paper for a open source tool, so I need to be very accurate. I > have seen one open source implementation for ECFP, which is from CDK. Most > researchers are using Pipeline Pilot to calculate ECFP. But, Pipeline Pilot > is not open source. > To be very clear: the only implementation of ECFP is the one in Pipeline Pilot. The other implementations like the one in the CDK and the RDKit, may have followed the algorithm description that was published, but due to differences in aromaticity perception and hashing algorithms the results will not be exactly the same. > I calculate taminoto similarity based on Morgan fingerprint. The > similarity matrix is the input for my tool. I am wondering how different it > is for Morgan fingerprint and ECFP. Will they give different answers in > some situations? Can we use MorganFingerprint instead of ECFP most of the > time? > I have, in the past, done a comparison between similarity values calculated with the RDKit and those with PP's ECFP implementation. I've presented those results in a couple different places; here's one of them: http://rdkit.org/UGM/2012/Landrum_RDKit_UGM.Fingerprints.Final.pptx.pdf Best, -greg -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss