On 25/02/2011 13:30, Santi Villalba wrote:
> A molecule created from a smiles string lacks explicit hydrogen
> information, while hydrogens are there when reading from the pubchem
> file. Explicit hydrogens are used when computing the fingerprints, and
> so the substructures found differ in both cases, even if we are talking
> about the same molecule. Try this:
> molcan = pybel.readstring('can',
> 'C1=CC2=C(C=C1C3=CC=C(O3)CO)C(=NC=N2)NCC4=NC=CN4')
> molpub = pybel.readfile('sdf', '44968247.sdf').next()
> print(molcan.calcfp('MACCS') | molpub.calcfp('MACCS')) ==> != 1.0, weirdly
> molcan.addh()
> print(molcan.calcfp('MACCS') | molpub.calcfp('MACCS')) ==> == 1.0, as
> expected
>
> I guess this is the intended behavior in the fingerprint computation.
> However it can be confusing at first and lacks logical consistency, as
> we get different fingerprints for the very same molecule. Would it make
> sense to either explicitly add or explicitly remove the hydrogens inside
> the fingerprint computation?

It would have been better for the SMARTS patterns used to generate this 
kind of fingerprint not to depend on explicit/implicit hydrogen. 
However, those in the MACCS set (from RDKit) and in FP4 sometimes do. 
(This is not an issue for FP2.) The development code has now been 
modified so that explicit hydrogen is removed before calculating any of 
the fingerprints derived from PatternFP: FP3, FP4, MACCS. I'll write a 
test to ensure we don't get caught out again.

Chris

------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to