Hi,

On Mon, Jul 23, 2012 at 4:07 PM, Chris Morley <c.mor...@gaseq.co.uk> wrote:
> On 22/07/2012 23:35, Tim Vandermeersch wrote:
>> Hi,
>>
>> The problem seems to be in src/fingerprints/finger3.cpp:
>>
>>    //Each bit represents a single substructure; no need for
>> confirmation when substructure searching
>>    virtual unsigned int Flags() { return FPT_UNIQUEBITS;};
>>
>> This confuses me. It's not because a the substructures are present in
>> the queried molecule that the queried molecule is a superstructure of
>> the query. So an isomorphism search is still needed to confirm the
>> hit.
>
> This code comment (and the implementation of it in fastsearchformat) is
> clearly wrong and has been for a long time. The flag means that the bit
> represents only one substructure feature and is not a hash as in FP2. I
> have corrected in trunk code in fastsearchformat and the comment in
> finger3.cpp.

The change looks good.

>> When I change the Flags() function to return 0, I still don't get the
>> expected results though. With my query there should be 46 hits but FP3
>> gives 26, FP4 25 and MACCS only 12. Is there something I'm missing
>> here. If the bits simply represent a substructure, the fingerprint
>> screening should return all possible molecules containing the query.
>
> I think that this is because a structure as a pattern is not being
> distinguished sufficiently from a structure as a molecule. A SMILES
> input of OC will match any ether when it is used as SMARTS or in a FP2
> substructure search. With FP4 or MACCS, it is seen as methanol and a bit
> corresponding to an alcohol is set. This prevents a match to an ordinary
> ether.
>
> obabel -:"OC" -ofpt -xfFP4 -xs
>  >
> Alcohol C_ONS_bond
> 1 molecule converted
>
> obabel -:"COC" -ofpt -xfFP4 -xs
>  >
> Dialkylether    C_ONS_bond
> 1 molecule converted
>
> obabel -:"OC" -ofpt -xfMACCS -xs
>  >
> 93: QCH3        139: OH 157: C-O        160: CH3        164: O
> 1 molecule converted
>
> obabel -:"COC" -ofpt -xfMACCS -xs
>  >
> 74: CH3ACH3     86: CH2QCH2     93: QCH3        126: A!O!A      149: CH3
>  > 1*2
> 157: C-O        160: CH3        164: O
> 1 molecule converted
>
> I guess structure-key fingerprints should not be used for substructure
> searches, at least until we have a way round this. But they may be
> better for similarity comparisons. For example, in the above, the
> presence of an alcohol is more chemically significant than any old O
> bonded to C.

This makes sense.

Thanks,
Tim

> Chris
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> OpenBabel-Devel mailing list
> OpenBabel-Devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-devel

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to