Okay now I’ve actually tracked it down - the issue is to do with aromaticity
(kind of) and the SSSR providing a container for the ring atoms/bonds.
With implicit hydrogens the substructure from the SSSRFinder looks like this…
[CH]1CCCC1
Note that C1(O)CCCC1 is really [CH]1([OH])[CH2][CH2][CH2][CH2]1. In the CDK
removing atoms doesn’t update neighbour hydrogen counts hence the first carbon
keeps an implicit hydrogen count of 1.
When all hydrogens are explicit we get
[C]1[C][C][C][C]1
For some reason the aromaticity algorithm finds it to be aromatic. I can fix
that but for now you can update the valences (i.e. AtomType/AddHydrogens) - but
consider this. The atoms in the IRing are the same as the molecule - so
adjusting the hydrogen count for the ring atoms would also affect the parent
molecule. You can even run it and you’ll get..
[CH2]1([CH2](O[H])([CH2]([CH2]([CH2]1([H])[H])([H])[H])([H])[H])[H])([H])[H]
It would be even worse when there are multiple rings. I’ve never liked IRing
anyway - much better to refer to rings by index without creating a new
container.
Cheers,
J
On 7 Feb 2014, at 17:12, John May <john...@ebi.ac.uk> wrote:
> Doh - of course. So the SMARTS has a quirk that ‘C1’ matches the
> ‘CDKConstants.ISINRING’ flag.
>
> We can fix this without a patch - just add this before you match. The
> SMARTSQueryTool should be doing it already - not sure why it isn’t though….
> (that’s the bug)
>
>> SmartsMatchers.prepare(ring, true);
>
>
> On 7 Feb 2014, at 17:03, John May <john...@ebi.ac.uk> wrote:
>
>> No problem,
>>
>> master, but nothing should have changed…
>>
>> J
>>
>> On 7 Feb 2014, at 16:47, Nick Vandewiele <nick.vandewi...@ugent.be> wrote:
>>
>>> John,
>>>
>>> Thanks for the fast response!
>>>
>>> However: adding or removing dashes in the SMARTS string doesn’t change the
>>> outcome when I try it.
>>>
>>> Also, using your proposed alternative, eg:
>>> Pattern pattern = Ullmann.findSubstructure(SMARTSParser.parse(smarts,
>>> blr));
>>> for (IAtomContainer ring : ringSet.atomContainers()) {
>>> System.out.println(pattern.matches(ring));
>>> }
>>>
>>> Does not change the outcome (ie false) for me neither.
>>> Are you using the 1.5.4 or master branch?
>>>
>>> Regards,
>>> Nick
>>>
>>> From: John May [mailto:john...@ebi.ac.uk]
>>> Sent: Friday, February 07, 2014 5:28 PM
>>> To: Nick Vandewiele
>>> Cc: cdk-user@lists.sourceforge.net
>>> Subject: Re: [Cdk-user] SMARTS matching after implicit to explict hydrogen
>>> conversion and SSSRing finder
>>>
>>> Okay it’s the bond matching… C-1-C-C-C-C1 works but C1-C-C-C-C1 doesn’t.
>>>
>>> Should be an easy fix.
>>>
>>> J
>>>
>>> On 7 Feb 2014, at 16:03, Nick Vandewiele <nick.vandewi...@ugent.be> wrote:
>>>
>>>
>>> Hi,
>>>
>>> I am using CDK 1.5.4 and detected some behavior of the SMARTS matcher that
>>> I didn’t quite understand.
>>> When I search for a SMARTS pattern in one of the rings detected using the
>>> SSSRFinder algorithm, the success of finding the pattern in the ring
>>> depends on whether implicit hydrogens were converted to explicit ones, or
>>> not.
>>> If explicit hydrogens are present, the pattern is not found. If only
>>> implicit hydrogens are present, the pattern IS found.
>>>
>>> This code was used:
>>>
>>> String smiles = "C1C(O)CCC1";
>>> IChemObjectBuilder blr =
>>> SilentChemObjectBuilder.getInstance();
>>> SmilesParser smipar = new SmilesParser(blr);
>>> IAtomContainer m = smipar.parseSmiles(smiles);
>>> String smarts = "C1-C-C-C-C1";
>>> SMARTSQueryTool sqt = new SMARTSQueryTool(smarts, blr);
>>>
>>> AtomContainerManipulator.convertImplicitToExplicitHydrogens(m);
>>> IRingSet ringSet = new SSSRFinder(m).findSSSR();//find SSSR
>>> rings
>>>
>>> for(IAtomContainer ring : ringSet.atomContainers()){
>>> boolean found = sqt.matches(ring);//false (should be true)
>>> }
>>>
>>> Although the release notes of 1.5.4 are very informative, I couldn’t find
>>> an answer explaining this behavior.
>>>
>>> So my question is two-fold:
>>> 1) how do I ensure that the pattern is found, even when explicit
>>> hydrogens are used in the atomcontainer?
>>> 2) What is happening underneath the hood here? Is this behavior normal?
>>>
>>> Regards,
>>> Nick
>>>
>>> ------------------------------------------------------------------------------
>>> Managing the Performance of Cloud-Based Applications
>>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>>> Read the Whitepaper.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
>>> Cdk-user mailing list
>>> Cdk-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>> ------------------------------------------------------------------------------
>> Managing the Performance of Cloud-Based Applications
>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>> Read the Whitepaper.
>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user