Okay now I’ve actually tracked it down - the issue is to do with aromaticity 
(kind of) and the SSSR providing a container for the ring atoms/bonds. 

With implicit hydrogens the substructure from the SSSRFinder looks like this…

[CH]1CCCC1 

Note that C1(O)CCCC1 is really [CH]1([OH])[CH2][CH2][CH2][CH2]1. In the CDK 
removing atoms doesn’t update neighbour hydrogen counts hence the first carbon 
keeps an implicit hydrogen count of 1. 

When all hydrogens are explicit we get

[C]1[C][C][C][C]1

For some reason the aromaticity algorithm finds it to be aromatic. I can fix 
that but for now you can update the valences (i.e. AtomType/AddHydrogens) - but 
consider this. The atoms in the IRing are the same as the molecule - so 
adjusting the hydrogen count for the ring atoms would also affect the parent 
molecule. You can even run it and you’ll get..

[CH2]1([CH2](O[H])([CH2]([CH2]([CH2]1([H])[H])([H])[H])([H])[H])[H])([H])[H]

It would be even worse when there are multiple rings. I’ve never liked IRing 
anyway - much better to refer to rings by index without creating a new 
container.

Cheers,
J

On 7 Feb 2014, at 17:12, John May <john...@ebi.ac.uk> wrote:

> Doh - of course. So the SMARTS has a quirk that ‘C1’ matches the 
> ‘CDKConstants.ISINRING’ flag. 
> 
> We can fix this without a patch - just add this before you match. The 
> SMARTSQueryTool should be doing it already - not sure why it isn’t though…. 
> (that’s the bug)
> 
>> SmartsMatchers.prepare(ring, true);
> 
> 
> On 7 Feb 2014, at 17:03, John May <john...@ebi.ac.uk> wrote:
> 
>> No problem,
>> 
>> master, but nothing should have changed… 
>> 
>> J
>> 
>> On 7 Feb 2014, at 16:47, Nick Vandewiele <nick.vandewi...@ugent.be> wrote:
>> 
>>> John,
>>>  
>>> Thanks for the fast response!
>>>  
>>> However: adding or removing dashes in the SMARTS string doesn’t change the 
>>> outcome when I try it.
>>>  
>>> Also, using your proposed alternative, eg:
>>> Pattern pattern = Ullmann.findSubstructure(SMARTSParser.parse(smarts, 
>>> blr));       
>>> for (IAtomContainer ring : ringSet.atomContainers()) {
>>>     System.out.println(pattern.matches(ring));  
>>> }
>>>  
>>> Does not change the outcome  (ie false) for me neither.
>>> Are you using the 1.5.4 or master branch?
>>>  
>>> Regards,
>>> Nick
>>>  
>>> From: John May [mailto:john...@ebi.ac.uk] 
>>> Sent: Friday, February 07, 2014 5:28 PM
>>> To: Nick Vandewiele
>>> Cc: cdk-user@lists.sourceforge.net
>>> Subject: Re: [Cdk-user] SMARTS matching after implicit to explict hydrogen 
>>> conversion and SSSRing finder
>>>  
>>> Okay it’s the bond matching… C-1-C-C-C-C1 works but C1-C-C-C-C1 doesn’t.
>>>  
>>> Should be an easy fix.
>>>  
>>> J
>>>  
>>> On 7 Feb 2014, at 16:03, Nick Vandewiele <nick.vandewi...@ugent.be> wrote:
>>> 
>>> 
>>> Hi,
>>>  
>>> I am using CDK 1.5.4 and detected some behavior of the SMARTS matcher that 
>>> I didn’t quite understand.
>>> When I search for a SMARTS pattern in one of the rings detected using the 
>>> SSSRFinder algorithm, the success of finding the pattern in the ring 
>>> depends on whether implicit hydrogens were converted to explicit ones, or 
>>> not.
>>> If explicit hydrogens are present, the pattern is not found. If only 
>>> implicit hydrogens are present, the pattern IS found.
>>>  
>>> This code was used:
>>>  
>>>             String                        smiles = "C1C(O)CCC1";
>>>             IChemObjectBuilder      blr    = 
>>> SilentChemObjectBuilder.getInstance();
>>>             SmilesParser            smipar = new SmilesParser(blr);
>>>             IAtomContainer                m = smipar.parseSmiles(smiles);
>>>             String                  smarts = "C1-C-C-C-C1";
>>>             SMARTSQueryTool         sqt = new SMARTSQueryTool(smarts, blr);
>>>            
>>>             AtomContainerManipulator.convertImplicitToExplicitHydrogens(m);
>>>             IRingSet ringSet = new SSSRFinder(m).findSSSR();//find SSSR 
>>> rings
>>>            
>>>             for(IAtomContainer ring : ringSet.atomContainers()){
>>>                   boolean found = sqt.matches(ring);//false (should be true)
>>>             }
>>>  
>>> Although the release notes of 1.5.4 are very informative, I couldn’t find 
>>> an answer explaining this behavior.
>>>  
>>> So my question is two-fold:
>>> 1)      how do I ensure that the pattern is found, even when explicit 
>>> hydrogens are used in the atomcontainer?
>>> 2)      What is happening underneath the hood here? Is this behavior normal?
>>>  
>>> Regards,
>>> Nick
>>>  
>>> ------------------------------------------------------------------------------
>>> Managing the Performance of Cloud-Based Applications
>>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>>> Read the Whitepaper.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
>>> Cdk-user mailing list
>>> Cdk-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>> 
>> ------------------------------------------------------------------------------
>> Managing the Performance of Cloud-Based Applications
>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>> Read the Whitepaper.
>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
> 
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to