Hi John!

thanks for the research on this. This would have taken a lot of time for me to 
find this out...

so [C]1[C][C][C][C]1 is perceived as aromatic... this is in accordance with the 
different behavior I see when I run the same code with six-rings instead of 
five-rings. For 6-rings, there's no problem, presumably because it's not 
perceived as aromatic.

So what I do is first clone the original atomcontainer (to prevent it from 
updating the implicit H-count), and then run the atom typing and adding 
hydrogens on each of the IRings.

IRingSet ringSet = new SSSRFinder(m.clone()).findSSSR();//find SSSR rings
for(IAtomContainer ring : ringSet.atomContainers()){
                
AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(ring);
                CDKHydrogenAdder.getInstance(blr).addImplicitHydrogens(ring);
                boolean found = sqt.matches(ring);//true
}

regards,
Nick

________________________________________
From: John May [john...@ebi.ac.uk]
Sent: Friday, February 07, 2014 6:48 PM
To: Nick Vandewiele
Cc: cdk-user@lists.sourceforge.net
Subject: Re: [Cdk-user] SMARTS matching after implicit to explict hydrogen 
conversion and SSSRing finder

Okay now I’ve actually tracked it down - the issue is to do with aromaticity 
(kind of) and the SSSR providing a container for the ring atoms/bonds.

With implicit hydrogens the substructure from the SSSRFinder looks like this…

[CH]1CCCC1

Note that C1(O)CCCC1 is really [CH]1([OH])[CH2][CH2][CH2][CH2]1. In the CDK 
removing atoms doesn’t update neighbour hydrogen counts hence the first carbon 
keeps an implicit hydrogen count of 1.

When all hydrogens are explicit we get

[C]1[C][C][C][C]1

For some reason the aromaticity algorithm finds it to be aromatic. I can fix 
that but for now you can update the valences (i.e. AtomType/AddHydrogens) - but 
consider this. The atoms in the IRing are the same as the molecule - so 
adjusting the hydrogen count for the ring atoms would also affect the parent 
molecule. You can even run it and you’ll get..

[CH2]1([CH2](O[H])([CH2]([CH2]([CH2]1([H])[H])([H])[H])([H])[H])[H])([H])[H]

It would be even worse when there are multiple rings. I’ve never liked IRing 
anyway - much better to refer to rings by index without creating a new 
container.

Cheers,
J

On 7 Feb 2014, at 17:12, John May <john...@ebi.ac.uk<mailto:john...@ebi.ac.uk>> 
wrote:

Doh - of course. So the SMARTS has a quirk that ‘C1’ matches the 
‘CDKConstants.ISINRING’ flag.

We can fix this without a patch - just add this before you match. The 
SMARTSQueryTool should be doing it already - not sure why it isn’t though…. 
(that’s the bug)

SmartsMatchers.prepare(ring, true);

On 7 Feb 2014, at 17:03, John May <john...@ebi.ac.uk<mailto:john...@ebi.ac.uk>> 
wrote:

No problem,

master, but nothing should have changed…

J

On 7 Feb 2014, at 16:47, Nick Vandewiele 
<nick.vandewi...@ugent.be<mailto:nick.vandewi...@ugent.be>> wrote:

John,

Thanks for the fast response!

However: adding or removing dashes in the SMARTS string doesn’t change the 
outcome when I try it.

Also, using your proposed alternative, eg:
Pattern pattern = Ullmann.findSubstructure(SMARTSParser.parse(smarts, blr));
for (IAtomContainer ring : ringSet.atomContainers()) {
    System.out.println(pattern.matches(ring));
}

Does not change the outcome  (ie false) for me neither.
Are you using the 1.5.4 or master branch?

Regards,
Nick

From: John May [mailto:john...@ebi.ac.uk]
Sent: Friday, February 07, 2014 5:28 PM
To: Nick Vandewiele
Cc: cdk-user@lists.sourceforge.net<mailto:cdk-user@lists.sourceforge.net>
Subject: Re: [Cdk-user] SMARTS matching after implicit to explict hydrogen 
conversion and SSSRing finder

Okay it’s the bond matching… C-1-C-C-C-C1 works but C1-C-C-C-C1 doesn’t.

Should be an easy fix.

J

On 7 Feb 2014, at 16:03, Nick Vandewiele 
<nick.vandewi...@ugent.be<mailto:nick.vandewi...@ugent.be>> wrote:


Hi,

I am using CDK 1.5.4 and detected some behavior of the SMARTS matcher that I 
didn’t quite understand.
When I search for a SMARTS pattern in one of the rings detected using the 
SSSRFinder algorithm, the success of finding the pattern in the ring depends on 
whether implicit hydrogens were converted to explicit ones, or not.
If explicit hydrogens are present, the pattern is not found. If only implicit 
hydrogens are present, the pattern IS found.

This code was used:

            String                        smiles = "C1C(O)CCC1";
            IChemObjectBuilder      blr    = 
SilentChemObjectBuilder.getInstance();
            SmilesParser            smipar = new SmilesParser(blr);
            IAtomContainer                m = smipar.parseSmiles(smiles);
            String                  smarts = "C1-C-C-C-C1";
            SMARTSQueryTool         sqt = new SMARTSQueryTool(smarts, blr);

            AtomContainerManipulator.convertImplicitToExplicitHydrogens(m);
            IRingSet ringSet = new SSSRFinder(m).findSSSR();//find SSSR rings

            for(IAtomContainer ring : ringSet.atomContainers()){
                  boolean found = sqt.matches(ring);//false (should be true)
            }

Although the release notes of 1.5.4 are very informative, I couldn’t find an 
answer explaining this behavior.

So my question is two-fold:
1)      how do I ensure that the pattern is found, even when explicit hydrogens 
are used in the atomcontainer?
2)      What is happening underneath the hood here? Is this behavior normal?

Regards,
Nick

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net<mailto:Cdk-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/cdk-user

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net<mailto:Cdk-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/cdk-user

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to