Just to clarify you can write SMILES in CDK you’re writing SMILES and then 
interpreting this as SMARTS. CDK doesn’t have the ability to write a SMARTS. As 
well as hydrogens you may also have trouble with aromaticity, charges, and 
isotopes.

>> c(c[cH])c[cH]

Is probably better as [#6]([#6][#6])[#6][#6].

The reason you’re having trouble in CDK 1.5 is SMILES IO now correctly handles 
the valence.

Anyways, There are a couple of solutions

1) reset the hydrogen counts to default (i.e atom typing) this will work for 
your examples but will also mean you would lose aromaticity flags (i.e. the 
example above isn’t a ring) and this wouldn’t fix nitrogens which also have H 
displayed when aromatic. I would not recommend this.
2) set all hydrogen counts to 0 (not null!) before generating the SMILES you 
may also want to do charge and mass. Simply loop over the MCS and set the 
implicitH count to 0. removeHydrogens has no effect because they’re not 
explicit - 
http://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/.
 
3) after parsing the SMILES as SMARTS, traverse the expression tree of each 
atom and replace the And(<OtherSmartsAtom>, HydrogenCount) with 
<OtherSmartsAtom>.
4) load the SMILES as a SMILES and do a normal subgraph match opposed to SMARTS.

Also
        - make sure you use the new SMSD (not part of CDK) the CDK packages are 
quite old
        - avoid using the DefaultChemObjectBuilder and use 
SilentChemObjectBuilder (the naming is the wrong way round but actually Silent 
is better as it doesn’t fire off events).
        - you’re generating canonical SMILES when this isn’t needed use 
SmilesGenerator.generic().aromatic() when creating the SmilesGenerator.

J

On Dec 2, 2014, at 11:04 AM, Martin Gütlein <guetl...@posteo.de> wrote:

> Hi,
> 
> any help with this issue would be very much appreciated,
> 
> Kind regards,
> Martin
> 
> -------- Originalnachricht --------
> Betreff: Re: how to print SMARTS pattern without hydrogens
> Datum: 02.12.2014 12:00
> On 30 September 2014 at 09:30, Martin Guetlein
> <martin.guetl...@googlemail.com> wrote:
>> Hi,
>> 
>> I am currently migrating from cdk1.4 to 1.5. I am mining the maximum
>> common subgraph of two compounds and then print the resulting fragment
>> as SMARTS. This is working in 1.4, however in 1.5 the SmilesGenerator
>> is adding unwanted Hydrogens. How can I get rid of the Hydrogens?
>> See example below.
>> See also 
>> https://www.mail-archive.com/cdk-user@lists.sourceforge.net/msg02597.html
>> 
>> Thanks and kind regards,
>> Martin
>> 
>> The following code prints "mcs: c(c[cH])c[cH]" instead of "mcs: ccccc"
>> [[
>> SmilesParser sp = new 
>> SmilesParser(DefaultChemObjectBuilder.getInstance());
>> IAtomContainer mol1 = sp.parseSmiles("c1ccccc1NC");
>> IAtomContainer mol2 = sp.parseSmiles("c1cccnc1");
>> org.openscience.cdk.smsd.Isomorphism mcsFinder = new
>> org.openscience.cdk.smsd.Isomorphism(
>> org.openscience.cdk.smsd.interfaces.Algorithm.DEFAULT, true);
>> mcsFinder.init(mol1, mol2, true, true);
>> mcsFinder.setChemFilters(true, true, true);
>> 
>> mol1 = mcsFinder.getReactantMolecule();
>> IAtomContainer mcsmolecule =
>> DefaultChemObjectBuilder.getInstance().newInstance(IAtomContainer.class,
>> mol1);
>> List<IAtom> atomsToBeRemoved = new ArrayList<IAtom>();
>> for (IAtom atom : mcsmolecule.atoms())
>> {
>> int index = mcsmolecule.getAtomNumber(atom);
>> if (!mcsFinder.getFirstMapping().containsKey(index))
>> atomsToBeRemoved.add(atom);
>> }
>> for (IAtom atom : atomsToBeRemoved)
>> mcsmolecule.removeAtomAndConnectedElectronContainers(atom);
>> 
>> // has no effect
>> // mcsmolecule = AtomContainerManipulator.removeHydrogens(mcsmolecule);
>> 
>> SmilesGenerator g = new SmilesGenerator().aromatic();
>> System.out.println("mcs: " + g.create(mcsmolecule));
>> ]]
>> 
>> --
>> Dipl-Inf. Martin Gütlein
>> Phone:
>> +49 (0)761 203 8442 (office)
>> +49 (0)177 623 9499 (mobile)
>> Email:
>> guetl...@informatik.uni-freiburg.de
> 
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> _______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to