Hi Martin,
As you know, the CDK doesn’t not have an aromatic bond but rather an auxiliary
flag ‘ISAROMATIC’. Therefore the default ’normalized’ represention is one with
bond orders assigned and optionally with aromaticity perceived. There are
algorithms that depend on bond orders being assigned and do not reject or check
compounds that are delocalized with unassigned bond orders. The use of a flag
instead of an ‘aromatic’ bond type also means that if you delocalize bonds with
the CDK no information regarding bond order is lost.
Formats that have an aromatic bond type (most) should have input Kekulised.
Since aromatic SMILES are ubiquitous this is done automatically but can turn
this off.
SMILES: No - default option is to Kekulise automatically
InChI: No - InChI is delocalized but similar to SMILES bond orders are assigned
by JNI internally
Molfile: No/Yes - only for query structures but I would recommend to reject any
molfile input with bond order 4, compounds will need atom typing + implicit H
addition
CML: Yes - “A” bond type, but may be uncommon, compounds will need atom typing
+ implicit H addition
Mol2: Yes - “ar” bond type, compounds will need atom typing + implicit H
addition
> I also noted that in CDKHueckelAromaticityDetector is deprecated.
> However, if I skip this, my compounds are recognized as aromatic, so
> what do I use instead?
>
> In general, when changing from 1.4 to 1.5, some methods or classes
> became deprecated, however, it is not documented what to use instead.
> It would be much easier if this would be documented.
The CDKHueckelAromaticityDetector has been replaced by Aromaticity - this is
listed at the top of the documentation for that class.
The FixBondOrdersTool is not yet deprecated but will documented to point to
Kekulisation when needed.
> The first. It did not fail, just the bond orders in the armotic rings
> were different.
Was this for identical structures from the same input type? same structures but
with different atom orders (e.g. different input types)? or the same part of a
structure (e.g. two structures that had a naphthalene part)? For the second
case you will still need to call the mykekule example on all structures after
bond orders have been initially assigned. The third case is more tricky.
Hope that helps.
Kind regards,
John
On Sep 30, 2014, at 1:28 PM, Martin Guetlein <martin.guetl...@googlemail.com>
wrote:
> Hi John,
>
>> The Kekulization class replaces the FixBondOrderTool. You should invoke this
>> on read rather than on write - note again only certain formats will need it.
> Why should I use it on read instead of on write? And it is not needed
> for SMILES anymore (correct?), so which formats do require this
> processing step?
>
> I also noted that in CDKHueckelAromaticityDetector is deprecated.
> However, if I skip this, my compounds are recognized as aromatic, so
> what do I use instead?
>
> In general, when changing from 1.4 to 1.5, some methods or classes
> became deprecated, however, it is not documented what to use instead.
> It would be much easier if this would be documented.
>
>> In what regard was the FixBondOrderTool non deterministic? Did it give
>> different Kekulé assignments for the same input or did it randomly fail.
> The first. It did not fail, just the bond orders in the armotic rings
> were different.
>
> Kind regards,
> Martin
>
>
>
> On 30 September 2014 12:38, John May <john.wilkinson...@gmail.com> wrote:
>> Hi Martin,
>>
>> The Kekulization class replaces the FixBondOrderTool. You should invoke this
>> on read rather than on write - note again only certain formats will need it.
>>
>> SDFs like that is not deterministic
>>
>>
>> In what regard was the FixBondOrderTool non deterministic? Did it give
>> different Kekulé assignments for the same input or did it randomly fail.
>>
>> Thanks,
>> J
>>
>> On Sep 30, 2014, at 10:42 AM, Martin Guetlein
>> <martin.guetl...@googlemail.com> wrote:
>>
>> Hi John,
>>
>> my example is in the bug that I reported some time ago:
>> https://sourceforge.net/p/cdk/bugs/1307
>> I added the FixBondOrderTool to this example to make sure that, after
>> writing compounds to SDF and reading them again, aromaticity can be
>> detected correctly (thanks to kekulization). But then I found out that
>> writing SDFs like that is not deterministic. This is why I am looking
>> for a method that ensures a proper Kekule representation and, at the
>> same time, is deterministic.
>>
>> Are you reading from SMILES like in your example?
>>
>> My input data can be any format CDK is supporting (if your interested,
>> see http://ches-mapper.org)
>>
>> Thanks for your help,
>> Martin
>>
>>
>>
>>
>>
>>
>>
>> On 30 September 2014 11:15, John May <john.wilkinson...@gmail.com> wrote:
>>
>> Hi Martin,
>>
>> I tried your mykekule-Method in this example (instead of
>> FixBondOrderTool) and it was able to detect aromaticity correctly
>>
>>
>> I’m not sure what you mean here - it is not detecting aromaticity at all.
>> The input to this method needs a structure with bond orders already
>> assigned, aromatic flags may be set but they are not utilised. Do you mean
>> aromaticity can be detected by the assigned structure.
>>
>> I do not fully understand. Is there a method to asign bond orders and
>> to make the mykekule a replacement for FixBondOrderTool?
>>
>>
>> Yes - but I think we’re on different pages. Are you reading from SMILES like
>> in your example? If so, the bond orders are automatically assigned on input
>> and no extra effort is needed to store the output in SDF.
>>
>> My understanding was you needed a way to assign a consistent Kekule
>> structure when there are multiple Kekule forms (i.e. in naphthalene)? This
>> is what the mykekule method does - it is less work to assign a Kekulé form.
>> It might be easier if you have an example of the unexpected behaviour.
>>
>> J
>>
>> On Sep 29, 2014, at 2:51 PM, Martin Guetlein
>> <martin.guetl...@googlemail.com> wrote:
>>
>> Hi John,
>>
>> Hmm, I was only using the FixBondOrderTool to make sure that
>> aromaticity is not lost when exporting compounds to molfiles/SDF (see
>> https://sourceforge.net/p/cdk/bugs/1307).
>> I tried your mykekule-Method in this example (instead of
>> FixBondOrderTool) and it was able to detect aromaticity correctly.
>>
>> The input to SMILES must already have bond orders assigned so it’s not a
>> drop in replacement for FixBondOrderTool if you have delocalized bonds.
>>
>> I do not fully understand. Is there a method to asign bond orders and
>> to make the mykekule a replacement for FixBondOrderTool?
>>
>>
>> Thanks for your help,
>> Martin
>>
>>
>> On 29 September 2014 15:35, John May <john.wilkinson...@gmail.com> wrote:
>>
>> Hi Martin,
>>
>> The mykekule() example as it is will preserve 3D coordinates. All properties
>> (except bond order) remain unchanged from the input. Using the SMILES output
>> just simplifies the code a little. The input to SMILES must already have
>> bond orders assigned so it’s not a drop in replacement for FixBondOrderTool
>> if you have delocalized bonds. Usual warning about aromatic bonds in
>> molfiles applies.
>>
>> J
>>
>> On Sep 29, 2014, at 1:12 PM, Martin Guetlein
>> <martin.guetl...@googlemail.com> wrote:
>>
>> Hi John,
>>
>> thanks for your email. Unfortunately, I cannot use the SMILES round
>> trip like that because my input molecules might include 3D
>> information, that will be lost. Could this be added to the mykekule
>> method? I would use this mykekule method instead of the
>> FixBondOrderTool, correct?
>>
>> Kind regards,
>> Martin
>>
>>
>> On 29 September 2014 12:54, John May <john.wilkinson...@gmail.com> wrote:
>>
>> Hi Martin,
>>
>> I can’t speak for the FixBondOrdersTool but Kekulization is deterministic
>> and will generate the Kekulé assignment depending on atom order. To generate
>> a uniform assignment first canonicalize the molecule by sorting the atoms
>> and bonds but a canonical label (e.g. from Canon or InChINumbersTools).
>>
>> For something out of the box you could round trip through SMILES.
>>
>> IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
>> SmilesParser smipar = new SmilesParser(bldr);
>> SmilesGenerator smigen = SmilesGenerator.unique();
>>
>> System.out.println(smigen.create(smipar.parseSmiles("c1cc2c(cc1)cccc2")));
>> System.out.println(smigen.create(smipar.parseSmiles("C1=CC2=C(C=C1)C=CC=C2")));
>> System.out.println(smigen.create(smipar.parseSmiles("C=1C=C2C(=CC=1)C=CC=C2")));
>> System.out.println(smigen.create(smipar.parseSmiles("C1=CC=2C(C=C1)=CC=CC=2")));
>>
>>
>> Gives the output
>>
>> C1=CC=C2C=CC=CC2=C1
>>
>> C1=CC=C2C=CC=CC2=C1
>>
>> C1=CC=C2C=CC=CC2=C1
>>
>> C1=CC=C2C=CC=CC2=C1
>>
>>
>> It is possible to do it without reordering the molecule permanently but it
>> is quite tricky to explain so I’ll just provide the code.
>>
>> static IAtomContainer mykekule(IAtomContainer org) throws Exception {
>>
>> final IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
>> final SmilesParser smipar = new SmilesParser(bldr);
>> final SmilesGenerator smigen = SmilesGenerator.unique();
>>
>> final int n = org.getAtomCount();
>> int[] ordering = new int[n];
>>
>> // generate a kekule assignment via SMILES and store the output order (a
>> permutation of
>> // atom indices)
>> IAtomContainer cpy = smipar.parseSmiles(smigen.create(org, ordering));
>>
>> // index atoms for lookup
>> final Map<IAtom,Integer> atomIndexMap = new HashMap<>();
>> for (IAtom atom : org.atoms())
>> atomIndexMap.put(atom, atomIndexMap.size());
>>
>> // util to get atom index -> bond map
>> EdgeToBondMap bondMap = EdgeToBondMap.withSpaceFor(cpy);
>> GraphUtil.toAdjList(cpy, bondMap);
>>
>> for (IBond bond : org.bonds()) {
>>
>> // atom indices
>> int u = atomIndexMap.get(bond.getAtom(0));
>> int v = atomIndexMap.get(bond.getAtom(1));
>>
>> // atom indices in 'cpy'
>> int uCpy = ordering[u];
>> int vCpy = ordering[v];
>>
>> // propagate the assigned bond order
>> bond.setOrder(bondMap.get(uCpy, vCpy).getOrder());
>>
>> // note the following would also work to get the cpy bond
>> // cpy.getBond(cpy.getAtom(uCpy), cpy.getAtom(vCpy));
>> }
>>
>> return org;
>> }
>>
>>
>> https://gist.github.com/johnmay/cf1d3767d04eb424557f
>>
>> You can use that method to consistently kekulise molecules and maintain the
>> atom order. I use SMILES again but would work the same with molfiles.
>>
>> IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
>> SmilesParser smipar = new SmilesParser(bldr);
>> SmilesGenerator smigen = SmilesGenerator.generic(); // note not unique
>> smiles now!
>>
>> System.out.println(smigen.create(mykekule(smipar.parseSmiles("c1cc2c(cc1)cccc2"))));
>> System.out.println(smigen.create(mykekule(smipar.parseSmiles("C1=CC2=C(C=C1)C=CC=C2"))));
>> System.out.println(smigen.create(mykekule(smipar.parseSmiles("C=1C=C2C(=CC=1)C=CC=C2"))));
>> System.out.println(smigen.create(mykekule(smipar.parseSmiles("C1=CC=2C(C=C1)=CC=CC=2"))));
>>
>>
>> C=1C=C2C(=CC1)C=CC=C2
>> C=1C=C2C(=CC1)C=CC=C2
>> C=1C=C2C(=CC1)C=CC=C2
>> C=1C=C2C(=CC1)C=CC=C2
>>
>>
>> Kind regards,
>> J
>>
>> On Sep 26, 2014, at 2:49 PM, Martin Guetlein
>> <martin.guetl...@googlemail.com> wrote:
>>
>> Hi,
>>
>> At some point, FixBondOrdersTool apparently makes random choices and
>> this causes my SDF export to produce different results with the same
>> input on different runs.
>> Can this be circumvented somehow?
>>
>> Thanks and kind regards,
>> Martin
>>
>>
>> --
>> Dipl-Inf. Martin Gütlein
>> Phone:
>> +49 (0)761 203 8442 (office)
>> +49 (0)177 623 9499 (mobile)
>> Email:
>> guetl...@informatik.uni-freiburg.de
>>
>> ------------------------------------------------------------------------------
>> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
>> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
>> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
>> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>>
>>
>>
>> --
>> Dipl-Inf. Martin Gütlein
>> Phone:
>> +49 (0)761 203 8442 (office)
>> +49 (0)177 623 9499 (mobile)
>> Email:
>> guetl...@informatik.uni-freiburg.de
>>
>>
>>
>>
>>
>> --
>> Dipl-Inf. Martin Gütlein
>> Phone:
>> +49 (0)761 203 8442 (office)
>> +49 (0)177 623 9499 (mobile)
>> Email:
>> guetl...@informatik.uni-freiburg.de
>>
>>
>>
>>
>>
>> --
>> Dipl-Inf. Martin Gütlein
>> Phone:
>> +49 (0)761 203 8442 (office)
>> +49 (0)177 623 9499 (mobile)
>> Email:
>> guetl...@informatik.uni-freiburg.de
>>
>>
>
>
>
> --
> Dipl-Inf. Martin Gütlein
> Phone:
> +49 (0)761 203 8442 (office)
> +49 (0)177 623 9499 (mobile)
> Email:
> guetl...@informatik.uni-freiburg.de
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user