Hi all,

  Given a molecule, how do I generate a SMILES which reflects the internal 
aromaticity used?

I'm cross-comparing some work using RDKit with CDK. The differences appear to 
be due to differences in aromaticity perception, as expected.

I'm trying to figure out how to verify these differences. Consider the 
following input SMILES:

OCCO[P+]1(OCCO)n2c3ccc2/C(c2ccccc2)=C2/C=CC(=N2)/C(c2ccccc2)=c2/cc/c(n21)=C(\c1ccccc1)C1=NC(=C3c2ccccc2)C=C1
 CHEMBL2369103

and SMARTS:

C=CC=N

While the SMARTS seems like it would match the "C=CC(=N2)" in the SMILES, 
toolkits of course can perceive their own aromaticity. 
Testing with CDK Depict shows CDK perceives all four nitrogens as aromatic.

A SMARTS which does match is C=C-c:n and using "a" for the SMARTS verifies that 
all nitrogens are aromatic.

I wanted to verify this by visual inspection of the SMILES. When I generate the 
SMILES with the default flavor I get, as I should have expected, a Kekule form:

C1=CC=C(C=C1)/C/2=C/3\\C=CC(=N3)C(=C4C=CC5=C(C6=CC=CC=C6)C7=NC(=C(C8=CC=CC=C8)C9=CC=C2N9[P+](N45)(OCCO)OCCO)C=C7)C%10=CC=CC=C%10

When I remembered to add UseAromaticSymbols to the flavor I get:

c1ccc(cc1)/C/2=C/3\C=CC(=N3)C(=c4ccc5=C(c6ccccc6)C7=NC(=C(c8ccccc8)c9ccc2n9[P+](n45)(OCCO)OCCO)C=C7)c%10ccccc%10

This shows two aromatic nitrogens and two aliphatic nitrogens, which I expected 
four "n" terms.

This SMILES contains "C=CC(=N3)" which I would expect to match the SMARTS 
"C=CC=N", so I can't use this approach for manual verification.

I didn't see any other relevant flavors to add. Is there something else I 
should do?

Cheers,

                                Andrew
                                da...@dalkescientific.com





_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to