Hi all, Given a molecule, how do I generate a SMILES which reflects the internal aromaticity used?
I'm cross-comparing some work using RDKit with CDK. The differences appear to be due to differences in aromaticity perception, as expected. I'm trying to figure out how to verify these differences. Consider the following input SMILES: OCCO[P+]1(OCCO)n2c3ccc2/C(c2ccccc2)=C2/C=CC(=N2)/C(c2ccccc2)=c2/cc/c(n21)=C(\c1ccccc1)C1=NC(=C3c2ccccc2)C=C1 CHEMBL2369103 and SMARTS: C=CC=N While the SMARTS seems like it would match the "C=CC(=N2)" in the SMILES, toolkits of course can perceive their own aromaticity. Testing with CDK Depict shows CDK perceives all four nitrogens as aromatic. A SMARTS which does match is C=C-c:n and using "a" for the SMARTS verifies that all nitrogens are aromatic. I wanted to verify this by visual inspection of the SMILES. When I generate the SMILES with the default flavor I get, as I should have expected, a Kekule form: C1=CC=C(C=C1)/C/2=C/3\\C=CC(=N3)C(=C4C=CC5=C(C6=CC=CC=C6)C7=NC(=C(C8=CC=CC=C8)C9=CC=C2N9[P+](N45)(OCCO)OCCO)C=C7)C%10=CC=CC=C%10 When I remembered to add UseAromaticSymbols to the flavor I get: c1ccc(cc1)/C/2=C/3\C=CC(=N3)C(=c4ccc5=C(c6ccccc6)C7=NC(=C(c8ccccc8)c9ccc2n9[P+](n45)(OCCO)OCCO)C=C7)c%10ccccc%10 This shows two aromatic nitrogens and two aliphatic nitrogens, which I expected four "n" terms. This SMILES contains "C=CC(=N3)" which I would expect to match the SMARTS "C=CC=N", so I can't use this approach for manual verification. I didn't see any other relevant flavors to add. Is there something else I should do? Cheers, Andrew da...@dalkescientific.com _______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user