Yes, I've seen the same phenomenon in multiple SMILES generators. Even Daylight's (when they had it up on a public web site).
>From a chemical perspective, it isn't sensible that the pyridone-like ring in molecule 1 should not be seen as aromatic in the canonical SMILES, especially since the same ring is seen as aromatic in molecule2. The counter-argument has often been that "only exocyclic substituents are considered". But of course that =N is indeed exocyclic to the ring in question. In a famous quote, Dave Weininger said: It is important to remember that the purpose of the SMILES aromaticity detection algorithm is for the purposes of chemical information representation only! To this end, rigorous rules are provided for determining the "aromaticity" of charged, heterocyclic, and electron deficient ring systems. The"aromaticity" designation as used here is not intended to imply anything about the reactivity, magnetic resonance spectra, heat of formation, or odor of substances. As an example of the utility of this definition, consider o-xylene. You don't want to see the VB structure with a double bond connecting the methyl-attached carbons as different from the form with a single bond in that position. Hence, aromaticity enables SMILES to avoid that issue, since the (canonical) SMILES does not contain any double bonds, but only aromatic bonds within the ring. And the fact is that there is no ambiguity in any of the structures I've seen (including the one shown SMILES1) that exhibit the problem. There's only one way to draw the resonance structure, anyway, so you could argue that you don't need to make it aromatic at all. Of course, if you had the courage of that particular conviction, you wouldn't bother making pyrrole aromatic, either, because there's only one resonance structure you can draw. But SMILES does define pyrrole as aromatic. When I've discussed this with developers who have worked on SMILES systems, they say that looking for cases like exocyclic aromaticity-producing substituents in adjacent non-aromatic rings would slow the SMILES generator down. But the problem is that when you are using a SMARTS to look for one of these pyridone-like rings that you see in the first structure, you're not going to find it, even though it's there. Chemists do expect an aromatic SMARTS to find an aromatic ring, which is no doubt the secret reason for making pyrrole aromatic. I've never liked this situation, but it boils down to the fact that Daylight, which produced the original reference SMILES implementation, "done it that-a-way". It has the advantage of *stare decisis*. -P. P. S. By the way, if any of you have ever seen a SMILES generator that displays the 6-membered ring as aromatic in the first example, could you please tell us which one that is? On Fri, Nov 27, 2020 at 1:55 PM Paolo Tosco <paolo.tosco.m...@gmail.com> wrote: > (Now with link - you can tell it's Friday night) > > Hi Mark, Alexis, > > Yes, I was too fast in composing my previous reply and I did not pay > enough attention to the molecules. > After reading Alexis' reply, I looked more carefully at his original > question and at that point I remembered having seen a similar behaviour > before from RDKit on condensed ring systems featuring exocyclic bonds and > relative mailing list discussions. > So I did a bit of searching and I fished out the (long) thread that deals > with exactly this behaviour. > > > https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/CAAsqebGxOwJtH32T5jC%3DoBZN6G1JE_NwsEqKUO8%2BmUCqmABCzQ%40mail.gmail.com/#msg36448625 > > I hope that helps, cheers > p. > > On Fri, Nov 27, 2020 at 7:31 PM Mark Mackey <m...@cresset-group.com> > wrote: > >> Hi Paolo, >> >> >> >> Hmmm, I think this is displaying a bug (or at the very least unexpected >> behaviour) in the aromaticity code. The issue isn’t the aromaticity of the >> imidazole/dihydroimidazole, but the aromaticity of the pyridyl. Alexis’ >> second molecule is identical to the first except that one bond in the >> 5-membered ring was broken, and that (to my eyes at least) should not >> affect whether the 6-membered ring is seen as aromatic. >> >> >> >> Regards, >> >> Mark. >> >> >> >> *From:* Paolo Tosco <paolo.tosco.m...@gmail.com> >> *Sent:* 27 November 2020 17:04 >> *To:* Alexis Parenty <alexis.parenty.h...@gmail.com> >> *Cc:* RDKit Discuss <rdkit-discuss@lists.sourceforge.net> >> *Subject:* Re: [Rdkit-discuss] canonicalization of two aromatic >> molecules returning two different forms (kekule and aromatic) >> >> >> >> Hi Alexis, >> >> >> >> The second molecule (smiles2) is indeed aromatic, but the first (smiles1) >> is not, as the imidazole ring condensed to the pyridine is partially >> saturated. >> >> The smiles1a analogue where I have added a double bond is aromatic, and >> upon canonicalization it yields an aromatic SMILES as expected. >> >> >> >> Cheers, >> >> p. >> >> >> >> *from* rdkit *import* Chem >> >> In [2]: >> >> mol1 *=* Chem*.*MolFromSmiles("N12C=CC=CC1=NCC2") >> >> In [3]: >> >> mol1 >> >> Out[3]: >> >> In [4]: >> >> smiles1 *=* Chem*.*MolToSmiles(mol1) >> >> In [5]: >> >> smiles1 >> >> Out[5]: >> >> 'C1=CC2=NCCN2C=C1' >> >> In [6]: >> >> mol2 *=* Chem*.*MolFromSmiles("CN=C1C=CC=CN1C") >> >> In [7]: >> >> mol2 >> >> Out[7]: >> >> In [8]: >> >> smiles2 *=* Chem*.*MolToSmiles(mol2) >> >> In [9]: >> >> smiles2 >> >> Out[9]: >> >> 'CN=c1ccccn1C' >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> In [10]: >> >> mol1a *=* Chem*.*MolFromSmiles("N12C=CC=CC1=NC=C2") >> >> In [11]: >> >> mol1a >> >> Out[11]: >> >> In [12]: >> >> smiles1a *=* Chem*.*MolToSmiles(mol1a) >> >> In [13]: >> >> smiles1a >> >> Out[13]: >> >> 'c1ccn2ccnc2c1' >> >> >> >> On Fri, Nov 27, 2020 at 5:09 PM Alexis Parenty < >> alexis.parenty.h...@gmail.com> wrote: >> >> Hi everyone, >> >> >> >> Why is it that when I canonicalize the following smiles_1 I get its >> unexpected kekule form, whereas when I canonicalize a similar smiles_2, I >> get its expected aromatic form? >> >> >> >> From rdkit import Chem >> >> smiles1 = Chem.CanonSmiles("N12C=CC=CC1=NCC2") >> smiles >> >> ==> 'C1=CC2=NCCN2C=C1' >> >> >> >> smiles2 = Chem.CanonSmiles("CN=C1C=CC=CN1C") >> smiles2 >> >> ==> 'CN=c1ccccn1C' >> >> >> >> I would like to get the aromatic form in both cases... Is there a way to >> force the aromatic form? >> >> >> >> Best, >> >> Alexis >> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss