Yes, I've seen the same phenomenon in multiple SMILES generators.

Even Daylight's (when they had it up on a public web site).

>From a chemical perspective, it isn't sensible that the pyridone-like ring
in molecule 1 should not be seen as aromatic in the canonical
SMILES, especially since the same ring is seen as aromatic in molecule2.
The counter-argument has often been that "only exocyclic substituents are
considered". But of course that =N is indeed exocyclic to the ring in
question.

In a famous quote, Dave Weininger said:

It is important to remember that the purpose of the SMILES aromaticity
detection algorithm is for the purposes of chemical information
representation only! To this end, rigorous rules are provided for
determining the "aromaticity" of charged, heterocyclic, and electron
deficient ring systems. The"aromaticity" designation as used here is not
intended to imply anything about the reactivity, magnetic resonance
spectra, heat of formation, or odor of substances.

As an example of the utility of this definition, consider o-xylene. You
don't want to see the VB structure with a double bond connecting the
methyl-attached carbons as different from the form with a single bond in
that position. Hence, aromaticity enables SMILES to avoid that issue, since
the (canonical) SMILES does not contain any double bonds, but only aromatic
bonds within the ring.

And the fact is that there is no ambiguity in any of the structures I've
seen (including the one shown SMILES1) that exhibit the problem. There's
only one way to draw the resonance structure, anyway, so you could argue
that you don't need to make it aromatic at all.

Of course, if you had the courage of that particular conviction, you
wouldn't bother making pyrrole aromatic, either, because there's only one
resonance structure you can draw. But SMILES does define pyrrole as
aromatic.

When I've discussed this with developers who have worked on SMILES systems,
they say that looking for cases like exocyclic aromaticity-producing
substituents in adjacent non-aromatic rings would slow the SMILES generator
down.

But the problem is that when you are using a SMARTS to look for one of
these pyridone-like rings that you see in the first structure, you're not
going to find it, even though it's there. Chemists do expect an aromatic
SMARTS to find an aromatic ring, which is no doubt the secret reason for
making pyrrole aromatic.

I've never liked this situation, but it boils down to the fact that
Daylight, which produced the original reference SMILES implementation,
"done it that-a-way". It has the advantage of *stare decisis*.

-P.

P. S. By the way, if any of you have ever seen a SMILES generator that
displays the 6-membered ring as aromatic in the first example, could you
please tell us which one that is?

On Fri, Nov 27, 2020 at 1:55 PM Paolo Tosco <paolo.tosco.m...@gmail.com>
wrote:

> (Now with link - you can tell it's Friday night)
>
> Hi Mark, Alexis,
>
> Yes, I was too fast in composing my previous reply and I did not pay
> enough attention to the molecules.
> After reading Alexis' reply, I looked more carefully at his original
> question and at that point I remembered having seen a similar behaviour
> before from RDKit on condensed ring systems featuring exocyclic bonds and
> relative mailing list discussions.
> So I did a bit of searching and I fished out the (long) thread that deals
> with exactly this behaviour.
>
>
> https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/CAAsqebGxOwJtH32T5jC%3DoBZN6G1JE_NwsEqKUO8%2BmUCqmABCzQ%40mail.gmail.com/#msg36448625
>
> I hope that helps, cheers
> p.
>
> On Fri, Nov 27, 2020 at 7:31 PM Mark Mackey <m...@cresset-group.com>
> wrote:
>
>> Hi Paolo,
>>
>>
>>
>> Hmmm, I think this is displaying a bug (or at the very least unexpected
>> behaviour) in the aromaticity code. The issue isn’t the aromaticity of the
>> imidazole/dihydroimidazole, but the aromaticity of the pyridyl. Alexis’
>> second molecule is identical to the first except that one bond in the
>> 5-membered ring was broken, and that (to my eyes at least) should not
>> affect whether the 6-membered ring is seen as aromatic.
>>
>>
>>
>> Regards,
>>
>> Mark.
>>
>>
>>
>> *From:* Paolo Tosco <paolo.tosco.m...@gmail.com>
>> *Sent:* 27 November 2020 17:04
>> *To:* Alexis Parenty <alexis.parenty.h...@gmail.com>
>> *Cc:* RDKit Discuss <rdkit-discuss@lists.sourceforge.net>
>> *Subject:* Re: [Rdkit-discuss] canonicalization of two aromatic
>> molecules returning two different forms (kekule and aromatic)
>>
>>
>>
>> Hi Alexis,
>>
>>
>>
>> The second molecule (smiles2) is indeed aromatic, but the first (smiles1)
>> is not, as the imidazole ring condensed to the pyridine is partially
>> saturated.
>>
>> The smiles1a analogue where I have added a double bond is aromatic, and
>> upon canonicalization it yields an aromatic SMILES as expected.
>>
>>
>>
>> Cheers,
>>
>> p.
>>
>>
>>
>> *from* rdkit *import* Chem
>>
>> In [2]:
>>
>> mol1 *=* Chem*.*MolFromSmiles("N12C=CC=CC1=NCC2")
>>
>> In [3]:
>>
>> mol1
>>
>> Out[3]:
>>
>> In [4]:
>>
>> smiles1 *=* Chem*.*MolToSmiles(mol1)
>>
>> In [5]:
>>
>> smiles1
>>
>> Out[5]:
>>
>> 'C1=CC2=NCCN2C=C1'
>>
>> In [6]:
>>
>> mol2 *=* Chem*.*MolFromSmiles("CN=C1C=CC=CN1C")
>>
>> In [7]:
>>
>> mol2
>>
>> Out[7]:
>>
>> In [8]:
>>
>> smiles2 *=* Chem*.*MolToSmiles(mol2)
>>
>> In [9]:
>>
>> smiles2
>>
>> Out[9]:
>>
>> 'CN=c1ccccn1C'
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> In [10]:
>>
>> mol1a *=* Chem*.*MolFromSmiles("N12C=CC=CC1=NC=C2")
>>
>> In [11]:
>>
>> mol1a
>>
>> Out[11]:
>>
>> In [12]:
>>
>> smiles1a *=* Chem*.*MolToSmiles(mol1a)
>>
>> In [13]:
>>
>> smiles1a
>>
>> Out[13]:
>>
>> 'c1ccn2ccnc2c1'
>>
>>
>>
>> On Fri, Nov 27, 2020 at 5:09 PM Alexis Parenty <
>> alexis.parenty.h...@gmail.com> wrote:
>>
>> Hi everyone,
>>
>>
>>
>> Why is it that when I canonicalize the following smiles_1 I get its
>> unexpected kekule form, whereas when I canonicalize a similar smiles_2, I
>> get its expected aromatic form?
>>
>>
>>
>> From rdkit import Chem
>>
>> smiles1 = Chem.CanonSmiles("N12C=CC=CC1=NCC2")
>> smiles
>>
>> ==> 'C1=CC2=NCCN2C=C1'
>>
>>
>>
>> smiles2 = Chem.CanonSmiles("CN=C1C=CC=CN1C")
>> smiles2
>>
>> ==> 'CN=c1ccccn1C'
>>
>>
>>
>> I would like to get the aromatic form in both cases... Is there a way to
>> force the aromatic form?
>>
>>
>>
>> Best,
>>
>> Alexis
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to