Dear Peter and Paolo,
Wahoo!  many thanks to both of you for having researched that much on this
issue.  No worries, I can live with a kekulé form of my smiles1! I only
noticed that strange behaviour when I fragmented a dihydro imidazo pyridine
derivative of smiles1 and saw that some of its fragments had lost
aromaticity and were no longer a substructure match for their parent...

Best,
Alexis

On Sat, 28 Nov 2020 at 03:49, Peter S. Shenkin <shen...@gmail.com> wrote:

> Yes, I've seen the same phenomenon in multiple SMILES generators.
>
> Even Daylight's (when they had it up on a public web site).
>
> From a chemical perspective, it isn't sensible that the pyridone-like ring
> in molecule 1 should not be seen as aromatic in the canonical
> SMILES, especially since the same ring is seen as aromatic in molecule2.
> The counter-argument has often been that "only exocyclic substituents are
> considered". But of course that =N is indeed exocyclic to the ring in
> question.
>
> In a famous quote, Dave Weininger said:
>
> It is important to remember that the purpose of the SMILES aromaticity
> detection algorithm is for the purposes of chemical information
> representation only! To this end, rigorous rules are provided for
> determining the "aromaticity" of charged, heterocyclic, and electron
> deficient ring systems. The"aromaticity" designation as used here is not
> intended to imply anything about the reactivity, magnetic resonance
> spectra, heat of formation, or odor of substances.
>
> As an example of the utility of this definition, consider o-xylene. You
> don't want to see the VB structure with a double bond connecting the
> methyl-attached carbons as different from the form with a single bond in
> that position. Hence, aromaticity enables SMILES to avoid that issue, since
> the (canonical) SMILES does not contain any double bonds, but only aromatic
> bonds within the ring.
>
> And the fact is that there is no ambiguity in any of the structures I've
> seen (including the one shown SMILES1) that exhibit the problem. There's
> only one way to draw the resonance structure, anyway, so you could argue
> that you don't need to make it aromatic at all.
>
> Of course, if you had the courage of that particular conviction, you
> wouldn't bother making pyrrole aromatic, either, because there's only one
> resonance structure you can draw. But SMILES does define pyrrole as
> aromatic.
>
> When I've discussed this with developers who have worked on SMILES
> systems, they say that looking for cases like exocyclic
> aromaticity-producing substituents in adjacent non-aromatic rings would
> slow the SMILES generator down.
>
> But the problem is that when you are using a SMARTS to look for one of
> these pyridone-like rings that you see in the first structure, you're not
> going to find it, even though it's there. Chemists do expect an aromatic
> SMARTS to find an aromatic ring, which is no doubt the secret reason for
> making pyrrole aromatic.
>
> I've never liked this situation, but it boils down to the fact that
> Daylight, which produced the original reference SMILES implementation,
> "done it that-a-way". It has the advantage of *stare decisis*.
>
> -P.
>
> P. S. By the way, if any of you have ever seen a SMILES generator that
> displays the 6-membered ring as aromatic in the first example, could you
> please tell us which one that is?
>
> On Fri, Nov 27, 2020 at 1:55 PM Paolo Tosco <paolo.tosco.m...@gmail.com>
> wrote:
>
>> (Now with link - you can tell it's Friday night)
>>
>> Hi Mark, Alexis,
>>
>> Yes, I was too fast in composing my previous reply and I did not pay
>> enough attention to the molecules.
>> After reading Alexis' reply, I looked more carefully at his original
>> question and at that point I remembered having seen a similar behaviour
>> before from RDKit on condensed ring systems featuring exocyclic bonds and
>> relative mailing list discussions.
>> So I did a bit of searching and I fished out the (long) thread that deals
>> with exactly this behaviour.
>>
>>
>> https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/CAAsqebGxOwJtH32T5jC%3DoBZN6G1JE_NwsEqKUO8%2BmUCqmABCzQ%40mail.gmail.com/#msg36448625
>>
>> I hope that helps, cheers
>> p.
>>
>> On Fri, Nov 27, 2020 at 7:31 PM Mark Mackey <m...@cresset-group.com>
>> wrote:
>>
>>> Hi Paolo,
>>>
>>>
>>>
>>> Hmmm, I think this is displaying a bug (or at the very least unexpected
>>> behaviour) in the aromaticity code. The issue isn’t the aromaticity of the
>>> imidazole/dihydroimidazole, but the aromaticity of the pyridyl. Alexis’
>>> second molecule is identical to the first except that one bond in the
>>> 5-membered ring was broken, and that (to my eyes at least) should not
>>> affect whether the 6-membered ring is seen as aromatic.
>>>
>>>
>>>
>>> Regards,
>>>
>>> Mark.
>>>
>>>
>>>
>>> *From:* Paolo Tosco <paolo.tosco.m...@gmail.com>
>>> *Sent:* 27 November 2020 17:04
>>> *To:* Alexis Parenty <alexis.parenty.h...@gmail.com>
>>> *Cc:* RDKit Discuss <rdkit-discuss@lists.sourceforge.net>
>>> *Subject:* Re: [Rdkit-discuss] canonicalization of two aromatic
>>> molecules returning two different forms (kekule and aromatic)
>>>
>>>
>>>
>>> Hi Alexis,
>>>
>>>
>>>
>>> The second molecule (smiles2) is indeed aromatic, but the first (smiles1)
>>> is not, as the imidazole ring condensed to the pyridine is partially
>>> saturated.
>>>
>>> The smiles1a analogue where I have added a double bond is aromatic, and
>>> upon canonicalization it yields an aromatic SMILES as expected.
>>>
>>>
>>>
>>> Cheers,
>>>
>>> p.
>>>
>>>
>>>
>>> *from* rdkit *import* Chem
>>>
>>> In [2]:
>>>
>>> mol1 *=* Chem*.*MolFromSmiles("N12C=CC=CC1=NCC2")
>>>
>>> In [3]:
>>>
>>> mol1
>>>
>>> Out[3]:
>>>
>>> In [4]:
>>>
>>> smiles1 *=* Chem*.*MolToSmiles(mol1)
>>>
>>> In [5]:
>>>
>>> smiles1
>>>
>>> Out[5]:
>>>
>>> 'C1=CC2=NCCN2C=C1'
>>>
>>> In [6]:
>>>
>>> mol2 *=* Chem*.*MolFromSmiles("CN=C1C=CC=CN1C")
>>>
>>> In [7]:
>>>
>>> mol2
>>>
>>> Out[7]:
>>>
>>> In [8]:
>>>
>>> smiles2 *=* Chem*.*MolToSmiles(mol2)
>>>
>>> In [9]:
>>>
>>> smiles2
>>>
>>> Out[9]:
>>>
>>> 'CN=c1ccccn1C'
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> In [10]:
>>>
>>> mol1a *=* Chem*.*MolFromSmiles("N12C=CC=CC1=NC=C2")
>>>
>>> In [11]:
>>>
>>> mol1a
>>>
>>> Out[11]:
>>>
>>> In [12]:
>>>
>>> smiles1a *=* Chem*.*MolToSmiles(mol1a)
>>>
>>> In [13]:
>>>
>>> smiles1a
>>>
>>> Out[13]:
>>>
>>> 'c1ccn2ccnc2c1'
>>>
>>>
>>>
>>> On Fri, Nov 27, 2020 at 5:09 PM Alexis Parenty <
>>> alexis.parenty.h...@gmail.com> wrote:
>>>
>>> Hi everyone,
>>>
>>>
>>>
>>> Why is it that when I canonicalize the following smiles_1 I get its
>>> unexpected kekule form, whereas when I canonicalize a similar smiles_2, I
>>> get its expected aromatic form?
>>>
>>>
>>>
>>> From rdkit import Chem
>>>
>>> smiles1 = Chem.CanonSmiles("N12C=CC=CC1=NCC2")
>>> smiles
>>>
>>> ==> 'C1=CC2=NCCN2C=C1'
>>>
>>>
>>>
>>> smiles2 = Chem.CanonSmiles("CN=C1C=CC=CN1C")
>>> smiles2
>>>
>>> ==> 'CN=c1ccccn1C'
>>>
>>>
>>>
>>> I would like to get the aromatic form in both cases... Is there a way to
>>> force the aromatic form?
>>>
>>>
>>>
>>> Best,
>>>
>>> Alexis
>>>
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to