Hi Jim,

you can indeed enumerate all Kekulè structures for a molecule within the RDKit using Chem.ResonanceMolSupplier():

from  rdkit  import  Chem

mol  =  Chem.MolFromSmiles('c1ccccc1')

suppl  =  Chem.ResonanceMolSupplier(mol,  Chem.KEKULE_ALL)

len(suppl)

2

for  i  in  range(len(suppl)):
    print  (Chem.MolToSmiles(suppl[i],  kekuleSmiles=True))

C1C=CC=CC=1
C1=CC=CC=C1

Best,
Paolo

On 09/11/2017 05:22 PM, James T. Metz via Rdkit-discuss wrote:
Greg,

Thanks! Yes, very helpful. I will need to digest the detailed information
you have provided.  I am somewhat familiar with recursive SMARTS.  Thanks
again.

    Regards,
    Jim Metz




-----Original Message-----
From: Greg Landrum <greg.land...@gmail.com>
To: James T. Metz <jamestm...@aol.com>
Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net>
Sent: Mon, Sep 11, 2017 11:15 am
Subject: Re: [Rdkit-discuss] how to output multiple Kekule structures


On Mon, Sep 11, 2017 at 5:55 PM, James T. Metz <jamestm...@aol.com <mailto:jamestm...@aol.com>> wrote:

    Greg,

        I need to be able to use SMARTS patterns to identify
    substructures in molecules
    that can be aromatic, and I need to be able to handle cases where
    there can be
    differences in the way that the molecule was entered or drawn by a
    user.


That particular problem is a big part of the reason that we tend to use the aromatic representation of things.

        For example, consider the following alkenyl-substituted
    pyridine, there
    are two possible Kekule structures

        m1 = 'C=CC1=NC=CC=C1'
        m2 = 'C=CC1N=CC=CC1'


Fixing what I assume is a typo for m2, I can do the following:

In [11]: m1 = Chem.MolFromSmiles('C=CC1=NC=CC=C1')

In [12]: m2 = Chem.MolFromSmiles('C=CC1N=CC=CC=1')

In [13]: q1 = Chem.MolFromSmarts('cccc')

In [14]: q2 = Chem.MolFromSmarts('cccn')

In [15]: list(m1.GetSubstructMatch(q1))
Out[15]: [2, 7, 6, 5]

In [16]: list(m1.GetSubstructMatch(q2))
Out[16]: [6, 5, 4, 3]

In [17]: list(m2.GetSubstructMatch(q1))
Out[17]: [2, 7, 6, 5]

In [18]: list(m2.GetSubstructMatch(q2))
Out[18]: [6, 5, 4, 3]

Those particular queries were going for the aromatic species and will only match inside the ring, but if you want to be more generic you could tune your queries like this:

In [28]: q3 = Chem.MolFromSmarts('[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]-=,:[*])]')

In [29]: q4 = Chem.MolFromSmarts('[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]=,:[*])]-,=,:[#7;$([#7]-=,:[*])]')

In [30]: list(m1.GetSubstructMatch(q3))
Out[30]: [0, 1, 2, 7]

In [31]: list(m1.GetSubstructMatch(q4))
Out[31]: [0, 1, 2, 3]

In [32]: list(m2.GetSubstructMatch(q3))
Out[32]: [0, 1, 2, 7]

In [33]: list(m2.GetSubstructMatch(q4))
Out[33]: [0, 1, 2, 3]

If you aren't familiar with recursive SMARTS, this construct: "[#6;$([#6]=,:[*])]" means "a carbon that has either a double bond or an aromatic bond to another atom". So you can interpret q3 as "four carbons that each have either a double or aromatic bond and that are connected to each other by single, double, or aromatic bonds".

Is this starting to approximate what you're looking for?
-greg




        Now consider two SMARTS

        pattern1 = '[C]=[C]-[C]={C]
        pattern2 = '[C]=[C]-[C]=[N]'

        I need to be able to detect the existence of each pattern in
    the molecule

        If m1 is the only available generated Kekule structure, then
    pattern2 will be recognized.
        If m2 is the only available generated Kekule  structure, then
    pattern1 will be recognized.

        Hence, I am getting different answers for the same input
    molecule just because
    it was drawn in different Kekule structures.

        Regards,
        Jim Metz





    -----Original Message-----
    From: Greg Landrum <greg.land...@gmail.com
    <mailto:greg.land...@gmail.com>>
    To: James T. Metz <jamestm...@aol.com <mailto:jamestm...@aol.com>>
    Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net
    <mailto:rdkit-discuss@lists.sourceforge.net>>
    Sent: Mon, Sep 11, 2017 10:31 am
    Subject: Re: [Rdkit-discuss] how to output multiple Kekule structures

    Hi Jim,

    The code currently has no way to enumerate Kekule structures. I
    don't recall this coming up in the past and, to be honest, it
    doesn't seem all that generally useful.

    Perhaps there's an alternate way to solve the problem; what are
    you trying to do?

    -greg


    On Mon, Sep 11, 2017 at 5:04 PM, James T. Metz via Rdkit-discuss
    <rdkit-discuss@lists.sourceforge.net> wrote:

        Hello,

            Suppose I read in an aromatic SMILES e.g., for benzene

            c1ccccc1

            I would like to generate the major canonical resonance forms
        and save the results as two separate molecules.  Essentially
        I am trying to generate

            m1 = 'C1=CC=CC-C1'
            m2 = 'C1C=CC=CC1'

            Can this be done in RDkit?  I have found a KEKULE_ALL
        option in the detailed documentation which seems to be what I
        am trying to do, but I don't understand how this option is to
        be used,
        or the proper syntax.

          If it is necessary to somehow renumber the atoms and re-generate
        Kekule structures, that is OK.  Thank you.

          Regards,
          Jim Metz







        
------------------------------------------------------------------------------
        Check out the vibrant tech community on one of the world's most
        engaging tech sites, Slashdot.org! http://sdm.link/slashdot
        _______________________________________________
        Rdkit-discuss mailing list
        Rdkit-discuss@lists.sourceforge.net
        https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to