Hi Christos, thank you so much!

Your approach is much simpler and quicker than what I had, and it now works
with polycyclic compounds. I did try your approach at first but I could not
have an image representation in ChemDraw of the SMARTS I was creating with
the "a" labels. I thought I was doing something wrong and thought the only
way was to use the more complicated “:[*]” notation… Your script provides
valid SMARTS even if ChemDraw does not recognize them. You saved me a lot
of time.

Thanks again,

Alexis

On 19 May 2017 at 14:38, Christos Kannas <chriskan...@gmail.com> wrote:

> Hi Alexis,
>
> In SMARTS you can define an aromateic atom with "a".
> So I'm thinking that something like the following, might produce more
> correct generalised SMARTS patterns.
>
> https://gist.github.com/CKannas/7a9e2768461260461155257fd30c2152
>
> *Note: Please check if the chemistry is correct.*
>
> Best,
>
> Christos
>
> Christos Kannas
>
> Researcher
> Ph.D Student
>
> [image: View Christos Kannas's profile on LinkedIn]
> <http://cy.linkedin.com/in/christoskannas>
>
> On 19 May 2017 at 12:52, Alexis Parenty <alexis.parenty.h...@gmail.com>
> wrote:
>
>> Hi everyone,
>>
>>
>> I need a function that could generalize any aromatic rings from a SMARTS:
>>
>> [image: Inline images 1]
>>
>>
>> I have noticed that it is possible to rearrange most of SMARTS strings
>> into a general aromatic SMARTS strings by following those simple rules:
>>
>> 1                     Exchange any lower case of a SMARTS string with
>> “:[*]”
>>
>> 2                     Catch the two cycle junctions of the SMARTS:
>>
>> a.       Where a number(1-9) appears a first time in the string: insert
>> a colon after the digit (for example “[*]1” to “[*]1:”
>>
>> b.      Where the same number appears a second time, move the semi colon
>> before the digit (for example “[*]1:” to “[*]:1 the
>>
>>
>> I have written a function (see under) that works fine with any SMART
>> containing a single aromatic ring. But it does get buggy when I have a
>> SMARTS with more than one aromatic ring:
>>
>>
>>
>> [image: Inline images 2]
>>
>>
>>
>> def get_aromatic_generalised_smarts(smarts):
>>    for arom_atom in ("c", "o", "n", "s"):
>>       smarts = smarts.replace(arom_atom, "x")
>>    smarts = smarts.replace("[xH]", "x") # to take care of explicit hydrogen 
>> atoms
>>
>>    for char in smarts:
>>       if char == 'x':
>>          smarts = smarts.replace(char, ":[*]")
>>
>>    for char in smarts:
>>       if char.isdigit():
>>          if ("[*]"+char) in smarts:
>>             for cycle_junction in ("[*]1", "[*]2", "[*]3", "[*]4", "[*]5", 
>> "[*]6", "[*]7", "[*]8", "[*]9"):
>>                smarts = smarts.replace(cycle_junction, "[*]:" + 
>> cycle_junction[-1])   # that make the second cycle junction OK but introduce 
>> an error in the first cycle jonction that is corrected next line
>>             smarts = smarts.replace(":[*]:"+char, "[*]"+char, 1) # to 
>> correct the first cycle junction.
>>             break
>>    return smarts
>>
>>
>> print(get_aromatic_generalised_smarts("[*]c1coc(Cl)n1"))
>> print(get_aromatic_generalised_smarts("[*]c1coc(Cl)c1"))
>>
>> print(get_aromatic_generalised_smarts("[*]c1coc(Cl)c1Cc2ccccc2")
>>
>>
>> Am I heading in the right direction? I can't make my heads around SMARTS
>> with more than one aromatic rings...
>>
>> Maybe regular expressions would be more appropriate? Maybe there is an
>> RDKit function that does the trick from a mol object?
>>
>>
>> Thanks,
>>
>>
>> Alexis
>>
>>
>>
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to