Hi everyone,

I need a function that could generalize any aromatic rings from a SMARTS:

[image: Inline images 1]


I have noticed that it is possible to rearrange most of SMARTS strings into
a general aromatic SMARTS strings by following those simple rules:

1                     Exchange any lower case of a SMARTS string with “:[*]”

2                     Catch the two cycle junctions of the SMARTS:

a.       Where a number(1-9) appears a first time in the string: insert a
colon after the digit (for example “[*]1” to “[*]1:”

b.      Where the same number appears a second time, move the semi colon
before the digit (for example “[*]1:” to “[*]:1 the


I have written a function (see under) that works fine with any SMART
containing a single aromatic ring. But it does get buggy when I have a
SMARTS with more than one aromatic ring:



[image: Inline images 2]



def get_aromatic_generalised_smarts(smarts):
   for arom_atom in ("c", "o", "n", "s"):
      smarts = smarts.replace(arom_atom, "x")
   smarts = smarts.replace("[xH]", "x") # to take care of explicit
hydrogen atoms

   for char in smarts:
      if char == 'x':
         smarts = smarts.replace(char, ":[*]")

   for char in smarts:
      if char.isdigit():
         if ("[*]"+char) in smarts:
            for cycle_junction in ("[*]1", "[*]2", "[*]3", "[*]4",
"[*]5", "[*]6", "[*]7", "[*]8", "[*]9"):
               smarts = smarts.replace(cycle_junction, "[*]:" +
cycle_junction[-1])   # that make the second cycle junction OK but
introduce an error in the first cycle jonction that is corrected next
line
            smarts = smarts.replace(":[*]:"+char, "[*]"+char, 1) # to
correct the first cycle junction.
            break
   return smarts


print(get_aromatic_generalised_smarts("[*]c1coc(Cl)n1"))
print(get_aromatic_generalised_smarts("[*]c1coc(Cl)c1"))

print(get_aromatic_generalised_smarts("[*]c1coc(Cl)c1Cc2ccccc2")


Am I heading in the right direction? I can't make my heads around SMARTS
with more than one aromatic rings...

Maybe regular expressions would be more appropriate? Maybe there is an
RDKit function that does the trick from a mol object?


Thanks,


Alexis
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to