Hi Susan,

that looks like a bug in the way the MDL query is parsed; I have filed it
here:
https://github.com/rdkit/rdkit/issues/4785

If you can afford doing some Python massaging to your CTAB queries and
converting them to SMARTS before submitting them to PostgreSQL when they
fail sanitization, the following should work:

mol_og = Chem.MolFromMolBlock(ctab_og, sanitize=False)
try:
    Chem.SanitizeMol(mol_og)
    cur = conn.cursor()
    cur.execute(f"""select mol_from_smiles('{sm1}') @>
qmol_from_ctab('{ctab_og}')""")
    rows = cur.fetchall()
    print(rows)
except Chem.AtomKekulizeException as e:
    if re.match(r"non-ring atom \d+ marked aromatic", str(e)):
        Chem.FastFindRings(mol_og)
        rwmol_og = Chem.RWMol(mol_og)
        for a in mol_og.GetAtoms():
            if a.GetIsAromatic() and not a.IsInRing():
                rwmol_og.ReplaceAtom(a.GetIdx(),
rdqueries.IsAromaticQueryAtom())
        try:
            Chem.SanitizeMol(rwmol_og)
            smarts_og = Chem.MolToSmarts(rwmol_og)
            cur = conn.cursor()
            cur.execute(f"""select mol_from_smiles('{sm1}') @>
mol_from_smarts('{smarts_og}')""")
            rows = cur.fetchall()
            print(rows)
        except:
            ...

HTH, cheers
p.


On Thu, Dec 9, 2021 at 5:32 PM Susan Leung <susanhle...@gmail.com> wrote:

> Hi all,
>
>
>
> I am trying to do some substructure queries using the RDKit PostgreSQL
> cartridge. Specifically, my queries substructure inputs are CTAB (not
> SMARTS) so I would like to use qmol_from_ctab. However, I have some
> problems with making valid query molecules with a few CTABs.
>
>
>
> In this query, I try to use a CTAB to make a query to search for aryl
> boronate acid/ester. I can make an equivalent query using SMARTS but the
> CTAB is not valid.
>
>
> As far as I am aware, there's no warning message when using the SQL
> functions, so I use MolFromMolBlock from python and get "non-ring atom 0
> marked aromatic" so I correct the aromatic bond type to double bond and
> the CTAB can be read in (but that's not the query I want). I am guessing
> that there may be additional validity checks / sanitization steps when
> executing qmol_from_ctab vs qmol_from_smarts? As far as I can see, there’s
> no flag in qmol_from_ctab.
>
>
>
> I describe the general problem below but also attach the ipynb (if it is
> useful) that uses psycopg2 to do the SQL , leaving out the database
> connection credentials.
>
>
>
> Many thanks,
>
>
>
> Susan
>
> __________________________________
>
>
>
> For example, I want to match an aromatic boronic acid:
>
> sm1 = 'OB(O)c1ccccc1'
>
>
>
> But the following CTAB isn’t valid. MolFromFromBlock returns non-ring atom
> marked aromatic error so I suspect it’s to do with that. Also changing the
> bond marked aromatic ‘4’ to a double bond ‘2’ makes the ctab valid.
>
> ctab_og = """Boronate acid/ester(aryl)
>
>   SciTegic12012112112D
>
>
>
>   5  4  0  0  0  0            999 V2000
>
>     1.7243   -2.7324    0.0000 A   0  0
>
>     2.7559   -2.1456    0.0000 C   0  0
>
>     3.7808   -2.7324    0.0000 B   0  0
>
>     4.8057   -2.1456    0.0000 O   0  0
>
>     3.7808   -3.9190    0.0000 O   0  0
>
>   1  2  4  0  0  1  0
>
>   2  3  1  0
>
>   3  4  1  0
>
>   3  5  1  0
>
> M  END
>
> > <Name>
>
> Boronate acid/ester(aryl)
>
>
>
> """
>
> ctab_fixed = """Boronate acid/ester(aryl)
>
>   SciTegic12012112112D
>
>
>
>   5  4  0  0  0  0            999 V2000
>
>     1.7243   -2.7324    0.0000 A   0  0
>
>     2.7559   -2.1456    0.0000 C   0  0
>
>     3.7808   -2.7324    0.0000 B   0  0
>
>     4.8057   -2.1456    0.0000 O   0  0
>
>     3.7808   -3.9190    0.0000 O   0  0
>
>   1  2  2  0  0  1  0
>
>   2  3  1  0
>
>   3  4  1  0
>
>   3  5  1  0
>
> M  END
>
> > <Name>
>
> Boronate acid/ester(aryl)
>
>
>
> """
>
> select is_valid_ctab('{ctab_og}')
>
>
>
> Returns False
>
> select is_valid_ctab('{ctab_fixed}')
>
> Returns True
>
>
>
> However, I can make a qmol using SMARTS match sm1. Is there of making the
> query CTAB valid so we don’t have to use SMARTS?
>
> select mol_from_smiles('{sm1}') @> qmol_from_ctab('{ctab_fixed}')
>
>
>
> Returns False
>
>
>
> select mol_from_smiles('{sm1}') @> qmol_from_smarts('{alt_smarts}')
>
>
>
> Returns True
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to