Hi Greg,

thank you for your feedback.

the tests you mentioned worked ok for me and both molecules are matched using the specified smiles. I found that the matching problem was really silly: I was expecting to match both molecules in the CHEMBL database I downloaded (i.e. CHEMBL1517804 and CHEMBL2442053) which are accesible though a search using the web interface of CHEMBL. However, for some reason compound CHEMBL2442053 is not present in the downloadable database (and obviously not being matched)

best regards

Alfredo


El 31/05/2018 a las 1:09, Greg Landrum escribió:
Hi Alfredo,

I can't think of any reason this would be true based on the molecules you provide.
Certainly each of the molecules has a substructure match:
chembl_23=# select 'CCc1ccc(-n2nc3ccc(NC(=O)c4ccc5c(c4)OCO5)cc3n2)cc1'::mol@>'c1c[nH]nn1'::mol;
 ?column?
----------
 t
(1 row)

chembl_23=# select 'COc1ncc(-c2ccc(N(Cc3ccsc3)C(=O)Cn3nnc4ccccc43)cc2)cn1'::mol@>'c1c[nH]nn1'::mol;
 ?column?
----------
 t
(1 row)

And if I put them in a small table, add an index, and search, I also get the expected results:
chembl_23=# create temporary table twomols (smiles text,m  mol);
CREATE TABLE
chembl_23=# insert into twomols values ('CCc1ccc(-n2nc3ccc(NC(=O)c4ccc5c(c4)OCO5)cc3n2)cc1', 'CCc1ccc(-n2nc3ccc(NC(=O)c4ccc5c(c4)OCO5)cc3n2)cc1'::mol);
INSERT 0 1
chembl_23=# insert into twomols values ('COc1ncc(-c2ccc(N(Cc3ccsc3)C(=O)Cn3nnc4ccccc43)cc2)cn1', 'COc1ncc(-c2ccc(N(Cc3ccsc3)C(=O)Cn3nnc4ccccc43)cc2)cn1'::mol);
INSERT 0 1
chembl_23=# select smiles from twomols where m@>'c1c[nH]nn1'::mol;
                        smiles
-------------------------------------------------------
 CCc1ccc(-n2nc3ccc(NC(=O)c4ccc5c(c4)OCO5)cc3n2)cc1
 COc1ncc(-c2ccc(N(Cc3ccsc3)C(=O)Cn3nnc4ccccc43)cc2)cn1
(2 rows)

chembl_23=# create index tidx on twomols using gist(m);
CREATE INDEX
chembl_23=# select smiles from twomols where m@>'c1c[nH]nn1'::mol;
                        smiles
-------------------------------------------------------
 CCc1ccc(-n2nc3ccc(NC(=O)c4ccc5c(c4)OCO5)cc3n2)cc1
 COc1ncc(-c2ccc(N(Cc3ccsc3)C(=O)Cn3nnc4ccccc43)cc2)cn1
(2 rows)

Can you please check to see if this simple test works for you?
To do more detailed troubleshooting I will need to know which version of the cartridge you are using and one which operating system.

Best,
-greg



On Tue, May 29, 2018 at 8:00 PM Alfredo Quevedo <[email protected] <mailto:[email protected]>> wrote:

    Dear user,

    I am trying to perform a substructure search using smiles notation
    under
    the ChEMBL database I have already loaded into my postgreSQL
    database. I
    am here providing two sample molecules in smiles format as read by
    the
    RDKit cartrigde into the database:

    Molecule 1: CCc1ccc(-n2nc3ccc(NC(=O)c4ccc5c(c4)OCO5)cc3n2)cc1

    Molecule 2: COc1ncc(-c2ccc(N(Cc3ccsc3)C(=O)Cn3nnc4ccccc43)cc2)cn1


    Both molecules contains a triazole scaffold, and I am trying to
    select
    both compounds among a whole database using the following smiles
    genereated by RDKit for a triazole: ´c1c[nH]nn1´

    My problem is that the search is only able to match molecule 1 but
    not
    molecule 2. Which may be the problem? Since I am serching in a
    database
    of compounds previously processed with the RDKit cartrigde,
    shouldnt the
    subtructure match?

    thanks in advance for the help

    regards

    Alfredo


    
------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, Slashdot.org! http://sdm.link/slashdot
    _______________________________________________
    Rdkit-discuss mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to