We are currently using Mol files, they’re large especially when urlencoded, we 
would prefer smiles and smarts for queries, but we would need to be able to 
normalize them.

As to sketchers Marvin JS seems to do a pretty good job with SMARTS.

Webster

From: Greg Landrum <greg.land...@gmail.com>
Sent: Friday, November 01, 2019 11:21 PM
To: Webster Homer <webster.ho...@milliporesigma.com>
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] SMARTS Query Normalization?

Hi Webster,

That's a really good question.
At the moment there isn't any way to do SMARTS normalization. The assumption 
throughout the code is that if you've gone to the trouble to create a SMARTS 
then you captured the aromaticity that you intend to search for there. I think 
your use case makes sense though, so this would be an interesting thing for us 
to take a look at for a future release.

What you might be able to do in the meantime, and what I usually suggest when 
coming from a chemical sketcher, is to get an MDL molfile from the sketcher and 
then use that to do your queries. You can use mol_from_ctab() in the cartridge 
along with mol_adjust_query_properties:

chembl_25=# select * from rdk.mols where 
m@>mol_adjust_query_properties(mol_from_ctab('
  Mrv1810 11021905152D

  9  9  0  0  0  0            999 V2000
   -2.2782   -0.0547    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9927   -0.4672    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.9927   -1.2922    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.2782   -1.7047    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.5637   -1.2922    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.5637   -0.4672    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.2782    0.7703    0.0000 A   0  0  0  0  0  0  0  0  0  0  0  0
   -0.8493   -0.0547    0.0000 A   0  0  0  0  0  0  0  0  0  0  0  0
   -0.8493   -1.7047    0.0000 A   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  2  3  2  0  0  0  0
  3  4  1  0  0  0  0
  4  5  2  0  0  0  0
  5  6  1  0  0  0  0
  1  6  2  0  0  0  0
  1  7  1  0  0  0  0
  6  8  1  0  0  0  0
  5  9  1  0  0  0  0
M  END
')) limit 5;

The chemical sketchers that I have tried tend to do a better job of generating 
queries in Mol files, and the RDKit deals with converting from kekule->aromatic 
form for you.

Does that help?
-greg


On Thu, Oct 31, 2019 at 5:42 PM Webster Homer 
<webster.ho...@milliporesigma.com<mailto:webster.ho...@milliporesigma.com>> 
wrote:
I am working on evaluating the RD Kit Postgresql data cartridge for use as the 
back end of a Web Application. The app will use a JavaScript sketcher to allow 
the user to input a SMILES of SMARTS that will be sent to the RD Kit cartridge. 
In evaluating RD Kit I found that it doesn’t support aromatic normalization on 
SMARTS. As a test case I used Marvin JS to generate a SMARTS:  
C(=CN=C1)C(=C1N2)N=C2

Used it as a query:
select structure_id from rdk.mols where 
m@>mol_adjust_query_properties(mol_from_smarts('C(=CN=C1)C(=C1N2)N=C2'));
structure_id
--------------
(0 rows)
Not surprisingly it had no hits. Looked at the mol_adjust_query_properties 
function:
select mol_adjust_query_properties(mol_from_smarts('C(=CN=C1)C(=C1N2)N=C2'));
mol_adjust_query_properties
-----------------------------
c1cc2ncnc2cn1

That looked good.
select structure_id from rdk.mols where 
m@>mol_adjust_query_properties(mol_from_smarts('c1cc2ncnc2cn1'));
structure_id
--------------
     30183725
(1 row)
But wait there should be more hits!
select count(*) from rdk.mols where m@>'c1cc2ncnc2cn1'::qmol;
count
-------
    27
Then I tried this:
select structure_id from rdk.mols where 
m@>mol_adjust_query_properties(mol_from_smarts('c1cc2ncnc2cn1'),'{"adjustDegree":false}');
(27 rows)
OK, but what I really need to have work is this:
select structure_id from rdk.mols where 
m@>mol_adjust_query_properties(mol_from_smarts('C(=CN=C1)C(=C1N2)N=C2'),'{"adjustDegree":false}');
structure_id
--------------
(0 rows)
Which it does not. Is mol_adjust_query_properties misnamed? It doesn’t really 
seem to want a query. Am I missing an option? Unless I can make this work I 
don’t see how I can use RD Kit in my application.

What am I missing? Or does RD Kit just not allow for normalizing SMARTS?

Thanks
Webster Homer
This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, you 
must not copy this message or attachment or disclose the contents to any other 
person. If you have received this transmission in error, please notify the 
sender immediately and delete the message and any attachment from your system. 
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept 
liability for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any unauthorized 
changes of the content of this message and any attachment thereto. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not guarantee that this 
message is free of viruses and does not accept liability for any damages caused 
by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer 
to access the German, French, Spanish and Portuguese versions of this 
disclaimer.
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, you 
must not copy this message or attachment or disclose the contents to any other 
person. If you have received this transmission in error, please notify the 
sender immediately and delete the message and any attachment from your system. 
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept 
liability for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any unauthorized 
changes of the content of this message and any attachment thereto. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not guarantee that this 
message is free of viruses and does not accept liability for any damages caused 
by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer 
to access the German, French, Spanish and Portuguese versions of this 
disclaimer.
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to