Yeah, this is exactly the case where using qmol_from_ctab() should help.

Below is a short example demonstrating this by querying my local ChEMBL
instance. Notice that the first form of the query, which uses
mol_from_ctab() matches what you describe: the results include amides,
esters, etc. The second query, which uses qmol_from_ctab(), only returns
molecules which have a ketone.

I hope this helps,
-greg

chembl_28=# select * from rdk.mols where m@>mol_from_ctab('aldehyde query
  MJ192500

  4  3  0  0  0  0  0  0  0  0999 V2000
   -2.8123    1.5508    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5267    1.1383    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.2412    1.5508    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5267    0.3133    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
  2  1  1  0  0  0  0
  2  4  2  0  0  0  0
  2  3  1  0  0  0  0
M  END
') limit 5;
 molregno |                               m
----------+----------------------------------------------------------------
   310993 | O=C(NO)c1cc(CS(=O)(=O)c2ccc(Cl)cc2)on1
   310992 | O=C(NO)c1cc(CS(=O)(=O)c2cccc(Cl)c2)on1
   318822 | CCC(NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C#N)c1ccccc1
   310016 | O=C(CCNC(=O)c1ccccc1)NC1CCN(Cc2ccc(Cl)cc2)C1
   319381 | CCOC(=O)/C=C/c1ccc(CN(C(=O)C2CCCCC2)c2cccc(/C=C/C(=O)OC)c2)cc1
(5 rows)

chembl_28=# select * from rdk.mols where m@>qmol_from_ctab('aldehyde query
  MJ192500

  4  3  0  0  0  0  0  0  0  0999 V2000
   -2.8123    1.5508    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5267    1.1383    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.2412    1.5508    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
   -3.5267    0.3133    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
  2  1  1  0  0  0  0
  2  4  2  0  0  0  0
  2  3  1  0  0  0  0
M  END
') limit 5;
 molregno |
                                m

----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   284772 | COC(=O)NC1[C@H](C)O[C@@H](O[C@H]2C/C=C(\C)[C@@H]3C=C[C@@H]4[C@
@H](O)[C@@H](C)C[C@H](C)[C@H]4[C@]3(C)/C(O)=C3\C(=O)O[C@]4(CC(C=O)=C[C@H
](OC(C)=O)[C@H]4/C=C\2C)C3=O)CC1(C)[N+](=O)[O-]
   284633 | COC(=O)NC1[C@H](C)O[C@@H](O[C@H]2C/C=C(\C)[C@@H]3C=C[C@@H]4[C@
@H](O[C@H]5CCCCO5)[C@@H](C)C[C@H](C)[C@H]4[C@]3(C)/C(O)=C3\C(=O)O[C@
]4(CC(C=O)=C[C@H](OC(C)=O)[C@H]4/C=C\2C)C3=O)CC1(C)[N+](=O)[O-]
   284865 | COC(=O)NC1[C@H](C)O[C@@H](O[C@H]2C/C=C(\C)[C@@H]3C=C[C@@H]4[C@
@H](OCc5ccc(OC)cc5)[C@@H](C)C[C@H](C)[C@H]4[C@]3(C)/C(O)=C3\C(=O)O[C@
]4(CC(C=O)=C[C@H](OC(C)=O)[C@H]4/C=C\2C)C3=O)CC1(C)[N+](=O)[O-]
   299586 | CC1(C)C2CC[C@]3(C)C(CC=C4C5CC(C)(C)[C@@H](OC(=O)c6ccccc6)[C@H
](OC(=O)/C=C/c6ccccc6)[C@]5(C=O)[C@H](O)C[C@]43C)[C@@]2(C)CC[C@@H]1O
   317613 | Cn1cncc1C=O
(5 rows)



On Tue, Jul 20, 2021 at 11:55 PM Webster Homer <
webster.ho...@milliporesigma.com> wrote:

> I should have included the query. It looks like RD Kit is ignoring the H
> atom
>
> The user put in an explicit H
>
> ===========MOL file after this
>
> aldehyde query
>
>   MJ192500
>
>
>
>   4  3  0  0  0  0  0  0  0  0999 V2000
>
>    -2.8123    1.5508    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>
>    -3.5267    1.1383    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>
>    -4.2412    1.5508    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
>
>    -3.5267    0.3133    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
>
>   2  1  1  0  0  0  0
>
>   2  4  2  0  0  0  0
>
>   2  3  1  0  0  0  0
>
> M  END
>
> =================MOL file above this
>
>
>
>
>
> *From:* Greg Landrum <greg.land...@gmail.com>
> *Sent:* Friday, July 16, 2021 11:38 PM
> *To:* Webster Homer <webster.ho...@milliporesigma.com>
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* Re: [Rdkit-discuss] Substructure search for an aldehyde
> returns ketones and acids
>
>
>
> *[WARNING – EXTERNAL EMAIL]* Do not open links or attachments unless you
> recognize the sender of this email. If you are unsure please click the
> button "Report suspicious email"
>
>
>
> Hi Webster,
>
>
>
> Without seeing an actual query I am inclined to believe that it’s not a
> bug. The problem is more likely a query which has not been drawn explicitly
> or an easily made mistake in the way the cartridge is being used.
>
>
>
> Assuming that the aldehyde queries have been drawn with an explicit H atom
> connected to the C (apologies for not showing this, I’m on my phone and
> don’t have a sketcher available), you should be calling the cartridge
> function qmol_from_ctab(), not mol_from_ctab(), before doing the query.
> qmol_from_ctab() will use the H to help define the query.
>
>
>
> If you’re doing this and still seeing incorrect search results, please
> share a query and the way you’re doing the search and we can try to help
> (or diagnose the bug if there is one)
>
>
>
> Best,
>
> -greg
>
>
>
>
>
> On Fri, 16 Jul 2021 at 17:53, Webster Homer <
> webster.ho...@milliporesigma.com> wrote:
>
> We use RDKit Postgresql cartridge as our substructure searcher. When a
> user sketches an aldehyde and submits the mol fle as the query. RD Kit
> returns aldehydes, but also returns ketones and acids. Is this a bug?
>
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
>
>
> Click merckgroup.com/disclaimer
> <https://www.merckgroup.com/en/legal-disclaimer/mail-disclaimer.html> to
> access the German, French, Spanish, Portuguese, Turkish, Polish and Slovak
> versions of this disclaimer.
>
>
>
> Please find our Privacy Statement information by clicking here
> merckgroup.com/en/privacy-statement.html
> <https://www.merckgroup.com/en/privacy-statement.html>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
>
>
> Click merckgroup.com/disclaimer
> <https://www.merckgroup.com/en/legal-disclaimer/mail-disclaimer.html> to
> access the German, French, Spanish, Portuguese, Turkish, Polish and Slovak
> versions of this disclaimer.
>
>
>
> Please find our Privacy Statement information by clicking here
> merckgroup.com/en/privacy-statement.html
> <https://www.merckgroup.com/en/privacy-statement.html>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to