Dear all,

Using the latest/greatest 2018.09.1.

I have an MSc student who is working on some targets in DUDE.

If we take some specific molecules from there (e.g. "C1=NN(C2=NC=NC(=O)[C@
@H]21)[C@H]3[C@@H]([C@@H]([C@H](O3)CO)O)O"), sanitize them using r
dMolStandardize.StandardizeSmiles(smiles), and we then generate conformers
(with EmbedMultipleConfs and ETKDGv2) -- the conformer generation step
hangs.  If we omit the sanitization step, conf. gen. works fine as expected.

Any clues as to what may be causing this?  My bet in the above example is
something to do with chirality, i.e. [C@@H].  Any hint on a possible

I'd also like to thank whoever it was who worked on integrating the
cleaning code (molvs) into RDKit.  This is such a critical, common task -
great to have something out of the box to do it.

We have an example jupyter notebook which highlights the problem here:

Also, a list of other molecules which exhibit this same behaviour (just the
ones we came across, as we only looked at a small subset of DUDE targets):

Adenosine A2a receptor (GPCR)/ 28499( C1=CC2=c3nn/c(=N\N=C\[C@
@H]4C=CC=N4)[nH]c3=N[C@@H]2C=C1 )
Adenosine A2a receptor (GPCR)/ 9903(
Adenosine A2a receptor (GPCR)/ 23728(
Progesterone Receptor/ 14194(
Cc1ccc(S(=O)(=O)C(Sc2ccccc2)=S=NC23CC4CC(CC(C4)C2)C3)cc1 )
Progesterone Receptor/ 14821( Cc1ccc(N2C(=O)[C@@H]3[C@@H]4C[C@H]5[C@H](O[C@
@]2(C(C)C)[C@@H]53)[C@@H]4O)c(C)c1 )
Adenosine A2a receptor (GPCR)/ 4014(
CCc1ccc2c(c1)=C1N=[NH+]C(N/N=C/c3cc(OC)ccc3OC)=N[C@@H]1[NH+]=2 )
Progesterone Receptor/ 61(
CC(C)=C/C=C1\Oc2ccc(F)cc2-c2ccc3c(c21)C(C)=CC(C)(C)N3 )
Progesterone Receptor/ 67(
CC1=CC(C)(C)Nc2ccc3c(c21)C(=C1SCCCS1)Oc1ccc(F)cc1-3 )
Adenosine A2a receptor (GPCR)/ 29753(
Adenosine A2a receptor (GPCR)/ 14471(
CCc1nn2c(c1-c1ccc(Cl)cc1)NC=C1C(=O)N(c3ncn[nH]3)C=C[C@@H]12 )
Adenosine A2a receptor (GPCR)/ 2411(
Cc1ccc2c(c1)=C1N=NC(N/N=C/c3cc(O)c(O)c(Br)c3)=[NH+][C@H]1[NH+]=2 )
HIVPR/ 21585( 
Adenosine A2a receptor (GPCR)/ 13221(
Cc1nn2c(c1-c1ccc(F)cc1)NC=C1C(=O)N(c3ncn[nH]3)C=C[C@H]12 )
Leukotriene A4 hydrolase (Protease)/ 8094( C[C@@H]([NH3+])[C@@H]1[C@H
]2CC[C@H]3C[C@H](C2)C[C@@H]31 )
Leukotriene A4 hydrolase (Protease)/ 4803( CC1=N[C@@H]2C=C(OC[C@
@H]3CCN(C(=O)OC(C)(C)C)C3)C=C[C@H]2S1 )
Adenosine A2a receptor (GPCR)/ 8106(
O=CC1=CN=C2C=CC(c3cccc([N+](=O)[O-])c3)=C[C@H]12 )
Thymidine kinase/ 2696( CC(=O)O[C@H]1CC[C@@]2(COS(C)(=O)=O)[C@@H]3CC[C@
]4(C)C5=C(O)C(=O)CO[C@@]4(CC5)[C@@H]3CC[C@]2(O)C1 )
Adenosine A2a receptor (GPCR)/ 5075(
Cc1ccc2c(c1)=C1N=[NH+]C(N/N=C/c3cc(Br)c(O)c(O)c3Br)=N[C@@H]1[NH+]=2 )
Adenosine A2a receptor (GPCR)/ 22643( COc1cc([N+](=O)[O-])ccc1NC(=O)[C@
@H]1[C@@H]2C[C@@H]3OC(=O)[C@@H]1[C@@H]3C2 )
Progesterone Receptor/ 182(
CC1=CC(C)(C)Nc2ccc3c(c21)/C(=C/C1CCCCC1)Oc1ccc(F)cc1-3 )

Many thanks for your attention, looking forward to hear any insights about
this issue.

Rdkit-discuss mailing list

Reply via email to