Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-20 Thread Gianmarco Ghiandoni
Hi all,

Thanks for all your replies. I wonder what kind of logic may be put in
place to fix these mistakes algorithmically. Any ways to configure the
correct stereochemistry before embedding the structure?

Thanks,

On Thu, 19 Jan 2023 at 17:49, Giovanni Tricarico <
giovanni.tricar...@glpg.com> wrote:

> Indeed, the only two chemically valid configurations for this molecule
> seem to be:
>
>
>
>
>
> Impressive that rdkit can detect this kind of contradictions.
>
>
>
> G
>
>
>
> *From:* Kangway Chuang (CHUANGK4) via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net>
> *Sent:* 19 January 2023 17:54
> *To:* Ling Chan ; Gianmarco Ghiandoni <
> ghiandon...@gmail.com>
> *Cc:* RDKit 
> *Subject:* Re: [Rdkit-discuss] Embedding of molecules with incorrect
> stereochistry assignment
>
>
>
> Agree with Hao above. For the molecules provided, the second example is
> geometrically inaccessible given the stereochemical constraints. In this
> case, the expected behavior should be an unsuccessful embedding.
>
>
>
> Kangway
>
>
>
> On Thu, Jan 19, 2023 at 8:49 AM Ling Chan  wrote:
>
> Keep trying with more random seeds?
>
>
>
> On Thu., Jan. 19, 2023, 07:38 Hao,  wrote:
>
> Hi Gianmarco,
>
>
>
> In my experience, this just means that you have an impossible molecule. I
> haven't found any ways around it besides trying to embed. If it fails, try
> to swap the other stereoisomer. I find this particularly prevalent in large
> scale datasets where data quality is not very good.
>
>
>
> Best,
>
> Hao
>
>
>
> On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni 
> wrote:
>
> Hi all,
>
>
>
> Anyone can help with this matter?
>
>
>
> Thanks,
>
>
>
> On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni 
> wrote:
>
> Hi all,
>
>
>
> I have come across an issue while embedding structures with
> stereochemistry configurations that presumably lead to clashes between
> atoms:
>
>
>
> from rdkit import Chem
>
> from rdkit.Chem import AllChem
>
>
>
> smiles="C1N[C@@H]2CO[C@H]1C2"
>
> m = Chem.MolFromSmiles(smiles)
>
> mh = Chem.AddHs(m)
>
> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>
>
>
> smiles="C1N[C@@H]2CO[C@@H]1C2"
>
> m = Chem.MolFromSmiles(smiles)
>
> mh = Chem.AddHs(m)
>
> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>
>
>
>
>
> Produces:
>
> 0 (successful embedding)
> -1  (unsuccessful embedding)
>
>
>
> What is in your opinion the best way to deal with this in order to avoid
> failures?
>
>
>
> Thanks,
>
> --
>
> *Gianmarco*
>
>
>
>
> --
>
> *Gianmarco*
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
>
> This e-mail and its attachment(s) (if any) may contain confidential and/or
> proprietary information and is intended for its addressee(s) only. Any

Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-19 Thread Giovanni Tricarico
Indeed, the only two chemically valid configurations for this molecule seem to 
be:

[cid:image001.png@01D92C36.B02214E0]

Impressive that rdkit can detect this kind of contradictions.

G

From: Kangway Chuang (CHUANGK4) via Rdkit-discuss 

Sent: 19 January 2023 17:54
To: Ling Chan ; Gianmarco Ghiandoni 

Cc: RDKit 
Subject: Re: [Rdkit-discuss] Embedding of molecules with incorrect 
stereochistry assignment

Agree with Hao above. For the molecules provided, the second example is 
geometrically inaccessible given the stereochemical constraints. In this case, 
the expected behavior should be an unsuccessful embedding.

Kangway

On Thu, Jan 19, 2023 at 8:49 AM Ling Chan 
mailto:lingtrek...@gmail.com>> wrote:
Keep trying with more random seeds?

On Thu., Jan. 19, 2023, 07:38 Hao, 
mailto:shenha...@gmail.com>> wrote:
Hi Gianmarco,

In my experience, this just means that you have an impossible molecule. I 
haven't found any ways around it besides trying to embed. If it fails, try to 
swap the other stereoisomer. I find this particularly prevalent in large scale 
datasets where data quality is not very good.

Best,
Hao

On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni 
mailto:ghiandon...@gmail.com>> wrote:
Hi all,

Anyone can help with this matter?

Thanks,

On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni 
mailto:ghiandon...@gmail.com>> wrote:
Hi all,

I have come across an issue while embedding structures with stereochemistry 
configurations that presumably lead to clashes between atoms:

from rdkit import Chem
from rdkit.Chem import AllChem

smiles="C1N[C@@H]2CO[C@H]1C2"
m = Chem.MolFromSmiles(smiles)
mh = Chem.AddHs(m)
print(AllChem.EmbedMolecule(mh, randomSeed=11))

smiles="C1N[C@@H]2CO[C@@H]1C2"
m = Chem.MolFromSmiles(smiles)
mh = Chem.AddHs(m)
print(AllChem.EmbedMolecule(mh, randomSeed=11))


Produces:
0 (successful embedding)
-1  (unsuccessful embedding)

What is in your opinion the best way to deal with this in order to avoid 
failures?

Thanks,
--
Gianmarco


--
Gianmarco
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss<https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss<https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss<https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
This e-mail and its attachment(s) (if any) may contain confidential and/or 
proprietary information and is intended for its addressee(s) only. Any 
unauthorized use of the information contained herein (including, but not 
limited to, alteration, reproduction, communication, distribution or any other 
form of dissemination) is strictly prohibited. If you are not the intended 
addressee, please notify the originator promptly and delete this e-mail and its 
attachment(s) (if any) subsequently. Neither Galapagos nor any of its 
affiliates shall be liable for direct, special, indirect or consequential 
damages arising from alteration of the contents of this message (by a third 
party) or as a result of a virus being passed on.
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-19 Thread Kangway Chuang (CHUANGK4) via Rdkit-discuss
Agree with Hao above. For the molecules provided, the second example is
geometrically inaccessible given the stereochemical constraints. In this
case, the expected behavior should be an unsuccessful embedding.

Kangway

On Thu, Jan 19, 2023 at 8:49 AM Ling Chan  wrote:

> Keep trying with more random seeds?
>
> On Thu., Jan. 19, 2023, 07:38 Hao,  wrote:
>
>> Hi Gianmarco,
>>
>> In my experience, this just means that you have an impossible molecule. I
>> haven't found any ways around it besides trying to embed. If it fails, try
>> to swap the other stereoisomer. I find this particularly prevalent in large
>> scale datasets where data quality is not very good.
>>
>> Best,
>> Hao
>>
>> On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni <
>> ghiandon...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> Anyone can help with this matter?
>>>
>>> Thanks,
>>>
>>> On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni 
>>> wrote:
>>>
 Hi all,

 I have come across an issue while embedding structures with
 stereochemistry configurations that presumably lead to clashes between
 atoms:

 from rdkit import Chem
 from rdkit.Chem import AllChem

 smiles="C1N[C@@H]2CO[C@H]1C2"
 m = Chem.MolFromSmiles(smiles)
 mh = Chem.AddHs(m)
 print(AllChem.EmbedMolecule(mh, randomSeed=11))

 smiles="C1N[C@@H]2CO[C@@H]1C2"
 m = Chem.MolFromSmiles(smiles)
 mh = Chem.AddHs(m)
 print(AllChem.EmbedMolecule(mh, randomSeed=11))


 Produces:
 0 (successful embedding)
 -1  (unsuccessful embedding)

 What is in your opinion the best way to deal with this in order to
 avoid failures?

 Thanks,
 --
 *Gianmarco*

>>>
>>>
>>> --
>>> *Gianmarco*
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-19 Thread Ling Chan
Keep trying with more random seeds?

On Thu., Jan. 19, 2023, 07:38 Hao,  wrote:

> Hi Gianmarco,
>
> In my experience, this just means that you have an impossible molecule. I
> haven't found any ways around it besides trying to embed. If it fails, try
> to swap the other stereoisomer. I find this particularly prevalent in large
> scale datasets where data quality is not very good.
>
> Best,
> Hao
>
> On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni 
> wrote:
>
>> Hi all,
>>
>> Anyone can help with this matter?
>>
>> Thanks,
>>
>> On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni 
>> wrote:
>>
>>> Hi all,
>>>
>>> I have come across an issue while embedding structures with
>>> stereochemistry configurations that presumably lead to clashes between
>>> atoms:
>>>
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>>
>>> smiles="C1N[C@@H]2CO[C@H]1C2"
>>> m = Chem.MolFromSmiles(smiles)
>>> mh = Chem.AddHs(m)
>>> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>>>
>>> smiles="C1N[C@@H]2CO[C@@H]1C2"
>>> m = Chem.MolFromSmiles(smiles)
>>> mh = Chem.AddHs(m)
>>> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>>>
>>>
>>> Produces:
>>> 0 (successful embedding)
>>> -1  (unsuccessful embedding)
>>>
>>> What is in your opinion the best way to deal with this in order to avoid
>>> failures?
>>>
>>> Thanks,
>>> --
>>> *Gianmarco*
>>>
>>
>>
>> --
>> *Gianmarco*
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-19 Thread Hao
Hi Gianmarco,

In my experience, this just means that you have an impossible molecule. I
haven't found any ways around it besides trying to embed. If it fails, try
to swap the other stereoisomer. I find this particularly prevalent in large
scale datasets where data quality is not very good.

Best,
Hao

On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni 
wrote:

> Hi all,
>
> Anyone can help with this matter?
>
> Thanks,
>
> On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni 
> wrote:
>
>> Hi all,
>>
>> I have come across an issue while embedding structures with
>> stereochemistry configurations that presumably lead to clashes between
>> atoms:
>>
>> from rdkit import Chem
>> from rdkit.Chem import AllChem
>>
>> smiles="C1N[C@@H]2CO[C@H]1C2"
>> m = Chem.MolFromSmiles(smiles)
>> mh = Chem.AddHs(m)
>> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>>
>> smiles="C1N[C@@H]2CO[C@@H]1C2"
>> m = Chem.MolFromSmiles(smiles)
>> mh = Chem.AddHs(m)
>> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>>
>>
>> Produces:
>> 0 (successful embedding)
>> -1  (unsuccessful embedding)
>>
>> What is in your opinion the best way to deal with this in order to avoid
>> failures?
>>
>> Thanks,
>> --
>> *Gianmarco*
>>
>
>
> --
> *Gianmarco*
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-19 Thread Gianmarco Ghiandoni
Hi all,

Anyone can help with this matter?

Thanks,

On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni 
wrote:

> Hi all,
>
> I have come across an issue while embedding structures with
> stereochemistry configurations that presumably lead to clashes between
> atoms:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
> smiles="C1N[C@@H]2CO[C@H]1C2"
> m = Chem.MolFromSmiles(smiles)
> mh = Chem.AddHs(m)
> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>
> smiles="C1N[C@@H]2CO[C@@H]1C2"
> m = Chem.MolFromSmiles(smiles)
> mh = Chem.AddHs(m)
> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>
>
> Produces:
> 0 (successful embedding)
> -1  (unsuccessful embedding)
>
> What is in your opinion the best way to deal with this in order to avoid
> failures?
>
> Thanks,
> --
> *Gianmarco*
>


-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss