Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment
Hi all, Thanks for all your replies. I wonder what kind of logic may be put in place to fix these mistakes algorithmically. Any ways to configure the correct stereochemistry before embedding the structure? Thanks, On Thu, 19 Jan 2023 at 17:49, Giovanni Tricarico < giovanni.tricar...@glpg.com> wrote: > Indeed, the only two chemically valid configurations for this molecule > seem to be: > > > > > > Impressive that rdkit can detect this kind of contradictions. > > > > G > > > > *From:* Kangway Chuang (CHUANGK4) via Rdkit-discuss < > rdkit-discuss@lists.sourceforge.net> > *Sent:* 19 January 2023 17:54 > *To:* Ling Chan ; Gianmarco Ghiandoni < > ghiandon...@gmail.com> > *Cc:* RDKit > *Subject:* Re: [Rdkit-discuss] Embedding of molecules with incorrect > stereochistry assignment > > > > Agree with Hao above. For the molecules provided, the second example is > geometrically inaccessible given the stereochemical constraints. In this > case, the expected behavior should be an unsuccessful embedding. > > > > Kangway > > > > On Thu, Jan 19, 2023 at 8:49 AM Ling Chan wrote: > > Keep trying with more random seeds? > > > > On Thu., Jan. 19, 2023, 07:38 Hao, wrote: > > Hi Gianmarco, > > > > In my experience, this just means that you have an impossible molecule. I > haven't found any ways around it besides trying to embed. If it fails, try > to swap the other stereoisomer. I find this particularly prevalent in large > scale datasets where data quality is not very good. > > > > Best, > > Hao > > > > On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni > wrote: > > Hi all, > > > > Anyone can help with this matter? > > > > Thanks, > > > > On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni > wrote: > > Hi all, > > > > I have come across an issue while embedding structures with > stereochemistry configurations that presumably lead to clashes between > atoms: > > > > from rdkit import Chem > > from rdkit.Chem import AllChem > > > > smiles="C1N[C@@H]2CO[C@H]1C2" > > m = Chem.MolFromSmiles(smiles) > > mh = Chem.AddHs(m) > > print(AllChem.EmbedMolecule(mh, randomSeed=11)) > > > > smiles="C1N[C@@H]2CO[C@@H]1C2" > > m = Chem.MolFromSmiles(smiles) > > mh = Chem.AddHs(m) > > print(AllChem.EmbedMolecule(mh, randomSeed=11)) > > > > > > Produces: > > 0 (successful embedding) > -1 (unsuccessful embedding) > > > > What is in your opinion the best way to deal with this in order to avoid > failures? > > > > Thanks, > > -- > > *Gianmarco* > > > > > -- > > *Gianmarco* > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0> > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0> > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0> > > This e-mail and its attachment(s) (if any) may contain confidential and/or > proprietary information and is intended for its addressee(s) only. Any
Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment
Indeed, the only two chemically valid configurations for this molecule seem to be: [cid:image001.png@01D92C36.B02214E0] Impressive that rdkit can detect this kind of contradictions. G From: Kangway Chuang (CHUANGK4) via Rdkit-discuss Sent: 19 January 2023 17:54 To: Ling Chan ; Gianmarco Ghiandoni Cc: RDKit Subject: Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment Agree with Hao above. For the molecules provided, the second example is geometrically inaccessible given the stereochemical constraints. In this case, the expected behavior should be an unsuccessful embedding. Kangway On Thu, Jan 19, 2023 at 8:49 AM Ling Chan mailto:lingtrek...@gmail.com>> wrote: Keep trying with more random seeds? On Thu., Jan. 19, 2023, 07:38 Hao, mailto:shenha...@gmail.com>> wrote: Hi Gianmarco, In my experience, this just means that you have an impossible molecule. I haven't found any ways around it besides trying to embed. If it fails, try to swap the other stereoisomer. I find this particularly prevalent in large scale datasets where data quality is not very good. Best, Hao On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni mailto:ghiandon...@gmail.com>> wrote: Hi all, Anyone can help with this matter? Thanks, On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni mailto:ghiandon...@gmail.com>> wrote: Hi all, I have come across an issue while embedding structures with stereochemistry configurations that presumably lead to clashes between atoms: from rdkit import Chem from rdkit.Chem import AllChem smiles="C1N[C@@H]2CO[C@H]1C2" m = Chem.MolFromSmiles(smiles) mh = Chem.AddHs(m) print(AllChem.EmbedMolecule(mh, randomSeed=11)) smiles="C1N[C@@H]2CO[C@@H]1C2" m = Chem.MolFromSmiles(smiles) mh = Chem.AddHs(m) print(AllChem.EmbedMolecule(mh, randomSeed=11)) Produces: 0 (successful embedding) -1 (unsuccessful embedding) What is in your opinion the best way to deal with this in order to avoid failures? Thanks, -- Gianmarco -- Gianmarco ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss<https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss<https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss<https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0> This e-mail and its attachment(s) (if any) may contain confidential and/or proprietary information and is intended for its addressee(s) only. Any unauthorized use of the information contained herein (including, but not limited to, alteration, reproduction, communication, distribution or any other form of dissemination) is strictly prohibited. If you are not the intended addressee, please notify the originator promptly and delete this e-mail and its attachment(s) (if any) subsequently. Neither Galapagos nor any of its affiliates shall be liable for direct, special, indirect or consequential damages arising from alteration of the contents of this message (by a third party) or as a result of a virus being passed on. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment
Agree with Hao above. For the molecules provided, the second example is geometrically inaccessible given the stereochemical constraints. In this case, the expected behavior should be an unsuccessful embedding. Kangway On Thu, Jan 19, 2023 at 8:49 AM Ling Chan wrote: > Keep trying with more random seeds? > > On Thu., Jan. 19, 2023, 07:38 Hao, wrote: > >> Hi Gianmarco, >> >> In my experience, this just means that you have an impossible molecule. I >> haven't found any ways around it besides trying to embed. If it fails, try >> to swap the other stereoisomer. I find this particularly prevalent in large >> scale datasets where data quality is not very good. >> >> Best, >> Hao >> >> On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni < >> ghiandon...@gmail.com> wrote: >> >>> Hi all, >>> >>> Anyone can help with this matter? >>> >>> Thanks, >>> >>> On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni >>> wrote: >>> Hi all, I have come across an issue while embedding structures with stereochemistry configurations that presumably lead to clashes between atoms: from rdkit import Chem from rdkit.Chem import AllChem smiles="C1N[C@@H]2CO[C@H]1C2" m = Chem.MolFromSmiles(smiles) mh = Chem.AddHs(m) print(AllChem.EmbedMolecule(mh, randomSeed=11)) smiles="C1N[C@@H]2CO[C@@H]1C2" m = Chem.MolFromSmiles(smiles) mh = Chem.AddHs(m) print(AllChem.EmbedMolecule(mh, randomSeed=11)) Produces: 0 (successful embedding) -1 (unsuccessful embedding) What is in your opinion the best way to deal with this in order to avoid failures? Thanks, -- *Gianmarco* >>> >>> >>> -- >>> *Gianmarco* >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment
Keep trying with more random seeds? On Thu., Jan. 19, 2023, 07:38 Hao, wrote: > Hi Gianmarco, > > In my experience, this just means that you have an impossible molecule. I > haven't found any ways around it besides trying to embed. If it fails, try > to swap the other stereoisomer. I find this particularly prevalent in large > scale datasets where data quality is not very good. > > Best, > Hao > > On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni > wrote: > >> Hi all, >> >> Anyone can help with this matter? >> >> Thanks, >> >> On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni >> wrote: >> >>> Hi all, >>> >>> I have come across an issue while embedding structures with >>> stereochemistry configurations that presumably lead to clashes between >>> atoms: >>> >>> from rdkit import Chem >>> from rdkit.Chem import AllChem >>> >>> smiles="C1N[C@@H]2CO[C@H]1C2" >>> m = Chem.MolFromSmiles(smiles) >>> mh = Chem.AddHs(m) >>> print(AllChem.EmbedMolecule(mh, randomSeed=11)) >>> >>> smiles="C1N[C@@H]2CO[C@@H]1C2" >>> m = Chem.MolFromSmiles(smiles) >>> mh = Chem.AddHs(m) >>> print(AllChem.EmbedMolecule(mh, randomSeed=11)) >>> >>> >>> Produces: >>> 0 (successful embedding) >>> -1 (unsuccessful embedding) >>> >>> What is in your opinion the best way to deal with this in order to avoid >>> failures? >>> >>> Thanks, >>> -- >>> *Gianmarco* >>> >> >> >> -- >> *Gianmarco* >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment
Hi Gianmarco, In my experience, this just means that you have an impossible molecule. I haven't found any ways around it besides trying to embed. If it fails, try to swap the other stereoisomer. I find this particularly prevalent in large scale datasets where data quality is not very good. Best, Hao On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni wrote: > Hi all, > > Anyone can help with this matter? > > Thanks, > > On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni > wrote: > >> Hi all, >> >> I have come across an issue while embedding structures with >> stereochemistry configurations that presumably lead to clashes between >> atoms: >> >> from rdkit import Chem >> from rdkit.Chem import AllChem >> >> smiles="C1N[C@@H]2CO[C@H]1C2" >> m = Chem.MolFromSmiles(smiles) >> mh = Chem.AddHs(m) >> print(AllChem.EmbedMolecule(mh, randomSeed=11)) >> >> smiles="C1N[C@@H]2CO[C@@H]1C2" >> m = Chem.MolFromSmiles(smiles) >> mh = Chem.AddHs(m) >> print(AllChem.EmbedMolecule(mh, randomSeed=11)) >> >> >> Produces: >> 0 (successful embedding) >> -1 (unsuccessful embedding) >> >> What is in your opinion the best way to deal with this in order to avoid >> failures? >> >> Thanks, >> -- >> *Gianmarco* >> > > > -- > *Gianmarco* > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment
Hi all, Anyone can help with this matter? Thanks, On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni wrote: > Hi all, > > I have come across an issue while embedding structures with > stereochemistry configurations that presumably lead to clashes between > atoms: > > from rdkit import Chem > from rdkit.Chem import AllChem > > smiles="C1N[C@@H]2CO[C@H]1C2" > m = Chem.MolFromSmiles(smiles) > mh = Chem.AddHs(m) > print(AllChem.EmbedMolecule(mh, randomSeed=11)) > > smiles="C1N[C@@H]2CO[C@@H]1C2" > m = Chem.MolFromSmiles(smiles) > mh = Chem.AddHs(m) > print(AllChem.EmbedMolecule(mh, randomSeed=11)) > > > Produces: > 0 (successful embedding) > -1 (unsuccessful embedding) > > What is in your opinion the best way to deal with this in order to avoid > failures? > > Thanks, > -- > *Gianmarco* > -- *Gianmarco* ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss