Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)
That's a really nice presentation. -P. On Fri, Nov 18, 2016 at 3:16 AM, Greg Landrum <greg.land...@gmail.com> wrote: > This is a very big topic, and one where I would very much like to improve > the RDKit. John Mayfield gave a great talk on the issues (and some ideas > about fixing them based on his work with the CDK) at the UGM that some of > you may find interesting : > https://github.com/rdkit/UGM_2016/blob/master/Presentations/JohnMayfield_ > Depiction.pdf > > Fixing the larger problems is a *lot* of work and not something that is > likely to happen quickly, but there is some low-hanging fruit (like cutting > crossed bonds) that I ought to be able to do something about.[1] > > -greg > [1] the trick is to avoid, as much as possible, creating drawings that > look like Möbius strips. > _ > From: Peter S. Shenkin <shen...@gmail.com> > Sent: Thursday, November 17, 2016 11:23 PM > Subject: Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a > bit off-topic) > To: <rdkit-discuss@lists.sourceforge.net> > > > > > On 17 Nov 2016, at 4:12 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> wrote: > > Philosophically speaking, there must exist molecules for which a legible > 2D projection is simply not possible. > > > Hi, > > I don't think that 2D projection of a 3D structure is an appropriate > paradigm for 2D depiction, in general. I think of it as being more about 2D > construction. I don't think camphor is a particularly difficult example, > though, and I think that the hidden-line elimination (for lack of a better > term) that Marvin does gives it a leg up on RDKit's representation. > > By the way, I do not think that Marvin is the best there is out there; > it's just what I happen to have available for comparison. > > Stereochemistry adds complications, because 3D information has to be > encoded in some way. Camphor (your suggestion) has a little of this. I gave > Marvin a non-stereo SMILES and it picked an enantiomer. I drew the same > enantiomer. I did not specify stereochemistry to RDKit, so, despite the > visual confusion of the bond crossings, I suppose it's good that it didn't > depict an explicit enantiomer. > > And labels add further complications. The two approaches I've seen for > labels are using them as the atomic vertices, as RDKit does, and adding > them adjacent to the vertices. I personally prefer the latter, because to > my eye, it's easier to see the connectivity without being distracted by the > labels. > > But my philosophical point was that different forms of 2D depiction work > better for different purposes. Stéphane wants to see sugars drawn as > carbohydrate chemists are used to seeing them. I would like to see the 2D > connectivity as clearly as possible and would sacrifice some conventions > for that purpose. And so on. > > -P. > > > > > > -- > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)
Le 18/11/2016 à 09:16, Greg Landrum a écrit : > Fixing the larger problems is a *lot* of work and not something that > is likely to happen quickly, but there is some low-hanging fruit (like > cutting crossed bonds) that I ought to be able to do something about.[1] > > -greg > [1] the trick is to avoid, as much as possible, creating drawings that > look like Möbius strips. Dear Greg, I appreciate your tremendous, and I have always wanted to give back to the community when possible. Do you have any guidelines about getting something better (a draft or links to ideas / papers?), I do have the possibility to mentor some students and this kind of subjects would be a perfect C++ / python project for them ... Just my two cents :-) Stéphane -- Team Protein Design In Silico UFIP, UMR 6286 CNRS, UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, Nantes cedex 03, France Tél : +33 251 125 636 - Fax : +33 251 125 632 http://www.ufip.univ-nantes.fr/ - http://www.steletch.org -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)
As Greg says, this is a large area and somewhat of a diversion from my original intention. All I was asking for was a set of test cases so I can ensure that my port of the original Python code in AllChem.py to C++ behaves correctly. That seems like a sensible first step before embarking on something more ambitious. Dave On Fri, 18 Nov 2016 at 08:17, Greg Landrum <greg.land...@gmail.com> wrote: > This is a very big topic, and one where I would very much like to improve > the RDKit. John Mayfield gave a great talk on the issues (and some ideas > about fixing them based on his work with the CDK) at the UGM that some of > you may find interesting : > > https://github.com/rdkit/UGM_2016/blob/master/Presentations/JohnMayfield_Depiction.pdf > > Fixing the larger problems is a *lot* of work and not something that is > likely to happen quickly, but there is some low-hanging fruit (like cutting > crossed bonds) that I ought to be able to do something about.[1] > > -greg > [1] the trick is to avoid, as much as possible, creating drawings that > look like Möbius strips. > _ > From: Peter S. Shenkin <shen...@gmail.com> > Sent: Thursday, November 17, 2016 11:23 PM > Subject: Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a > bit off-topic) > To: <rdkit-discuss@lists.sourceforge.net> > > > > > On 17 Nov 2016, at 4:12 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> wrote: > > Philosophically speaking, there must exist molecules for which a legible > 2D projection is simply not possible. > > > Hi, > > I don't think that 2D projection of a 3D structure is an appropriate > paradigm for 2D depiction, in general. I think of it as being more about 2D > construction. I don't think camphor is a particularly difficult example, > though, and I think that the hidden-line elimination (for lack of a better > term) that Marvin does gives it a leg up on RDKit's representation. > > By the way, I do not think that Marvin is the best there is out there; > it's just what I happen to have available for comparison. > > Stereochemistry adds complications, because 3D information has to be > encoded in some way. Camphor (your suggestion) has a little of this. I gave > Marvin a non-stereo SMILES and it picked an enantiomer. I drew the same > enantiomer. I did not specify stereochemistry to RDKit, so, despite the > visual confusion of the bond crossings, I suppose it's good that it didn't > depict an explicit enantiomer. > > And labels add further complications. The two approaches I've seen for > labels are using them as the atomic vertices, as RDKit does, and adding > them adjacent to the vertices. I personally prefer the latter, because to > my eye, it's easier to see the connectivity without being distracted by the > labels. > > But my philosophical point was that different forms of 2D depiction work > better for different purposes. Stéphane wants to see sugars drawn as > carbohydrate chemists are used to seeing them. I would like to see the 2D > connectivity as clearly as possible and would sacrifice some conventions > for that purpose. And so on. > > -P. > > > > > -- > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)
This is a very big topic, and one where I would very much like to improve the RDKit. John Mayfield gave a great talk on the issues (and some ideas about fixing them based on his work with the CDK) at the UGM that some of you may find interesting :https://github.com/rdkit/UGM_2016/blob/master/Presentations/JohnMayfield_Depiction.pdf Fixing the larger problems is a *lot* of work and not something that is likely to happen quickly, but there is some low-hanging fruit (like cutting crossed bonds) that I ought to be able to do something about.[1] -greg[1] the trick is to avoid, as much as possible, creating drawings that look like Möbius strips. _ From: Peter S. Shenkin <shen...@gmail.com> Sent: Thursday, November 17, 2016 11:23 PM Subject: Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic) To: <rdkit-discuss@lists.sourceforge.net> On 17 Nov 2016, at 4:12 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> wrote: Philosophically speaking, there must exist molecules for which a legible 2D projection is simply not possible. Hi, I don't think that 2D projection of a 3D structure is an appropriate paradigm for 2D depiction, in general. I think of it as being more about 2D construction. I don't think camphor is a particularly difficult example, though, and I think that the hidden-line elimination (for lack of a better term) that Marvin does gives it a leg up on RDKit's representation. By the way, I do not think that Marvin is the best there is out there; it's just what I happen to have available for comparison. Stereochemistry adds complications, because 3D information has to be encoded in some way. Camphor (your suggestion) has a little of this. I gave Marvin a non-stereo SMILES and it picked an enantiomer. I drew the same enantiomer. I did not specify stereochemistry to RDKit, so, despite the visual confusion of the bond crossings, I suppose it's good that it didn't depict an explicit enantiomer. And labels add further complications. The two approaches I've seen for labels are using them as the atomic vertices, as RDKit does, and adding them adjacent to the vertices. I personally prefer the latter, because to my eye, it's easier to see the connectivity without being distracted by the labels. But my philosophical point was that different forms of 2D depiction work better for different purposes. Stéphane wants to see sugars drawn as carbohydrate chemists are used to seeing them. I would like to see the 2D connectivity as clearly as possible and would sacrifice some conventions for that purpose. And so on. -P. -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)
> On 17 Nov 2016, at 4:12 PM, Dimitri Maziukwrote: > > Philosophically speaking, there must exist molecules for which a legible > 2D projection is simply not possible. Hi, I don't think that 2D projection of a 3D structure is an appropriate paradigm for 2D depiction, in general. I think of it as being more about 2D construction. I don't think camphor is a particularly difficult example, though, and I think that the hidden-line elimination (for lack of a better term) that Marvin does gives it a leg up on RDKit's representation. By the way, I do not think that Marvin is the best there is out there; it's just what I happen to have available for comparison. Stereochemistry adds complications, because 3D information has to be encoded in some way. Camphor (your suggestion) has a little of this. I gave Marvin a non-stereo SMILES and it picked an enantiomer. I drew the same enantiomer. I did not specify stereochemistry to RDKit, so, despite the visual confusion of the bond crossings, I suppose it's good that it didn't depict an explicit enantiomer. And labels add further complications. The two approaches I've seen for labels are using them as the atomic vertices, as RDKit does, and adding them adjacent to the vertices. I personally prefer the latter, because to my eye, it's easier to see the connectivity without being distracted by the labels. But my philosophical point was that different forms of 2D depiction work better for different purposes. Stéphane wants to see sugars drawn as carbohydrate chemists are used to seeing them. I would like to see the 2D connectivity as clearly as possible and would sacrifice some conventions for that purpose. And so on. -P. -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)
On 11/17/2016 02:41 PM, Peter S. Shenkin wrote: ... > I have to say that Marvin displays the connectivity of the structures much > more > clearly than RDKit. Philosophically speaking, there must exist molecules for which a legible 2D projection is simply not possible. PubChem CID 2537 comes close. Marvin doesn't do much better on this one even if you don't turn on all the labels. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)
Le 17/11/2016 à 18:01, David Cosgrove a écrit : > Hi All, > > I'm currently working on transferring the 2 Python functions > GenerateDepictionMatching2DStructure and the 3D equivalent into the C++ > core so they will be available to all users of the toolkit. Can anyone > supply example test cases for me? In particular, I would appreciate > examples of the 2D version using the optional referencePattern argument, > and, for the 3D version, examples where it works well and less well. > SMILES and/or SDFs would be enough, but if you have pictures of the > output and/or comments on them, that would be a bonus. > > Many thanks, > Dave Dear Dave, thanks a lot for the move. One thing which is not very well handled in rdkit (and elsewhere) is carbohydrates. Since you are moving things, do you want test scenarios and examples to adjust it? For instance "classical" representations of carbohydrates: https://en.wikibooks.org/wiki/File:Chairenvelopeboat*.png or better : http://oregonstate.edu/instruct/bb450/450material/stryer7/11/figure_11_07.jpg Actually they look pretty flat in rdkit, unfortunately. This is also true in pubchem for instance : https://pubchem.ncbi.nlm.nih.gov/compound/D-glucose Sorry for this off-topic demand, but if there is room for improvements, I would be intersted :-) Best, Stéphane -- Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein Design In Silico UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322 Nantes cedex 03, France Tél : +33 251 125 636 / Fax : +33 251 125 632 http://www.ufip.univ-nantes.fr/ - http://www.steletch.org -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss