2D drawing code is tough. The 90/10 rule applies: the last 10% of
correctness takes 90% of the effort.

I like Dmitri Agrafiotis's method, but IIRC it's patented; also, though
it's good for rough work, it doesn't produce "beautiful" structural
diagrams.

Some of the 2D drawing methods that do produce "pretty" pictures have a
large number of templates built in that match the most common (and even
somewhat uncommon) motifs, and they fall down when they hit something they
can't get a close enough match for. And then, the IUPAC has a whole list of
"desirable" features in 2D diagrams (as in, "Don't show it this way, but
rather show it that way."). So even if you produce what might appear to be
an acceptable drawing, it might not match the IUPAC list of desirables.

I think for the present purposes what we need is something correct, robust
and legible, and of course the example shown does not exhibit that. (But I
don't know what the starting SMILES is, so I don't know whether the
7-bonded C is due to a bad SMILES, in which case all bets are off.)

In addition, I think some discussion earlier indicated that the RDKit 2D
structures look much worse when H's are included.

I actually wrote a code one time (while at Schrödinger) to give a "badness"
score to 2D structures. When our 2D depiction development was in progress,
we created 2D SD files for many thousands of structures. I could put these
through the program and sort with the worst on top. That allowed the most
severe problems to be identified more quickly than, say, looking at
thousands of 2D diagrams. The program looked at three things: Number of
bonds that crossed, Number of atoms that were too close together, and Large
disparity of bond lengths within the same molecule. (The checking code
didn't deal with labels.)

Writing the checker was a fun project, but I'm glad I didn't have to write
the 2D depiction code. As Mark Twain said, "Improving oneself is good.
Improving others is better – and easier."

-P.

On Mon, Sep 26, 2016 at 5:54 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu>
wrote:

> On 09/26/2016 04:42 PM, Peter S. Shenkin wrote:
> > Also, the C attached to H44 has an extra H (its own or someone else's?)
> > superimposed upon it.
>
> I wonder if 2D drawing code should really work the same way as the 3D
> conformer generation: generate a bunch of candidate layouts and pick the
> one(s) with least clashes/overlaps.
>
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
> ------------------------------------------------------------
> ------------------
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to