Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Peter S. Shenkin
Yes, of course, storing the images is an alternative.

-P.

On Thu, Dec 15, 2016 at 5:46 PM, Dimitri Maziuk 
wrote:

> On 12/15/2016 04:23 PM, Peter S. Shenkin wrote:
>
> > Obviously, it doesn't matter if you're rendering just few structures, but
> > in a scenario where you might be downloading a hundred SMILES from a DB
> and
> > displaying them on a grid in a browser, computing the 2D depictions on
> the
> > fly, waiting 5 sec for a page refresh wouldn't be great.
>
> Maybe not, but depending how the browser lays out the grid, it may take
> 5 seconds anyway.
>
> My recommendation for that use case would be to pre-generate the images
> and store the URLs in that database. Which is what we do here.
>
> ;)
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Dimitri Maziuk
On 12/15/2016 04:23 PM, Peter S. Shenkin wrote:

> Obviously, it doesn't matter if you're rendering just few structures, but
> in a scenario where you might be downloading a hundred SMILES from a DB and
> displaying them on a grid in a browser, computing the 2D depictions on the
> fly, waiting 5 sec for a page refresh wouldn't be great.

Maybe not, but depending how the browser lays out the grid, it may take
5 seconds anyway.

My recommendation for that use case would be to pre-generate the images
and store the URLs in that database. Which is what we do here.

;)
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Peter S. Shenkin
Well, Figure 10 shows that a molecule with about 25 heavy atoms takes about
50 ms to optimize.

In John Mayfield's UGM talk, it looks like CDK is taking an average of 1 ms
for "easy" structures and 56 ms for the hard ones, some of which are
depicted and have far more than 25 heavy atoms.

We don't know the details of the two data sets, so a head-to-head
comparison is tough, but intuitively, 20 structures/sec sounds slow.

Having said that, it's reasonable to pay a price in speed for additional
quality and robustness.

Obviously, it doesn't matter if you're rendering just few structures, but
in a scenario where you might be downloading a hundred SMILES from a DB and
displaying them on a grid in a browser, computing the 2D depictions on the
fly, waiting 5 sec for a page refresh wouldn't be great.

-P.

On Thu, Dec 15, 2016 at 4:22 PM, Dimitri Maziuk 
wrote:

> On 12/15/2016 02:53 PM, Peter S. Shenkin wrote:
> > Looks good, but maybe too slow for production use... (?)
>
> I wonder what kind of production use would require sub-second wall clock
> time for this.
>
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Dimitri Maziuk
On 12/15/2016 02:53 PM, Peter S. Shenkin wrote:
> Looks good, but maybe too slow for production use... (?)

I wonder what kind of production use would require sub-second wall clock
time for this.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Peter S. Shenkin
Looks good, but maybe too slow for production use... (?)

-P.

On Thu, Dec 15, 2016 at 3:38 PM, Chris Swain  wrote:

> At first glance this looks an interesting approach.
>
> Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram
> Generation of Complex Molecules and Ligand–Protein Interactions
> DOI: http://dx.doi.org/10.1021/acs.jcim.6b00391
>
> On 27 Sep 2016, at 05:38, rdkit-discuss-requ...@lists.sourceforge.net
> wrote:
>
> 2D drawing code is tough. The 90/10 rule applies: the last 10% of
> correctness takes 90% of the effort.
>
> I like Dmitri Agrafiotis's method, but IIRC it's patented; also, though
> it's good for rough work, it doesn't produce "beautiful" structural
> diagrams.
>
> Some of the 2D drawing methods that do produce "pretty" pictures have a
> large number of templates built in that match the most common (and even
> somewhat uncommon) motifs, and they fall down when they hit something they
> can't get a close enough match for. And then, the IUPAC has a whole list of
> "desirable" features in 2D diagrams (as in, "Don't show it this way, but
> rather show it that way."). So even if you produce what might appear to be
> an acceptable drawing, it might not match the IUPAC list of desirables.
>
> I think for the present purposes what we need is something correct, robust
> and legible, and of course the example shown does not exhibit that. (But I
> don't know what the starting SMILES is, so I don't know whether the
> 7-bonded C is due to a bad SMILES, in which case all bets are off.)
>
> In addition, I think some discussion earlier indicated that the RDKit 2D
> structures look much worse when H's are included.
>
> I actually wrote a code one time (while at Schr?dinger) to give a "badness"
> score to 2D structures. When our 2D depiction development was in progress,
> we created 2D SD files for many thousands of structures. I could put these
> through the program and sort with the worst on top. That allowed the most
> severe problems to be identified more quickly than, say, looking at
> thousands of 2D diagrams. The program looked at three things: Number of
> bonds that crossed, Number of atoms that were too close together, and Large
> disparity of bond lengths within the same molecule. (The checking code
> didn't deal with labels.)
>
> Writing the checker was a fun project, but I'm glad I didn't have to write
> the 2D depiction code. As Mark Twain said, "Improving oneself is good.
> Improving others is better ? and easier."
>
> -P.
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Chris Swain
At first glance this looks an interesting approach.

Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram 
Generation of Complex Molecules and Ligand–Protein Interactions
DOI: http://dx.doi.org/10.1021/acs.jcim.6b00391

> On 27 Sep 2016, at 05:38, rdkit-discuss-requ...@lists.sourceforge.net wrote:
> 
> 2D drawing code is tough. The 90/10 rule applies: the last 10% of
> correctness takes 90% of the effort.
> 
> I like Dmitri Agrafiotis's method, but IIRC it's patented; also, though
> it's good for rough work, it doesn't produce "beautiful" structural
> diagrams.
> 
> Some of the 2D drawing methods that do produce "pretty" pictures have a
> large number of templates built in that match the most common (and even
> somewhat uncommon) motifs, and they fall down when they hit something they
> can't get a close enough match for. And then, the IUPAC has a whole list of
> "desirable" features in 2D diagrams (as in, "Don't show it this way, but
> rather show it that way."). So even if you produce what might appear to be
> an acceptable drawing, it might not match the IUPAC list of desirables.
> 
> I think for the present purposes what we need is something correct, robust
> and legible, and of course the example shown does not exhibit that. (But I
> don't know what the starting SMILES is, so I don't know whether the
> 7-bonded C is due to a bad SMILES, in which case all bets are off.)
> 
> In addition, I think some discussion earlier indicated that the RDKit 2D
> structures look much worse when H's are included.
> 
> I actually wrote a code one time (while at Schr?dinger) to give a "badness"
> score to 2D structures. When our 2D depiction development was in progress,
> we created 2D SD files for many thousands of structures. I could put these
> through the program and sort with the worst on top. That allowed the most
> severe problems to be identified more quickly than, say, looking at
> thousands of 2D diagrams. The program looked at three things: Number of
> bonds that crossed, Number of atoms that were too close together, and Large
> disparity of bond lengths within the same molecule. (The checking code
> didn't deal with labels.)
> 
> Writing the checker was a fun project, but I'm glad I didn't have to write
> the 2D depiction code. As Mark Twain said, "Improving oneself is good.
> Improving others is better ? and easier."
> 
> -P.

--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss