Eric Firing wrote: > Mike, John, > > Because path simplification does not work with anything but a > continuous line, it is turned off if there are any nans in the path. > The result is that if one does this: > > import numpy as np > xx = np.arange(200000) > yy = np.random.rand(200000) > #plot(xx, yy) > yy[1000] = np.nan > plot(xx, yy) > > the plot fails with an incomplete rendering and general > unresponsiveness; apparently some mysterious agg limit is quietly > exceeded. The limit in question is "cell_block_limit" in agg_rasterizer_cells_aa.h. The relationship between the number vertices and the number of rasterization cells I suspect depends on the nature of the values.
However, if we want to increase the limit, each "cell_block" is 4096 cells, each with 16 bytes, and currently it maxes out at 1024 cell blocks, for a total of 67,108,864 bytes. So, the question is, how much memory should be devoted to rasterization, when the data set is large like this? I think we could safely quadruple this number for a lot of modern machines, and this maximum won't affect people plotting smaller data sets, since the memory is dynamically allocated anyway. It works for me, but I have 4GB RAM here at work. > With or without the nan, this test case also shows the bizarre > slowness of add_line that I asked about in a message yesterday, and > that has me completely baffled. lsprofcalltree is my friend! > > Both of these are major problems for real-world use. > > Do you have any thoughts on timing and strategy for solving this > problem? A few weeks ago, when the problem with nans and path > simplification turned up, I tried to figure out what was going on and > how to fix it, but I did not get very far. I could try again, but as > you know I don't get along well with C++. That simplification code is pretty hairy, particularly because it tries to avoid a copy by doing everything in an iterator/generator way. I think even just supporting MOVETOs there would be tricky, but probably the easiest first thing. > > I am also wondering whether more than straightforward path > simplification with nan/moveto might be needed. Suppose there is a > nightmarish time series with every third point being bad, so it is > essentially a sequence of 2-point line segments. The simplest form of > path simplification fix might be to reset the calculation whenever a > moveto is encountered, but this would yield no simplification in this > case. I assume Agg would still choke. Is there a need for some sort > of automatic chunking of the rendering operation in addition to path > simplification? > Chunking is probably something worth looking into (for lines, at least), as it might also reduce memory usage vs. the "increase the cell_block_limit" scenario. I also think for the special case of high-resolution time series data, where x if uniform, there is an opportunity to do something completely different that should be far faster. Audio editors (such as Audacity), draw each column of pixels based on the min/max and/or mean and/or RMS of the values within that column. This makes the rendering extremely fast and simple. See: http://audacity.sourceforge.net/about/images/audacity-macosx.png Of course, that would mean writing a bunch of new code, but it shouldn't be incredibly tricky new code. It could convert the time series data to an image and plot that, or to a filled polygon whose vertices are downsampled from the original data. The latter may be nicer for Ps/Pdf output. Cheers, Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel