On Thu, Feb 17, 2011 at 1:36 PM, Benjamin Root <ben.r...@ou.edu> wrote:
> On Tue, Feb 15, 2011 at 2:05 PM, Benjamin Root <ben.r...@ou.edu> wrote: > >> On Tue, Feb 15, 2011 at 1:41 PM, Benjamin Root <ben.r...@ou.edu> wrote: >> >>> >>> >>> On Tue, Feb 15, 2011 at 1:17 PM, Eric Firing <efir...@hawaii.edu> wrote: >>> >>>> On 02/15/2011 08:50 AM, Benjamin Root wrote: >>>> >>>>> On Tue, Feb 15, 2011 at 12:19 PM, Benjamin Root <ben.r...@ou.edu >>>>> <mailto:ben.r...@ou.edu>> wrote: >>>>> >>>>> On Tue, Feb 15, 2011 at 11:54 AM, Eric Firing <efir...@hawaii.edu >>>>> <mailto:efir...@hawaii.edu>> wrote: >>>>> >>>>> On 02/15/2011 07:40 AM, Benjamin Root wrote: >>>>> > I have come across a little inconsistency that was unexpected >>>>> in the >>>>> > matplotlib API. The following is perfectly valid: >>>>> > >>>>> > import matplotlib.pyplot as plt >>>>> > plt.plot([], []) >>>>> > plt.show() >>>>> > >>>>> > >>>>> > However, this is not valid: >>>>> > >>>>> > import matplotlib.pyplot as plt >>>>> > plt.scatter([], []) >>>>> > plt.show() >>>>> > >>>>> > >>>>> > The immediate issue that comes up in scatter is that it >>>>> attempts to find >>>>> > min/max of the input data for the purpose of autoscaling >>>>> (this can >>>>> > probably be done better by just using set_xmargin(0.05) and >>>>> > set_ymargin(0.05)). This can easily be bypassed with an if >>>>> statement. >>>>> > However, then we discover that polygon collection do not like >>>>> having >>>>> > empty offsets, which leads to a failure in the affine >>>>> transformation. >>>>> > >>>>> > So, the question is, is this a bug or a feature? I >>>>> personally believe >>>>> > that empty data is a perfectly valid scenario and given that >>>>> other >>>>> > matplotlib functions handle it gracefully, we should make the >>>>> > collections object more friendly to empty data. >>>>> >>>>> I agree; a call with empty data should simply not plot anything. >>>>> >>>>> Eric >>>>> >>>>> >>>>> Digging further, it appears that the problem is in _path.cpp for >>>>> _path_module::affine_transform() which explicitly checks for an >>>>> empty vertices array and throws an exception if it is empty. >>>>> >>>>> So, do we want to make _path.cpp empty-friendly or should we just >>>>> make empty collections objects just avoid doing anything that >>>>> requires doing an affine transform? >>>>> >>>>> Ben Root >>>>> >>>>> >>>>> >>>>> Ok, some more digging deepens the mystery. While an empty-friendly >>>>> _path.cpp would be nice, it appears that the collections and axes >>>>> objects are already doing all it can to avoid doing transforms for >>>>> empty >>>>> collections. >>>>> >>>>> However, it appears that the supposedly empty collection object from >>>>> scatter([], []) is not really empty. Its self._paths member contains a >>>>> list of unit_circle() from Path. This is also the case for >>>>> EllipseCollection. Meanwhile, LineCollection and PatchCollection >>>>> initializes their self._paths in accordance to their given data. >>>>> >>>> >>>> One way to solve the problem would be to start each draw() method with a >>>> short-circuit return in case there is nothing to draw. It would be needed >>>> only for classes for which empty self._paths is not a valid test. So for >>>> CircleCollection it would be: >>>> >>>> @allow_rasterization >>>> def draw(self, renderer): >>>> # sizes is the area of the circle circumscribing the polygon >>>> # in points^2 >>>> if len(self._sizes) == 0: >>>> return >>>> self._transforms = [ >>>> transforms.Affine2D().scale( >>>> (np.sqrt(x) * self.figure.dpi / 72.0) / np.sqrt(np.pi)) >>>> for x in self._sizes] >>>> return Collection.draw(self, renderer) >>>> >>>> (Collection.draw returns nothing, so there is no inconsistency in the >>>> return value.) >>>> >>>> Alternatively, it looks like maybe an empty self._transforms could be >>>> used in a short-circuit test at the start of Collection.draw. >>>> >>>> Eric >>>> >>>> >>>> >>>>> Ben Root >>>>> >>>>> >>>> >>> That wouldn't completely solve the problem. Affine transforms are being >>> done for get_datalim() as well. The other issue (and I see that I mixed >>> this up myself) is the assumption elsewhere that a non-empty self._path >>> attribute means that there is something to plot. This is an assumption that >>> is made in axes.py on line 1413 and it is an invalid assumption. >>> >>> As for your proposed solution in draw(), I prefer short-circuiting in >>> Collections.draw(). This makes for less work for new Collection subclasses. >>> >>> Ben Root >>> >> >> >> Working from an assumption that we can't possibly find all bad/broken >> assumptions in mpl, I decided to go the route of Eric and make the >> Collections object as robust as possible. I have attached a patch that >> should enable empty scatter plots in matplotlib and enable empty Collections >> as well. >> >> This has not been fully tested yet. Please be brutal! >> >> Ben Root >> >> >> > Heh, looks like that patch breaks the plotting of errorbars (the line > connecting the center point to the error caps is missing). > > I will see about fixing that. > > Ben Root > > Nailed it! After much examination and analysis, I realized that the transforms and the backend renderers are actually quite robust to empty arrays of points, so long as the arrays are shaped Nx2. It appears that in collections.py, an attempt is made to force that shape, but this is done inconsistently and wasn't technically correct. Consider the following snippet: offsets = np.asarray(offsets) if len(offsets.shape) == 1: offsets = offsets[np.newaxis,:] # Make it Nx2. If offsets was [[1, 4], [2, 5], [3, 6]], everything is fine. If offsets was [1, 4], everything works fine as well. However, if offsets was [1, 2, 3], this would go through, producing a 1x3 array (which isn't what we want and can cause errors later). If offsets was [], then a (1, 0) array is produced, and this causes problems later in the code when it checks for empty arrays incorrectly. We can make this code better and clearer by shaping the array appropriately like so: offsets = np.asarray(offsets) offsets.shape = (-1, 2) There are multiple benefits to this. First, it explicitly makes it clear our intent and requirement (that offsets needs to be Nx2). Second, it guarantees no copying of the array. Third, any array that can't fit this shape will throw an error. Fourth, we eliminate an if-statement! So, with this cleaner enforcement, I applied it to other places in collections.py where it wasn't being done at all. In addition, checks for empty arrays were being done with a check for len(offsets) > 0, which isn't quite right (a 1x0 array would pass this check, which is incorrect), so I changed it to the clearer and more correct offsets.size > 0. The attached patch will not only fix the issues in collections.py, but it also slightly changes the implementation of scatter() in axes.py to avoid doing min/max checks (which would fail for empty arrays) and instead will set data margins if they haven't been set already (and if there is data). I have tested the patch with the test suite and it passes. I also double-checked that all the mplot3d examples (which are not in the test suite) and they all rendered correctly. My only question is: Is it too late to commit to svn? I hope this enlightens others. Ben Root
emptycollections.patch
Description: Binary data
------------------------------------------------------------------------------ The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel