On 2/4/2011 2:14 PM, Eric Firing wrote: > On 02/04/2011 11:33 AM, Eric Firing wrote: >> On 02/04/2011 10:28 AM, Christoph Gohlke wrote: >>> >>> >>> On 2/4/2011 11:54 AM, Eric Firing wrote: >>>> On 02/03/2011 05:35 PM, Christoph Gohlke wrote: >>>>> >>>>> >>>>> On 2/3/2011 6:50 PM, Eric Firing wrote: >>>>>> On 02/03/2011 03:04 PM, Benjamin Root wrote: >>>>>> >>>>>>> Also, not to sound too annoying, but has anyone considered the idea of >>>>>>> using compressed arrays for holding those rgba values? >>>>>> >>>>>> I don't see how that really helps; as far as I know, a full rgba array >>>>>> has to be passed into agg. What *does* help is using uint8 from start >>>>>> to finish. It might also be possible to use some smart downsampling >>>>>> before generating the rgba array, but the uint8 route seems to me the >>>>>> first thing to attack. >>>>>> >>>>>> Eric >>>>>> >>>>>>> >>>>>>> Ben Root >>>>>> >>>>> >>>>> Please review the attached patch. It avoids generating and storing >>>>> float64 rgba arrays and uses uint8 rgba instead. That's a huge memory >>>>> saving and also faster. I can't see any side effects as >>>>> _image.fromarray() converts the float64 input to uint8 anyway. >>>> >>>> Christoph, >>>> >>>> Thank you! I haven't found anything wrong with that delightfully simple >>>> patch, so I have committed it to the trunk. Back in 2007 I added the >>>> ability of colormapping to generate uint8 directly, precisely to enable >>>> this sort of optimization. Why it was not already being used in imshow, >>>> I don't know--maybe I was going to do it, got sidetracked, and never >>>> finished. >>>> >>>> I suspect it won't be as simple as for the plain image, but there may be >>>> opportunities for optimizing with uint8 in other image-like operations. >>>> >>>>> >>>>> So far other attempts to optimize memory usage were thwarted by >>>>> matplotlib's internal use of masked arrays. As mentioned before, users >>>>> can provide their own normalized rgba arrays to avoid all this processing. >>>>> >>>> >>>> Did you see other potential low-hanging fruit that might be harvested >>>> with some changes to the code associated with masked arrays? >>>> >>>> Eric >>>> >>> >>> The norm function currently converts the data to double precision >>> floating point and also creates temporary arrays that can be avoided. >>> For float32 and low precision integer images this seems overkill and one >>> could use float32. It might be possible to replace the norm function >>> with numpy.digitize if that works with masked arrays. Last, the >>> _image.frombyte function does a copy of 'strided arrays' (only relevant >>> when zooming/panning large images). I try to provide a patch for each. >> >> masked arrays can be filled to create an ndarray before passing to >> digitize; whether that will be faster, remains to be seen. I've never >> used digitize. > > I didn't say that ("can be filled...") right. I think one would need to > use the mask to put in the i_bad index where appropriate. np.ma does > not have a digitize function. I suspect it won't help much if at all in > Normalize, but it would be a natural for use in BoundaryNorm. > > It looks easy to allow Normalize.__call__ to use float32 if that is what > it receives. > > I don't see any unnecessary temporary array creation apart from the > conversion to float64, except for the generation of a masked array > regardless of input. I don't think this costs much; if it gets an > ndarray it does not copy it, and it does not generate a full mask array. > Still, the function probably could be sped up a bit by handling > masking more explicitly instead of letting ma do the work. >
In class Normalize: result = 0.0 * val and result = (val-vmin) / (vmax-vmin) > Eric > >> >> Regarding frombyte, I suspect you can't avoid the copy; the data >> structure being passed to agg is just a string of bytes, as far as I can >> see, so everything is based on having a simple contiguous array. >> The PyArray_ContiguousFromObject call will return a copy if the input array is not already contiguous. Christoph ------------------------------------------------------------------------------ The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb _______________________________________________ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users