On 02/04/2011 12:33 PM, Christoph Gohlke wrote: > > > On 2/4/2011 2:14 PM, Eric Firing wrote: >> On 02/04/2011 11:33 AM, Eric Firing wrote: >>> On 02/04/2011 10:28 AM, Christoph Gohlke wrote: >>>> >>>> >>>> On 2/4/2011 11:54 AM, Eric Firing wrote: >>>>> On 02/03/2011 05:35 PM, Christoph Gohlke wrote: >>>>>> >>>>>> >>>>>> On 2/3/2011 6:50 PM, Eric Firing wrote: >>>>>>> On 02/03/2011 03:04 PM, Benjamin Root wrote: >>>>>>> >>>>>>>> Also, not to sound too annoying, but has anyone considered the idea of >>>>>>>> using compressed arrays for holding those rgba values? >>>>>>> >>>>>>> I don't see how that really helps; as far as I know, a full rgba array >>>>>>> has to be passed into agg. What *does* help is using uint8 from start >>>>>>> to finish. It might also be possible to use some smart downsampling >>>>>>> before generating the rgba array, but the uint8 route seems to me the >>>>>>> first thing to attack. >>>>>>> >>>>>>> Eric >>>>>>> >>>>>>>> >>>>>>>> Ben Root >>>>>>> >>>>>> >>>>>> Please review the attached patch. It avoids generating and storing >>>>>> float64 rgba arrays and uses uint8 rgba instead. That's a huge memory >>>>>> saving and also faster. I can't see any side effects as >>>>>> _image.fromarray() converts the float64 input to uint8 anyway. >>>>> >>>>> Christoph, >>>>> >>>>> Thank you! I haven't found anything wrong with that delightfully simple >>>>> patch, so I have committed it to the trunk. Back in 2007 I added the >>>>> ability of colormapping to generate uint8 directly, precisely to enable >>>>> this sort of optimization. Why it was not already being used in imshow, >>>>> I don't know--maybe I was going to do it, got sidetracked, and never >>>>> finished. >>>>> >>>>> I suspect it won't be as simple as for the plain image, but there may be >>>>> opportunities for optimizing with uint8 in other image-like operations. >>>>> >>>>>> >>>>>> So far other attempts to optimize memory usage were thwarted by >>>>>> matplotlib's internal use of masked arrays. As mentioned before, users >>>>>> can provide their own normalized rgba arrays to avoid all this >>>>>> processing. >>>>>> >>>>> >>>>> Did you see other potential low-hanging fruit that might be harvested >>>>> with some changes to the code associated with masked arrays? >>>>> >>>>> Eric >>>>> >>>> >>>> The norm function currently converts the data to double precision >>>> floating point and also creates temporary arrays that can be avoided. >>>> For float32 and low precision integer images this seems overkill and one >>>> could use float32. It might be possible to replace the norm function >>>> with numpy.digitize if that works with masked arrays. Last, the >>>> _image.frombyte function does a copy of 'strided arrays' (only relevant >>>> when zooming/panning large images). I try to provide a patch for each. >>> >>> masked arrays can be filled to create an ndarray before passing to >>> digitize; whether that will be faster, remains to be seen. I've never >>> used digitize. >> >> I didn't say that ("can be filled...") right. I think one would need to >> use the mask to put in the i_bad index where appropriate. np.ma does >> not have a digitize function. I suspect it won't help much if at all in >> Normalize, but it would be a natural for use in BoundaryNorm. >> >> It looks easy to allow Normalize.__call__ to use float32 if that is what >> it receives. >> >> I don't see any unnecessary temporary array creation apart from the >> conversion to float64, except for the generation of a masked array >> regardless of input. I don't think this costs much; if it gets an >> ndarray it does not copy it, and it does not generate a full mask array. >> Still, the function probably could be sped up a bit by handling >> masking more explicitly instead of letting ma do the work. >> > > In class Normalize: > result = 0.0 * val > and > result = (val-vmin) / (vmax-vmin) > >> Eric >> >>> >>> Regarding frombyte, I suspect you can't avoid the copy; the data >>> structure being passed to agg is just a string of bytes, as far as I can >>> see, so everything is based on having a simple contiguous array. >>> > > The PyArray_ContiguousFromObject call will return a copy if the input > array is not already contiguous.
Exactly. I thought you were suggesting that this was not needed, but maybe I misunderstood. Eric > > Christoph > > ------------------------------------------------------------------------------ > The modern datacenter depends on network connectivity to access resources > and provide services. The best practices for maximizing a physical server's > connectivity to a physical network are well understood - see how these > rules translate into the virtual world? > http://p.sf.net/sfu/oracle-sfdevnlfb > _______________________________________________ > Matplotlib-users mailing list > Matplotlib-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/matplotlib-users ------------------------------------------------------------------------------ The modern datacenter depends on network connectivity to access resources and provide services. The best practices for maximizing a physical server's connectivity to a physical network are well understood - see how these rules translate into the virtual world? http://p.sf.net/sfu/oracle-sfdevnlfb _______________________________________________ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users