On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith <n...@pobox.com> wrote:
> On 7 Sep 2012 14:38, "Benjamin Root" <ben.r...@ou.edu> wrote: > > > > An issue just reported on the matplotlib-users list involved a user who > ran out of memory while attempting to do an imshow() on a large array. > While this wouldn't be totally unexpected, the user's traceback shows that > they ran out of memory before any actual building of the image occurred. > Memory usage sky-rocketed when imshow() attempted to determine the min and > max of the image. The input data was a masked array, and it appears that > the implementation of min() for masked arrays goes something like this > (paraphrasing here): > > > > obj.filled(inf).min() > > > > The idea is that any masked element is set to the largest possible value > for their dtype in a copied array of itself, and then a min() is performed > on that copied array. I am assuming that max() does the same thing. > > > > Can this be done differently/more efficiently? If the "filled" approach > has to be done, maybe it would be a good idea to make the copy in chunks > instead of all at once? Ideally, it would be nice to avoid the copying > altogether and utilize some of the special iterators that Mark Weibe > created last year. > > I think what you're looking for is where= support for ufunc.reduce. This > isn't implemented yet but at least it's straightforward in principle... > otherwise I don't know anything better than reimplementing .min() by hand. > > -n > > Yes, it was the where= support that I was thinking of. I take it that it was pulled out of the 1.7 branch with the rest of the NA stuff? Ben Root
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion