All,
I just committed (r6994) some modifications to numpy.ma.getdata (Eric  
Firing's patch) and to the ufunc wrappers that were too slow with  
large arrays. We're roughly 3 times faster than we used to, but still  
slower than the equivalent classic ufuncs (no surprise here).

Here's the catch: it's basically cheating. I got rid of the pre- 
processing (where a mask was calculated depending on the domain and  
the input set to a filling value depending on this mask, before the  
actual computation). Instead, I  force  
np.seterr(divide='ignore',invalid='ignore') before calling the ufunc  
on the .data part, then mask the invalid values (if any) and reset the  
corresponding entries in .data to the input. Finally, I reset the  
error status. All in all, we're still data-friendly, meaning that the  
value below a masked entry is the same as the input, but we can't say  
that values initially masked are discarded (they're used in the  
computation but reset to their initial value)...

This playing around with the error status may (or may not, I don't  
know) cause some problems down the road.
It's still faaar faster than computing the domain (especially  
_DomainSafeDivide) when the inputs are large...
  I'd be happy if you could give it a try and send some feedback.

Cheers
P.

On May 9, 2009, at 8:17 PM, Eric Firing wrote:

> Eric Firing wrote:
>
> Pierre,
>
> ... I pressed "send" too soon.  There are test failures with the  
> patch I attached to my last message.  I think the basic ideas are  
> correct, but evidently there are wrinkles to be worked out.  Maybe  
> putmask() has to be used instead of where() (putmask is much faster)  
> to maintain the ability to do *= and similar, and maybe there are  
> other adjustments. Somehow, though, it should be possible to get  
> decent speed for simple multiplication and division; a 10x penalty  
> relative to ndarray operations is just too much.
>
> Eric
>
>
>> Eli Bressert wrote:
>>> Hi,
>>>
>>> I'm using masked arrays to compute large-scale standard deviation,
>>> multiplication, gaussian, and weighted averages. At first I thought
>>> using the masked arrays would be a great way to sidestep looping
>>> (which it is), but it's still slower than expected. Here's a snippet
>>> of the code that I'm using it for.
>> [...]
>>> # Like the spatial_weight section, this takes about 20 seconds
>>> W = spatial_weight / Rho2
>>>
>>> # Takes less than one second.
>>> Ave = np.average(av_good,axis=1,weights=W)
>>>
>>> Any ideas on why it would take such a long time for processing?
>> A part of the slowdown is what looks to me like unnecessary copying  
>> in _MaskedBinaryOperation.__call__.  It is using getdata, which  
>> applies numpy.array to its input, forcing a copy.  I think the copy  
>> is actually unintentional, in at least one sense, and possibly two:  
>> first, because the default argument of getattr is always evaluated,  
>> even if it is not needed; and second, because the call to np.array  
>> is used where np.asarray or equivalent would suffice.
>> The first file attached below shows the kernprof in the case of  
>> multiplying two masked arrays, shape (100000,50), with no masked  
>> elements; 2/3 of the time is taken copying the data.
>> Now, if there are actually masked elements in the arrays, it gets  
>> much worse: see the second attachment.  The total time has  
>> increased by more than a factor of 3, and the culprit is  
>> numpy.which(), a very slow function.  It looks to me like it is  
>> doing nothing useful at all; the numpy binary operation is still  
>> being executed for all elements, regardless of mask, contrary to  
>> the intention implied by the comment in the code.
>> The third attached file has a patch that fixes the getdata problem  
>> and eliminates the which().
>> With this patch applied we get the profile in the 4th file, to be  
>> compared to the second profile.  Much better.  I am pretty sure it  
>> could still be sped up quite a bit, though.  It looks like the  
>> masks are essentially being calculated twice for no good reason,  
>> but I don't completely understand all the mask considerations, so  
>> at this point I am not trying to fix that problem.
>> Eric
>>> Especially the spatial_weight and W variables? Would there be a  
>>> faster
>>> way to do this? Or is there a way that numpy.std can process ignore
>>> nan's when processing?
>>>
>>> Thanks,
>>>
>>> Eli Bressert
>>> _______________________________________________
>>> Numpy-discussion mailing list
>>> Numpy-discussion@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> ------------------------------------------------------------------------
>> _______________________________________________
>> Numpy-discussion mailing list
>> Numpy-discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to