Re: [Numpy-discussion] dot() performance depends on data?

David Cournapeau Fri, 10 Sep 2010 18:05:07 -0700

On Sat, Sep 11, 2010 at 9:47 AM, Charles R Harris
<[email protected]> wrote:
>
>
> On Fri, Sep 10, 2010 at 6:41 PM, David Cournapeau <[email protected]>
> wrote:
>>
>> On Sat, Sep 11, 2010 at 2:57 AM, Charles R Harris
>> <[email protected]> wrote:
>> >
>> >
>> > On Fri, Sep 10, 2010 at 11:36 AM, Hagen Fürstenau <[email protected]>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> I'm multiplying two 1000x1000 arrays with numpy.dot() and seeing
>> >> significant performance differences depending on the data. It seems to
>> >> take much longer on matrices with many zeros than on random ones. I
>> >> don't know much about optimized MM implementations, but is this normal
>> >> behavior for some reason?
>> >>
>> >
>> > Multiplication by zero used to be faster than multiplication by random
>> > numbers. However, modern hardware and compilers may have changed that to
>> > pretty much a wash. More likely you are seeing cache issues due to data
>> > localization or even variations in the time given the thread running the
>> > multiplication.
>>
>> That's actually most likely a denormal issue. The a and b matrix (from
>> mm.py) have many very small numbers, which could cause numbers to be
>> denormal. Maybe a has more denormals than b. Denormal cause
>> significant performance issues on Intel hardware at least.
>>
>> Unfortunately, we don't have a way in numpy to check for denormal that
>> I know of.
>>
>
> The matrices could be scaled up to check that.


Indeed - and I misread the script anyway, I should not investigate
this kind of things after waking up :)

Anyway, seems it is indeed a denormal issue, as adding a small (1e-10)
constant gives same speed for both timings.

cheers,

David
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] dot() performance depends on data?

Reply via email to