I wrote a comment in the gist.

2014-12-11 17:08 GMT-05:00 Robert Gates <[email protected]>:

> In any case, this does make me wonder what is going on under the hood... I
> would not call the vectorized code "vectorized". IMHO, this should just
> pass to BLAS without overhead. Something appears to be creating a bunch of
> temporaries.
>
> On Thursday, December 11, 2014 5:47:01 PM UTC+1, Petr Krysl wrote:
>
>> Acting upon the advice that replacing matrix-matrix multiplications in
>> vectorized form with loops would help with performance, I chopped out a
>> piece of code from my finite element solver (https://gist.github.com/
>> anonymous/4ec426096c02faa4354d) and ran some tests with the following
>> results:
>>
>> Vectorized code:
>> elapsed time: 0.326802682 seconds (134490340 bytes allocated, 17.06% gc
>> time)
>>
>> Loops code:
>> elapsed time: 4.681451441 seconds (997454276 bytes allocated, 9.05% gc
>> time)
>>
>> SLOWER and using MORE memory?!
>>
>> I must be doing something terribly wrong.
>>
>> Petr
>>
>>

Reply via email to