Yeah, I think I figured it out on my own, hence the message deletion. 
Nonetheless, I don't see your comment.

On Thursday, December 11, 2014 11:29:15 PM UTC+1, Andreas Noack wrote:
>
> I wrote a comment in the gist.
>
> 2014-12-11 17:08 GMT-05:00 Robert Gates <[email protected] <javascript:>
> >:
>
>> In any case, this does make me wonder what is going on under the hood... 
>> I would not call the vectorized code "vectorized". IMHO, this should just 
>> pass to BLAS without overhead. Something appears to be creating a bunch of 
>> temporaries.
>>
>> On Thursday, December 11, 2014 5:47:01 PM UTC+1, Petr Krysl wrote:
>>
>>> Acting upon the advice that replacing matrix-matrix multiplications in 
>>> vectorized form with loops would help with performance, I chopped out a 
>>> piece of code from my finite element solver (https://gist.github.com/
>>> anonymous/4ec426096c02faa4354d) and ran some tests with the following 
>>> results:
>>>
>>> Vectorized code:
>>> elapsed time: 0.326802682 seconds (134490340 bytes allocated, 17.06% gc 
>>> time)
>>>
>>> Loops code:
>>> elapsed time: 4.681451441 seconds (997454276 bytes allocated, 9.05% gc 
>>> time) 
>>>
>>> SLOWER and using MORE memory?!
>>>
>>> I must be doing something terribly wrong.
>>>
>>> Petr
>>>
>>>
>

Reply via email to