On 01/10/2012 10:22 AM, Gilles Louppe wrote:
>> @both: This might be a stupid question but is there really so much
>> difference
>> in indexing continuously or with stride over a C pointer?
>>
>> I didn't do much CPU optimization in the past so sorry for asking stupid
>> stuff ;)
>>      
> Yes, the speedup can be quite significant.
>
> To sum up, when the CPU accesses the central memory, it prefetches a
> whole contiguous block of bytes from the location pointed by the
> pointer. That block is put into the CPU cache(s), which allow much
> faster accesses to the subsequent bytes within that block (i.e., the
> next elements in the array). If the array was C-ordered, then one
> couldn't benefit from the CPU cache in our case (because the next
> value in some column j would most likely not be within the block
> fetched into the cache (the next values in the block would be the
> values on the same line, not on the same column)).
>
>    
I thought that modern compilers coupled with modern
hardware that uses all kind of branch prediction
and prefetchers wouldn't really rely on such simple
mechanisms any more.
(see for example 
http://software.intel.com/en-us/articles/optimizing-application-performance-on-intel-coret-microarchitecture-using-hardware-implemented-prefetchers/)

But I you say there is still a difference then I believe you ;)

Still I would be interested in a C program benchmark.
Maybe I'll whip one up right now...

Cheers,
Andy

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to