On 01/10/2012 10:22 AM, Gilles Louppe wrote: >> @both: This might be a stupid question but is there really so much >> difference >> in indexing continuously or with stride over a C pointer? >> >> I didn't do much CPU optimization in the past so sorry for asking stupid >> stuff ;) >> > Yes, the speedup can be quite significant. > > To sum up, when the CPU accesses the central memory, it prefetches a > whole contiguous block of bytes from the location pointed by the > pointer. That block is put into the CPU cache(s), which allow much > faster accesses to the subsequent bytes within that block (i.e., the > next elements in the array). If the array was C-ordered, then one > couldn't benefit from the CPU cache in our case (because the next > value in some column j would most likely not be within the block > fetched into the cache (the next values in the block would be the > values on the same line, not on the same column)). > > I thought that modern compilers coupled with modern hardware that uses all kind of branch prediction and prefetchers wouldn't really rely on such simple mechanisms any more. (see for example http://software.intel.com/en-us/articles/optimizing-application-performance-on-intel-coret-microarchitecture-using-hardware-implemented-prefetchers/)
But I you say there is still a difference then I believe you ;) Still I would be interested in a C program benchmark. Maybe I'll whip one up right now... Cheers, Andy ------------------------------------------------------------------------------ Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
