The Intel manual effectively is pretty short while explaining it, but anyway a "simple" check with proper tools could be a good experiment to validate it: https://joemario.github.io/blog/2016/09/01/c2c-blog/
I've asked to an Intel engineer about it some times ago (on 14th Feb) and he answered me this: Hi Francesco, About your questions on prefetchers: > > - Prefetchers normally kick in only after multiple cache lines in a > specific pattern have been accessed. So I wouldn't worry too much for a > single cache line. > > > - Prefetchers tend to only read lines, so they by itself cannot cause > additional classic false sharing (but may cause additional aborts on TSX). > > > - The same is true for speculative execution. You have more to fight > than just prefetching; speculative execution tends to pull in lots of data > early. You can assume the cpu runs 150+ instructions ahead > specualtively, if not more. > > > - There shouldn't be an automatic "get the next line" as much as there > are pattern recognizers, and if there's a sequential pattern, the next > lines will be prefeteched. it's not unconditional. > > You can always test by enabling/disabling the prefetchers: > wrmsr -a 0x1a4 0xf // to disable > wrmsr -a 0x1a4 0x0 // to enable > See > <https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors> > https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors > for > more info. > The wrmsr tool is available at: https://01.org/msr-tools/overview Il giorno lunedì 29 maggio 2017 18:06:02 UTC+2, Benedict Elliott Smith ha scritto: > > It's approximately where you'd expect, in the Intel 64 and IA32 > Architecture Optimization Reference Manual, under "Data Prefetching" on > page 2-29, and referred to as the "Spatial prefetcher" > > It is pretty easy to miss, given it's only afforded a single sentence. > > It's possible to disable it on a per-core basis: > > > https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors > > > On Mon, 29 May 2017 at 16:54, Martin Thompson <[email protected] > <javascript:>> wrote: > >> Switching topics slightly, prefetch extending the effective cache line >>> size was causing us some consternation, since we were never able to find >>> where it was documented. Do you have a reference to it? When did it start >>> happening? >>> >>> >>> It seems like it invalidates all software that was carefully written to >>> honor 64 byte cache lines. >>> >>> >>> IIRC Pentium 4 had 128 byte "sectors", but it was never fully explained >>> what these were, and the word died with the P4. >>> >> >> I've seen adjacent cacheline prefetching on Intel processors since the >> Netburst days (well over a decade). Until Sandy bridge it was generally >> recommended to disable them because memory bandwidth often became an issue. >> These days it works on the L2 cache sitting along side a prefetcher that >> looks for patterns of cache line accesses. L1 has different prefetchers. It >> does have quite a noticeable effect on false sharing but not as much as >> when within the same 64 byte cache line. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "mechanical-sympathy" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
