On 10-08-17 04:27 PM, Mathieu Desnoyers wrote:
* Alexandre Montplaisir ([email protected]) wrote:
On 10-08-17 03:54 PM, Mathieu Desnoyers wrote:
  [...]
Oh, and by the way, given that these are arrays made of one variable per
cpu, the extra space allocated will not consume extra cache lines in any
of the CPU. We're just wasting a bit a memory here, not adding to cache
pressure.

Mathieu
Sorry to chime in, but wouldn't padding to 128 bytes on architectures
with 64-byte cache lines "waste" an extra line every time, thus
indirectly adding to cache pressure?
A cache line is only used if the data located in that cache line is
touched. If we only have padding in the second half of the 128 bytes,
then the associated 64 bytes cache line is never fetched by the cpu.

But reality can be a bit different when we speak of sequential accesses
with prefetching. However, this apply well to randomly-accessed memory.

Does that make sense ?

Yes, thanks for the explanation!

I see now why we *clearly* don't want the align attribute to be smaller than the actual cache-line size.

Me and David's concerns were more about the other architectures (core2, atom, "generic", etc., which do seem to be more common than P4 & NUMA) where the lib is still defining 128-byte cache line sizes even though they use 64-byte ones. Would it be worth it to have a per-architecture definition?

Alexandre

Thanks,

Mathieu

(relatively newbie here, please be gentle :) )

Alexandre



_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev

Reply via email to