On 10-08-17 03:54 PM, Mathieu Desnoyers wrote:
* Mathieu Desnoyers ([email protected]) wrote:
* David Goulet ([email protected]) wrote:
On 10-08-17 02:51 PM, Mathieu Desnoyers wrote:
* David Goulet ([email protected]) wrote:
Hi,
I have some doubt about the value of #define CACHE_LINE_SIZE
(urcu/arch_x86.h) that is set to 128.
After some research and looking on my computer, the x86 architecture
seems to have most of the time 64 bytes size. On my i7 920, here's what
I have :
# getconf LEVEL1_DCACHE_LINESIZE
64
# cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size
64
Since the Intel NetBurst microarch., the Intel manual says 64 bytes also
and it has not changed apparently for Nehalem.
So, Mathieu, why 128 bytes? UST is using that, if it's the wrong value
here for x86, it could have an effect on cache pressure since 2 lines
are required for structure less then 64 bytes.
See Linux kernel source:
arch/x86/Kconfig.cpu
#
# Define implied options from the CPU selection here
config X86_INTERNODE_CACHE_SHIFT
int
default "12" if X86_VSMP
default "7" if NUMA
default X86_L1_CACHE_SHIFT
and
config X86_L1_CACHE_SHIFT
int
default "7" if MPENTIUM4 || MPSC
default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM ||
MVIAC7 || X86_GENERIC || GENERIC_CPU
default "4" if X86_ELAN || M486 || M386 || MGEODEGX1
default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON
|| MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX ||
M586TSC || M586 || MVIAC3_2 || MGEODE_LX
So Pentium 4 seems to have 128 bytes cache lines.
Yep I saw that and this is why I'm asking because only NUMA, P4 and vSMP
machines are bigger then 64 bytes. The rest is 64 bytes (X86 generic,
Core 2(Nehalem), Atom).
So you are saying that you prefer use 128 bytes knowing that most of X86
is lower or equal to 64 bytes?
Yes. The performance degradation caused by cache-line bouncing is _way_
worse than extra cache pressure.
Oh, and by the way, given that these are arrays made of one variable per
cpu, the extra space allocated will not consume extra cache lines in any
of the CPU. We're just wasting a bit a memory here, not adding to cache
pressure.
Mathieu
Sorry to chime in, but wouldn't padding to 128 bytes on architectures
with 64-byte cache lines "waste" an extra line every time, thus
indirectly adding to cache pressure?
(relatively newbie here, please be gentle :) )
Alexandre
Mathieu
Hopefully the ScaleMP vSMP machine are rare enough (they would require a
4k alignment).
NUMA is not that rare, and requires 128 bytes cache lines too.
Can you send a patch for userspace RCU that documents this briefly in
urcu/arch_x86.h ? (just a summary of the info I pasted here would be
fine)
Thanks,
Mathieu
Thanks!
--
David Goulet
LTTng project, DORSAL Lab.
PGP/GPG : 1024D/16BD8563
BE3C 672B 9331 9796 291A 14C6 4AF7 C14B 16BD 8563
_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
--
David Goulet
LTTng project, DORSAL Lab.
PGP/GPG : 1024D/16BD8563
BE3C 672B 9331 9796 291A 14C6 4AF7 C14B 16BD 8563
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev