Re: The "5-level Page Table" Default in Linux May Impact Throughput

Mark Dawson Sat, 18 Jun 2022 11:58:02 -0700

Thanks, Gil, for the compliment!

Regarding the VM tricks used by certain GCs out there, wouldn't that be
better served instead by Intel's upcoming Linear Address Masking (LAM) in
Sapphire Rapids, which would allow you to apply all types of tricks to the
upper 15 bits of unused bits on 4-level Page Table systems (only the upper
7 bits on 5-level Page Table systems)? With LAM, user applications no
longer need to worry about masking off updates to those bits when accessing
memory via those pointers since the system will ignore them (i.e., the CPU
no longer checks for canonicality).


In that case, 4-level Page Tables maintain their advantage over 5-level
Page Tables provided an end user does not require the 64TB or more of RAM
which 5-level Page Tables enable.


On Sat, Jun 18, 2022, 10:30 AM 'Gil Tene' via mechanical-sympathy <
[email protected]> wrote:

> A good read, and very well written.
>
> I'm with you in general on most/all of it except for one thing: That 64TB
> line.
>
> I draw that same line at 256GB. Or at least at a 256GB
> guest/pod/container/process. There are things out there that use virtual
> memory multi (and "many") mapping tricks for speed, and eat up a bunch of
> that seemingly plentiful 47 bit virtual user space in the process. The ones
> I know the most about (because I am mostly to blame for them) are high
> throughput concurrent GC mechanisms. Those have a multitude of
> implementation variants, all of which encode
> phases/generations/spaces/colors in higher-order virtual bits and use
> multi-mapping to efficiently recycle physical memory.
>
> For a concrete example, when running on a vanilla linux kernel with
> 4-level page tables, the current C4 collector in the Prime JVM (formerly
> known as "Zing"), uses different phase encoding, and different LVB barrier
> instruction encodings, depending on whether the heap size is above or below
> 256GB. Below 256GB, C4 gets to use sparse phase and generation encodings
> (using up 6 bits of virtual space) and a faster LVB test (test&jmp), and
> above 256GB, it uses denser encodings (using up only 3 bits) with slightly
> more expensive LVBs (test a bit in a mask & jmp). A 5 level page table (on
> hardware that supports it) we can move that line out by 512x, which means
> that even many-TB Java heaps can use the same cheap LVB tests that the
> smaller ones do.
>
> I expect the (exact) same cosniderations will be true for ZGC in OpenJDK
> (once it adds a generational mode to be able to keep up with high
> throughout allocations and large live sets), as ZGC's virtual space
> encoding needs and resulting LVB test instruction encodings are identical
> to C4's.
>
> So I'd say you can safely turn off 5 level tables on machines that
> physical have less than 256GB of memory, or on machines that are known to
> not run Java (now or in the future), or some other in-memory application
> technology that uses virtual memory tricks at scale. But above 256GB, I'd
> keep it on, especially if the thing is e.g. a Kubernetes node that may
> want to run some cool Java workload tomorrow with the best speed and
> efficiency.
> On Friday, June 17, 2022 at 7:06:42 PM UTC+2 Mark E. Dawson, Jr. wrote:
>
>> In the article below, I address just *one* of the areas where this new
>> default Linux build option in most recent distros can adversely impact
>> multithreaded performance - page fault handling. But any workload which
>> requires the kernel to mimic MMU page table walking to accomplish a task
>> could be impacted adversely, as well (e.g., pipe communication). You'd do
>> well to do your own testing:
>>
>> https://www.jabperf.com/5-level-vs-4-level-page-tables-does-it-matter/
>>
>> *NOTE*: 5-level Page Table can be disabled with "no5lvl" on the kernel
>> boot command line.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/mechanical-sympathy/c7d49c5d-4872-4f2d-8010-3035ccf5d7d8n%40googlegroups.com
> <https://groups.google.com/d/msgid/mechanical-sympathy/c7d49c5d-4872-4f2d-8010-3035ccf5d7d8n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/CAFvqqVeMu8eHwrXaQ-WM%3DqDZbXfUNQMv82Hgx%2BBkcNZ2okZ%3D3w%40mail.gmail.com.

Re: The "5-level Page Table" Default in Linux May Impact Throughput

Reply via email to