Hi Roberto,
Excited to hear the news about improving G1 barrier! I have a few questions
about this proposal:
a) ZGC uses late expansion because it has a clear fast path and a medium/slow
path. The fastpath
contains only 1 or 2 simple instructions so doesn't need optimization from c2.
G1 post barrier has several
branch check and doesn't have clear boudaries of fast or slow paths. And there
could be optimization opportunity such
as JDK-8225776. Permanently avoiding C2 optimization might lose performance.
b) G1(as well as card table remset GC) uses imprecise card mark which marks
the object address card instead of the field address.
If we use late expansion, we only have field address there and therefore have
to recompute the object address which
needs additional instructions or registers. BTW, I didn't see the details in
the prototype implementation. We can
alway use precise card mark in G1 anyway. Imprecise card mark has the advantage
to eliminate redundant card
mark while writing into different field of an object because the card mark
addresses are the same. Parallel GC can perform this optimization.
The late expansion could benifit from domination analysis to remove redudant
barriers and traditional ideal optimization could barelly help
G1 barrier. Looking forwarding to your reply and progress!
Thanks,
Liang
________________________________________
发件人: porters-dev <porters-dev-r...@openjdk.org> 代表 Roberto Castaneda Lozano
发送时间: 2024年2月2日 22:37
收件人: Andrew Dinn <ad...@redhat.com>; porters-dev@openjdk.org
主题: Re: [External] : Re: Heads-up: Late G1 Barrier Expansion (Draft JEP)
Hi Andrew,
Thanks for your interest! I am unfortunately not very familiar with Shenandoah
and its barrier model, but in principle late barrier expansion should be
applicable to any collector where barriers are tightly coupled to individual
memory access operations and performance does not depend too much on exposing
barrier operation details to the JIT compiler.
If it helps, our prototype is available here:
https://github.com/robcasloz/jdk/tree/g1-late-barrier-expansion
<https://github.com/robcasloz/jdk/tree/g1-late-barrier-expansion >. Please note
that this is early, experimental work and might change significantly as the JEP
evolves.
Thanks,
Roberto
________________________________________
From: Andrew Dinn <ad...@redhat.com>
Sent: Friday, February 2, 2024 2:33 PM
To: Roberto Castaneda Lozano; porters-dev@openjdk.org
Subject: [External] : Re: Heads-up: Late G1 Barrier Expansion (Draft JEP)
Hi Roberto,
On 02/02/2024 13:18, Roberto Castaneda Lozano wrote:
> I have written (together with Erik Österlund) a draft JEP for
> simplifying C2's handling of G1 barriers, see
> https://bugs.openjdk.org/browse/JDK-8322295
> <https://bugs.openjdk.org/browse/JDK-8322295 >. This is a heads-up that
> the implementation of this JEP requires platform-specific support from
> all OpenJDK ports. While interpreter G1 barrier implementations are
> available for all ports and can be largely reused, the JEP
> additionally requires 1) defining G1-specific ADL instructions and 2)
> implementing platform-specific logic to support runtime calls from the
> barrier code. For ports that already support ZGC, the effort should be
> smaller, as the logic for 2) can be shared between ZGC and G1.
>
> To give a rough estimation of the required effort, the x86-64 changes
> in our prototype involve approximately 900 line insertions and 300
> line deletions over 9 files, among which approximately 300 deleted and
> inserted lines correspond to logic factored out from ZGC.
I looked at the proposal and was interested in the approach, not least because
ZGC appears to have traversed the path that this JEP recommends
G1 to follow.
Have you considered whether this same approach might be taken with the
Shenandoah GC? Alternatively, can declare any basic assumptions regarding how
G1 operates that are needed to enable this change which might therefore need be
met by Shenandoah?
Of course, access to the prototype code might help answer those questions (at
least it would help someone better versed in Shenandoah than me) but a
high-level summary of what in the design of G1 and ZGC makes this approach work
or, conversely, might make it fail would, if available, be a great help.
regards,
Andrew Dinn
-----------
=