Kim, you are right, to play it safe we need to -fno-tree-pre all relevant bool occurrances. I'll get back with an updated webrev when testing is finished.
-Aleksei On 19/11/2019 01:51, Kim Barrett wrote: >> On Nov 18, 2019, at 8:10 AM, Aleksei Voitylov <aleksei.voity...@bell-sw.com> >> wrote: >> >> Hi, >> >> please review this build-only workaround for a GCC bug [1]. The problem >> surfaced on ARM32 when new code [2] got inlined with old code. In GCC at >> RTL level there is no good distinction between pointer and int, which >> makes it hard for the aliasing code to correctly recognize base >> addresses for objects, which leads to incorrect aliasing. As a >> consequence one of the loads in this code was completely eliminated on >> ARM32 by DSE. All GCC versions I could reach are affected. >> >> Thanks to the GCC community, the fix will appear in GCC 10 but before it >> becomes mainstream we need a workaround. The workaround is to disable >> -ftree-pre optimization on ARM32 which triggers the bug. The problem >> does not surface on x86 in this code. >> >> issue: https://bugs.openjdk.java.net/browse/JDK-8231612 >> webrev: http://cr.openjdk.java.net/~avoitylov/webrev.8231612.02/ >> >> Testing on ARM32: the problem described in 8231612 (100% CPU utilization >> in Service Thread) no longer appears, the load is not eliminated as >> observed by disassembly. >> Testing on x86_64: linux-x86_64-server-release builds OK after this change. >> >> Thanks, >> >> -Aleksei >> >> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92462 >> [2] https://bugs.openjdk.java.net/browse/JDK-8226366 > This change to limit the optimization of OopStorage.cpp doesn't seem > sufficient to me, based on my reading the gcc bug thread. From that > thread it sounds like any use of our CmpxchgByteUsingInt helper may be > miscompiled. (And who knows what else...) > > There are certainly other uses of that helper. On that platform I > think it gets used for any Atomic::cmpxchg of a bool value; at least, > that's where I *think* OopStorage is hitting it. (I'm guessing the > problem is the cmpxchg in has_cleanup_work_and_reset.) > > Just egrepping for "cmpxchg\((true|false)" finds > > ./share/gc/g1/g1RemSet.cpp: bool marked_as_dirty = Atomic::cmpxchg(true, > &_contains[region], false) == false; > ./share/gc/g1/g1RemSet.cpp: return !Atomic::cmpxchg(true, > &_collection_set_iter_state[region], false); > ./share/gc/g1/g1RemSet.cpp: !Atomic::cmpxchg(true, > &_fast_reclaim_handled, false)) { > ./share/gc/z/zRootsIterator.cpp: if (!_claimed && Atomic::cmpxchg(true, > &_claimed, false) == false) { > ./share/gc/z/zRootsIterator.cpp: if (!_claimed && Atomic::cmpxchg(true, > &_claimed, false) == false) { > ./share/gc/shared/oopStorage.cpp: return Atomic::cmpxchg(false, > &needs_cleanup_requested, true); > ./share/gc/shared/ptrQueue.cpp: Atomic::cmpxchg(true, &_transfer_lock, > false)) { > ./share/services/memTracker.cpp: if (Atomic::cmpxchg(true, > &g_final_report_did_run, false) == false) { > > There may be others with non-literal new-value arguments. And there > may be non-bool byte-sized cmpxchg's, though I don't offhand know of any. > > The two in zRootsIterator don't matter for arm32 (ZGC isn't supported > on 32bit platforms), but is this problem limited to arm32? The gcc > bug thread suggests it isn't affecting x86_32/64, and might be limited > to arm(32?). > > Also, the latest comment in the gcc thread says the proposed fix > doesn't work, so maybe not (yet) fixed in gcc 10. > >