https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123966
Bug ID: 123966
Summary: Eliminate atomic relaxed memory order load with unused
result
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: law at gcc dot gnu.org
Target Milestone: ---
So Misha and I were looking at 520.omnet a while back and realized that there
was a load where we never used the result. This was buried inside jemalloc's
free() routine. So it's not crazy hot, but it is important. And being a load
amplifies the value in removing the instrution over simple ALU ops. The
sequence we saw for rv64 was:
ld a3,0(a3)
andi a3,a4,1
Note how the load writes a3, then the andi immediately overwrites the value in
a3 without reading the value in a3.
This was ultimately tracked down to an atomic load where we never used the
result. It's kind of inherent in the jemalloc code paths. It's a relaxed
memory order so I believe it is safe to elide the load if its result is not
used.
void idalloctm() {
unsigned result;
__atomic_load(&atomic_load_u_a, &result, 0);
}