Richard Henderson <richard.hender...@linaro.org> writes: > Version 1 regressed the expansion of atomics, which means the addition > of CC clobber to all conditional branches is flawed. Version 2 goes > the other way: remove CC clobber from all conditional branches. > > This requires the out-of-range TBZ->TST+B.cond expansion be removed, > falling back to the far_branch expansion TBNZ+B.
Thanks for the update. I suppose this change (patch 9) is going to be the most controversial part. But I agree it's a necessary consequence of the approach that the series is taking. > I have fixed target/121388, the condition inversion problem, by > requiring the immediate ranges of CB be symmetric. This omits 64 > as a valid input to GE and -1 as a valid input to LE. > > I have dropped the removal of cbranch<SHORT>4 expanding to CBB/CBH. > I still think that's the correct way to go, but given the change to > mov<ALLI>cc, it is no longer required to fix an ICE. > > TODO list: > - Drop cbranch<SHORT>4, and direct expansion of CMPBR from cbranch<GPI>4, > so that we always get CCmode comparisons out of expand. Rely on combine > to re-assemble CBB/CBH when there's no other use of the compare. > Consider e.g. cmpbr-3.c, introduced by the final patch, where > CMP+CSEL is a better choice than CBHLT+MOV. > > - Re-add branch patterns containing cc clobbers, and maybe accept the > full range of constants. These can expand to CMP+Bcond for ranges > between 1KiB and 1MiB, and when condition inversion produces an > invalid immediate for CB. > > - Reorg the post-reload splitters that emit conditional branches. > There's a fair bit of by-hand generation of CMP+Bcond which > could be turned into CBcond, but even without CMPBR could benefit > from a well-designed helper function. > > I don't know if I'll get around to doing any of these, since I'm > supposed to be working on 128-bit page tables for qemu. :-) :-) I've replied to some of the individual patches, but the series is ok from my POV with those changes. Please wait till Monday to see if there are any other comments though. Richard > r~ > > > Richard Henderson (13): > aarch64: Fix spelling of BRANCH_LEN_N_1KiB > aarch64: Remove an indentation level from aarch64_if_then_else_costs > aarch64: Reorg aarch64_if_the_else_costs, conditional branch > aarch64: Use aarch64_gen_compare_zero_and_branch in aarch64_restore_za > aarch64: Fix gcs save/restore_stack_nonlocal > aarch64: Rename and improve aarch64_split_imm24 > aarch64: Fix aarch64_split_imm24 patterns > aarch64: Disable TARGET_CMPBR with aarch64_track_speculation > aarch64: Remove cc clobber from *aarch64_tbz<LTGE><ALLI>1 > aarch64: Fix gcc.target/aarch64/cmpbr.c > aarch64: Consider TARGET_CMPBR in rtx costs > aarch64: CMPBR branches must be invertable > aarch64: Fix condition accepted by mov<ALLI>cc > > gcc/config/aarch64/aarch64-protos.h | 4 + > gcc/config/aarch64/aarch64-sme.md | 3 +- > gcc/config/aarch64/aarch64.cc | 134 ++++++++---- > gcc/config/aarch64/aarch64.h | 5 +- > gcc/config/aarch64/aarch64.md | 201 +++++++----------- > gcc/config/aarch64/constraints.md | 10 +- > gcc/config/aarch64/iterators.md | 34 +-- > gcc/config/aarch64/predicates.md | 23 +- > gcc/testsuite/gcc.target/aarch64/cmpbr-1.c | 25 +++ > gcc/testsuite/gcc.target/aarch64/cmpbr-2.c | 110 ++++++++++ > gcc/testsuite/gcc.target/aarch64/cmpbr-3.c | 15 ++ > gcc/testsuite/gcc.target/aarch64/cmpbr.c | 10 +- > .../gcc.target/aarch64/gcs-nonlocal-3.c | 2 +- > gcc/testsuite/gcc.target/aarch64/pr109078.c | 2 +- > 14 files changed, 378 insertions(+), 200 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/cmpbr-1.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/cmpbr-2.c > create mode 100644 gcc/testsuite/gcc.target/aarch64/cmpbr-3.c