On Tue, Nov 12, 2024 at 4:43 PM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 11/12/24 11:31 AM, Alexey Merzlyakov wrote: > > Hi, Jeff, > > > > Oops... It looks like I've made a reply to Alex at the same time when > > you sent an e-mail update. > > > > So, moving the testcase to the gcc.dg/torture directory. > > > > On Tue, Nov 12, 2024 at 10:56:13AM -0700, Jeff Law wrote: > >>> I think it might be better to put the test in gcc.c-torture/execute, > >>> then it will automatically get run with the various torture options > >>> and you should be able to drop both the dejagnu directives. > >> Actually, gcc.dg/torture is better. > >> > >> jeff > > > > Next version of patch is below: > > > > -- >8 -- > > > > gcc/ChangeLog: > > > > * simplify-rtx.cc (simplify_context::simplify_unary_operation_1): > > Fix subreg mode check during zero_extend(not) -> xor optimization. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/torture/pr117476.c: New test. From Zhendong Su. > I think we've still got a problem in this code. If we take this test > from Zdenek, we completely optimize away foo() during combine. > > unsigned __int128 g; > > void > foo () > { > g += __builtin_add_overflow_p (~g, 0, 0ul); > } > > int > main () > { > foo(); > if (!g) > __builtin_abort(); > } > $ x86_64-pc-linux-gnu-gcc -O1 testcase.c -Wall -W && ./a.out > Aborted > > I think the problem is the mask we generate when the optimization > applies -- and you're about to be exposed to a bit of GCC quirkyness. > > We've got a TImode ZERO_EXTEND of this object: > > (gdb) p debug_rtx (op) > (subreg:DI (not:TI (reg:TI 100 [ g.0_2 ])) 0) > > > So precisely the kind of case you're trying to handle. This is > equivalent to a (xor:TI (reg:TI 100) (const_int 0xffffffffffffffff)) > > So we end up calling simplify_gen_binary, and our constant is: > > (gdb) p debug_rtx (op1) > (const_int -1 [0xffffffffffffffff]) > > > Note constants in GCC do not have a mode. So that (const_int -1) when > interpreted in TImode is 0xffffffffffffffffffffffffffffffff, clearly not > what we want. We ultimately return: > > (gdb) finish > Run till exit from #0 simplify_context::simplify_unary_operation_1 > (this=0x7fffffffd928, code=ZERO_EXTEND, mode=E_TImode, > op=0x7ffff77e5b40) at /home/jlaw/test/gcc/gcc/simplify-rtx.cc:1861 > simplify_context::simplify_unary_operation (this=0x7fffffffd928, > code=ZERO_EXTEND, mode=E_TImode, op=0x7ffff77e5b40, op_mode=E_DImode) at > /home/jlaw/test/gcc/gcc/simplify-rtx.cc:893 > 893 return simplify_unary_operation_1 (code, mode, op); > Value returned is $20 = (rtx_def *) 0x7ffff77c1e20 > (gdb) p debug_rtx ($20) > (not:TI (reg:TI 100 [ g.0_2 ])) > > > Which is not the same as: > > (zero_extend:TI (subreg:DI (not:TI (reg:TI 100 [ g.0_2 ])) 0)) > > Anyway, that the beginning of that function into: > > (insn 8 6 15 2 (set (reg:TI 107) > (not:TI (reg:TI 100 [ g.0_2 ]))) "j.c":6:8 1029 > {*one_cmplti2_doubleword} > (nil)) > > (insn 15 8 35 2 (set (reg:TI 108) > (not:TI (reg:TI 100 [ g.0_2 ]))) "j.c":6:8 174 {zero_extendditi2} > (nil)) > > (insn 35 15 36 2 (set (reg:CCZ 17 flags) > (compare:CCZ (reg:TI 107) > (reg:TI 108))) "j.c":6:8 34 {*cmpti_doubleword} > (expr_list:REG_DEAD (reg:TI 108) > (expr_list:REG_DEAD (reg:TI 107) > (nil)))) > > Naturally things go downhill after that since the condition has a known > result, eventually leading to the whole function optimizing away and > returning random garbage. > > > So the natural question is how to fix. First I would recommend changing > the type of "mask" to a HOST_WIDE_INT. In an of itself that won't fix > this problem, but it's the right thing to do.
I see nonzero_bits returns an unsigned HOST_WIDE_INT; though I wonder if it should use wi::wide_int; maybe it less important for RTX than the gimple level to support wider integers. The places which uses nonzero_bits either use HWI_COMPUTABLE_MODE_P (or ` GET_MODE_PRECISION (int_mode) <= HOST_BITS_PER_WIDE_INT`) or checks against 1 or a had a CONST_INT (which is HWI only). Though I wonder if the ones which check nonzero_bits against 1 are ok. Thanks, Andrew > > I think what you want is to test HWI_COMPUTABLE_MODE_P. That will > verify whether or not the given scalar integer mode can fit into a > HOST_WIDE_INT, which should return false for TI or larger modes. > > Jeff > > > > > > >