Hi, On 07/27/16 23:31, Jeff Law wrote: > So you're stumbling into another really interesting area. >
Absolutely, I am just too curious what's going on here ;-) > I can hazard a guess that the reason we create the paradoxical SUBREG > rather than a zero or sign extension is because various optimizers know > that the upper bits in a paradoxical are don't care bits and may make > transformations with that knowledge. > > Once you turn that into a zero/sign extend, then the rest of the > optimizers must preserve the zero/sign extension property -- they have > no way of knowing the bits were don't cares. > > So it's not necessarily clear that your change results in generally > better code or not. > > More importantly, it really feels like you're papering over a real bug > elsewhere. AFAICT the use of a paradoxical subreg here is safe. So the > real bug ought to be downstream of this point. Several folks have > pointed at reload, which would certainly be a point ripe for a > paradoxical subreg problem. I have looked again at the test case of PR 71779 and discovered something that I wanted to discuss here. I have tried to figure out what made combine change this: (insn 1048 1047 1049 101 (set (zero_extract:DI (reg/v:DI 191 [ obj1D.17943 ]) (const_int 32 [0x20]) (const_int 0 [0])) (reg/f:DI 481)) isl_input.c:2496 691 {*insv_regdi} (nil)) (insn 1049 1048 1050 101 (set (zero_extract:DI (reg/v:DI 191 [ obj1D.17943 ]) (const_int 32 [0x20]) (const_int 32 [0x20])) (reg/v/f:DI 145 [ SR.327D.17957 ])) isl_input.c:2496 691 {*insv_regdi} (expr_list:REG_DEAD (reg/v/f:DI 145 [ SR.327D.17957 ]) (nil))) (insn 1050 1049 1051 101 (set (reg:DI 1 x1) (reg/v:DI 191 [ obj1D.17943 ])) isl_input.c:2496 50 {*movdi_aarch64} (expr_list:REG_DEAD (reg/v:DI 191 [ obj1D.17943 ]) (nil))) into that (which is wrong because reg 481 has undefined bits in insn 1050): (insn 1048 1047 1049 101 (set (zero_extract:DI (reg/v:DI 191 [ obj1D.17943 ]) (const_int 32 [0x20]) (const_int 0 [0])) (reg/f:DI 481)) isl_input.c:2496 691 {*insv_regdi} (nil)) (insn 1049 1048 1050 101 (set (zero_extract:DI (reg/v:DI 191 [ obj1D.17943 ]) (const_int 32 [0x20]) (const_int 32 [0x20])) (reg/v/f:DI 145 [ SR.327D.17957 ])) isl_input.c:2496 691 {*insv_regdi} (expr_list:REG_DEAD (reg/v/f:DI 145 [ SR.327D.17957 ]) (nil))) (insn 1050 1049 1051 101 (set (reg:DI 1 x1) (ior:DI (ashift:DI (reg/v/f:DI 145 [ SR.327D.17957 ]) (const_int 32 [0x20])) (reg/f:DI 481))) isl_input.c:2496 50 {*movdi_aarch64} (nil)) The interesting fact is that combine did that while it was only considering 1048, 1049 -> 1050, and not the instruction with the paradoxical subreg: (insn 1047 1044 1048 101 (set (reg/f:DI 481) (subreg:DI (reg/f:SI 545) 0)) isl_input.c:2496 50 {*movdi_aarch64} (nil)) I found by debugging that expand_field_assignment was trying to build: (and:DI (reg/f:DI 481) (const_int -4294967296 [0xffffffff00000000])) with this statement: masked = simplify_gen_binary (ASHIFT, compute_mode, simplify_gen_binary ( AND, compute_mode, gen_lowpart (compute_mode, SET_SRC (x)), mask), pos); but the result is just (reg/f:DI 481) ! I debugged and found the first wrong result in rtlanal.c (nonzero_bits1) where the following if-statement tells us that the upper 32 bits of reg 481 must be zero: switch (code) { case REG: #if defined(POINTERS_EXTEND_UNSIGNED) /* If pointers extend unsigned and this is a pointer in Pmode, say that all the bits above ptr_mode are known to be zero. */ /* As we do not know which address space the pointer is referring to, we can do this only if the target does not support different pointer or address modes depending on the address space. */ if (target_default_pointer_address_modes_p () && POINTERS_EXTEND_UNSIGNED && GET_MODE (x) == Pmode && REG_POINTER (x) && !targetm.have_ptr_extend ()) nonzero &= GET_MODE_MASK (ptr_mode); #endif Pmode = DImode, ptr_mode=SImode, and REG_POINTER(x) is true. Yes, the value is actually a pointer! Which means, that it is not safe to extend a pointer from SI to DI with anything but a zero-extend. And if I remove this statement the wrong code is still not fixed. So there must be more places where the similar assumptions are made. However, PR 70903 is not about pointers, and uses a mode where Pmode=ptr_mode=DImode, so there must be a different as hard to track down reason why this did not work out right. So the question is, if the paradoxical subreg is really such a good idea, when it triggers so many different bugs all over the place? Bernd.