On Mon, Dec 15, 2025 at 12:55 PM Konstantinos Eleftheriou < [email protected]> wrote:
> > On Mon, Dec 15, 2025 at 5:54 AM Jeff Law <[email protected]> wrote: > >> >> >> On 12/9/25 10:01 AM, Konstantinos Eleftheriou wrote: >> > The call to `gen_lowpart` in `store_bit_field_1` might copy the >> destination >> > register into a new one, which may lead to wrong code generation, as >> the bit >> > insertions update the new register instead of updating `str_rtx`. >> > >> > This patch copies back the new destination register into `str_rtx` when >> needed. >> > >> > gcc/ChangeLog: >> > >> > * expmed.cc (store_bit_field_1): Copy back the new destination >> > register into `str_rtx` when needed. >> Ugh. gen_lowpart would not be a routine I'd want to see in this call >> chain. Largely because it's hooked and thus the underlying >> implementation changes depending on where in the pipeline we are there's >> also various helpers that tend to get used. Sigh. >> >> Regardless something doesn't make sense with your patch. The existing >> code passes the return value from gen_lowpart (potentially the new >> pseudo) as the destination for store_integral_bit_field. > > > Based on the RTL that I provided in the previous thread, the code seems to > rely > on the fact that `gen_lowpart` will generate a subreg to convert the > vector mode > into an integral one, preparing the call to 'store_integral_bit_field'. > That's why I chose > to copy the value back after the call to `store_integral_bit_field`. > >> >> > It seems like the problem would be inside store_integral_bit_field or >> its children. store_integral_bit_field doesn't have the problematic >> semantics (it returns a bool, not an RTX, so by definition it doesn't >> have those problem semantics). >> >> >> I'm thinking I really should throw this under the debugger to understand >> the dataflow here. How can I trigger the underlying issue? >> > > I'm using 'gcc.target/aarch64/vldN_lane_1.c', compiled with `-O3 > -fno-inline` (as in the testcase). > The RTL comes from the `test_vld4q_lane_u16` function. > > Sorry, I made a mistake here. The sequence is from the `test_vld4_lane_bf16` function in 'gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_1.c', compiled with `-march=armv8.2-a+fp16 -march=armv8.2-a+bf16 -O3 -g -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions`. Konstantinos > Thanks, > Konstantinos > >> >> jeff >> >
