On Mon, Dec 15, 2025 at 12:55 PM Konstantinos Eleftheriou <
[email protected]> wrote:

>
> On Mon, Dec 15, 2025 at 5:54 AM Jeff Law <[email protected]> wrote:
>
>>
>>
>> On 12/9/25 10:01 AM, Konstantinos Eleftheriou wrote:
>> > The call to `gen_lowpart` in `store_bit_field_1` might copy the
>> destination
>> > register into a new one, which may lead to wrong code generation, as
>> the bit
>> > insertions update the new register instead of updating `str_rtx`.
>> >
>> > This patch copies back the new destination register into `str_rtx` when
>> needed.
>> >
>> > gcc/ChangeLog:
>> >
>> >          * expmed.cc (store_bit_field_1): Copy back the new destination
>> >       register into `str_rtx` when needed.
>> Ugh.  gen_lowpart would not be a routine I'd want to see in this call
>> chain.  Largely because it's hooked and thus the underlying
>> implementation changes depending on where in the pipeline we are there's
>> also various helpers that tend to get used.  Sigh.
>>
>> Regardless something doesn't make sense with your patch.  The existing
>> code passes the return value from gen_lowpart (potentially the new
>> pseudo) as the destination for store_integral_bit_field.
>
>
> Based on the RTL that I provided in the previous thread, the code seems to
> rely
> on the fact that `gen_lowpart` will generate a subreg to convert the
> vector mode
> into an integral one, preparing the call to 'store_integral_bit_field'.
> That's why I chose
> to copy the value back after the call to `store_integral_bit_field`.
>
>>
>>
> It seems like the problem would be inside store_integral_bit_field or
>> its children.  store_integral_bit_field doesn't have the problematic
>> semantics (it returns a bool, not an RTX, so by definition it doesn't
>> have those problem semantics).
>>
>>
>> I'm thinking I really should throw this under the debugger to understand
>> the dataflow here.  How can I trigger the underlying issue?
>>
>
> I'm using 'gcc.target/aarch64/vldN_lane_1.c', compiled with `-O3
> -fno-inline` (as in the testcase).
> The RTL comes from the `test_vld4q_lane_u16` function.
>
> Sorry, I made a mistake here. The sequence is from the
`test_vld4_lane_bf16` function in
'gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_1.c',
compiled with `-march=armv8.2-a+fp16 -march=armv8.2-a+bf16 -O3 -g
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
-finline-functions`.

Konstantinos

> Thanks,
> Konstantinos
>
>>
>> jeff
>>
>

Reply via email to