On Tue, Jun 28, 2016 at 4:19 PM, Jakub Jelinek <[email protected]> wrote:
> On Tue, Jun 28, 2016 at 12:26:55PM +0200, Jakub Jelinek wrote:
>> On Tue, Jun 28, 2016 at 12:09:51PM +0200, Uros Bizjak wrote:
>> > > So, this patch instead changes ix86_expand_vector_move, so that
>> > > for SUBREGs it forces the SUBREG_REG into memory (or register if
>> > > that fails, though I don't have a testcase for when that would happen),
>> > > and just re-creates a SUBREG on the forced MEM (for whole size
>> > > SUBREGs that is in the end just using different mode for the MEM).
>> >
>> > There can be issue with paradoxical subregs, since subreg mode can be
>> > wider than original mode.
>>
>> For paradoxical subregs, the extra bits are undefined, whether it is SUBREG
>> of a constant, REG or MEM, isn't that the case? Though I guess with MEM
>> there is a risk that reading the undefined bits from mem will be beyond end
>> of the data segment. It would really surprise me if something created a
>> paradoxical SUBREG of a CONSTANT_P.
>> Anyway, I can just always force_reg in that case, like:
>>
>> 2016-06-28 Jakub Jelinek <[email protected]>
>>
>> PR middle-end/71626
>> * config/i386/i386.c (ix86_expand_vector_move): For SUBREG of
>> a constant, force its SUBREG_REG into memory or register instead
>> of whole op1.
>>
>> * gcc.c-torture/execute/pr71626-1.c: New test.
>> * gcc.c-torture/execute/pr71626-2.c: New test.
>
> This version passed bootstrap/regtest on x86_64-linux and i686-linux, is
> this one ok, or should I test the previous one?
Uh, just missed this mail... Previous patch is OK, no need for beeing
too paranoid here.
Uros.
>> --- gcc/config/i386/i386.c.jj 2016-06-27 14:50:51.000000000 +0200
>> +++ gcc/config/i386/i386.c 2016-06-28 10:51:18.444624190 +0200
>> @@ -19610,12 +19610,30 @@ ix86_expand_vector_move (machine_mode mo
>> of the register, once we have that information we may be able
>> to handle some of them more efficiently. */
>> if (can_create_pseudo_p ()
>> - && register_operand (op0, mode)
>> && (CONSTANT_P (op1)
>> || (SUBREG_P (op1)
>> && CONSTANT_P (SUBREG_REG (op1))))
>> - && !standard_sse_constant_p (op1, mode))
>> - op1 = validize_mem (force_const_mem (mode, op1));
>> + && ((register_operand (op0, mode)
>> + && !standard_sse_constant_p (op1, mode))
>> + /* ix86_expand_vector_move_misalign() does not like constants. */
>> + || (SSE_REG_MODE_P (mode)
>> + && MEM_P (op0)
>> + && MEM_ALIGN (op0) < align)))
>> + {
>> + if (SUBREG_P (op1))
>> + {
>> + machine_mode imode = GET_MODE (SUBREG_REG (op1));
>> + rtx r = (paradoxical_subreg_p (op1)
>> + ? NULL_RTX : force_const_mem (imode, SUBREG_REG (op1)));
>> + if (r)
>> + r = validize_mem (r);
>> + else
>> + r = force_reg (imode, SUBREG_REG (op1));
>> + op1 = simplify_gen_subreg (mode, r, imode, SUBREG_BYTE (op1));
>> + }
>> + else
>> + op1 = validize_mem (force_const_mem (mode, op1));
>> + }
>>
>> /* We need to check memory alignment for SSE mode since attribute
>> can make operands unaligned. */
>> @@ -19626,13 +19643,8 @@ ix86_expand_vector_move (machine_mode mo
>> {
>> rtx tmp[2];
>>
>> - /* ix86_expand_vector_move_misalign() does not like constants ... */
>> - if (CONSTANT_P (op1)
>> - || (SUBREG_P (op1)
>> - && CONSTANT_P (SUBREG_REG (op1))))
>> - op1 = validize_mem (force_const_mem (mode, op1));
>> -
>> - /* ... nor both arguments in memory. */
>> + /* ix86_expand_vector_move_misalign() does not like both
>> + arguments in memory. */
>> if (!register_operand (op0, mode)
>> && !register_operand (op1, mode))
>> op1 = force_reg (mode, op1);
>
> Jakub