[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-19 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

Richard Earnshaw  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |6.5

--- Comment #9 from Richard Earnshaw  ---
Fixed on trunk gcc-6 & gcc-7

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-19 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

--- Comment #7 from Richard Earnshaw  ---
Author: rearnsha
Date: Thu Oct 19 13:14:55 2017
New Revision: 253891

URL: https://gcc.gnu.org/viewcvs?rev=253891=gcc=rev
Log:
[ARM] PR 82445 - suppress 32-bit aligned ldrd/strd peepholing with
-mno-unaligned-access

Peephole patterns exist in the arm backend to spot load/store
operations to adjacent memory operations in order to convert them into
ldrd/strd instructions.  However, when we have strict alignment
enforced, then we can only do this if the accesses are known to be
64-bit aligned; this is unlikely to be the case for most loads.  The
patch adds some alignment checking to the code that validates the
addresses for use in the peephole patterns.  This should also fix
incorrect generation of ldrd/strd with unaligned accesses that could
previously have occurred on ARMv5e where all such operations must be
64-bit aligned.

I've added some new tests as well.  In doing so I discovered that the
ldrd/strd peephole tests could never fail since they would match the
source file name in the scanned assembly as well as any instructions
of the intended type.  I've fixed those by tightening the scan results
slightly.

gcc:

* config/arm/arm.c (align_ok_ldrd_strd): New function.
(mem_ok_for_ldrd_strd): New parameter align.  Extract the alignment of the
mem into it.
(gen_operands_ldrd_strd): Validate the alignment of the accesses.

testsuite:

* gcc.target/arm/peep-ldrd-1.c: Tighten test scan pattern.
* gcc.target/arm/peep-strd-1.c: Likewise.
* gcc.target/arm/peep-ldrd-2.c: New test.
* gcc.target/arm/peep-strd-2.c: New test.


Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/peep-ldrd-2.c
branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/peep-strd-2.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/arm/arm.c
branches/gcc-7-branch/gcc/testsuite/ChangeLog
branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/peep-ldrd-1.c
branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/peep-strd-1.c

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-19 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

--- Comment #8 from Richard Earnshaw  ---
Author: rearnsha
Date: Thu Oct 19 13:16:42 2017
New Revision: 253892

URL: https://gcc.gnu.org/viewcvs?rev=253892=gcc=rev
Log:
[ARM] PR 82445 - suppress 32-bit aligned ldrd/strd peepholing with
-mno-unaligned-access

Peephole patterns exist in the arm backend to spot load/store
operations to adjacent memory operations in order to convert them into
ldrd/strd instructions.  However, when we have strict alignment
enforced, then we can only do this if the accesses are known to be
64-bit aligned; this is unlikely to be the case for most loads.  The
patch adds some alignment checking to the code that validates the
addresses for use in the peephole patterns.  This should also fix
incorrect generation of ldrd/strd with unaligned accesses that could
previously have occurred on ARMv5e where all such operations must be
64-bit aligned.

I've added some new tests as well.  In doing so I discovered that the
ldrd/strd peephole tests could never fail since they would match the
source file name in the scanned assembly as well as any instructions
of the intended type.  I've fixed those by tightening the scan results
slightly.

gcc:

* config/arm/arm.c (align_ok_ldrd_strd): New function.
(mem_ok_for_ldrd_strd): New parameter align.  Extract the alignment of the
mem into it.
(gen_operands_ldrd_strd): Validate the alignment of the accesses.

testsuite:

* gcc.target/arm/peep-ldrd-1.c: Tighten test scan pattern.
* gcc.target/arm/peep-strd-1.c: Likewise.
* gcc.target/arm/peep-ldrd-2.c: New test.
* gcc.target/arm/peep-strd-2.c: New test.


Added:
branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/peep-ldrd-2.c
branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/peep-strd-2.c
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/config/arm/arm.c
branches/gcc-6-branch/gcc/testsuite/ChangeLog
branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/peep-ldrd-1.c
branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/peep-strd-1.c

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-19 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

--- Comment #6 from Richard Earnshaw  ---
Author: rearnsha
Date: Thu Oct 19 13:10:42 2017
New Revision: 253890

URL: https://gcc.gnu.org/viewcvs?rev=253890=gcc=rev
Log:
[ARM] PR 82445 - suppress 32-bit aligned ldrd/strd peepholing with
-mno-unaligned-access

Peephole patterns exist in the arm backend to spot load/store
operations to adjacent memory operations in order to convert them into
ldrd/strd instructions.  However, when we have strict alignment
enforced, then we can only do this if the accesses are known to be
64-bit aligned; this is unlikely to be the case for most loads.  The
patch adds some alignment checking to the code that validates the
addresses for use in the peephole patterns.  This should also fix
incorrect generation of ldrd/strd with unaligned accesses that could
previously have occurred on ARMv5e where all such operations must be
64-bit aligned.

I've added some new tests as well.  In doing so I discovered that the
ldrd/strd peephole tests could never fail since they would match the
source file name in the scanned assembly as well as any instructions
of the intended type.  I've fixed those by tightening the scan results
slightly.

gcc:

* config/arm/arm.c (align_ok_ldrd_strd): New function.
(mem_ok_for_ldrd_strd): New parameter align.  Extract the alignment of the
mem into it.
(gen_operands_ldrd_strd): Validate the alignment of the accesses.

testsuite:

* gcc.target/arm/peep-ldrd-1.c: Tighten test scan pattern.
* gcc.target/arm/peep-strd-1.c: Likewise.
* gcc.target/arm/peep-ldrd-2.c: New test.
* gcc.target/arm/peep-strd-2.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/arm/peep-ldrd-2.c
trunk/gcc/testsuite/gcc.target/arm/peep-strd-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/arm/peep-ldrd-1.c
trunk/gcc/testsuite/gcc.target/arm/peep-strd-1.c

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-18 Thread petrcvekcz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

Petr Cvek  changed:

   What|Removed |Added

 CC||petrcvekcz at gmail dot com

--- Comment #5 from Petr Cvek  ---
Get the same bug with: gcc version 7.1.0 (crosstool-NG
crosstool-ng-1.23.0-88-gfae66ae3) 

Bug free version: gcc version 6.3.0 (crosstool-NG
crosstool-ng-1.22.0-452-gc7b1e295) 

Testing program: just main() with "helloworld" calling the example code
(strd.c) compiled as object file.

march/mtune combinations which are OK:

-march=armv5te -mtune=xscale
-march=armv5te -mtune=iwmmxt
-march=iwmmxt
-march=armv5t (actually less opcodes than variants above)
-march=armv5

march/mtune combinations which generate an unaligned access in STRD (even with
-mno-unaligned-access):

-march=armv5te (tested on a real HW, generating alignment exceptions in
/proc/cpu/alignment)
-march=armv5tej (not existing on gcc 6.3.0)
-march=armv5te -mtune=arm1020e
-march=armv5te -mtune=arm926ej-s

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rearnsha at gcc dot 
gnu.org

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

--- Comment #4 from Richard Earnshaw  ---
looks like gen_operands_ldrd_strd should be checking for this and failing if
the alignment is not suitable for the target architecture.

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-06 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-10-06
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
To answer myself, yes, -fno-store-merging fixes it.  RTL expansion is ok:

;; MEM[(void *)unknown_glyph.0_1 + 4B] = 1048584;

(insn 7 6 8 (set (reg:SI 112)
(const_int 1048584 [0x18])) "strd.c":50 -1
 (nil))

(insn 8 7 0 (set (mem:SI (plus:SI (reg/f:SI 110 [ unknown_glyph.0_1 ])
(const_int 4 [0x4])) [0 MEM[(void *)unknown_glyph.0_1 + 4B]+0
S4 A32])
(reg:SI 112)) "strd.c":50 -1
 (nil))

;; MEM[(void *)unknown_glyph.0_1 + 8B] = 4294770688;

(insn 9 8 10 (set (reg:SI 113)
(const_int -196608 [0xfffd])) "strd.c":51 -1
 (nil))

(insn 10 9 0 (set (mem:SI (plus:SI (reg/f:SI 110 [ unknown_glyph.0_1 ])
(const_int 8 [0x8])) [0 MEM[(void *)unknown_glyph.0_1 + 8B]+0
S4 A32])
(reg:SI 113)) "strd.c":51 -1
 (nil))

;; MEM[(void *)unknown_glyph.0_1 + 12B] = 8;

(insn 11 10 12 (set (reg:SI 115)
(const_int 8 [0x8])) "strd.c":53 -1
 (nil))

(insn 12 11 13 (set (reg:HI 114)
(subreg:HI (reg:SI 115) 0)) "strd.c":53 -1
 (expr_list:REG_EQUAL (const_int 8 [0x8])
(nil)))

(insn 13 12 0 (set (mem:HI (plus:SI (reg/f:SI 110 [ unknown_glyph.0_1 ])
(const_int 12 [0xc])) [0 MEM[(void *)unknown_glyph.0_1 + 12B]+0
S2 A32])
(reg:HI 114)) "strd.c":53 -1
 (nil))


The DI move is introduced by peephole2:

Splitting with gen_peephole2_12

which is ldrdstrd.md:

(define_peephole2 ; strd
  [(set (match_operand:SI 2 "memory_operand" "")
(match_operand:SI 0 "arm_general_register_operand" ""))
   (set (match_operand:SI 3 "memory_operand" "")
(match_operand:SI 1 "arm_general_register_operand" ""))]
  "TARGET_LDRD"
  [(const_int 0)]
{
  if (!gen_operands_ldrd_strd (operands, false, false, false))
FAIL;
  else if (TARGET_ARM)
  {
/* In ARM state, the destination registers of LDRD/STRD must be
   consecutive. We emit DImode access.  */
operands[0] = gen_rtx_REG (DImode, REGNO (operands[0]));
operands[2] = adjust_address (operands[2], DImode, 0);
/* Emit [(set (match_dup 2) (match_dup 0))]  */
emit_insn (gen_rtx_SET (operands[2], operands[0]));
DONE;
  }
  else if (TARGET_THUMB2)
  {
/* Emit the pattern:
   [(parallel [(set (match_dup 2) (match_dup 0))
   (set (match_dup 3) (match_dup 1))])]  */
rtx t1 = gen_rtx_SET (operands[2], operands[0]);
rtx t2 = gen_rtx_SET (operands[3], operands[1]);
emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, t1, t2)));
DONE;
  }
})

but this doesn't at all consider alignment of the MEMs it merges.

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-06 Thread agraf at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

--- Comment #2 from Alexander Graf  ---
(In reply to Richard Biener from comment #1)
> Does -fno-store-merging fix it?

Yes, but it generates worse code than -march=armv5 (which does not support
STRD) does:

-march=armv6 -fno-store-merging:
>  :
>0: e59f3028ldr r3, [pc, #40]   ; 30 
>4: e3a02008mov r2, #8
>8: e3a0c010mov ip, #16
>c: e3a0mov r0, #0
>   10: e3e01002mvn r1, #2
>   14: e5933000ldr r3, [r3]
>   18: e1c3c0b6strhip, [r3, #6]
>   1c: e1c300b8strhr0, [r3, #8]
>   20: e1c310bastrhr1, [r3, #10]
>   24: e1c320b4strhr2, [r3, #4]
>   28: e1c320bcstrhr2, [r3, #12]
>   2c: e12fff1ebx  lr
>   30: .word   0x

-march=armv5:
>  :
>0: e59f3018ldr r3, [pc, #24]   ; 20 
>4: e3a02008mov r2, #8
>8: e59f0014ldr r0, [pc, #20]   ; 24 
>c: e59f1014ldr r1, [pc, #20]   ; 28 
>   10: e5933000ldr r3, [r3]
>   14: e9830003stmib   r3, {r0, r1}
>   18: e1c320bcstrhr2, [r3, #12]
>   1c: e12fff1ebx  lr
>   20: .word   0x
>   24: 0018.word   0x0018
>   28: fffd.word   0xfffd

[Bug target/82445] ARM target generates unaligned STRD instruction

2017-10-06 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82445

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target|ARM |arm

--- Comment #1 from Richard Biener  ---
Does -fno-store-merging fix it?

store-merging currently does

  unsigned HOST_WIDE_INT align = split_store->align;
  tree offset_type = get_alias_type_for_stmts (split_store->orig_stmts);
  location_t loc = get_location_for_stmts (split_store->orig_stmts);

  tree int_type = build_nonstandard_integer_type (try_size, UNSIGNED);
  int_type = build_aligned_type (int_type, align);
  tree dest = fold_build2 (MEM_REF, int_type, addr,
   build_int_cst (offset_type, try_pos));

and expects the target to cope with this.  But IIRC RTL expansion only
ever "honors" this if either it has a movmisalign optab for the mode
or it is SLOW_UNALIGNED_ACCESS (ok, arm seems to be SLOW_UNALIGNED_ACCESS).
So this means we go the bitfield store way (in the end this means store-merging
lacks proper costing for strict-align targets).  That may end up using
all sorts of patterns, like insv and friends.  Not sure what happens in the
end.