[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 Wilco changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #10 from Wilco --- Fixed in GCC 8.
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #9 from Wilco --- I presume this can be closed now?
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 --- Comment #8 from Steve Ellcey --- Author: sje Date: Tue Nov 21 00:18:14 2017 New Revision: 254977 URL: https://gcc.gnu.org/viewcvs?rev=254977=gcc=rev Log: 2017-11-20 Steve EllceyPR target/81356 * gfortran.dg/pr45636.f90 (aarch64*-*-*): Remove from xfail list. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gfortran.dg/pr45636.f90
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 --- Comment #7 from Steve Ellcey --- Author: sje Date: Fri Nov 17 22:44:32 2017 New Revision: 254901 URL: https://gcc.gnu.org/viewcvs?rev=254901=gcc=rev Log: 2017-11-17 Steve EllceyPR target/81356 * config/aarch64/aarch64.c (aarch64_use_by_pieces_infrastructure_p): Remove. (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64.c
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 --- Comment #6 from Qing Zhao --- just found that a similar fix have been submitted 2 weeks ago to gcc_patches: https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg173652.html
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 --- Comment #5 from Qing Zhao --- the following code in config/aarch64/aarch64.c cause such behavior: 14143 static bool 14144 aarch64_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size, 14145 unsigned int align, 14146 enum by_pieces_operation op, 14147 bool speed_p) 14148 { 14149 /* STORE_BY_PIECES can be used when copying a constant string, but 14150 in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR). 14151 For now we always fail this and let the move_by_pieces code copy 14152 the string from read-only memory. */ 14153 if (op == STORE_BY_PIECES) 14154 return false; when deleting line 14153 and 14154. and use this compiler to build the testing case, I got: f: mov w1, 26952 movkw1, 0x21, lsl 16 str w1, [x0] ret looks like exactly we want.
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 Qing Zhao changed: What|Removed |Added CC||qing.zhao at oracle dot com --- Comment #4 from Qing Zhao --- the issue is confirmed on aarch64. In addition to X86, testing the same testing case on SPARC, we see No such issue: ***SPARC: (sparc use multiple stores when compile with -O): f: mov 72, %g1 stb %g1, [%o0] mov 105, %g1 stb %g1, [%o0+1] mov 33, %g1 stb %g1, [%o0+2] jmp %o7+8 stb%g0, [%o0+3]
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 --- Comment #3 from Richard Biener --- A GIMPLE level optimization to *a = '\0'; would still be ok but I agree with Andrew.
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 Andrew Pinski changed: What|Removed |Added Component|tree-optimization |target --- Comment #2 from Andrew Pinski --- This is already done during builtin expansion; just aarch64 backend has the wrong choices. Take: void f(char *a) { __builtin_strcpy (a, "Hi!"); } On x86_64 (even with 4.4) produces: movl$2189640, (%rdi) ret While on aarch64 produces: f: adrpx1, .LC0 add x1, x1, :lo12:.LC0 ldr w1, [x1] str w1, [x0] ret Why not just (for little-endian): mov w1, #0x21 lsl #16 movzw1, #0x6948 str w1, [x0] ret STORE_BY_PIECES (MOVE_BY_PIECES and MOVE_RATIO are related) controls this. Except it is disabled on aarch64: /* MOVE_RATIO dictates when we will use the move_by_pieces infrastructure. move_by_pieces will continually copy the largest safe chunks. So a 7-byte copy is a 4-byte + 2-byte + byte copy. This proves inefficient for both size and speed of copy, so we will instead use the "movmem" standard name to implement the copy. This logic does not apply when targeting -mstrict-align, so keep a sensible default in that case. */