When I previously added the use of unaligned vsx stores to inline expansion
of memset, I didn't do a good job of managing boundary conditions. The intention
was to only use unaligned vsx if the block being cleared was more than 32 bytes.
What it actually did was to prevent the use of unaligned vsx for the last 32
bytes of any block being cleared. So this change puts the test up front so it
is not affected by the decrement of bytes.

OK for trunk if regstrap passes?

Thanks!
   Aaron



2018-11-26  Aaron Sawdey  <acsaw...@linux.ibm.com>

        * config/rs6000/rs6000-string.c (expand_block_clear): Change how
        we determine if unaligned vsx is ok.


Index: gcc/config/rs6000/rs6000-string.c
===================================================================
--- gcc/config/rs6000/rs6000-string.c   (revision 266219)
+++ gcc/config/rs6000/rs6000-string.c   (working copy)
@@ -85,14 +85,14 @@
   if (! optimize_size && bytes > 8 * clear_step)
     return 0;

+  bool unaligned_vsx_ok = (bytes >= 32 && TARGET_EFFICIENT_UNALIGNED_VSX);
+
   for (offset = 0; bytes > 0; offset += clear_bytes, bytes -= clear_bytes)
     {
       machine_mode mode = BLKmode;
       rtx dest;

-      if (TARGET_ALTIVEC
-         && ((bytes >= 16 && align >= 128)
-             || (bytes >= 32 && TARGET_EFFICIENT_UNALIGNED_VSX)))
+      if (TARGET_ALTIVEC && ((bytes >= 16 && align >= 128) || 
unaligned_vsx_ok))
        {
          clear_bytes = 16;
          mode = V4SImode;

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Reply via email to