Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS

Richard Earnshaw (lists) Wed, 16 Oct 2019 09:16:42 -0700

On 11/10/2019 00:08, Ramana Radhakrishnan wrote:

On Thu, Oct 10, 2019 at 7:06 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:


Wilco Dijkstra <wilco.dijks...@arm.com> writes:

ping

Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
bitfields by their declared type, which results in better codegeneration on 
practically
any target.


The name is confusing, but the documentation looks accurate to me:

     Define this macro as a C expression which is nonzero if accessing less
     than a word of memory (i.e.@: a @code{char} or a @code{short}) is no
     faster than accessing a word of memory, i.e., if such access
     require more than one instruction or if there is no difference in cost
     between byte and (aligned) word loads.

     When this macro is not defined, the compiler will access a field by
     finding the smallest containing object; when it is defined, a fullword
     load will be used if alignment permits.  Unless bytes accesses are
     faster than word accesses, using word accesses is preferable since it
     may eliminate subsequent memory access if subsequent accesses occur to
     other fields in the same word of the structure, but to different bytes.

I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS
from GCC as it's confusing and useless.


I disagree.  Some targets can optimise single-bit operations when the
container is a byte, for example.

OK for commit until we get rid of it?

ChangeLog:
2017-11-17  Wilco Dijkstra  <wdijk...@arm.com>

     gcc/
         * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1.
--
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -769,14 +769,9 @@ typedef struct
     if given data not on the nominal alignment.  */
  #define STRICT_ALIGNMENT                TARGET_STRICT_ALIGN

-/* Define this macro to be non-zero if accessing less than a word of
-   memory is no faster than accessing a word of memory, i.e., if such
-   accesses require more than one instruction or if there is no
-   difference in cost.
-   Although there's no difference in instruction count or cycles,
-   in AArch64 we don't want to expand to a sub-word to a 64-bit access
-   if we don't have to, for power-saving reasons.  */
-#define SLOW_BYTE_ACCESS               0
+/* Contrary to all documentation, this enables wide bitfield accesses,
+   which results in better code when accessing multiple bitfields.  */
+#define SLOW_BYTE_ACCESS               1

  #define NO_FUNCTION_CSE 1


I agree this makes sense from a performance point of view, and I think
the existing comment is admitting that AArch64 has the properties that
would normally cause us to set SLOW_BYTE_ACCESS to 1.  But the comment
is claiming that there's a power-saving benefit to leaving it off.

It seems like a weak argument though.  Bitfields are used when several
values are packed into the same integer, so there's a high likelihood
we'll need the whole integer anyway.  Avoiding the redundancies described
in the documention should if anything help with power usage.

Maybe the main concern was using a 64-bit access when a 32-bit one
would do, since 32-bit bitfield containers are the most common.  But the:

          && GET_MODE_ALIGNMENT (mode) <= align

condition in get_best_mode should avoid that unless the 64-bit
access is naturally aligned.  (See the big comment above for the
pros and cons of this.)

So I think we should change the macro value unless anyone can back up the
power-saving claim.  Let's wait a week (more) to see if anyone objects.


IIRC, that power saving comment comes from the original port and
probably from when
the port was first written which is probably more than 10 years now.

Yes. Don't forget that at that time the INSV and EXTV expanders onlyoperated on a single mode, which on AArch64 was 64 bits. IIRC, at thetime this was written the compiler would widen everything to that sizeif there was a bitfield op and that led to worse code.


So it's probably not as relevant now as it once was.

R.

regards
Ramana

Ramana


The comment change isn't OK though.  Please keep the first paragraph
and just reword the second to say that's why we set the value to 1.

Thanks,
Richard

Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS

Reply via email to