Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On 11/10/2019 00:08, Ramana Radhakrishnan wrote: On Thu, Oct 10, 2019 at 7:06 PM Richard Sandiford wrote: Wilco Dijkstra writes: ping Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing bitfields by their declared type, which results in better codegeneration on practically any target. The name is confusing, but the documentation looks accurate to me: Define this macro as a C expression which is nonzero if accessing less than a word of memory (i.e.@: a @code{char} or a @code{short}) is no faster than accessing a word of memory, i.e., if such access require more than one instruction or if there is no difference in cost between byte and (aligned) word loads. When this macro is not defined, the compiler will access a field by finding the smallest containing object; when it is defined, a fullword load will be used if alignment permits. Unless bytes accesses are faster than word accesses, using word accesses is preferable since it may eliminate subsequent memory access if subsequent accesses occur to other fields in the same word of the structure, but to different bytes. I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS from GCC as it's confusing and useless. I disagree. Some targets can optimise single-bit operations when the container is a byte, for example. OK for commit until we get rid of it? ChangeLog: 2017-11-17 Wilco Dijkstra gcc/ * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. -- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -769,14 +769,9 @@ typedef struct if given data not on the nominal alignment. */ #define STRICT_ALIGNMENTTARGET_STRICT_ALIGN -/* Define this macro to be non-zero if accessing less than a word of - memory is no faster than accessing a word of memory, i.e., if such - accesses require more than one instruction or if there is no - difference in cost. - Although there's no difference in instruction count or cycles, - in AArch64 we don't want to expand to a sub-word to a 64-bit access - if we don't have to, for power-saving reasons. */ -#define SLOW_BYTE_ACCESS 0 +/* Contrary to all documentation, this enables wide bitfield accesses, + which results in better code when accessing multiple bitfields. */ +#define SLOW_BYTE_ACCESS 1 #define NO_FUNCTION_CSE 1 I agree this makes sense from a performance point of view, and I think the existing comment is admitting that AArch64 has the properties that would normally cause us to set SLOW_BYTE_ACCESS to 1. But the comment is claiming that there's a power-saving benefit to leaving it off. It seems like a weak argument though. Bitfields are used when several values are packed into the same integer, so there's a high likelihood we'll need the whole integer anyway. Avoiding the redundancies described in the documention should if anything help with power usage. Maybe the main concern was using a 64-bit access when a 32-bit one would do, since 32-bit bitfield containers are the most common. But the: && GET_MODE_ALIGNMENT (mode) <= align condition in get_best_mode should avoid that unless the 64-bit access is naturally aligned. (See the big comment above for the pros and cons of this.) So I think we should change the macro value unless anyone can back up the power-saving claim. Let's wait a week (more) to see if anyone objects. IIRC, that power saving comment comes from the original port and probably from when the port was first written which is probably more than 10 years now. Yes. Don't forget that at that time the INSV and EXTV expanders only operated on a single mode, which on AArch64 was 64 bits. IIRC, at the time this was written the compiler would widen everything to that size if there was a bitfield op and that led to worse code. So it's probably not as relevant now as it once was. R. regards Ramana Ramana The comment change isn't OK though. Please keep the first paragraph and just reword the second to say that's why we set the value to 1. Thanks, Richard
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On Thu, Oct 10, 2019 at 8:06 PM Richard Sandiford wrote: > > Wilco Dijkstra writes: > > ping > > > > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing > > bitfields by their declared type, which results in better codegeneration on > > practically > > any target. > > The name is confusing, but the documentation looks accurate to me: > > Define this macro as a C expression which is nonzero if accessing less > than a word of memory (i.e.@: a @code{char} or a @code{short}) is no > faster than accessing a word of memory, i.e., if such access > require more than one instruction or if there is no difference in cost > between byte and (aligned) word loads. > > When this macro is not defined, the compiler will access a field by > finding the smallest containing object; when it is defined, a fullword > load will be used if alignment permits. Unless bytes accesses are > faster than word accesses, using word accesses is preferable since it > may eliminate subsequent memory access if subsequent accesses occur to > other fields in the same word of the structure, but to different bytes. > > > I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS > > from GCC as it's confusing and useless. > > I disagree. Some targets can optimise single-bit operations when the > container is a byte, for example. There's also less chance of store-to-load forwarding issues when _not_ forcing word accesses. Because I think that we do not use the same macro to change code generation for writes (where a larger access requires a read-modify-write cycle). But this also means that a decision with no information on context is going to be flawed. Still generally I'd do smaller reads if they are not significantly more expensive on the target but larger writes (with the same constraint) just for the STLF issue. Unfortunately the macro docs don't say if it is applicable to reads or writes or both... > > OK for commit until we get rid of it? > > > > ChangeLog: > > 2017-11-17 Wilco Dijkstra > > > > gcc/ > > * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. > > -- > > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > > index > > 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 > > 100644 > > --- a/gcc/config/aarch64/aarch64.h > > +++ b/gcc/config/aarch64/aarch64.h > > @@ -769,14 +769,9 @@ typedef struct > > if given data not on the nominal alignment. */ > > #define STRICT_ALIGNMENTTARGET_STRICT_ALIGN > > > > -/* Define this macro to be non-zero if accessing less than a word of > > - memory is no faster than accessing a word of memory, i.e., if such > > - accesses require more than one instruction or if there is no > > - difference in cost. > > - Although there's no difference in instruction count or cycles, > > - in AArch64 we don't want to expand to a sub-word to a 64-bit access > > - if we don't have to, for power-saving reasons. */ > > -#define SLOW_BYTE_ACCESS 0 > > +/* Contrary to all documentation, this enables wide bitfield accesses, > > + which results in better code when accessing multiple bitfields. */ > > +#define SLOW_BYTE_ACCESS 1 > > > > #define NO_FUNCTION_CSE 1 > > I agree this makes sense from a performance point of view, and I think > the existing comment is admitting that AArch64 has the properties that > would normally cause us to set SLOW_BYTE_ACCESS to 1. But the comment > is claiming that there's a power-saving benefit to leaving it off. > > It seems like a weak argument though. Bitfields are used when several > values are packed into the same integer, so there's a high likelihood > we'll need the whole integer anyway. Avoiding the redundancies described > in the documention should if anything help with power usage. > > Maybe the main concern was using a 64-bit access when a 32-bit one > would do, since 32-bit bitfield containers are the most common. But the: > > && GET_MODE_ALIGNMENT (mode) <= align > > condition in get_best_mode should avoid that unless the 64-bit > access is naturally aligned. (See the big comment above for the > pros and cons of this.) > > So I think we should change the macro value unless anyone can back up the > power-saving claim. Let's wait a week (more) to see if anyone objects. > > The comment change isn't OK though. Please keep the first paragraph > and just reword the second to say that's why we set the value to 1. > > Thanks, > Richard
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On Thu, Oct 10, 2019 at 7:06 PM Richard Sandiford wrote: > > Wilco Dijkstra writes: > > ping > > > > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing > > bitfields by their declared type, which results in better codegeneration on > > practically > > any target. > > The name is confusing, but the documentation looks accurate to me: > > Define this macro as a C expression which is nonzero if accessing less > than a word of memory (i.e.@: a @code{char} or a @code{short}) is no > faster than accessing a word of memory, i.e., if such access > require more than one instruction or if there is no difference in cost > between byte and (aligned) word loads. > > When this macro is not defined, the compiler will access a field by > finding the smallest containing object; when it is defined, a fullword > load will be used if alignment permits. Unless bytes accesses are > faster than word accesses, using word accesses is preferable since it > may eliminate subsequent memory access if subsequent accesses occur to > other fields in the same word of the structure, but to different bytes. > > > I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS > > from GCC as it's confusing and useless. > > I disagree. Some targets can optimise single-bit operations when the > container is a byte, for example. > > > OK for commit until we get rid of it? > > > > ChangeLog: > > 2017-11-17 Wilco Dijkstra > > > > gcc/ > > * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. > > -- > > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > > index > > 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 > > 100644 > > --- a/gcc/config/aarch64/aarch64.h > > +++ b/gcc/config/aarch64/aarch64.h > > @@ -769,14 +769,9 @@ typedef struct > > if given data not on the nominal alignment. */ > > #define STRICT_ALIGNMENTTARGET_STRICT_ALIGN > > > > -/* Define this macro to be non-zero if accessing less than a word of > > - memory is no faster than accessing a word of memory, i.e., if such > > - accesses require more than one instruction or if there is no > > - difference in cost. > > - Although there's no difference in instruction count or cycles, > > - in AArch64 we don't want to expand to a sub-word to a 64-bit access > > - if we don't have to, for power-saving reasons. */ > > -#define SLOW_BYTE_ACCESS 0 > > +/* Contrary to all documentation, this enables wide bitfield accesses, > > + which results in better code when accessing multiple bitfields. */ > > +#define SLOW_BYTE_ACCESS 1 > > > > #define NO_FUNCTION_CSE 1 > > I agree this makes sense from a performance point of view, and I think > the existing comment is admitting that AArch64 has the properties that > would normally cause us to set SLOW_BYTE_ACCESS to 1. But the comment > is claiming that there's a power-saving benefit to leaving it off. > > It seems like a weak argument though. Bitfields are used when several > values are packed into the same integer, so there's a high likelihood > we'll need the whole integer anyway. Avoiding the redundancies described > in the documention should if anything help with power usage. > > Maybe the main concern was using a 64-bit access when a 32-bit one > would do, since 32-bit bitfield containers are the most common. But the: > > && GET_MODE_ALIGNMENT (mode) <= align > > condition in get_best_mode should avoid that unless the 64-bit > access is naturally aligned. (See the big comment above for the > pros and cons of this.) > > So I think we should change the macro value unless anyone can back up the > power-saving claim. Let's wait a week (more) to see if anyone objects. IIRC, that power saving comment comes from the original port and probably from when the port was first written which is probably more than 10 years now. regards Ramana Ramana > > The comment change isn't OK though. Please keep the first paragraph > and just reword the second to say that's why we set the value to 1. > > Thanks, > Richard
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
Wilco Dijkstra writes: > ping > > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing > bitfields by their declared type, which results in better codegeneration on > practically > any target. The name is confusing, but the documentation looks accurate to me: Define this macro as a C expression which is nonzero if accessing less than a word of memory (i.e.@: a @code{char} or a @code{short}) is no faster than accessing a word of memory, i.e., if such access require more than one instruction or if there is no difference in cost between byte and (aligned) word loads. When this macro is not defined, the compiler will access a field by finding the smallest containing object; when it is defined, a fullword load will be used if alignment permits. Unless bytes accesses are faster than word accesses, using word accesses is preferable since it may eliminate subsequent memory access if subsequent accesses occur to other fields in the same word of the structure, but to different bytes. > I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS > from GCC as it's confusing and useless. I disagree. Some targets can optimise single-bit operations when the container is a byte, for example. > OK for commit until we get rid of it? > > ChangeLog: > 2017-11-17 Wilco Dijkstra > > gcc/ > * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. > -- > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index > 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 > 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -769,14 +769,9 @@ typedef struct > if given data not on the nominal alignment. */ > #define STRICT_ALIGNMENTTARGET_STRICT_ALIGN > > -/* Define this macro to be non-zero if accessing less than a word of > - memory is no faster than accessing a word of memory, i.e., if such > - accesses require more than one instruction or if there is no > - difference in cost. > - Although there's no difference in instruction count or cycles, > - in AArch64 we don't want to expand to a sub-word to a 64-bit access > - if we don't have to, for power-saving reasons. */ > -#define SLOW_BYTE_ACCESS 0 > +/* Contrary to all documentation, this enables wide bitfield accesses, > + which results in better code when accessing multiple bitfields. */ > +#define SLOW_BYTE_ACCESS 1 > > #define NO_FUNCTION_CSE 1 I agree this makes sense from a performance point of view, and I think the existing comment is admitting that AArch64 has the properties that would normally cause us to set SLOW_BYTE_ACCESS to 1. But the comment is claiming that there's a power-saving benefit to leaving it off. It seems like a weak argument though. Bitfields are used when several values are packed into the same integer, so there's a high likelihood we'll need the whole integer anyway. Avoiding the redundancies described in the documention should if anything help with power usage. Maybe the main concern was using a 64-bit access when a 32-bit one would do, since 32-bit bitfield containers are the most common. But the: && GET_MODE_ALIGNMENT (mode) <= align condition in get_best_mode should avoid that unless the 64-bit access is naturally aligned. (See the big comment above for the pros and cons of this.) So I think we should change the macro value unless anyone can back up the power-saving claim. Let's wait a week (more) to see if anyone objects. The comment change isn't OK though. Please keep the first paragraph and just reword the second to say that's why we set the value to 1. Thanks, Richard
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
ping Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing bitfields by their declared type, which results in better codegeneration on practically any target. I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS from GCC as it's confusing and useless. OK for commit until we get rid of it? ChangeLog: 2017-11-17 Wilco Dijkstra gcc/ * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. -- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -769,14 +769,9 @@ typedef struct if given data not on the nominal alignment. */ #define STRICT_ALIGNMENTTARGET_STRICT_ALIGN -/* Define this macro to be non-zero if accessing less than a word of - memory is no faster than accessing a word of memory, i.e., if such - accesses require more than one instruction or if there is no - difference in cost. - Although there's no difference in instruction count or cycles, - in AArch64 we don't want to expand to a sub-word to a 64-bit access - if we don't have to, for power-saving reasons. */ -#define SLOW_BYTE_ACCESS 0 +/* Contrary to all documentation, this enables wide bitfield accesses, + which results in better code when accessing multiple bitfields. */ +#define SLOW_BYTE_ACCESS 1 #define NO_FUNCTION_CSE 1
[PATCH][AArch64] Set SLOW_BYTE_ACCESS
Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing bitfields by their declared type, which results in better codegeneration on practically any target. I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS from GCC as it's confusing and useless. OK for commit until we get rid of it? ChangeLog: 2017-11-17 Wilco Dijkstra gcc/ * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. -- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -769,14 +769,9 @@ typedef struct if given data not on the nominal alignment. */ #define STRICT_ALIGNMENT TARGET_STRICT_ALIGN -/* Define this macro to be non-zero if accessing less than a word of - memory is no faster than accessing a word of memory, i.e., if such - accesses require more than one instruction or if there is no - difference in cost. - Although there's no difference in instruction count or cycles, - in AArch64 we don't want to expand to a sub-word to a 64-bit access - if we don't have to, for power-saving reasons. */ -#define SLOW_BYTE_ACCESS 0 +/* Contrary to all documentation, this enables wide bitfield accesses, + which results in better code when accessing multiple bitfields. */ +#define SLOW_BYTE_ACCESS 1 #define NO_FUNCTION_CSE 1
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On 16/05/18 14:56, Wilco Dijkstra wrote: > Richard Earnshaw wrote: > Which doesn't appear to have been approved. Did you follow up with Jeff? >>> >>> I'll get back to that one at some point - it'll take some time to agree on >>> a way >>> forward with the callback. >>> >>> Wilco >>> >>> >> >> So it seems to me that this should then be queued until that is resolved. > > Why? The patch as is doesn't at all depend on the resolution of how to improve > the callback. If we stopped all patches until GCC is 100% perfect we'd never > make any progress. > > Wilco > Because we don't want to build up technical debt for things that should and can be fixed properly. R.
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
Richard Earnshaw wrote: >>> Which doesn't appear to have been approved. Did you follow up with Jeff? >> >> I'll get back to that one at some point - it'll take some time to agree on a >> way >> forward with the callback. >> >> Wilco >> >> > > So it seems to me that this should then be queued until that is resolved. Why? The patch as is doesn't at all depend on the resolution of how to improve the callback. If we stopped all patches until GCC is 100% perfect we'd never make any progress. Wilco
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On 15/05/18 17:58, Wilco Dijkstra wrote: > Hi, > >> Which doesn't appear to have been approved. Did you follow up with Jeff? > > I'll get back to that one at some point - it'll take some time to agree on a > way > forward with the callback. > > Wilco > > So it seems to me that this should then be queued until that is resolved. R.
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
Hi, > Which doesn't appear to have been approved. Did you follow up with Jeff? I'll get back to that one at some point - it'll take some time to agree on a way forward with the callback. Wilco
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On 15/05/18 17:01, Wilco Dijkstra wrote: > Hi, > >> I see nothing about you addressing James' comment from 17th November... > > I addressed that in a separate patch, see > https://patchwork.ozlabs.org/patch/839126/ > > Wilco > Which doesn't appear to have been approved. Did you follow up with Jeff? R.
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
Hi, > I see nothing about you addressing James' comment from 17th November... I addressed that in a separate patch, see https://patchwork.ozlabs.org/patch/839126/ Wilco
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On 15/05/18 14:24, Wilco Dijkstra wrote: > > ping > > I see nothing about you addressing James' comment from 17th November... > > > From: Wilco Dijkstra > Sent: 17 November 2017 15:21 > To: GCC Patches > Cc: nd > Subject: [PATCH][AArch64] Set SLOW_BYTE_ACCESS > > > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing > bitfields by their declared type, which results in better codegeneration on > practically > any target. > > I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS > from GCC as it's confusing and useless. > > OK for commit until we get rid of it? > > ChangeLog: > 2017-11-17 Wilco Dijkstra > > gcc/ > * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. > -- > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index > 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 > 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -769,14 +769,9 @@ typedef struct > if given data not on the nominal alignment. */ > #define STRICT_ALIGNMENT TARGET_STRICT_ALIGN > > -/* Define this macro to be non-zero if accessing less than a word of > - memory is no faster than accessing a word of memory, i.e., if such > - accesses require more than one instruction or if there is no > - difference in cost. > - Although there's no difference in instruction count or cycles, > - in AArch64 we don't want to expand to a sub-word to a 64-bit access > - if we don't have to, for power-saving reasons. */ > -#define SLOW_BYTE_ACCESS 0 > +/* Contrary to all documentation, this enables wide bitfield accesses, > + which results in better code when accessing multiple bitfields. */ > +#define SLOW_BYTE_ACCESS 1 > > #define NO_FUNCTION_CSE 1 > > >
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
ping From: Wilco Dijkstra Sent: 17 November 2017 15:21 To: GCC Patches Cc: nd Subject: [PATCH][AArch64] Set SLOW_BYTE_ACCESS Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing bitfields by their declared type, which results in better codegeneration on practically any target. I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS from GCC as it's confusing and useless. OK for commit until we get rid of it? ChangeLog: 2017-11-17 Wilco Dijkstra gcc/ * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. -- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -769,14 +769,9 @@ typedef struct if given data not on the nominal alignment. */ #define STRICT_ALIGNMENT TARGET_STRICT_ALIGN -/* Define this macro to be non-zero if accessing less than a word of - memory is no faster than accessing a word of memory, i.e., if such - accesses require more than one instruction or if there is no - difference in cost. - Although there's no difference in instruction count or cycles, - in AArch64 we don't want to expand to a sub-word to a 64-bit access - if we don't have to, for power-saving reasons. */ -#define SLOW_BYTE_ACCESS 0 +/* Contrary to all documentation, this enables wide bitfield accesses, + which results in better code when accessing multiple bitfields. */ +#define SLOW_BYTE_ACCESS 1 #define NO_FUNCTION_CSE 1
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On Fri, Nov 17, 2017 at 03:21:31PM +, Wilco Dijkstra wrote: > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing > bitfields by their declared type, which results in better codegeneration on > practically > any target. > > I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS > from GCC as it's confusing and useless. > -/* Define this macro to be non-zero if accessing less than a word of > - memory is no faster than accessing a word of memory, i.e., if such > - accesses require more than one instruction or if there is no > - difference in cost. > - Although there's no difference in instruction count or cycles, > - in AArch64 we don't want to expand to a sub-word to a 64-bit access > - if we don't have to, for power-saving reasons. */ > -#define SLOW_BYTE_ACCESS 0 > +/* Contrary to all documentation, this enables wide bitfield accesses, > + which results in better code when accessing multiple bitfields. */ > +#define SLOW_BYTE_ACCESS 1 Why don't we fix the documentation and the name of this macro so we don't need the comment? I'd rather us fix the macro to behave in a sensible way than record the opposite of what we mean just because the documentation is flawed. Thanks, James
[PATCH][AArch64] Set SLOW_BYTE_ACCESS
Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing bitfields by their declared type, which results in better codegeneration on practically any target. I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS from GCC as it's confusing and useless. OK for commit until we get rid of it? ChangeLog: 2017-11-17 Wilco Dijkstra gcc/ * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. -- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -769,14 +769,9 @@ typedef struct if given data not on the nominal alignment. */ #define STRICT_ALIGNMENT TARGET_STRICT_ALIGN -/* Define this macro to be non-zero if accessing less than a word of - memory is no faster than accessing a word of memory, i.e., if such - accesses require more than one instruction or if there is no - difference in cost. - Although there's no difference in instruction count or cycles, - in AArch64 we don't want to expand to a sub-word to a 64-bit access - if we don't have to, for power-saving reasons. */ -#define SLOW_BYTE_ACCESS 0 +/* Contrary to all documentation, this enables wide bitfield accesses, + which results in better code when accessing multiple bitfields. */ +#define SLOW_BYTE_ACCESS 1 #define NO_FUNCTION_CSE1