Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-15 Thread James Greenhalgh
On Mon, May 14, 2018 at 03:41:34PM -0500, Luis Machado wrote:
> On 05/11/2018 06:46 AM, Kyrill Tkachov wrote:
> > Hi Luis,
> > 
> > On 10/05/18 11:31, Luis Machado wrote:
> >>
> >> On 05/09/2018 10:44 AM, Kyrill Tkachov wrote:
> >>>
> >>> On 09/05/18 13:30, Luis Machado wrote:
>  Hi Kyrill,
> 
>  On 05/08/2018 11:09 AM, Kyrill Tkachov wrote:
> > Hi Luis,
> >
> > On 07/05/18 15:28, Luis Machado wrote:
> >> Hi,
> >>
> >> On 02/08/2018 10:45 AM, Luis Machado wrote:
> >>> Hi Kyrill,
> >>>
> >>> On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:
>  Hi Luis,
> 
>  On 06/02/18 15:04, Luis Machado wrote:
> > Thanks for the feedback Kyrill. I've adjusted the v2 patch 
> > based on your
> > suggestions and re-tested the changes. Everything is still sane.
> 
>  Thanks! This looks pretty good to me.
> 
> > Since this is ARM-specific and fairly specific, i wonder if it 
> > would be
> > reasonable to consider it for inclusion at the current stage.
> 
>  It is true that the target maintainers can choose to take
>  such patches at any stage. However, any patch at this stage 
>  increases
>  the risk of regressions being introduced and these regressions
>  can come bite us in ways that are very hard to anticipate.
> 
>  Have a look at some of the bugs in bugzilla (or a quick scan of 
>  the gcc-bugs list)
>  for examples of the ways that things can go wrong with any of 
>  the myriad of GCC components
>  and the unexpected ways in which they can interact.
> 
>  For example, I am now working on what I initially thought was a 
>  one-liner fix for
>  PR 84164 but it has expanded into a 3-patch series with a midend 
>  component and
>  target-specific changes for 2 ports.
> 
>  These issues are very hard to catch during review and normal 
>  testing, and can sometimes take months of deep testing by
>  fuzzing and massive codebase rebuilds to expose, so the closer 
>  the commit is to a release
>  the higher the risk is that an obscure edge case will be 
>  unnoticed and unfixed in the release.
> 
>  So the priority at this stage is to minimise the risk of 
>  destabilising the codebase,
>  as opposed to taking in new features and desirable performance 
>  improvements (like your patch!)
> 
>  That is the rationale for delaying committing such changes until 
>  the start
>  of GCC 9 development. But again, this is up to the aarch64 
>  maintainers.
>  I'm sure the patch will be a perfectly fine and desirable commit 
>  for GCC 9.
>  This is just my perspective as maintainer of the arm port.
> >>>
> >>> Thanks. Your explanation makes the situation pretty clear and it 
> >>> sounds very reasonable. I'll put the patch on hold until 
> >>> development is open again.
> >>>
> >>> Regards,
> >>> Luis
> >>
> >> With GCC 9 development open, i take it this patch is worth 
> >> considering again?
> >>
> >
> > Yes, I believe the latest version is at:
> > https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?
> >
> > +(define_insn "*ashift_extv_bfiz"
> > +  [(set (match_operand:GPI 0 "register_operand" "=r")
> > +    (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 
> > "register_operand" "r")
> > +  (match_operand 2 
> > "aarch64_simd_shift_imm_offset_" "n")
> > +  (match_operand 3 
> > "aarch64_simd_shift_imm_" "n"))
> > + (match_operand 4 "aarch64_simd_shift_imm_" "n")))]
> > +  ""
> > +  "sbfiz\\t%0, %1, %4, %2"
> > +  [(set_attr "type" "bfx")]
> > +)
> > +
> 
>  Indeed.
> 
> >
> > Can you give a bit more information about what are the values for 
> > operands 2,3 and 4 in your example testcases?
> 
>  For sbfiz32 we have 3, 0 and 19 respectively. For sbfiz64 we have 6, 
>  0 and 38.
> 
> > I'm trying to understand why the value of operand 3 (the bit 
> > position the sign-extract starts from) doesn't get validated
> > in any way and doesn't play any role in the output...
> 
>  This may be an oversight. It seems operand 3 will always be 0 in 
>  this particular case i'm covering. It starts from 0, gets shifted x 
>  bits to the left and then y < x bits to the right). The operation is 
>  essentially an ashift of the bitfield followed by a sign-extension 
>  of the msb of the bitfield being extracted.
> 
>  Having a non-zero operand 3 from RTL means the shift amount won't 
>  translate directly to operand 3 of sbfiz (the position). Then we'd 

Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-15 Thread Kyrill Tkachov

Hi Luis,

On 14/05/18 21:41, Luis Machado wrote:

On 05/11/2018 06:46 AM, Kyrill Tkachov wrote:

Hi Luis,

On 10/05/18 11:31, Luis Machado wrote:


On 05/09/2018 10:44 AM, Kyrill Tkachov wrote:


On 09/05/18 13:30, Luis Machado wrote:

Hi Kyrill,

On 05/08/2018 11:09 AM, Kyrill Tkachov wrote:

Hi Luis,

On 07/05/18 15:28, Luis Machado wrote:

Hi,

On 02/08/2018 10:45 AM, Luis Machado wrote:

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:

Thanks for the feedback Kyrill. I've adjusted the v2 patch based on your
suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.


Since this is ARM-specific and fairly specific, i wonder if it would be
reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of the gcc-bugs 
list)
for examples of the ways that things can go wrong with any of the myriad of GCC 
components
and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a one-liner fix 
for
PR 84164 but it has expanded into a 3-patch series with a midend component and
target-specific changes for 2 ports.

These issues are very hard to catch during review and normal testing, and can 
sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the commit is to 
a release
the higher the risk is that an obscure edge case will be unnoticed and unfixed 
in the release.

So the priority at this stage is to minimise the risk of destabilising the 
codebase,
as opposed to taking in new features and desirable performance improvements 
(like your patch!)

That is the rationale for delaying committing such changes until the start
of GCC 9 development. But again, this is up to the aarch64 maintainers.
I'm sure the patch will be a perfectly fine and desirable commit for GCC 9.
This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it sounds very 
reasonable. I'll put the patch on hold until development is open again.

Regards,
Luis


With GCC 9 development open, i take it this patch is worth considering again?



Yes, I believe the latest version is at:
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?

+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r")
+  (match_operand 2 "aarch64_simd_shift_imm_offset_" 
"n")
+  (match_operand 3 "aarch64_simd_shift_imm_" "n"))
+ (match_operand 4 "aarch64_simd_shift_imm_" "n")))]
+  ""
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+


Indeed.



Can you give a bit more information about what are the values for operands 2,3 
and 4 in your example testcases?


For sbfiz32 we have 3, 0 and 19 respectively. For sbfiz64 we have 6, 0 and 38.


I'm trying to understand why the value of operand 3 (the bit position the 
sign-extract starts from) doesn't get validated
in any way and doesn't play any role in the output...


This may be an oversight. It seems operand 3 will always be 0 in this particular 
case i'm covering. It starts from 0, gets shifted x bits to the left and then y 
< x bits to the right). The operation is essentially an ashift of the bitfield 
followed by a sign-extension of the msb of the bitfield being extracted.

Having a non-zero operand 3 from RTL means the shift amount won't translate 
directly to operand 3 of sbfiz (the position). Then we'd need to do a 
calculation where we take into account operand 3 from RTL.

I'm wondering when such a RTL pattern, with a non-zero operand 3, would be 
generated though.


I think it's best to enforce that operand 3 is a zero. Maybe just match 
const_int 0 here directly.
Better safe than sorry with these things.


Indeed. I've updated the original patch with that change now.

Bootstrapped and regtested on aarch64-linux.



I think you might also want to enforce that the sum of operands 2 and 3 doesn't 
exceed the width of the mode
(see other patterns in aarch64.md that generate bfx-style instructions for 
similar restrictions).


Updated now in the attached patch. Everything still sane with bootstrap and 
testsuite on aarch64-linux.



Thanks!
This looks good to me now.
You'll need a final approval from a maintainer.

Kyrill



Otherwise the patch looks good to me (it will still need approval from a 
maintainer).

For the future I think there's also an opportunity to match the SImode version 
of this pattern that zero-extends to DImode,
making use of the implicit 

Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-14 Thread Luis Machado

On 05/11/2018 06:46 AM, Kyrill Tkachov wrote:

Hi Luis,

On 10/05/18 11:31, Luis Machado wrote:


On 05/09/2018 10:44 AM, Kyrill Tkachov wrote:


On 09/05/18 13:30, Luis Machado wrote:

Hi Kyrill,

On 05/08/2018 11:09 AM, Kyrill Tkachov wrote:

Hi Luis,

On 07/05/18 15:28, Luis Machado wrote:

Hi,

On 02/08/2018 10:45 AM, Luis Machado wrote:

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:
Thanks for the feedback Kyrill. I've adjusted the v2 patch 
based on your

suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.

Since this is ARM-specific and fairly specific, i wonder if it 
would be

reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage 
increases

the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of 
the gcc-bugs list)
for examples of the ways that things can go wrong with any of 
the myriad of GCC components

and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a 
one-liner fix for
PR 84164 but it has expanded into a 3-patch series with a midend 
component and

target-specific changes for 2 ports.

These issues are very hard to catch during review and normal 
testing, and can sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer 
the commit is to a release
the higher the risk is that an obscure edge case will be 
unnoticed and unfixed in the release.


So the priority at this stage is to minimise the risk of 
destabilising the codebase,
as opposed to taking in new features and desirable performance 
improvements (like your patch!)


That is the rationale for delaying committing such changes until 
the start
of GCC 9 development. But again, this is up to the aarch64 
maintainers.
I'm sure the patch will be a perfectly fine and desirable commit 
for GCC 9.

This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it 
sounds very reasonable. I'll put the patch on hold until 
development is open again.


Regards,
Luis


With GCC 9 development open, i take it this patch is worth 
considering again?




Yes, I believe the latest version is at:
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?

+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+    (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 
"register_operand" "r")
+  (match_operand 2 
"aarch64_simd_shift_imm_offset_" "n")
+  (match_operand 3 
"aarch64_simd_shift_imm_" "n"))

+ (match_operand 4 "aarch64_simd_shift_imm_" "n")))]
+  ""
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+


Indeed.



Can you give a bit more information about what are the values for 
operands 2,3 and 4 in your example testcases?


For sbfiz32 we have 3, 0 and 19 respectively. For sbfiz64 we have 6, 
0 and 38.


I'm trying to understand why the value of operand 3 (the bit 
position the sign-extract starts from) doesn't get validated

in any way and doesn't play any role in the output...


This may be an oversight. It seems operand 3 will always be 0 in 
this particular case i'm covering. It starts from 0, gets shifted x 
bits to the left and then y < x bits to the right). The operation is 
essentially an ashift of the bitfield followed by a sign-extension 
of the msb of the bitfield being extracted.


Having a non-zero operand 3 from RTL means the shift amount won't 
translate directly to operand 3 of sbfiz (the position). Then we'd 
need to do a calculation where we take into account operand 3 from RTL.


I'm wondering when such a RTL pattern, with a non-zero operand 3, 
would be generated though.


I think it's best to enforce that operand 3 is a zero. Maybe just 
match const_int 0 here directly.

Better safe than sorry with these things.


Indeed. I've updated the original patch with that change now.

Bootstrapped and regtested on aarch64-linux.



I think you might also want to enforce that the sum of operands 2 and 3 
doesn't exceed the width of the mode
(see other patterns in aarch64.md that generate bfx-style instructions 
for similar restrictions).


Updated now in the attached patch. Everything still sane with bootstrap 
and testsuite on aarch64-linux.




Otherwise the patch looks good to me (it will still need approval from a 
maintainer).


For the future I think there's also an opportunity to match the SImode 
version of this pattern that zero-extends to DImode,
making use of the implicit zero-extension that writing to a W-dreg 
provides. But that would be a separate patch.


Indeed. I'll have a TODO for 

Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-11 Thread Kyrill Tkachov

Hi Luis,

On 10/05/18 11:31, Luis Machado wrote:


On 05/09/2018 10:44 AM, Kyrill Tkachov wrote:


On 09/05/18 13:30, Luis Machado wrote:

Hi Kyrill,

On 05/08/2018 11:09 AM, Kyrill Tkachov wrote:

Hi Luis,

On 07/05/18 15:28, Luis Machado wrote:

Hi,

On 02/08/2018 10:45 AM, Luis Machado wrote:

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:

Thanks for the feedback Kyrill. I've adjusted the v2 patch based on your
suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.


Since this is ARM-specific and fairly specific, i wonder if it would be
reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of the gcc-bugs 
list)
for examples of the ways that things can go wrong with any of the myriad of GCC 
components
and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a one-liner fix 
for
PR 84164 but it has expanded into a 3-patch series with a midend component and
target-specific changes for 2 ports.

These issues are very hard to catch during review and normal testing, and can 
sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the commit is to 
a release
the higher the risk is that an obscure edge case will be unnoticed and unfixed 
in the release.

So the priority at this stage is to minimise the risk of destabilising the 
codebase,
as opposed to taking in new features and desirable performance improvements 
(like your patch!)

That is the rationale for delaying committing such changes until the start
of GCC 9 development. But again, this is up to the aarch64 maintainers.
I'm sure the patch will be a perfectly fine and desirable commit for GCC 9.
This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it sounds very 
reasonable. I'll put the patch on hold until development is open again.

Regards,
Luis


With GCC 9 development open, i take it this patch is worth considering again?



Yes, I believe the latest version is at:
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?

+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r")
+  (match_operand 2 "aarch64_simd_shift_imm_offset_" 
"n")
+  (match_operand 3 "aarch64_simd_shift_imm_" "n"))
+ (match_operand 4 "aarch64_simd_shift_imm_" "n")))]
+  ""
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+


Indeed.



Can you give a bit more information about what are the values for operands 2,3 
and 4 in your example testcases?


For sbfiz32 we have 3, 0 and 19 respectively. For sbfiz64 we have 6, 0 and 38.


I'm trying to understand why the value of operand 3 (the bit position the 
sign-extract starts from) doesn't get validated
in any way and doesn't play any role in the output...


This may be an oversight. It seems operand 3 will always be 0 in this particular 
case i'm covering. It starts from 0, gets shifted x bits to the left and then y 
< x bits to the right). The operation is essentially an ashift of the bitfield 
followed by a sign-extension of the msb of the bitfield being extracted.

Having a non-zero operand 3 from RTL means the shift amount won't translate 
directly to operand 3 of sbfiz (the position). Then we'd need to do a 
calculation where we take into account operand 3 from RTL.

I'm wondering when such a RTL pattern, with a non-zero operand 3, would be 
generated though.


I think it's best to enforce that operand 3 is a zero. Maybe just match 
const_int 0 here directly.
Better safe than sorry with these things.


Indeed. I've updated the original patch with that change now.

Bootstrapped and regtested on aarch64-linux.



I think you might also want to enforce that the sum of operands 2 and 3 doesn't 
exceed the width of the mode
(see other patterns in aarch64.md that generate bfx-style instructions for 
similar restrictions).

Otherwise the patch looks good to me (it will still need approval from a 
maintainer).

For the future I think there's also an opportunity to match the SImode version 
of this pattern that zero-extends to DImode,
making use of the implicit zero-extension that writing to a W-dreg provides. 
But that would be a separate patch.

Thanks, and sorry for the respins,
Kyrill


Thanks,
Luis




Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-10 Thread Luis Machado


On 05/09/2018 10:44 AM, Kyrill Tkachov wrote:


On 09/05/18 13:30, Luis Machado wrote:

Hi Kyrill,

On 05/08/2018 11:09 AM, Kyrill Tkachov wrote:

Hi Luis,

On 07/05/18 15:28, Luis Machado wrote:

Hi,

On 02/08/2018 10:45 AM, Luis Machado wrote:

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:
Thanks for the feedback Kyrill. I've adjusted the v2 patch based 
on your

suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.

Since this is ARM-specific and fairly specific, i wonder if it 
would be

reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of 
the gcc-bugs list)
for examples of the ways that things can go wrong with any of the 
myriad of GCC components

and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a 
one-liner fix for
PR 84164 but it has expanded into a 3-patch series with a midend 
component and

target-specific changes for 2 ports.

These issues are very hard to catch during review and normal 
testing, and can sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the 
commit is to a release
the higher the risk is that an obscure edge case will be unnoticed 
and unfixed in the release.


So the priority at this stage is to minimise the risk of 
destabilising the codebase,
as opposed to taking in new features and desirable performance 
improvements (like your patch!)


That is the rationale for delaying committing such changes until 
the start
of GCC 9 development. But again, this is up to the aarch64 
maintainers.
I'm sure the patch will be a perfectly fine and desirable commit 
for GCC 9.

This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it 
sounds very reasonable. I'll put the patch on hold until 
development is open again.


Regards,
Luis


With GCC 9 development open, i take it this patch is worth 
considering again?




Yes, I believe the latest version is at:
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?

+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+    (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 
"register_operand" "r")
+  (match_operand 2 
"aarch64_simd_shift_imm_offset_" "n")
+  (match_operand 3 
"aarch64_simd_shift_imm_" "n"))

+ (match_operand 4 "aarch64_simd_shift_imm_" "n")))]
+  ""
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+


Indeed.



Can you give a bit more information about what are the values for 
operands 2,3 and 4 in your example testcases?


For sbfiz32 we have 3, 0 and 19 respectively. For sbfiz64 we have 6, 0 
and 38.


I'm trying to understand why the value of operand 3 (the bit position 
the sign-extract starts from) doesn't get validated

in any way and doesn't play any role in the output...


This may be an oversight. It seems operand 3 will always be 0 in this 
particular case i'm covering. It starts from 0, gets shifted x bits to 
the left and then y < x bits to the right). The operation is 
essentially an ashift of the bitfield followed by a sign-extension of 
the msb of the bitfield being extracted.


Having a non-zero operand 3 from RTL means the shift amount won't 
translate directly to operand 3 of sbfiz (the position). Then we'd 
need to do a calculation where we take into account operand 3 from RTL.


I'm wondering when such a RTL pattern, with a non-zero operand 3, 
would be generated though.


I think it's best to enforce that operand 3 is a zero. Maybe just match 
const_int 0 here directly.

Better safe than sorry with these things.


Indeed. I've updated the original patch with that change now.

Bootstrapped and regtested on aarch64-linux.

Thanks,
Luis
2018-05-10  Luis Machado  

	gcc/
	* config/aarch64/aarch64.md (*ashift_extv_bfiz): New pattern.

	gcc/testsuite/
	* gcc.target/aarch64/lsl_asr_sbfiz.c: New test.
---
 gcc/config/aarch64/aarch64.md| 13 +
 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 32a0e1f..1f943e6 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4851,6 +4851,19 @@
   [(set_attr "type" "bfx")]
 )
 
+;; Match sbfiz pattern in a shift left + shift right operation.
+
+(define_insn 

Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-09 Thread Kyrill Tkachov


On 09/05/18 13:30, Luis Machado wrote:

Hi Kyrill,

On 05/08/2018 11:09 AM, Kyrill Tkachov wrote:

Hi Luis,

On 07/05/18 15:28, Luis Machado wrote:

Hi,

On 02/08/2018 10:45 AM, Luis Machado wrote:

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:

Thanks for the feedback Kyrill. I've adjusted the v2 patch based on your
suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.


Since this is ARM-specific and fairly specific, i wonder if it would be
reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of the gcc-bugs 
list)
for examples of the ways that things can go wrong with any of the myriad of GCC 
components
and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a one-liner fix 
for
PR 84164 but it has expanded into a 3-patch series with a midend component and
target-specific changes for 2 ports.

These issues are very hard to catch during review and normal testing, and can 
sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the commit is to 
a release
the higher the risk is that an obscure edge case will be unnoticed and unfixed 
in the release.

So the priority at this stage is to minimise the risk of destabilising the 
codebase,
as opposed to taking in new features and desirable performance improvements 
(like your patch!)

That is the rationale for delaying committing such changes until the start
of GCC 9 development. But again, this is up to the aarch64 maintainers.
I'm sure the patch will be a perfectly fine and desirable commit for GCC 9.
This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it sounds very 
reasonable. I'll put the patch on hold until development is open again.

Regards,
Luis


With GCC 9 development open, i take it this patch is worth considering again?



Yes, I believe the latest version is at:
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?

+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r")
+  (match_operand 2 "aarch64_simd_shift_imm_offset_" 
"n")
+  (match_operand 3 "aarch64_simd_shift_imm_" "n"))
+ (match_operand 4 "aarch64_simd_shift_imm_" "n")))]
+  ""
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+


Indeed.



Can you give a bit more information about what are the values for operands 2,3 
and 4 in your example testcases?


For sbfiz32 we have 3, 0 and 19 respectively. For sbfiz64 we have 6, 0 and 38.


I'm trying to understand why the value of operand 3 (the bit position the 
sign-extract starts from) doesn't get validated
in any way and doesn't play any role in the output...


This may be an oversight. It seems operand 3 will always be 0 in this particular 
case i'm covering. It starts from 0, gets shifted x bits to the left and then y 
< x bits to the right). The operation is essentially an ashift of the bitfield 
followed by a sign-extension of the msb of the bitfield being extracted.

Having a non-zero operand 3 from RTL means the shift amount won't translate 
directly to operand 3 of sbfiz (the position). Then we'd need to do a 
calculation where we take into account operand 3 from RTL.

I'm wondering when such a RTL pattern, with a non-zero operand 3, would be 
generated though.


I think it's best to enforce that operand 3 is a zero. Maybe just match 
const_int 0 here directly.
Better safe than sorry with these things.

Thanks,
Kyrill


Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-09 Thread Luis Machado

Hi Kyrill,

On 05/08/2018 11:09 AM, Kyrill Tkachov wrote:

Hi Luis,

On 07/05/18 15:28, Luis Machado wrote:

Hi,

On 02/08/2018 10:45 AM, Luis Machado wrote:

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:
Thanks for the feedback Kyrill. I've adjusted the v2 patch based on 
your

suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.

Since this is ARM-specific and fairly specific, i wonder if it 
would be

reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of the 
gcc-bugs list)
for examples of the ways that things can go wrong with any of the 
myriad of GCC components

and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a 
one-liner fix for
PR 84164 but it has expanded into a 3-patch series with a midend 
component and

target-specific changes for 2 ports.

These issues are very hard to catch during review and normal 
testing, and can sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the 
commit is to a release
the higher the risk is that an obscure edge case will be unnoticed 
and unfixed in the release.


So the priority at this stage is to minimise the risk of 
destabilising the codebase,
as opposed to taking in new features and desirable performance 
improvements (like your patch!)


That is the rationale for delaying committing such changes until the 
start

of GCC 9 development. But again, this is up to the aarch64 maintainers.
I'm sure the patch will be a perfectly fine and desirable commit for 
GCC 9.

This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it 
sounds very reasonable. I'll put the patch on hold until development 
is open again.


Regards,
Luis


With GCC 9 development open, i take it this patch is worth considering 
again?




Yes, I believe the latest version is at:
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?

+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+    (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 
"register_operand" "r")
+  (match_operand 2 
"aarch64_simd_shift_imm_offset_" "n")
+  (match_operand 3 "aarch64_simd_shift_imm_" 
"n"))

+ (match_operand 4 "aarch64_simd_shift_imm_" "n")))]
+  ""
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+


Indeed.



Can you give a bit more information about what are the values for 
operands 2,3 and 4 in your example testcases?


For sbfiz32 we have 3, 0 and 19 respectively. For sbfiz64 we have 6, 0 
and 38.


I'm trying to understand why the value of operand 3 (the bit position 
the sign-extract starts from) doesn't get validated

in any way and doesn't play any role in the output...


This may be an oversight. It seems operand 3 will always be 0 in this 
particular case i'm covering. It starts from 0, gets shifted x bits to 
the left and then y < x bits to the right). The operation is essentially 
an ashift of the bitfield followed by a sign-extension of the msb of the 
bitfield being extracted.


Having a non-zero operand 3 from RTL means the shift amount won't 
translate directly to operand 3 of sbfiz (the position). Then we'd need 
to do a calculation where we take into account operand 3 from RTL.


I'm wondering when such a RTL pattern, with a non-zero operand 3, would 
be generated though.


Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-08 Thread Kyrill Tkachov

Hi Luis,

On 07/05/18 15:28, Luis Machado wrote:

Hi,

On 02/08/2018 10:45 AM, Luis Machado wrote:

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:

Thanks for the feedback Kyrill. I've adjusted the v2 patch based on your
suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.


Since this is ARM-specific and fairly specific, i wonder if it would be
reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of the gcc-bugs 
list)
for examples of the ways that things can go wrong with any of the myriad of GCC 
components
and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a one-liner fix 
for
PR 84164 but it has expanded into a 3-patch series with a midend component and
target-specific changes for 2 ports.

These issues are very hard to catch during review and normal testing, and can 
sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the commit is to 
a release
the higher the risk is that an obscure edge case will be unnoticed and unfixed 
in the release.

So the priority at this stage is to minimise the risk of destabilising the 
codebase,
as opposed to taking in new features and desirable performance improvements 
(like your patch!)

That is the rationale for delaying committing such changes until the start
of GCC 9 development. But again, this is up to the aarch64 maintainers.
I'm sure the patch will be a perfectly fine and desirable commit for GCC 9.
This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it sounds very 
reasonable. I'll put the patch on hold until development is open again.

Regards,
Luis


With GCC 9 development open, i take it this patch is worth considering again?



Yes, I believe the latest version is at:
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?

+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+   (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" 
"r")
+ (match_operand 2 
"aarch64_simd_shift_imm_offset_" "n")
+ (match_operand 3 "aarch64_simd_shift_imm_" 
"n"))
+(match_operand 4 "aarch64_simd_shift_imm_" "n")))]
+  ""
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+

Can you give a bit more information about what are the values for operands 2,3 
and 4 in your example testcases?
I'm trying to understand why the value of operand 3 (the bit position the 
sign-extract starts from) doesn't get validated
in any way and doesn't play any role in the output...

Thanks,
Kyrill


Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-05-07 Thread Luis Machado

Hi,

On 02/08/2018 10:45 AM, Luis Machado wrote:

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:

Thanks for the feedback Kyrill. I've adjusted the v2 patch based on your
suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.


Since this is ARM-specific and fairly specific, i wonder if it would be
reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of the 
gcc-bugs list)
for examples of the ways that things can go wrong with any of the 
myriad of GCC components

and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a 
one-liner fix for
PR 84164 but it has expanded into a 3-patch series with a midend 
component and

target-specific changes for 2 ports.

These issues are very hard to catch during review and normal testing, 
and can sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the 
commit is to a release
the higher the risk is that an obscure edge case will be unnoticed and 
unfixed in the release.


So the priority at this stage is to minimise the risk of destabilising 
the codebase,
as opposed to taking in new features and desirable performance 
improvements (like your patch!)


That is the rationale for delaying committing such changes until the 
start

of GCC 9 development. But again, this is up to the aarch64 maintainers.
I'm sure the patch will be a perfectly fine and desirable commit for 
GCC 9.

This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it sounds 
very reasonable. I'll put the patch on hold until development is open 
again.


Regards,
Luis


With GCC 9 development open, i take it this patch is worth considering 
again?


Thanks,
Luis


Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-02-08 Thread Luis Machado

Hi Kyrill,

On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:

Thanks for the feedback Kyrill. I've adjusted the v2 patch based on your
suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.


Since this is ARM-specific and fairly specific, i wonder if it would be
reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of the 
gcc-bugs list)
for examples of the ways that things can go wrong with any of the myriad 
of GCC components

and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a 
one-liner fix for
PR 84164 but it has expanded into a 3-patch series with a midend 
component and

target-specific changes for 2 ports.

These issues are very hard to catch during review and normal testing, 
and can sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the 
commit is to a release
the higher the risk is that an obscure edge case will be unnoticed and 
unfixed in the release.


So the priority at this stage is to minimise the risk of destabilising 
the codebase,
as opposed to taking in new features and desirable performance 
improvements (like your patch!)


That is the rationale for delaying committing such changes until the start
of GCC 9 development. But again, this is up to the aarch64 maintainers.
I'm sure the patch will be a perfectly fine and desirable commit for GCC 9.
This is just my perspective as maintainer of the arm port.


Thanks. Your explanation makes the situation pretty clear and it sounds 
very reasonable. I'll put the patch on hold until development is open again.


Regards,
Luis


Re: [PATCH, v2] Recognize a missed usage of a sbfiz instruction

2018-02-08 Thread Kyrill Tkachov

Hi Luis,

On 06/02/18 15:04, Luis Machado wrote:

Thanks for the feedback Kyrill. I've adjusted the v2 patch based on your
suggestions and re-tested the changes. Everything is still sane.


Thanks! This looks pretty good to me.


Since this is ARM-specific and fairly specific, i wonder if it would be
reasonable to consider it for inclusion at the current stage.


It is true that the target maintainers can choose to take
such patches at any stage. However, any patch at this stage increases
the risk of regressions being introduced and these regressions
can come bite us in ways that are very hard to anticipate.

Have a look at some of the bugs in bugzilla (or a quick scan of the gcc-bugs 
list)
for examples of the ways that things can go wrong with any of the myriad of GCC 
components
and the unexpected ways in which they can interact.

For example, I am now working on what I initially thought was a one-liner fix 
for
PR 84164 but it has expanded into a 3-patch series with a midend component and
target-specific changes for 2 ports.

These issues are very hard to catch during review and normal testing, and can 
sometimes take months of deep testing by
fuzzing and massive codebase rebuilds to expose, so the closer the commit is to 
a release
the higher the risk is that an obscure edge case will be unnoticed and unfixed 
in the release.

So the priority at this stage is to minimise the risk of destabilising the 
codebase,
as opposed to taking in new features and desirable performance improvements 
(like your patch!)

That is the rationale for delaying committing such changes until the start
of GCC 9 development. But again, this is up to the aarch64 maintainers.
I'm sure the patch will be a perfectly fine and desirable commit for GCC 9.
This is just my perspective as maintainer of the arm port.

Thanks,
Kyrill


Regards,
Luis

Changes in v2:

- Added more restrictive predicates to operands 2, 3 and 4.
- Removed pattern conditional.
- Switched to long long for 64-bit signed integer for the testcase.

---

A customer reported the following missed opportunities to combine a couple
instructions into a sbfiz.

int sbfiz32 (int x)
{
   return x << 29 >> 10;
}

long long sbfiz64 (long long x)
{
   return x << 58 >> 20;
}

This gets converted to the following pattern:

(set (reg:SI 98)
 (ashift:SI (sign_extend:SI (reg:HI 0 x0 [ xD.3334 ]))
 (const_int 6 [0x6])))

Currently, gcc generates the following:

sbfiz32:
lsl x0, x0, 29
asr x0, x0, 10
ret

sbfiz64:
lsl x0, x0, 58
asr x0, x0, 20
ret

It could generate this instead:

sbfiz32:
sbfiz   w0, w0, 19, 3
ret

sbfiz64::
sbfiz   x0, x0, 38, 6
ret

The unsigned versions already generate ubfiz for the same code, so the lack of
a sbfiz pattern may have been an oversight.

This particular sbfiz pattern shows up in both CPU2006 (~ 80 hits) and
CPU2017 (~ 280 hits). It's not a lot, but seems beneficial in any case. No
significant performance differences, probably due to the small number of
occurrences.

2018-02-06  Luis Machado  

gcc/
* config/aarch64/aarch64.md (*ashift_extv_bfiz): New pattern.

2018-02-06  Luis Machado  

gcc/testsuite/
* gcc.target/aarch64/lsl_asr_sbfiz.c: New test.
---
  gcc/config/aarch64/aarch64.md| 13 +
  gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 
  2 files changed, 37 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a2a930..e8284ae 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4828,6 +4828,19 @@
[(set_attr "type" "bfx")]
  )
  
+;; Match sbfiz pattern in a shift left + shift right operation.

+
+(define_insn "*ashift_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+   (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" 
"r")
+ (match_operand 2 
"aarch64_simd_shift_imm_offset_" "n")
+ (match_operand 3 "aarch64_simd_shift_imm_" 
"n"))
+(match_operand 4 "aarch64_simd_shift_imm_" "n")))]
+  ""
+  "sbfiz\\t%0, %1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+
  ;; When the bit position and width of the equivalent extraction add up to 32
  ;; we can use a W-reg LSL instruction taking advantage of the implicit
  ;; zero-extension of the X-reg.
diff --git a/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c 
b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
new file mode 100644
index 000..106433d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* Check that a LSL followed by an ASR can be combined into a single SBFIZ
+   instruction.  */
+
+/*