Re: [PATCH 4/4] rs6000: build constant via li/lis;rldic
On 2023-06-13 17:18, Jiufu Guo via Gcc-patches wrote: Hi David, Thanks for your valuable comments! David Edelsohn writes: ... Do you have any measurement of how expensive it is to test all of these additional methods to generate a constant? How much does this affect the compile time? Yeap, Thanks for this very good question! This patch is mostly using bitwise operations and if-conditions, it would be expected not expensive. Testcases were checked. For example: A case with ~1000 constants: most of them hit this feature. With this feature, the compiling time is slightly faster. 0m1.985s(without patch) vs. 0m1.874s(with patch) (note:D rs6000_emit_set_long_const does not occur in hot perf functions. So, the tricky time saving would not directly cause by this feature.) A case with ~1000 constants:(most are not hit by this feature) 0m2.493s(without patch) vs. 0m2.558s(with patch). Typo, this should be: 0m2.493s(with patch) vs. 0m2.558s(without patch). It is also faster with the patch :) BR, Jeff (Jiufu Guo) For runtime, actually, with the patch, it seems there is no visible improvement in SPEC2017. While I still feel this patch is doing the right thing: use fewer instructions to build the constant. BR, Jeff (Jiufu Guo) Thanks, David
Re: [PATCH 4/4] rs6000: build constant via li/lis;rldic
Hi David, Thanks for your valuable comments! David Edelsohn writes: > > On Wed, Jun 7, 2023 at 9:56 PM Jiufu Guo wrote: > > Hi, > > This patch checks if a constant is possible to be built by "li;rldic". > We only need to take care of "negative li", other forms do not need to check. > For example, "negative lis" is just a "negative li" with an additional shift. > > Bootstrap and regtest pass on ppc64{,le}. > Is this ok for trunk? > > BR, > Jeff (Jiufu) > > gcc/ChangeLog: > > * config/rs6000/rs6000.cc (can_be_built_by_li_and_rldic): New > function. > (rs6000_emit_set_long_const): Call can_be_built_by_li_and_rldic. > > This is okay. > > Do you have any measurement of how expensive it is to test all of these > additional methods to generate a constant? How much does this affect the > compile time? Yeap, Thanks for this very good question! This patch is mostly using bitwise operations and if-conditions, it would be expected not expensive. Testcases were checked. For example: A case with ~1000 constants: most of them hit this feature. With this feature, the compiling time is slightly faster. 0m1.985s(without patch) vs. 0m1.874s(with patch) (note:D rs6000_emit_set_long_const does not occur in hot perf functions. So, the tricky time saving would not directly cause by this feature.) A case with ~1000 constants:(most are not hit by this feature) 0m2.493s(without patch) vs. 0m2.558s(with patch). For runtime, actually, with the patch, it seems there is no visible improvement in SPEC2017. While I still feel this patch is doing the right thing: use fewer instructions to build the constant. BR, Jeff (Jiufu Guo) > > Thanks, David > > > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/const-build.c: Add more tests. > --- > gcc/config/rs6000/rs6000.cc | 61 ++- > .../gcc.target/powerpc/const-build.c | 28 + > 2 files changed, 88 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 2a3fa733b45..cd04b6b5c82 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -10387,6 +10387,64 @@ can_be_built_by_li_lis_and_rldicr (HOST_WIDE_INT c, > int *shift, > return false; > } > > +/* Check if value C can be built by 2 instructions: one is 'li', another is > + rldic. > + > + If so, *SHIFT is set to the 'shift' operand of rldic; and *MASK is set > + to the mask value about the 'mb' operand of rldic; and return true. > + Return false otherwise. */ > + > +static bool > +can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int *shift, HOST_WIDE_INT > *mask) > +{ > + /* There are 49 successive ones in the negative value of 'li'. */ > + int ones = 49; > + > + /* 1..1xx1..1: negative value of li --> 0..01..1xx0..0: > + right bits are shifted as 0's, and left 1's(and x's) are cleaned. */ > + int tz = ctz_hwi (c); > + int lz = clz_hwi (c); > + int middle_ones = clz_hwi (~(c << lz)); > + if (tz + lz + middle_ones >= ones) > +{ > + *mask = ((1LL << (HOST_BITS_PER_WIDE_INT - tz - lz)) - 1LL) << tz; > + *shift = tz; > + return true; > +} > + > + /* 1..1xx1..1 --> 1..1xx0..01..1: some 1's(following x's) are cleaned. */ > + int leading_ones = clz_hwi (~c); > + int tailing_ones = ctz_hwi (~c); > + int middle_zeros = ctz_hwi (c >> tailing_ones); > + if (leading_ones + tailing_ones + middle_zeros >= ones) > +{ > + *mask = ~(((1ULL << middle_zeros) - 1ULL) << tailing_ones); > + *shift = tailing_ones + middle_zeros; > + return true; > +} > + > + /* xx1..1xx: --> xx0..01..1xx: some 1's(following x's) are cleaned. */ > + /* Get the position for the first bit of successive 1. > + The 24th bit would be in successive 0 or 1. */ > + HOST_WIDE_INT low_mask = (1LL << 24) - 1LL; > + int pos_first_1 = ((c & (low_mask + 1)) == 0) > + ? clz_hwi (c & low_mask) > + : HOST_BITS_PER_WIDE_INT - ctz_hwi (~(c | low_mask)); > + middle_ones = clz_hwi (~c << pos_first_1); > + middle_zeros = ctz_hwi (c >> (HOST_BITS_PER_WIDE_INT - pos_first_1)); > + if (pos_first_1 < HOST_BITS_PER_WIDE_INT > + && middle_ones + middle_zeros < HOST_BITS_PER_WIDE_INT > + && middle_ones + middle_zeros >= ones) > +{ > + *mask = ~(((1ULL << middle_zeros) - 1LL) > + << (HOST_BITS_PER_WIDE_INT - pos_first_1)); > + *shift = HOST_BITS_PER_WIDE_INT - pos_first_1 + middle_zeros; > + return true; > +} > + > + return false; > +} > + > /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode. > Output insns to set DEST equal to the constant C as a series of > lis, ori and shl instructions. */ > @@ -10435,7 +10493,8 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT > c) > } > else if (can_be_built_by_li_lis_and_rotldi (c, , ) >
Re: [PATCH 4/4] rs6000: build constant via li/lis;rldic
On Wed, Jun 7, 2023 at 9:56 PM Jiufu Guo wrote: > Hi, > > This patch checks if a constant is possible to be built by "li;rldic". > We only need to take care of "negative li", other forms do not need to > check. > For example, "negative lis" is just a "negative li" with an additional > shift. > > Bootstrap and regtest pass on ppc64{,le}. > Is this ok for trunk? > > BR, > Jeff (Jiufu) > > gcc/ChangeLog: > > * config/rs6000/rs6000.cc (can_be_built_by_li_and_rldic): New > function. > (rs6000_emit_set_long_const): Call can_be_built_by_li_and_rldic. > This is okay. Do you have any measurement of how expensive it is to test all of these additional methods to generate a constant? How much does this affect the compile time? Thanks, David > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/const-build.c: Add more tests. > --- > gcc/config/rs6000/rs6000.cc | 61 ++- > .../gcc.target/powerpc/const-build.c | 28 + > 2 files changed, 88 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 2a3fa733b45..cd04b6b5c82 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -10387,6 +10387,64 @@ can_be_built_by_li_lis_and_rldicr (HOST_WIDE_INT > c, int *shift, >return false; > } > > +/* Check if value C can be built by 2 instructions: one is 'li', another > is > + rldic. > + > + If so, *SHIFT is set to the 'shift' operand of rldic; and *MASK is set > + to the mask value about the 'mb' operand of rldic; and return true. > + Return false otherwise. */ > + > +static bool > +can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int *shift, HOST_WIDE_INT > *mask) > +{ > + /* There are 49 successive ones in the negative value of 'li'. */ > + int ones = 49; > + > + /* 1..1xx1..1: negative value of li --> 0..01..1xx0..0: > + right bits are shifted as 0's, and left 1's(and x's) are cleaned. */ > + int tz = ctz_hwi (c); > + int lz = clz_hwi (c); > + int middle_ones = clz_hwi (~(c << lz)); > + if (tz + lz + middle_ones >= ones) > +{ > + *mask = ((1LL << (HOST_BITS_PER_WIDE_INT - tz - lz)) - 1LL) << tz; > + *shift = tz; > + return true; > +} > + > + /* 1..1xx1..1 --> 1..1xx0..01..1: some 1's(following x's) are cleaned. > */ > + int leading_ones = clz_hwi (~c); > + int tailing_ones = ctz_hwi (~c); > + int middle_zeros = ctz_hwi (c >> tailing_ones); > + if (leading_ones + tailing_ones + middle_zeros >= ones) > +{ > + *mask = ~(((1ULL << middle_zeros) - 1ULL) << tailing_ones); > + *shift = tailing_ones + middle_zeros; > + return true; > +} > + > + /* xx1..1xx: --> xx0..01..1xx: some 1's(following x's) are cleaned. */ > + /* Get the position for the first bit of successive 1. > + The 24th bit would be in successive 0 or 1. */ > + HOST_WIDE_INT low_mask = (1LL << 24) - 1LL; > + int pos_first_1 = ((c & (low_mask + 1)) == 0) > + ? clz_hwi (c & low_mask) > + : HOST_BITS_PER_WIDE_INT - ctz_hwi (~(c | low_mask)); > + middle_ones = clz_hwi (~c << pos_first_1); > + middle_zeros = ctz_hwi (c >> (HOST_BITS_PER_WIDE_INT - pos_first_1)); > + if (pos_first_1 < HOST_BITS_PER_WIDE_INT > + && middle_ones + middle_zeros < HOST_BITS_PER_WIDE_INT > + && middle_ones + middle_zeros >= ones) > +{ > + *mask = ~(((1ULL << middle_zeros) - 1LL) > + << (HOST_BITS_PER_WIDE_INT - pos_first_1)); > + *shift = HOST_BITS_PER_WIDE_INT - pos_first_1 + middle_zeros; > + return true; > +} > + > + return false; > +} > + > /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode. > Output insns to set DEST equal to the constant C as a series of > lis, ori and shl instructions. */ > @@ -10435,7 +10493,8 @@ rs6000_emit_set_long_const (rtx dest, > HOST_WIDE_INT c) > } >else if (can_be_built_by_li_lis_and_rotldi (c, , ) >|| can_be_built_by_li_lis_and_rldicl (c, , ) > - || can_be_built_by_li_lis_and_rldicr (c, , )) > + || can_be_built_by_li_lis_and_rldicr (c, , ) > + || can_be_built_by_li_and_rldic (c, , )) > { >temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); >unsigned HOST_WIDE_INT imm = (c | ~mask); > diff --git a/gcc/testsuite/gcc.target/powerpc/const-build.c > b/gcc/testsuite/gcc.target/powerpc/const-build.c > index 8c209921d41..b503ee31c7c 100644 > --- a/gcc/testsuite/gcc.target/powerpc/const-build.c > +++ b/gcc/testsuite/gcc.target/powerpc/const-build.c > @@ -82,6 +82,29 @@ lis_rldicr_12 (void) >return 0x5310LL; > } > > +long long NOIPA > +li_rldic_13 (void) > +{ > + return 0x000f8531LL; > +} > +long long NOIPA > +li_rldic_14 (void) > +{ > + return 0x853100ffLL; > +} > + > +long long NOIPA > +li_rldic_15 (void) > +{ > + return 0x8031LL; > +} > + > +long long NOIPA > +li_rldic_16 (void) > +{ >
[PATCH 4/4] rs6000: build constant via li/lis;rldic
Hi, This patch checks if a constant is possible to be built by "li;rldic". We only need to take care of "negative li", other forms do not need to check. For example, "negative lis" is just a "negative li" with an additional shift. Bootstrap and regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu) gcc/ChangeLog: * config/rs6000/rs6000.cc (can_be_built_by_li_and_rldic): New function. (rs6000_emit_set_long_const): Call can_be_built_by_li_and_rldic. gcc/testsuite/ChangeLog: * gcc.target/powerpc/const-build.c: Add more tests. --- gcc/config/rs6000/rs6000.cc | 61 ++- .../gcc.target/powerpc/const-build.c | 28 + 2 files changed, 88 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 2a3fa733b45..cd04b6b5c82 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10387,6 +10387,64 @@ can_be_built_by_li_lis_and_rldicr (HOST_WIDE_INT c, int *shift, return false; } +/* Check if value C can be built by 2 instructions: one is 'li', another is + rldic. + + If so, *SHIFT is set to the 'shift' operand of rldic; and *MASK is set + to the mask value about the 'mb' operand of rldic; and return true. + Return false otherwise. */ + +static bool +can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int *shift, HOST_WIDE_INT *mask) +{ + /* There are 49 successive ones in the negative value of 'li'. */ + int ones = 49; + + /* 1..1xx1..1: negative value of li --> 0..01..1xx0..0: + right bits are shifted as 0's, and left 1's(and x's) are cleaned. */ + int tz = ctz_hwi (c); + int lz = clz_hwi (c); + int middle_ones = clz_hwi (~(c << lz)); + if (tz + lz + middle_ones >= ones) +{ + *mask = ((1LL << (HOST_BITS_PER_WIDE_INT - tz - lz)) - 1LL) << tz; + *shift = tz; + return true; +} + + /* 1..1xx1..1 --> 1..1xx0..01..1: some 1's(following x's) are cleaned. */ + int leading_ones = clz_hwi (~c); + int tailing_ones = ctz_hwi (~c); + int middle_zeros = ctz_hwi (c >> tailing_ones); + if (leading_ones + tailing_ones + middle_zeros >= ones) +{ + *mask = ~(((1ULL << middle_zeros) - 1ULL) << tailing_ones); + *shift = tailing_ones + middle_zeros; + return true; +} + + /* xx1..1xx: --> xx0..01..1xx: some 1's(following x's) are cleaned. */ + /* Get the position for the first bit of successive 1. + The 24th bit would be in successive 0 or 1. */ + HOST_WIDE_INT low_mask = (1LL << 24) - 1LL; + int pos_first_1 = ((c & (low_mask + 1)) == 0) + ? clz_hwi (c & low_mask) + : HOST_BITS_PER_WIDE_INT - ctz_hwi (~(c | low_mask)); + middle_ones = clz_hwi (~c << pos_first_1); + middle_zeros = ctz_hwi (c >> (HOST_BITS_PER_WIDE_INT - pos_first_1)); + if (pos_first_1 < HOST_BITS_PER_WIDE_INT + && middle_ones + middle_zeros < HOST_BITS_PER_WIDE_INT + && middle_ones + middle_zeros >= ones) +{ + *mask = ~(((1ULL << middle_zeros) - 1LL) + << (HOST_BITS_PER_WIDE_INT - pos_first_1)); + *shift = HOST_BITS_PER_WIDE_INT - pos_first_1 + middle_zeros; + return true; +} + + return false; +} + /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode. Output insns to set DEST equal to the constant C as a series of lis, ori and shl instructions. */ @@ -10435,7 +10493,8 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c) } else if (can_be_built_by_li_lis_and_rotldi (c, , ) || can_be_built_by_li_lis_and_rldicl (c, , ) - || can_be_built_by_li_lis_and_rldicr (c, , )) + || can_be_built_by_li_lis_and_rldicr (c, , ) + || can_be_built_by_li_and_rldic (c, , )) { temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); unsigned HOST_WIDE_INT imm = (c | ~mask); diff --git a/gcc/testsuite/gcc.target/powerpc/const-build.c b/gcc/testsuite/gcc.target/powerpc/const-build.c index 8c209921d41..b503ee31c7c 100644 --- a/gcc/testsuite/gcc.target/powerpc/const-build.c +++ b/gcc/testsuite/gcc.target/powerpc/const-build.c @@ -82,6 +82,29 @@ lis_rldicr_12 (void) return 0x5310LL; } +long long NOIPA +li_rldic_13 (void) +{ + return 0x000f8531LL; +} +long long NOIPA +li_rldic_14 (void) +{ + return 0x853100ffLL; +} + +long long NOIPA +li_rldic_15 (void) +{ + return 0x8031LL; +} + +long long NOIPA +li_rldic_16 (void) +{ + return 0x8f31LL; +} + struct fun arr[] = { {li_rotldi_1, 0x75310LL}, {li_rotldi_2, 0x2164LL}, @@ -95,11 +118,16 @@ struct fun arr[] = { {li_rldicr_10, 0x8531fff0LL}, {li_rldicr_11, 0x21f0LL}, {lis_rldicr_12, 0x5310LL}, + {li_rldic_13, 0x000f8531LL}, + {li_rldic_14, 0x853100ffLL}, + {li_rldic_15, 0x8031LL}, + {li_rldic_16, 0x8f31LL} }; /* { dg-final { scan-assembler-times
[PATCH 4/4] rs6000: build constant via li/lis;rldic
Hi, This patch checks if a constant is possible to be built by "li;rldic". We only need to take care of "negative li", other forms do not need to check. For example, "negative lis" is just a "negative li" with an additional shift. Bootstrap and regtest pass on ppc64{,le}. Is this ok for trunk or next stage1? BR, Jeff (Jiufu) gcc/ChangeLog: * config/rs6000/rs6000.cc (can_be_built_by_li_and_rldic): New function. (rs6000_emit_set_long_const): Call can_be_built_by_li_and_rldic. gcc/testsuite/ChangeLog: * gcc.target/powerpc/const-build.c: Add more tests. --- gcc/config/rs6000/rs6000.cc | 60 ++- .../gcc.target/powerpc/const-build.c | 28 + 2 files changed, 87 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 025abaa436e..59b4e422058 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10361,6 +10361,63 @@ can_be_built_by_li_lis_and_rldicr (HOST_WIDE_INT c, int *shift, return false; } +/* Check if value C can be built by 2 instructions: one is 'li', another is + rldic. + + If so, *SHIFT is set to the 'shift' operand of rldic; and *MASK is set + to the mask value about the 'mb' operand of rldic; and return true. + Return false otherwise. */ +static bool +can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int *shift, HOST_WIDE_INT *mask) +{ + /* There are 49 successive ones in the negative value of 'li'. */ + int ones = 49; + + /* 1..1xx1..1: negative value of li --> 0..01..1xx0..0: + right bits are shiftted as 0's, and left 1's(and x's) are cleaned. */ + int tz = ctz_hwi (c); + int lz = clz_hwi (c); + int middle_ones = clz_hwi (~(c << lz)); + if (tz + lz + middle_ones >= ones) +{ + *mask = ((1LL << (HOST_BITS_PER_WIDE_INT - tz - lz)) - 1LL) << tz; + *shift = tz; + return true; +} + + /* 1..1xx1..1 --> 1..1xx0..01..1: some 1's(following x's) are cleaned. */ + int leading_ones = clz_hwi (~c); + int tailing_ones = ctz_hwi (~c); + int middle_zeros = ctz_hwi (c >> tailing_ones); + if (leading_ones + tailing_ones + middle_zeros >= ones) +{ + *mask = ~(((1ULL << middle_zeros) - 1ULL) << tailing_ones); + *shift = tailing_ones + middle_zeros; + return true; +} + + /* xx1..1xx: --> xx0..01..1xx: some 1's(following x's) are cleaned. */ + /* Get the possition for the first bit of sucessive 1. + The 24th bit would be in successive 0 or 1. */ + HOST_WIDE_INT low_mask = (1LL << 24) - 1LL; + int pos_first_1 = ((c & (low_mask + 1)) == 0) + ? clz_hwi (c & low_mask) + : HOST_BITS_PER_WIDE_INT - ctz_hwi (~(c | low_mask)); + middle_ones = clz_hwi (~c << pos_first_1); + middle_zeros = ctz_hwi (c >> (HOST_BITS_PER_WIDE_INT - pos_first_1)); + if (pos_first_1 < HOST_BITS_PER_WIDE_INT + && middle_ones + middle_zeros < HOST_BITS_PER_WIDE_INT + && middle_ones + middle_zeros >= ones) +{ + *mask = ~(((1ULL << middle_zeros) - 1LL) + << (HOST_BITS_PER_WIDE_INT - pos_first_1)); + *shift = HOST_BITS_PER_WIDE_INT - pos_first_1 + middle_zeros; + return true; +} + + return false; +} + /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode. Output insns to set DEST equal to the constant C as a series of lis, ori and shl instructions. */ @@ -10402,7 +10459,8 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c) } else if (can_be_built_by_li_lis_and_rotldi (c, , ) || can_be_built_by_li_lis_and_rldicl (c, , ) - || can_be_built_by_li_lis_and_rldicr (c, , )) + || can_be_built_by_li_lis_and_rldicr (c, , ) + || can_be_built_by_li_and_rldic (c, , )) { temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); unsigned HOST_WIDE_INT imm = (c | ~mask); diff --git a/gcc/testsuite/gcc.target/powerpc/const-build.c b/gcc/testsuite/gcc.target/powerpc/const-build.c index 8c209921d41..b503ee31c7c 100644 --- a/gcc/testsuite/gcc.target/powerpc/const-build.c +++ b/gcc/testsuite/gcc.target/powerpc/const-build.c @@ -82,6 +82,29 @@ lis_rldicr_12 (void) return 0x5310LL; } +long long NOIPA +li_rldic_13 (void) +{ + return 0x000f8531LL; +} +long long NOIPA +li_rldic_14 (void) +{ + return 0x853100ffLL; +} + +long long NOIPA +li_rldic_15 (void) +{ + return 0x8031LL; +} + +long long NOIPA +li_rldic_16 (void) +{ + return 0x8f31LL; +} + struct fun arr[] = { {li_rotldi_1, 0x75310LL}, {li_rotldi_2, 0x2164LL}, @@ -95,11 +118,16 @@ struct fun arr[] = { {li_rldicr_10, 0x8531fff0LL}, {li_rldicr_11, 0x21f0LL}, {lis_rldicr_12, 0x5310LL}, + {li_rldic_13, 0x000f8531LL}, + {li_rldic_14, 0x853100ffLL}, + {li_rldic_15, 0x8031LL}, + {li_rldic_16, 0x8f31LL} }; /* { dg-final {