Re: [PATCH 3/3]rs6000: split complicate constant to constant pool

2023-10-25 Thread Jiufu Guo


Hi,

"Kewen.Lin"  writes:

> on 2023/10/25 16:14, Jiufu Guo wrote:
>> 
>> Hi,
>> 
>> "Kewen.Lin"  writes:
>> 
>>> Hi,
>>>
>>> on 2023/10/25 10:00, Jiufu Guo wrote:
 Hi,

 Sometimes, a complicated constant is built via 3(or more)
 instructions to build. Generally speaking, it would not be
 as faster as loading it from the constant pool (as a few
 discussions in PR63281).
>>>
>>> I may miss some previous discussions, but I'm curious why we
>>> chose ">=3" here, as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c9
>>> which indicates that more than 3 (>3) should be considered
>>> with this change.
>> 
>> Thanks a lot for your great patience for reading the history!
>> Yes, there are some discussions about "> 3" vs. "> 2".
>> - In theory, "ld" is one instruction.  If consider "address/toc"
>>   adjust, we may count it as 2 instructions. "pld" may need less
>>   cycles.
>
> OK, even without prefixed insn support, the high part of address
> computation could be optimized as nop by linker further.  It would
> be good to say something on this in commit log, otherwise people
> may be confused as the PR comment mentioned above.
>
>> - As test, it seems "> 2" could get better/stable runtime result
>>   during testing SPEC2017.
>
> Ok, if you posted the conclusion previously, it would be good to
> mention it here with a link on the result comparisons.

Thanks a lot for your sugguestions.

>
>> 
>>>

 For the concern that I raised in:
 https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599676.html
 The micro-cases would not be the major concern. Because as
 Segher explained in:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c18
 It would just be about the benchmark method.

 As tested on spec2017, for visible performance changes, we
 can find the runtime improvement on 500.perlbench_r about
 ~1.8% (-O2) when support loading complicates constant from
 constant pool. And no visible performance recession on
 other benchmarks.
>>>
>>> The improvement on 500.perlbench_r looks to match what PR63281
>>> mentioned, nice!  I'm curious that which options and which kinds
>>> of CPUs have you tested with?  Since this is a general change,
>>> I'd expect we can test with P8/P9/P10 at O2/O3 (or Ofast) at
>>> least.
>> 
>> Great advice! Thanks for pointing this!
>> A few months ago, P8/P9/P10 are tested. While this time, I rerun
>> SPEC2017 on P10 for -O2 and -O3.  More test on latest code would
>> be better.
>
> Was it tested previously with your recent commits on constant
> building together?  or just with the trunk at that time?  Anyway,

Just with the trunk at that time.

> I was curious how it's tested, thanks for replying, good to see
> those are covered.  :)  I'd leave the further review to Segher and
> David.

Thanks again.

BR,
Jeff (Jiufu Guo)

>
> BR,
> Kewen
>
>> 
>> 
>> BR,
>> Jeff (Jiufu Guo)
>> 
>>>
>>> BR,
>>> Kewen
>>>

 Boostrap & regtest pass on ppc64{,le}.
 Is this ok for trunk?

 BR,
 Jeff (Jiufu Guo)

PR target/63281

 gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_set_const): Update to split
complicate constant to memory.

 gcc/testsuite/ChangeLog:

* gcc.target/powerpc/const_anchors.c: Update to test final-rtl. 
* gcc.target/powerpc/parall_5insn_const.c: Update to keep original test
point.
* gcc.target/powerpc/pr106550.c: Likewise..
* gcc.target/powerpc/pr106550_1.c: Likewise.
* gcc.target/powerpc/pr87870.c: Update according to latest behavior.
* gcc.target/powerpc/pr93012.c: Likewise.

 ---
  gcc/config/rs6000/rs6000.cc | 16 
  .../gcc.target/powerpc/const_anchors.c  |  5 ++---
  .../gcc.target/powerpc/parall_5insn_const.c | 14 --
  gcc/testsuite/gcc.target/powerpc/pr106550.c | 17 +++--
  gcc/testsuite/gcc.target/powerpc/pr106550_1.c   | 15 +--
  gcc/testsuite/gcc.target/powerpc/pr87870.c  |  5 -
  gcc/testsuite/gcc.target/powerpc/pr93012.c  |  4 +++-
  7 files changed, 65 insertions(+), 11 deletions(-)

 diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
 index 4690384cdbe..b9562f1ea0f 100644
 --- a/gcc/config/rs6000/rs6000.cc
 +++ b/gcc/config/rs6000/rs6000.cc
 @@ -10292,6 +10292,22 @@ rs6000_emit_set_const (rtx dest, rtx source)
  c = sext_hwi (c, 32);
  emit_move_insn (lo, GEN_INT (c));
}
 +
 +  /* If it can be stored to the constant pool and profitable.  */
 +  else if (base_reg_operand (dest, mode)
 + && num_insns_constant (source, mode) > 2)
 +  {
 +rtx sym = force_const_mem (mode, source);
 +if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0))
 +&& use_toc_relative_ref (XEXP (sym, 0), mode))
 +  {
 +

Re: [PATCH 3/3]rs6000: split complicate constant to constant pool

2023-10-25 Thread Kewen.Lin
on 2023/10/25 16:14, Jiufu Guo wrote:
> 
> Hi,
> 
> "Kewen.Lin"  writes:
> 
>> Hi,
>>
>> on 2023/10/25 10:00, Jiufu Guo wrote:
>>> Hi,
>>>
>>> Sometimes, a complicated constant is built via 3(or more)
>>> instructions to build. Generally speaking, it would not be
>>> as faster as loading it from the constant pool (as a few
>>> discussions in PR63281).
>>
>> I may miss some previous discussions, but I'm curious why we
>> chose ">=3" here, as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c9
>> which indicates that more than 3 (>3) should be considered
>> with this change.
> 
> Thanks a lot for your great patience for reading the history!
> Yes, there are some discussions about "> 3" vs. "> 2".
> - In theory, "ld" is one instruction.  If consider "address/toc"
>   adjust, we may count it as 2 instructions. "pld" may need less
>   cycles.

OK, even without prefixed insn support, the high part of address
computation could be optimized as nop by linker further.  It would
be good to say something on this in commit log, otherwise people
may be confused as the PR comment mentioned above.

> - As test, it seems "> 2" could get better/stable runtime result
>   during testing SPEC2017.

Ok, if you posted the conclusion previously, it would be good to
mention it here with a link on the result comparisons.

> 
>>
>>>
>>> For the concern that I raised in:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599676.html
>>> The micro-cases would not be the major concern. Because as
>>> Segher explained in:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c18
>>> It would just be about the benchmark method.
>>>
>>> As tested on spec2017, for visible performance changes, we
>>> can find the runtime improvement on 500.perlbench_r about
>>> ~1.8% (-O2) when support loading complicates constant from
>>> constant pool. And no visible performance recession on
>>> other benchmarks.
>>
>> The improvement on 500.perlbench_r looks to match what PR63281
>> mentioned, nice!  I'm curious that which options and which kinds
>> of CPUs have you tested with?  Since this is a general change,
>> I'd expect we can test with P8/P9/P10 at O2/O3 (or Ofast) at
>> least.
> 
> Great advice! Thanks for pointing this!
> A few months ago, P8/P9/P10 are tested. While this time, I rerun
> SPEC2017 on P10 for -O2 and -O3.  More test on latest code would
> be better.

Was it tested previously with your recent commits on constant
building together?  or just with the trunk at that time?  Anyway,
I was curious how it's tested, thanks for replying, good to see
those are covered.  :)  I'd leave the further review to Segher and
David.

BR,
Kewen

> 
> 
> BR,
> Jeff (Jiufu Guo)
> 
>>
>> BR,
>> Kewen
>>
>>>
>>> Boostrap & regtest pass on ppc64{,le}.
>>> Is this ok for trunk?
>>>
>>> BR,
>>> Jeff (Jiufu Guo)
>>>
>>> PR target/63281
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/rs6000/rs6000.cc (rs6000_emit_set_const): Update to split
>>> complicate constant to memory.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/powerpc/const_anchors.c: Update to test final-rtl. 
>>> * gcc.target/powerpc/parall_5insn_const.c: Update to keep original test
>>> point.
>>> * gcc.target/powerpc/pr106550.c: Likewise..
>>> * gcc.target/powerpc/pr106550_1.c: Likewise.
>>> * gcc.target/powerpc/pr87870.c: Update according to latest behavior.
>>> * gcc.target/powerpc/pr93012.c: Likewise.
>>>
>>> ---
>>>  gcc/config/rs6000/rs6000.cc | 16 
>>>  .../gcc.target/powerpc/const_anchors.c  |  5 ++---
>>>  .../gcc.target/powerpc/parall_5insn_const.c | 14 --
>>>  gcc/testsuite/gcc.target/powerpc/pr106550.c | 17 +++--
>>>  gcc/testsuite/gcc.target/powerpc/pr106550_1.c   | 15 +--
>>>  gcc/testsuite/gcc.target/powerpc/pr87870.c  |  5 -
>>>  gcc/testsuite/gcc.target/powerpc/pr93012.c  |  4 +++-
>>>  7 files changed, 65 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>>> index 4690384cdbe..b9562f1ea0f 100644
>>> --- a/gcc/config/rs6000/rs6000.cc
>>> +++ b/gcc/config/rs6000/rs6000.cc
>>> @@ -10292,6 +10292,22 @@ rs6000_emit_set_const (rtx dest, rtx source)
>>>   c = sext_hwi (c, 32);
>>>   emit_move_insn (lo, GEN_INT (c));
>>> }
>>> +
>>> +  /* If it can be stored to the constant pool and profitable.  */
>>> +  else if (base_reg_operand (dest, mode)
>>> +  && num_insns_constant (source, mode) > 2)
>>> +   {
>>> + rtx sym = force_const_mem (mode, source);
>>> + if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0))
>>> + && use_toc_relative_ref (XEXP (sym, 0), mode))
>>> +   {
>>> + rtx toc = create_TOC_reference (XEXP (sym, 0), copy_rtx (dest));
>>> + sym = gen_const_mem (mode, toc);
>>> + set_mem_alias_set (sym, get_TOC_alias_set ());
>>> +   }
>>> +
>>> + emit_insn (gen_rtx_SET (dest, sym));
>>> +   }
>>>else

Re: [PATCH 3/3]rs6000: split complicate constant to constant pool

2023-10-25 Thread Jiufu Guo


Hi,

"Kewen.Lin"  writes:

> Hi,
>
> on 2023/10/25 10:00, Jiufu Guo wrote:
>> Hi,
>> 
>> Sometimes, a complicated constant is built via 3(or more)
>> instructions to build. Generally speaking, it would not be
>> as faster as loading it from the constant pool (as a few
>> discussions in PR63281).
>
> I may miss some previous discussions, but I'm curious why we
> chose ">=3" here, as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c9
> which indicates that more than 3 (>3) should be considered
> with this change.

Thanks a lot for your great patience for reading the history!
Yes, there are some discussions about "> 3" vs. "> 2".
- In theory, "ld" is one instruction.  If consider "address/toc"
  adjust, we may count it as 2 instructions. "pld" may need less
  cycles.
- As test, it seems "> 2" could get better/stable runtime result
  during testing SPEC2017.

>
>> 
>> For the concern that I raised in:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599676.html
>> The micro-cases would not be the major concern. Because as
>> Segher explained in:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c18
>> It would just be about the benchmark method.
>> 
>> As tested on spec2017, for visible performance changes, we
>> can find the runtime improvement on 500.perlbench_r about
>> ~1.8% (-O2) when support loading complicates constant from
>> constant pool. And no visible performance recession on
>> other benchmarks.
>
> The improvement on 500.perlbench_r looks to match what PR63281
> mentioned, nice!  I'm curious that which options and which kinds
> of CPUs have you tested with?  Since this is a general change,
> I'd expect we can test with P8/P9/P10 at O2/O3 (or Ofast) at
> least.

Great advice! Thanks for pointing this!
A few months ago, P8/P9/P10 are tested. While this time, I rerun
SPEC2017 on P10 for -O2 and -O3.  More test on latest code would
be better.


BR,
Jeff (Jiufu Guo)

>
> BR,
> Kewen
>
>> 
>> Boostrap & regtest pass on ppc64{,le}.
>> Is this ok for trunk?
>> 
>> BR,
>> Jeff (Jiufu Guo)
>> 
>>  PR target/63281
>> 
>> gcc/ChangeLog:
>> 
>>  * config/rs6000/rs6000.cc (rs6000_emit_set_const): Update to split
>>  complicate constant to memory.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/powerpc/const_anchors.c: Update to test final-rtl. 
>>  * gcc.target/powerpc/parall_5insn_const.c: Update to keep original test
>>  point.
>>  * gcc.target/powerpc/pr106550.c: Likewise..
>>  * gcc.target/powerpc/pr106550_1.c: Likewise.
>>  * gcc.target/powerpc/pr87870.c: Update according to latest behavior.
>>  * gcc.target/powerpc/pr93012.c: Likewise.
>> 
>> ---
>>  gcc/config/rs6000/rs6000.cc | 16 
>>  .../gcc.target/powerpc/const_anchors.c  |  5 ++---
>>  .../gcc.target/powerpc/parall_5insn_const.c | 14 --
>>  gcc/testsuite/gcc.target/powerpc/pr106550.c | 17 +++--
>>  gcc/testsuite/gcc.target/powerpc/pr106550_1.c   | 15 +--
>>  gcc/testsuite/gcc.target/powerpc/pr87870.c  |  5 -
>>  gcc/testsuite/gcc.target/powerpc/pr93012.c  |  4 +++-
>>  7 files changed, 65 insertions(+), 11 deletions(-)
>> 
>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index 4690384cdbe..b9562f1ea0f 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -10292,6 +10292,22 @@ rs6000_emit_set_const (rtx dest, rtx source)
>>c = sext_hwi (c, 32);
>>emit_move_insn (lo, GEN_INT (c));
>>  }
>> +
>> +  /* If it can be stored to the constant pool and profitable.  */
>> +  else if (base_reg_operand (dest, mode)
>> +   && num_insns_constant (source, mode) > 2)
>> +{
>> +  rtx sym = force_const_mem (mode, source);
>> +  if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0))
>> +  && use_toc_relative_ref (XEXP (sym, 0), mode))
>> +{
>> +  rtx toc = create_TOC_reference (XEXP (sym, 0), copy_rtx (dest));
>> +  sym = gen_const_mem (mode, toc);
>> +  set_mem_alias_set (sym, get_TOC_alias_set ());
>> +}
>> +
>> +  emit_insn (gen_rtx_SET (dest, sym));
>> +}
>>else
>>  rs6000_emit_set_long_const (dest, c);
>>break;
>> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c 
>> b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>> index 542e2674b12..188744165f2 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>> @@ -1,5 +1,5 @@
>>  /* { dg-do compile { target has_arch_ppc64 } } */
>> -/* { dg-options "-O2" } */
>> +/* { dg-options "-O2 -fdump-rtl-final" } */
>>  
>>  #define C1 0x2351847027482577ULL
>>  #define C2 0x2351847027482578ULL
>> @@ -16,5 +16,4 @@ void __attribute__ ((noinline)) foo1 (long long *a, long 
>> long b)
>>if (b)
>>  *a++ = C2;
>>  }
>> -
>> -/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
>> +/* { dg-final { 

Re: [PATCH 3/3]rs6000: split complicate constant to constant pool

2023-10-24 Thread Kewen.Lin
Hi,

on 2023/10/25 10:00, Jiufu Guo wrote:
> Hi,
> 
> Sometimes, a complicated constant is built via 3(or more)
> instructions to build. Generally speaking, it would not be
> as faster as loading it from the constant pool (as a few
> discussions in PR63281).

I may miss some previous discussions, but I'm curious why we
chose ">=3" here, as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c9
which indicates that more than 3 (>3) should be considered
with this change.

> 
> For the concern that I raised in:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599676.html
> The micro-cases would not be the major concern. Because as
> Segher explained in:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c18
> It would just be about the benchmark method.
> 
> As tested on spec2017, for visible performance changes, we
> can find the runtime improvement on 500.perlbench_r about
> ~1.8% (-O2) when support loading complicates constant from
> constant pool. And no visible performance recession on
> other benchmarks.

The improvement on 500.perlbench_r looks to match what PR63281
mentioned, nice!  I'm curious that which options and which kinds
of CPUs have you tested with?  Since this is a general change,
I'd expect we can test with P8/P9/P10 at O2/O3 (or Ofast) at
least.

BR,
Kewen

> 
> Boostrap & regtest pass on ppc64{,le}.
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu Guo)
> 
>   PR target/63281
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_const): Update to split
>   complicate constant to memory.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/const_anchors.c: Update to test final-rtl. 
>   * gcc.target/powerpc/parall_5insn_const.c: Update to keep original test
>   point.
>   * gcc.target/powerpc/pr106550.c: Likewise..
>   * gcc.target/powerpc/pr106550_1.c: Likewise.
>   * gcc.target/powerpc/pr87870.c: Update according to latest behavior.
>   * gcc.target/powerpc/pr93012.c: Likewise.
> 
> ---
>  gcc/config/rs6000/rs6000.cc | 16 
>  .../gcc.target/powerpc/const_anchors.c  |  5 ++---
>  .../gcc.target/powerpc/parall_5insn_const.c | 14 --
>  gcc/testsuite/gcc.target/powerpc/pr106550.c | 17 +++--
>  gcc/testsuite/gcc.target/powerpc/pr106550_1.c   | 15 +--
>  gcc/testsuite/gcc.target/powerpc/pr87870.c  |  5 -
>  gcc/testsuite/gcc.target/powerpc/pr93012.c  |  4 +++-
>  7 files changed, 65 insertions(+), 11 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 4690384cdbe..b9562f1ea0f 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10292,6 +10292,22 @@ rs6000_emit_set_const (rtx dest, rtx source)
> c = sext_hwi (c, 32);
> emit_move_insn (lo, GEN_INT (c));
>   }
> +
> +  /* If it can be stored to the constant pool and profitable.  */
> +  else if (base_reg_operand (dest, mode)
> +&& num_insns_constant (source, mode) > 2)
> + {
> +   rtx sym = force_const_mem (mode, source);
> +   if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0))
> +   && use_toc_relative_ref (XEXP (sym, 0), mode))
> + {
> +   rtx toc = create_TOC_reference (XEXP (sym, 0), copy_rtx (dest));
> +   sym = gen_const_mem (mode, toc);
> +   set_mem_alias_set (sym, get_TOC_alias_set ());
> + }
> +
> +   emit_insn (gen_rtx_SET (dest, sym));
> + }
>else
>   rs6000_emit_set_long_const (dest, c);
>break;
> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c 
> b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> index 542e2674b12..188744165f2 100644
> --- a/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target has_arch_ppc64 } } */
> -/* { dg-options "-O2" } */
> +/* { dg-options "-O2 -fdump-rtl-final" } */
>  
>  #define C1 0x2351847027482577ULL
>  #define C2 0x2351847027482578ULL
> @@ -16,5 +16,4 @@ void __attribute__ ((noinline)) foo1 (long long *a, long 
> long b)
>if (b)
>  *a++ = C2;
>  }
> -
> -/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
> +/* { dg-final { scan-rtl-dump-times {\madddi3\M} 2 "final" } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c 
> b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> index e3a9a7264cf..df0690b90be 100644
> --- a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> @@ -9,8 +9,18 @@
>  void __attribute__ ((noinline)) foo (unsigned long long *a)
>  {
>/* 2 lis + 2 ori + 1 rldimi for each constant.  */
> -  *a++ = 0x800aabcdc167fa16ULL;
> -  *a++ = 0x7543a876867f616ULL;
> +  {
> +register long long d asm("r0") = 0x800aabcdc167fa16ULL;
> +long long n;
> +asm("mr %0, %1" : "=r"(n) : "r"(d));
> +*a++ = 

[PATCH 3/3]rs6000: split complicate constant to constant pool

2023-10-24 Thread Jiufu Guo
Hi,

Sometimes, a complicated constant is built via 3(or more)
instructions to build. Generally speaking, it would not be
as faster as loading it from the constant pool (as a few
discussions in PR63281).

For the concern that I raised in:
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599676.html
The micro-cases would not be the major concern. Because as
Segher explained in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281#c18
It would just be about the benchmark method.

As tested on spec2017, for visible performance changes, we
can find the runtime improvement on 500.perlbench_r about
~1.8% (-O2) when support loading complicates constant from
constant pool. And no visible performance recession on
other benchmarks.

Boostrap & regtest pass on ppc64{,le}.
Is this ok for trunk?

BR,
Jeff (Jiufu Guo)

PR target/63281

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_set_const): Update to split
complicate constant to memory.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/const_anchors.c: Update to test final-rtl. 
* gcc.target/powerpc/parall_5insn_const.c: Update to keep original test
point.
* gcc.target/powerpc/pr106550.c: Likewise..
* gcc.target/powerpc/pr106550_1.c: Likewise.
* gcc.target/powerpc/pr87870.c: Update according to latest behavior.
* gcc.target/powerpc/pr93012.c: Likewise.

---
 gcc/config/rs6000/rs6000.cc | 16 
 .../gcc.target/powerpc/const_anchors.c  |  5 ++---
 .../gcc.target/powerpc/parall_5insn_const.c | 14 --
 gcc/testsuite/gcc.target/powerpc/pr106550.c | 17 +++--
 gcc/testsuite/gcc.target/powerpc/pr106550_1.c   | 15 +--
 gcc/testsuite/gcc.target/powerpc/pr87870.c  |  5 -
 gcc/testsuite/gcc.target/powerpc/pr93012.c  |  4 +++-
 7 files changed, 65 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 4690384cdbe..b9562f1ea0f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10292,6 +10292,22 @@ rs6000_emit_set_const (rtx dest, rtx source)
  c = sext_hwi (c, 32);
  emit_move_insn (lo, GEN_INT (c));
}
+
+  /* If it can be stored to the constant pool and profitable.  */
+  else if (base_reg_operand (dest, mode)
+  && num_insns_constant (source, mode) > 2)
+   {
+ rtx sym = force_const_mem (mode, source);
+ if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0))
+ && use_toc_relative_ref (XEXP (sym, 0), mode))
+   {
+ rtx toc = create_TOC_reference (XEXP (sym, 0), copy_rtx (dest));
+ sym = gen_const_mem (mode, toc);
+ set_mem_alias_set (sym, get_TOC_alias_set ());
+   }
+
+ emit_insn (gen_rtx_SET (dest, sym));
+   }
   else
rs6000_emit_set_long_const (dest, c);
   break;
diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c 
b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
index 542e2674b12..188744165f2 100644
--- a/gcc/testsuite/gcc.target/powerpc/const_anchors.c
+++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target has_arch_ppc64 } } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fdump-rtl-final" } */
 
 #define C1 0x2351847027482577ULL
 #define C2 0x2351847027482578ULL
@@ -16,5 +16,4 @@ void __attribute__ ((noinline)) foo1 (long long *a, long long 
b)
   if (b)
 *a++ = C2;
 }
-
-/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
+/* { dg-final { scan-rtl-dump-times {\madddi3\M} 2 "final" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c 
b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
index e3a9a7264cf..df0690b90be 100644
--- a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
+++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
@@ -9,8 +9,18 @@
 void __attribute__ ((noinline)) foo (unsigned long long *a)
 {
   /* 2 lis + 2 ori + 1 rldimi for each constant.  */
-  *a++ = 0x800aabcdc167fa16ULL;
-  *a++ = 0x7543a876867f616ULL;
+  {
+register long long d asm("r0") = 0x800aabcdc167fa16ULL;
+long long n;
+asm("mr %0, %1" : "=r"(n) : "r"(d));
+*a++ = n;
+  }
+  {
+register long long d asm("r0") = 0x7543a876867f616ULL;
+long long n;
+asm("mr %0, %1" : "=r"(n) : "r"(d));
+*a++ = n;
+  }
 }
 
 long long A[] = {0x800aabcdc167fa16ULL, 0x7543a876867f616ULL};
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550.c 
b/gcc/testsuite/gcc.target/powerpc/pr106550.c
index 74e395331ab..5eca2b2f701 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr106550.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c
@@ -1,12 +1,25 @@
 /* PR target/106550 */
 /* { dg-options "-O2 -mdejagnu-cpu=power10" } */
 /* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target has_arch_ppc64 } */
 
 void
 foo (unsigned long long *a)
 {
-  *a++ =