Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-08-25 Thread HAO CHEN GUI via Gcc-patches
oat va = vec_promote (a, PREF_F); +  vector float vb = vec_promote (b, PREF_F); +  return vec_extract (vec_min (va, vb), PREF_F); +} On 25/8/2021 下午 8:34, Bill Schmidt wrote: Hi Haochen, Thanks for the updates!  This looks good to me; please await Segher's response. Bill On 8/25/21 2:06 AM, HAO

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-08-25 Thread HAO CHEN GUI via Gcc-patches
Hi Kewen,   Thanks for your advice. On 25/8/2021 下午 3:50, Kewen.Lin wrote: Hi Haochen, on 2021/8/25 下午3:06, HAO CHEN GUI via Gcc-patches wrote: Hi,     I refined the patch according to Bill's advice. I pasted the ChangeLog and diff file here. If it doesn't work, please let me know. Thanks

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-08-25 Thread HAO CHEN GUI via Gcc-patches
8/24/21 3:52 AM, HAO CHEN GUI wrote: Thanks for this patch!  In the future, if you can put your ChangeLog and patch inline in your post, it makes it easier to review.  (Otherwise we have to manually copy it into our response and manipulate it to look quoted, etc.) It is encoded even, making it

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-08-25 Thread HAO CHEN GUI via Gcc-patches
On 25/8/2021 下午 4:17, HAO CHEN GUI via Gcc-patches wrote: Hi Kewen,   Thanks for your advice. On 25/8/2021 下午 3:50, Kewen.Lin wrote: Hi Haochen, on 2021/8/25 下午3:06, HAO CHEN GUI via Gcc-patches wrote: Hi, I refined the patch according to Bill's advice. I pasted the ChangeLog

[PATCH, rs6000] optimization for long long and double vec_reve [PR100868]

2021-09-06 Thread HAO CHEN GUI via Gcc-patches
Hi    The patch optimized expansion for long long or double vec_reve builtin.      Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2021-09-06 Haochen Gui gcc/     * config/rs6000/altivec.md

[PATCH, rs6000] Optimization for vec_xl_sext

2021-09-06 Thread HAO CHEN GUI via Gcc-patches
Hi,    The patch optimized the code generation for vec_xl_sext builtin. Now all the sign extensions are done on VSX registers directly.    Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2021-09-06

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-09-06 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578162.html Thanks On 26/8/2021 上午 9:19, HAO CHEN GUI wrote: Hi Bill,    Thanks for your comments. Hi Segher,    Here is the ChangeLog and patch diff. Thanks. 2021-08-25 Haochen Gui gcc/     * config

[PATCH, rs6000] optimization for vec_reve builtin [PR100868]

2021-09-08 Thread HAO CHEN GUI via Gcc-patches
Hi,   The patch optimized for vec_reve builtin on rs6000. For V2DI and V2DF, it is implemented by xxswapd on all targets. For V16QI, V8HI, V4SI and V4SF, it is implemented by quadword byte reverse plus halfword/word byte reverse when p9_vector is defined.   Bootstrapped and tested on

Re: [PATCH, rs6000] Optimization for vec_xl_sext

2021-09-15 Thread HAO CHEN GUI via Gcc-patches
sd2q\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */ On 10/9/2021 下午 8:18, Bill Schmidt wrote: On 9/10/21 12:45 AM, HAO CHEN GUI wrote: Bill,

Re: [PATCH, rs6000] Optimization for vec_xl_sext

2021-09-09 Thread HAO CHEN GUI via Gcc-patches
mes {\mvextsw2d\M} 1 } } */ On 10/9/2021 上午 4:49, Bill Schmidt wrote: Hi Haochen, This patch was sent with "format=flowed", so it doesn't apply. That makes it harder to review.  Can you please make sure you disable line wrap from your patch submissions, at least in the patch part? On 9

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-10-13 Thread HAO CHEN GUI via Gcc-patches
ue, Oct 12, 2021 at 10:59 AM HAO CHEN GUI via Gcc-patches wrote: Hi, This patch disables gimple folding for float or double vec_min/max when fast-math is not set. It makes vec_min/max conform with the guide. Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-10-13 Thread HAO CHEN GUI via Gcc-patches
On 13/10/2021 下午 4:29, Richard Biener wrote: On Wed, Oct 13, 2021 at 9:43 AM HAO CHEN GUI wrote: Richard, Thanks so much for your comments. As far as I know, VSX/altivec min/max instructions don't conform with C-Sytle Min/Max Macro. The fold converts it to MIN/MAX_EXPR then it has

[PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-10-12 Thread HAO CHEN GUI via Gcc-patches
Hi,    This patch disables gimple folding for float or double vec_min/max when fast-math is not set. It makes vec_min/max conform with the guide. Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay  for trunk? Any recommendations? Thanks a lot.    I re-send the

PATCH, rs6000] Optimization for vec_xl_sext

2021-10-14 Thread HAO CHEN GUI via Gcc-patches
Hi,   The patch optimizes the code generation for vec_xl_sext builtin. Now all the sign extensions are done on VSX registers directly.   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot.   I refined the patch

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-10-21 Thread HAO CHEN GUI via Gcc-patches
On 21/10/2021 上午 12:19, Segher Boessenkool wrote: > Hi! > > On Wed, Oct 20, 2021 at 05:04:56PM +0800, HAO CHEN GUI wrote: >> This patch disables gimple folding for float or double vec_min/max when  >> fast-math is not set. It makes vec_min/max conform with the guid

Re: [PATCH, rs6000] punish reload of lfiwzx when loading an int variable [PR102169, PR102146]

2021-10-14 Thread HAO CHEN GUI via Gcc-patches
On 14/10/2021 上午 8:12, Segher Boessenkool wrote: On Wed, Sep 29, 2021 at 04:32:19PM +0800, HAO CHEN GUI wrote:   The patch punishes reload of alternative pair of "d, Z" for movsi_internal1. The reload occurs if 'Z' doesn't match and generates an additional insn. So the memory rel

[PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-10-20 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch disables gimple folding for float or double vec_min/max when  fast-math is not set. It makes vec_min/max conform with the guide. Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay  for trunk? Any recommendations? Thanks a lot.   I refined the patch

Re: PATCH, rs6000] Optimization for vec_xl_sext

2021-10-19 Thread HAO CHEN GUI via Gcc-patches
Committed as r12-4494. Thanks to all of you. Gui Haochen On 15/10/2021 上午 2:53, David Edelsohn wrote: On Thu, Oct 14, 2021 at 2:17 AM HAO CHEN GUI wrote: Hi, The patch optimizes the code generation for vec_xl_sext builtin. Now all the sign extensions are done on VSX registers directly

Ping^1 [PATCH, rs6000] optimization for vec_reve builtin [PR100868]

2021-10-10 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579038.html Thanks On 8/9/2021 下午 2:42, HAO CHEN GUI wrote: Hi,   The patch optimized for vec_reve builtin on rs6000. For V2DI and V2DF, it is implemented by xxswapd on all targets. For V16QI, V8HI, V4SI

[PATCH, rs6000] Optimization for vec_xl_sext

2021-10-11 Thread HAO CHEN GUI via Gcc-patches
Hi,    The patch optimized the code generation for vec_xl_sext builtin. Now all the  sign extensions are done on VSX registers directly.    Bootstrapped and tested on powerpc64le-linux with no regressions. Is this  okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2021-10-11

PING^3 [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-10-10 Thread HAO CHEN GUI via Gcc-patches
Hi,     Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578162.html Thanks On 22/9/2021 下午 2:52, HAO CHEN GUI wrote: Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578162.html Thanks On 6/9/2021 下午 2:01, HAO CHEN GUI wrote

Ping^1 [PATCH, rs6000] punish reload of lfiwzx when loading an int variable [PR102169, PR102146]

2021-10-10 Thread HAO CHEN GUI via Gcc-patches
Hi,     Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580479.html Thanks On 29/9/2021 下午 4:32, HAO CHEN GUI wrote: Hi,   The patch punishes reload of alternative pair of "d, Z" for movsi_internal1. The reload occurs if 'Z' doesn't match and

[PATCH, rs6000] punish reload of lfiwzx when loading an int variable [PR102169, PR102146]

2021-09-29 Thread HAO CHEN GUI via Gcc-patches
Hi,   The patch punishes reload of alternative pair of "d, Z" for movsi_internal1. The reload occurs if 'Z' doesn't match and generates an additional insn. So the memory reload should be punished.   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this  okay for trunk? Any 

PING^1 [PATCH, rs6000] Optimization for vec_xl_sext

2021-09-22 Thread HAO CHEN GUI via Gcc-patches
Hi,   Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579434.html Thanks On 15/9/2021 下午 3:35, HAO CHEN GUI wrote: Bill,     Yes, I built the gcc with p10 binutils. Then power10_ok tests can pass. Thanks again for your kindly explanation.     I finally realized

PING^2 [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-09-22 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578162.html Thanks On 6/9/2021 下午 2:01, HAO CHEN GUI wrote: Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578162.html Thanks On 26/8/2021 上午 9:19, HAO CHEN GUI wrote: Hi

[PATCH] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]

2021-12-01 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch modifies the combine pattern with a helper - change_pseudo_and_mask when recog fails. The helper converts a single pseudo to the pseudo AND with a mask if the outer operator is IOR/XOR/PLUS and the inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + ior

[PATCH, rs6000] Implement mffscrni pattern

2021-12-19 Thread HAO CHEN GUI via Gcc-patches
Hi, I modified the patch according to David and Segher's advice. This patch defines a pattern for mffscrni. If the RN is a constant, it can call gen_rs6000_mffscrni directly. The "rs6000-builtin-new.def" defines prototype for builtin arguments. The pattern "rs6000_set_fpscr_rn" is then

[PATCH v3, rs6000] Implement mffscrni pattern

2021-12-21 Thread HAO CHEN GUI via Gcc-patches
Hi, I modified the patch according to reviewers' advice. This patch defines a pattern for mffscrni. If the RN is a constant, it can call gen_rs6000_mffscrni directly. The "rs6000-builtin-new.def" defines prototype for builtin arguments. The pattern "rs6000_set_fpscr_rn" is then broken as

[PATCH, rs6000] Fix ICE on expand bcd__ [PR100736]

2021-12-21 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch fixes the ICE in PR100736. It adds a reverse condition comparison when the condition code can be reversed and finite-math-only is set. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot.

Re: [PATCH v3, rs6000] Implement mffscrni pattern

2021-12-21 Thread HAO CHEN GUI via Gcc-patches
Hi Segher, Thanks for your advice. Please see my explanation below. On 22/12/2021 上午 1:05, Segher Boessenkool wrote: > Hi! > > On Tue, Dec 21, 2021 at 04:08:06PM +0800, HAO CHEN GUI wrote: >> This patch defines a pattern for mffscrni. If the RN is a constant,

Re: [PATCH 0/3] Add zero cycle move support

2021-11-22 Thread HAO CHEN GUI via Gcc-patches
Bill and David,     Currently, the absolute jump table is not by default enabled. It can be enabled by undocumented option "-mno-relative-jumptables". If the target supports named sections (have_named_sections), the feature can be enabled. We plan to enable the future by default in GCC12 and

Re: [PATCH, rs6000] optimization for vec_reve builtin [PR100868]

2021-11-23 Thread HAO CHEN GUI via Gcc-patches
Thanks for your review. Committed as r12-5463. On 22/11/2021 上午 10:56, David Edelsohn wrote: > On Wed, Nov 17, 2021 at 3:28 AM HAO CHEN GUI wrote: >> Hi, >> >> The patch optimized for vec_reve builtin on rs6000. For V2DI and V2DF, it >> is implemented by xxswap

[PATCH, rs6000] optimization for vec_reve builtin [PR100868]

2021-11-17 Thread HAO CHEN GUI via Gcc-patches
Hi,   The patch optimized for vec_reve builtin on rs6000. For V2DI and V2DF, it is implemented by xxswapd on all targets. For V16QI, V8HI, V4SI and V4SF, it is implemented by quadword byte reverse plus halfword/word byte reverse when p9_vector is set.   Bootstrapped and tested on

[PATCH, rs6000] Optimization for vec_xl_sext

2021-11-15 Thread HAO CHEN GUI via Gcc-patches
Hi,    The patch optimizes the code generation for vec_xl_sext builtin. Now all the  sign extensions are done on VSX registers directly.    Bootstrapped and tested on powerpc64le-linux with no regressions. Is this  okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2021-11-16 Haochen

Re: [PATCH, rs6000] Optimization for vec_xl_sext

2021-11-16 Thread HAO CHEN GUI via Gcc-patches
> :-) I know > because I needed to make corresponding adjustments to the new builtins code. > > Thanks, > Bill > > On 11/15/21 8:16 PM, HAO CHEN GUI wrote: >> Hi, >> >>    The patch optimizes the code generation for vec_xl_sext builtin. Now all  >>

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-11-01 Thread HAO CHEN GUI via Gcc-patches
ge to rs6000-call.c. I only see the new testcases. > > Please resend the complete patch. > > Thanks David > > On Mon, Nov 1, 2021 at 2:48 AM HAO CHEN GUI wrote: >> Hi, >> >> This patch disables gimple folding for VSX_BUILTIN_XVMINDP, >> VSX_BUILTIN_XVMA

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-11-02 Thread HAO CHEN GUI via Gcc-patches
+float vminf (float a, float b) +{ +  vector float va = vec_promote (a, PREF_F); +  vector float vb = vec_promote (b, PREF_F); +  return vec_extract (vec_min (va, vb), PREF_F); +} On 2/11/2021 下午 9:12, David Edelsohn wrote: > On Mon, Nov 1, 2021 at 10:40 PM HAO CHEN GUI wrote: >> D

[PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-11-01 Thread HAO CHEN GUI via Gcc-patches
Hi,   This patch disables gimple folding for VSX_BUILTIN_XVMINDP, VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMINFP and  ALTIVEC_BUILTIN_VMAXFP when  fast-math is not set.  With the gimple folding is enabled, the four built-ins will be implemented by c-type instructions - xs[min|max]cdp on P9 and P10

[PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread HAO CHEN GUI via Gcc-patches
Hi,     This patch modifies the combine pattern with a helper - change_pseudo_and_mask when recog fails. The helper converts a single pseudo to the pseudo and with a mask if the outer operator is IOR/XOR/PLUS and the inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + ior

[PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]

2021-12-12 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of a TI to a V2DI, then move the V2DI to V1TI. With the pattern, the subreg pass can do register split for TI when there is a TI to V1TI move. The patch optimizes one unnecessary "mr" out

Re: [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]

2021-12-13 Thread HAO CHEN GUI via Gcc-patches
Hi Segher, Thanks for your advice. Please see my comments. On 14/12/2021 上午 6:59, Segher Boessenkool wrote: > Hi! > > On Mon, Dec 13, 2021 at 05:22:06PM -0500, David Edelsohn wrote: >> On Sun, Dec 12, 2021 at 10:00 PM HAO CHEN GUI wrote: >>> --- a/gcc/config/rs6000/vsx

[PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]

2021-12-16 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of a TI to a V2DI. With the pattern, the subreg pass can do register split for TI when there is a TI to V1TI move. The patch optimizes one unnecessary "mr" out on P9. The new test case

[PATCH, rs6000] Implement mffscrni pattern

2021-12-16 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch defines a pattern for mffscrni. If the RN is a constant, it can call gen_rs6000_mffscrni directly. The "rs6000-builtin-new.def" defines prototype for builtin arguments. The pattern "rs6000_set_fpscr_rn" is then broken as the mode of its argument is DI while its corresponding

Re: [PATCH] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]

2021-12-02 Thread HAO CHEN GUI via Gcc-patches
Kewen, Many thanks for your comments. On 2/12/2021 上午 10:21, Kewen.Lin wrote: > Hi Haochen, > > on 2021/12/1 下午5:01, HAO CHEN GUI via Gcc-patches wrote: >> Hi, >> This patch modifies the combine pattern with a helper - >> change_pseudo_and_mask when recog fa

[PATCH] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]

2021-12-07 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch modifies the combine pattern with a helper - change_pseudo_and_mask when recog fails. The helper converts a single pseudo to the pseudo AND with a mask if the outer operator is IOR/XOR/PLUS and the inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps on shift + ior

Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread HAO CHEN GUI via Gcc-patches
Hi Segher,    Thanks for your review. Please see my comments. On 1/12/2021 上午 2:11, Segher Boessenkool wrote: > Hi! > > On Tue, Nov 30, 2021 at 04:46:34PM +0800, HAO CHEN GUI wrote: >>     This patch modifies the combine pattern with a helper - >> change_pseudo_and_

Ping [PATCH, rs6000] fix execution failure of parity_1.f90 on P10 [PR100952]

2021-07-18 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575036.html Thanks On 13/7/2021 上午 9:38, HAO CHEN GUI wrote: Hi,    I refined the patch according to Segher's advice. Is this okay for trunk? Any recommendations? Thanks a lot. On 6/7/2021 上午 11:01, HAO CHEN

Ping [PATCH, rs6000] fix failure test cases caused by disabling mode promotion for pseudos [PR100952]

2021-07-18 Thread HAO CHEN GUI via Gcc-patches
Hi,   Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574503.html Thanks. On 6/7/2021 上午 11:11, HAO CHEN GUI wrote: Hi    The patch changed matching conditions in pr81384.c and pr56605.c. The original conditions failed to match due to mode promotion disabled

Re: [PATCH, rs6000] fix failure test cases caused by disabling mode promotion for pseudos [PR100952]

2021-07-22 Thread HAO CHEN GUI via Gcc-patches
Segher,    Thanks for your advice. I tested it. "{ dg-final { scan-rtl-dump-times {\(compare:CC \((?:and|zero_extend):(?:DI) \((?:sub)?reg:[SD]I} 1 "combine" } }" works well. On 22/7/2021 上午 6:51, Segher Boessenkool wrote: Hi! On Tue, Jul 06, 2021 at 11:11:05AM +0800

Ping^1 [PATCH, rs6000] Fix ICE on expand bcd__ [PR100736]

2022-01-09 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587253.html Thanks On 21/12/2021 下午 4:19, HAO CHEN GUI wrote: > Hi, > This patch fixes the ICE in PR100736. It adds a reverse condition > comparison when the > condition code can be reverse

Ping^1 [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]

2022-01-09 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587051.html Thanks On 17/12/2021 上午 9:55, HAO CHEN GUI wrote: > Hi, >This patch defines a new split pattern for TI to V1TI move. The pattern > concatenates two subreg:DI of > a

Ping^1 [PATCH] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]

2022-01-09 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586304.html Thanks On 7/12/2021 下午 4:28, HAO CHEN GUI wrote: > Hi, > This patch modifies the combine pattern with a helper - > change_pseudo_and_mask when recog fails. > The helper conve

[PATCH] Place jump tables in RELRO only when targets require local relocation to be placed in a read-write section

2022-01-11 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch sets "relocatable" of jump table to true when targets require local relocation to be placed in a read-write section - bit 0 is set in reloc_rw_mask. Jump tables are in local relocation, so they should be placed in RELRO only when both global and local relocation need to be

Re: Ping^1 [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]

2022-01-10 Thread HAO CHEN GUI via Gcc-patches
Segher and David, Thanks for your explanation. I got it. The "\m" itself is a constraint escape. Gui Haochen On 11/1/2022 上午 9:12, Segher Boessenkool wrote: > On Mon, Jan 10, 2022 at 06:09:01PM -0500, David Edelsohn wrote: >> On Sun, Jan 9, 2022 at 10:16 PM

[PATCH, rs6000] Enable absolute jump table by default

2022-01-12 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch enables absolute jump table by default on rs6000. The relative jump tables are used when it's explicit set by "rs6000_relative_jumptables", or jump tables are placed in text section but global relocation is required. Bootstrapped and tested on powerpc64-linux BE and LE

Re: [PATCH, rs6000] Enable absolute jump table by default

2022-01-12 Thread HAO CHEN GUI via Gcc-patches
Hi David, On 12/1/2022 下午 10:44, David Edelsohn wrote: > On Wed, Jan 12, 2022 at 7:22 AM HAO CHEN GUI wrote: >> >> Hi, >>This patch enables absolute jump table by default on rs6000. The relative >> jump tables are used when >>it's explicit s

Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]

2022-02-14 Thread HAO CHEN GUI via Gcc-patches
Segher, Thanks for your comments. Here are my comments and questions.Thanks. On 15/2/2022 上午 5:36, Segher Boessenkool wrote: > Hi! > > On Wed, Feb 09, 2022 at 10:43:17AM +0800, HAO CHEN GUI wrote: >> This patch removes TImode from mode iterator BOOL_128. Thus, bool >>

Ping^1 [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694, PR93123]

2022-02-13 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590057.html Thanks On 9/2/2022 上午 10:43, HAO CHEN GUI wrote: > Hi, > This patch removes TImode from mode iterator BOOL_128. Thus, bool > operations (AND, IOR, XOR, NOT) > on TImode will be split to

Ping^1 [PATCH v3, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-02-13 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589006.html Thanks On 21/1/2022 下午 5:28, HAO CHEN GUI wrote: > Hi, >This patch adds a combine pattern for "CA minus one". As CA only has two > values (0 or 1), we could conver

Ping^2 [PATCH, rs6000] Fix ICE on expand bcd__ [PR100736]

2022-02-13 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587253.html Thanks On 10/1/2022 上午 11:14, HAO CHEN GUI wrote: > Hi, > > Gentle ping this: > > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587253.html > > Thanks >

[PATCHv2, rs6000] Add V1TI into vector comparison expand [PR103316]

2022-03-16 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds V1TI mode into a new mode iterator used in vector comparison expands.With the patch, both built-ins and direct comparison could generate P10 new V1TI comparison instructions. Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. Is this okay for trunk?

Ping^1 [PATCH, rs6000] Correct match pattern in pr56605.c

2022-03-14 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590958.html Thanks On 28/2/2022 上午 11:17, HAO CHEN GUI wrote: > Hi, > This patch corrects the match pattern in pr56605.c. The former pattern > is wrong and test case fails with GCC11. It should match

Ping^1 [PATCH, rs6000] Add V1TI into vector comparison expand [PR103316]

2022-03-14 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591507.html Thanks On 10/3/2022 下午 2:31, HAO CHEN GUI wrote: > Hi, >This patch adds V1TI mode into mode iterator used in vector comparison > expands.With the patch, both built-ins and direct compari

PATCH, rs6000] Add V1TI into vector comparison expand [PR103316]

2022-03-09 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds V1TI mode into mode iterator used in vector comparison expands.With the patch, both built-ins and direct comparison could generate P10 new V1TI comparison instructions. Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. Is this okay for trunk? Any

[PATCH v2, rs6000] Enable absolute jump table for PPC AIX and Linux

2022-02-16 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch enables absolute jump tables on PPC AIX and Linux. For AIX, the jump table is placed in data section. For Linux, it is placed in RELRO section when relocation is needed. Bootstrapped and tested on AIX,Linux BE and LE with no regressions. Is this okay for trunk? Any

[PATCH v3, rs6000] Enable absolute jump table for PPC AIX and Linux

2022-02-28 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch enables absolute jump tables on PPC AIX and Linux. For AIX, the jump table is placed in data section. For Linux, it is placed in RELRO section when relocation is needed. Bootstrapped and tested on AIX,Linux BE and LE with no regressions. Is this okay for trunk? Any

Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]

2022-02-16 Thread HAO CHEN GUI via Gcc-patches
Hi, On 15/2/2022 下午 10:56, Segher Boessenkool wrote: > On Tue, Feb 15, 2022 at 11:01:03AM +0800, HAO CHEN GUI wrote: > Hi! > >> On 15/2/2022 上午 5:36, Segher Boessenkool wrote: >>> On Wed, Feb 09, 2022 at 10:43:17AM +0800, HAO CHEN GUI wrote: >>> All that are argu

Re: [PATCH v2, rs6000] Enable absolute jump table for PPC AIX and Linux

2022-02-21 Thread HAO CHEN GUI via Gcc-patches
Kewen, Thanks so much for your advice. On 21/2/2022 下午 5:42, Kewen.Lin wrote: > Hi Haochen, > > Some minor comments are inlined. > > on 2022/2/16 下午4:42, HAO CHEN GUI via Gcc-patches wrote: >> Hi, >>This patch enables absolute jump tables on PPC AIX and L

[PATCH, rs6000] Correct match pattern in pr56605.c

2022-02-27 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch corrects the match pattern in pr56605.c. The former pattern is wrong and test case fails with GCC11. It should match following insn on each subtarget after mode promotion is disabled. The patch need to be backported to GCC11. //gimple _17 = (unsigned int) _20;

[PATCH v2, rs6000] Disable TImode from Bool expanders [PR100694, PR93123]

2022-02-20 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch disables TImode for Bool expanders. Thus TI register can be split to two DI registers during expand.Potential optimizations can be implemented after the split. The new test case illustrates it. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this

[PATCH v3, rs6000] Add V1TI into vector comparison expand [PR103316]

2022-03-20 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds V1TI mode into a new mode iterator used in vector comparison expands.Without the patch, the comparisons between two vector __int128 are converted to scalar comparisons with branches. The code is suboptimal.The patch fixes the issue. Now all comparisons between two vector

[PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]

2022-02-08 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch removes TImode from mode iterator BOOL_128. Thus, bool operations (AND, IOR, XOR, NOT) on TImode will be split to the relevant operations on word mode during expand (in optabs.c). Potential optimizations can be implemented after the split. The former practice splits it after

[PATCH, rs6000] Enable absolute jump table for PPC Linux

2022-01-17 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch enables absolute jump table on PPC Linux. When PIC is set, the absolute jump tables are placed in RELRO section. Otherwise, they're placed in rodata section. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk? Any

[PATCH, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-18 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds a combine pattern for "CA minus one". As CA only has two values (0 or 1), we could convert following pattern (sign_extend:DI (plus:SI (reg:SI 98 ca) (const_int -1 [0x] to (plus:DI (reg:DI 98 ca) (const_int -1

[PATCH v3, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-21 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds a combine pattern for "CA minus one". As CA only has two values (0 or 1), we could convert following pattern (sign_extend:DI (plus:SI (reg:SI 98 ca) (const_int -1 [0x] to (plus:DI (reg:DI 98 ca) (const_int -1

Re: [PATCH v2, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-20 Thread HAO CHEN GUI via Gcc-patches
Thanks so much for your advice. Please see my comments. On 21/1/2022 上午 5:42, Segher Boessenkool wrote: > Hi! > > On Thu, Jan 20, 2022 at 01:46:48PM -0500, David Edelsohn wrote: >> On Thu, Jan 20, 2022 at 2:36 AM HAO CHEN GUI wrote: >>>This patch adds a combine p

[PATCH v2, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-19 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds a combine pattern for "CA minus one". As CA only has two values (0 or 1), we could convert following pattern (sign_extend:DI (plus:SI (reg:SI 98 ca) (const_int -1 [0x] to (plus:DI (reg:DI 98 ca) (const_int -1

Re: [PATCH, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-19 Thread HAO CHEN GUI via Gcc-patches
On 19/1/2022 下午 3:52, Andrew Pinski wrote: > On Tue, Jan 18, 2022 at 11:13 PM HAO CHEN GUI via Gcc-patches > wrote: >> >> Hi, >>This patch adds a combine pattern for "CA minus one". As CA only has two >> values (0 or 1), we could convert following p

Re: [PATCH, rs6000] Correct match pattern in pr56605.c

2022-04-10 Thread HAO CHEN GUI via Gcc-patches
Hi, On 9/4/2022 上午 12:48, will schmidt wrote: > On Mon, 2022-02-28 at 11:17 +0800, HAO CHEN GUI via Gcc-patches wrote: >> Hi, >> This patch corrects the match pattern in pr56605.c. The former pattern >> is wrong and test case fails with GCC11. It should match following insn

Re: [PATCH, rs6000] Correct match pattern in pr56605.c

2022-04-10 Thread HAO CHEN GUI via Gcc-patches
Hi, On 9/4/2022 上午 3:36, Segher Boessenkool wrote: > Hi! > > On Mon, Feb 28, 2022 at 11:17:27AM +0800, HAO CHEN GUI wrote: >> This patch corrects the match pattern in pr56605.c. The former pattern >> is wrong and test case fails with GCC11. It should match following insn

Ping^2 [PATCH, rs6000] Correct match pattern in pr56605.c

2022-04-07 Thread HAO CHEN GUI via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590958.html Thanks On 15/3/2022 上午 10:06, HAO CHEN GUI wrote: > Hi, > Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590958.html > Thanks > > On 28/2/2022 上午 1

[PATCH-2v2, rs6000] Implement 32bit inline lrint [PR88558]

2023-09-03 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch implements 32bit inline lrint by "fctiw". It depends on the patch1 to do SImode move from FP registers on P7. Compared to last version, the main change is to add tests for "lrintf" and adjust the count of corresponding instructions.

[PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-03 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch enables SImode in FP registers on P7. Instruction "fctiw" stores its integer output in an FP register. So SImode in FP register needs be enabled on P7 if we want support "fctiw" on P7. The test case is in the second patch which implements 32bit inline lrint. Compared to the

Re: [PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-14 Thread HAO CHEN GUI via Gcc-patches
Hi Kewen, 在 2023/9/12 17:33, Kewen.Lin 写道: > Ok, at least regression testing doesn't expose any needs to do disparaging > for this. Could you also test this patch with SPEC2017 for P7 and P8 > separately at options like -O2 or -O3, to see if there is any assembly > change, and if yes filtering

Re: [PATCH, rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

2023-08-30 Thread HAO CHEN GUI via Gcc-patches
Kewen, I refined the patch according to your comments and it passed bootstrap and regression test. I committed it as https://gcc.gnu.org/g:946b8967b905257ac9f140225db744c9a6ab91be Thanks Gui Haochen 在 2023/8/29 16:55, Kewen.Lin 写道: > Hi Haochen, > > on 2023/8/29 10:50, HAO CHEN

Re: [PATCH-1, combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-08-20 Thread HAO CHEN GUI via Gcc-patches
itely reduce the rtx cost or it helps match patterns? Thanks a lot. Thanks Gui Haochen 在 2023/8/5 7:32, Jeff Law 写道: > > > On 7/20/23 18:59, HAO CHEN GUI wrote: >> Hi Jeff, >> >> 在 2023/7/21 5:27, Jeff Law 写道: >>> Wouldn't it make more sense to just try rot

Re: [PATCHv4, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-08-16 Thread HAO CHEN GUI via Gcc-patches
Committed after fixing the comments. https://gcc.gnu.org/g:a79cf858b39e01c80537bc5d47a5e9004418c267 Thanks Gui Haochen 在 2023/8/14 15:47, Kewen.Lin 写道: > Hi Haochen, > > on 2023/8/14 10:18, HAO CHEN GUI wrote: >> Hi, >> This patch modifies vsx extract expand and gene

Re: [PATCH, rs6000] Skip redundant vector extract if the element is first element of dword0 [PR110429]

2023-08-16 Thread HAO CHEN GUI via Gcc-patches
Committed after tweaking and testing. https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=d471bdb0453de7b738f49148b66d57cb5871937d Thanks Gui Haochen 在 2023/7/28 17:32, Kewen.Lin 写道: > Hi Haochen, > > on 2023/7/5 11:22, HAO CHEN GUI wrote: >> Hi, >> This patch skips redundan

[PATCHv4, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-08-13 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx for all sub targets when the mode is V4SI and the extracted element is word 1 from BE order. Also this patch adds a insn pattern for mfvsrwz which helps eliminate redundant zero extend. Compared to last version, the main

[PATCH, rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

2023-08-28 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds "TARGET_64BIT" check when calling vector load/store with length expand in expand_block_move. It matches the expand condition of "lxvl" and "stxvl" defined in vsx.md. This patch fixes the ICE occurred with the test case on 32-bit Power10. Bootstrapped and tested on

[PATCHv2, rs6000] Extract the element in dword0 by mfvsrd and shift/mask [PR110331]

2023-08-22 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch implements the vector element extraction by mfvsrd and shift/mask when the element is in dword0 of the vector. Originally, it generates vsplat/mfvsrd on P8 and li/vextract on P9. Since mfvsrd has lower latency than vextract and rldicl has lower latency than vsplat, the new

[PATCH-1, rs6000] Enable SImode in FP register on P7 [PR88558]

2023-08-25 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch enables SImode in FP register on P7. Instruction "fctiw" stores its integer output in an FP register. So SImode in FP register needs be enabled on P7 if we want support "fctiw" on P7. The test case is in the second patch which implements 32bit inline lrint. Bootstrapped and

[PATCH-2, rs6000] Implement 32bit inline lrint [PR88558]

2023-08-25 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch implements 32bit inline lrint by "fctiw". It depends on the patch1 to do SImode move from FP register on P7. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Thanks Gui Haochen ChangeLog rs6000: support 32bit inline lrint gcc/ PR target/88558

Re: Ping^2 [PATCH, rs6000] Correct match pattern in pr56605.c

2022-04-19 Thread HAO CHEN GUI via Gcc-patches
2022 at 08:54:14PM -0300, Alexandre Oliva wrote: >> On Apr 7, 2022, HAO CHEN GUI via Gcc-patches >> wrote: >> >>> Gentle ping this: >>>https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590958.html >>> Thanks >> >>>> On 28

Re: Ping^2 [PATCH, rs6000] Correct match pattern in pr56605.c

2022-04-19 Thread HAO CHEN GUI via Gcc-patches
\(subreg:SI \(reg:DI} 1 "combine" } } */ I tested it and it is fine on all sub-targets. Thanks. On 20/4/2022 上午 5:06, Segher Boessenkool wrote: > On Tue, Apr 19, 2022 at 04:05:06PM +0800, HAO CHEN GUI wrote: >>I tested the test case on Linux and AIX with both big and little

[PATCH v4, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-05-12 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds a combine pattern for "CA minus one". As CA only has two values (0 or 1), we could convert following pattern (sign_extend:DI (plus:SI (reg:SI 98 ca) (const_int -1 [0x] to (plus:DI (reg:DI 98 ca) (const_int -1

[PATCH v5, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-05-15 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds a combine pattern for "CA minus one". As CA only has two values (0 or 1), we could convert following pattern (sign_extend:DI (plus:SI (reg:SI 98 ca) (const_int -1 [0x] to (plus:DI (reg:DI 98 ca) (const_int -1

[PATCH v5, rs6000] Add V1TI into vector comparison expand [PR103316]

2022-05-30 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch adds V1TI mode into a new mode iterator used in vector comparison shift and rotation expands. Without the patch, the comparisons between two vector __int128 are converted to scalar comparisons and code is suboptimal. The patch fixes the issue. Now all comparisons between two

Re: [PATCH v4, rs6000] Add V1TI into vector comparison expand [PR103316]

2022-05-25 Thread HAO CHEN GUI via Gcc-patches
check then? On 26/5/2022 上午 11:22, Kewen.Lin wrote: > Hi Haochen, > > on 2022/5/24 16:45, HAO CHEN GUI wrote: >> Hi, >>This patch adds V1TI mode into a new mode iterator used in vector >> comparison and rotation expands. Without the patch, the comparisons >

[PATCH v2, rs6000] Fix ICE on expand bcd__ [PR100736]

2022-05-26 Thread HAO CHEN GUI via Gcc-patches
Hi, This patch fixes the ICE reported in PR100736. It removes the condition check of finite math only flag not setting in "*_cc" pattern. With or without this flag, we still can use "cror" to check if either two bits of CC is set or not for "fp_two" codes. We don't need a reverse comparison

<    1   2   3   4   5   >