Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-16 Thread luoxhu via Gcc-patches
On 2020/9/15 14:51, Richard Biener wrote: >> I only see VAR_DECL and PARM_DECL, is there any function to check the tree >> variable is global? I added DECL_REGISTER, but the RTL still expands to >> stack: > > is_global_var () or alternatively !auto_var_in_fn_p (), I think doing > IFN_SET

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread luoxhu via Gcc-patches
On 2020/9/14 17:47, Richard Biener wrote: On Mon, Sep 14, 2020 at 10:05 AM luoxhu wrote: Not sure whether this reflects the issues you discussed above. I constructed below test cases and tested with and without this patch, only if "a+c"(which means store only), the performance is getting

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread luoxhu via Gcc-patches
On 2020/9/10 18:08, Richard Biener wrote: > On Wed, Sep 9, 2020 at 6:03 PM Segher Boessenkool > wrote: >> >> On Wed, Sep 09, 2020 at 04:28:19PM +0200, Richard Biener wrote: >>> On Wed, Sep 9, 2020 at 3:49 PM Segher Boessenkool >>> wrote: Hi! On Tue, Sep 08, 2020 at

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-08 Thread luoxhu via Gcc-patches
On 2020/9/8 16:26, Richard Biener wrote: >> Seems not only pseudo, for example "v = vec_insert (i, v, n);" >> the vector variable will be store to stack first, then [r112:DI] is a >> memory here to be processed. So the patch loads it from stack(insn #10) to >> temp vector register first, and

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-08 Thread luoxhu via Gcc-patches
Hi Richi, On 2020/9/7 19:57, Richard Biener wrote: > + if (TREE_CODE (to) == ARRAY_REF) > + { > + tree op0 = TREE_OPERAND (to, 0); > + if (TREE_CODE (op0) == VIEW_CONVERT_EXPR > + && expand_view_convert_to_vec_set (to, from, to_rtx)) > + { > +

[PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-06 Thread luoxhu via Gcc-patches
Hi, On 2020/9/4 18:23, Segher Boessenkool wrote: diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index 03b00738a5e..00c65311f76 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c /* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2)

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-04 Thread luoxhu via Gcc-patches
On 2020/9/4 15:23, Richard Biener wrote: > On Fri, Sep 4, 2020 at 9:19 AM Richard Biener > wrote: >> >> On Fri, Sep 4, 2020 at 8:38 AM luoxhu wrote: >>> >>> >>> >>> On 2020/9/4 14:16, luoxhu via Gcc-patches wrote: >>>

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-04 Thread luoxhu via Gcc-patches
On 2020/9/4 14:16, luoxhu via Gcc-patches wrote: Hi, Yes, I checked and found that both vec_set and vec_extract doesn't support variable index for most targets, store_bit_field_1 and extract_bit_field_1 would only consider use optabs when index is integer value. Anyway, it shouldn't

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-04 Thread luoxhu via Gcc-patches
Hi, On 2020/9/3 18:29, Richard Biener wrote: > On Thu, Sep 3, 2020 at 11:20 AM luoxhu wrote: >> >> >> >> On 2020/9/2 17:30, Richard Biener wrote: so maybe bypass convert_vector_to_array_for_subscript for special circumstance like "i = v[n%4]" or "v[n&3]=i" to generate vec_extract

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-03 Thread luoxhu via Gcc-patches
On 2020/9/2 17:30, Richard Biener wrote: >> so maybe bypass convert_vector_to_array_for_subscript for special >> circumstance >> like "i = v[n%4]" or "v[n&3]=i" to generate vec_extract or vec_insert builtin >> call a relative simpler method? > I think you have it backward. You need to work

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-02 Thread luoxhu via Gcc-patches
Hi, On 2020/9/1 21:07, Richard Biener wrote: > On Tue, Sep 1, 2020 at 10:11 AM luoxhu via Gcc-patches > wrote: >> >> Hi, >> >> On 2020/9/1 01:04, Segher Boessenkool wrote: >>> Hi! >>> >>> On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xion

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-01 Thread luoxhu via Gcc-patches
Hi, On 2020/9/1 00:47, will schmidt wrote: >> + tmode = TYPE_MODE (TREE_TYPE (arg0)); >> + mode1 = TYPE_MODE (TREE_TYPE (TREE_TYPE (arg0))); >> + mode2 = TYPE_MODE ((TREE_TYPE (arg2))); >> + gcc_assert (VECTOR_MODE_P (tmode)); >> + >> + op0 = expand_expr (arg0, NULL_RTX, tmode,

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-01 Thread luoxhu via Gcc-patches
Hi, On 2020/9/1 01:04, Segher Boessenkool wrote: > Hi! > > On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xiong Hu Luo wrote: >> vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value >> to be insert, arg2 is the place to insert arg1 to arg0. This patch adds >>

Re: [PATCH] ipa-inline: Improve growth accumulation for recursive calls

2020-08-14 Thread luoxhu via Gcc-patches
Hi, On 2020/8/13 20:52, Jan Hubicka wrote: >> Since there are no other callers outside of these specialized nodes, the >> guessed profile count should be same equal? Perf tool shows that even >> each specialized node is called only once, none of them take same time for >> each call: >> >>

Re: [PATCH] ipa-inline: Improve growth accumulation for recursive calls

2020-08-13 Thread luoxhu via Gcc-patches
Hi, On 2020/8/13 01:53, Jan Hubicka wrote: > Hello, > with Martin we spent some time looking into exchange2 and my > understanding of the problem is the following: > > There is the self recursive function digits_2 with the property that it > has 10 nested loops and calls itself from the

Re: [PATCH v5] dse: Remove partial load after full store for high part access[PR71309]

2020-08-05 Thread luoxhu via Gcc-patches
Hi Richard, On 2020/8/3 22:01, Richard Sandiford wrote: /* Try a wider mode if truncating the store mode to NEW_MODE requires a real instruction. */ if (maybe_lt (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)) @@ -1779,6 +1780,25 @@ find_shift_sequence

Re: [PATCH v5] dse: Remove partial load after full store for high part access[PR71309]

2020-08-03 Thread luoxhu via Gcc-patches
On 2020/8/3 22:01, Richard Sandiford wrote: /* Try a wider mode if truncating the store mode to NEW_MODE requires a real instruction. */ if (maybe_lt (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)) @@ -1779,6 +1780,25 @@ find_shift_sequence (poly_int64

[PATCH v5] dse: Remove partial load after full store for high part access[PR71309]

2020-08-03 Thread luoxhu via Gcc-patches
Thanks, the v5 update as comments: 1. Move const_rhs shift out of loop; 2. Iterate from int size for read_mode. This patch could optimize(works for char/short/int/void*): 6: r119:TI=[r118:DI+0x10] 7: [r118:DI]=r119:TI 8: r121:DI=[r118:DI+0x8] => 6: r119:TI=[r118:DI+0x10] 16:

Re: [PATCH v4] dse: Remove partial load after full store for high part access[PR71309]

2020-07-28 Thread luoxhu via Gcc-patches
Gentle ping in case this mail is missed, Thanks :) https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550602.html Xionghu On 2020/7/24 18:47, luoxhu via Gcc-patches wrote: Hi Richard, This is the updated version that could pass all regression test on Power9-LE. Just need another "may

[PATCH v4] dse: Remove partial load after full store for high part access[PR71309]

2020-07-24 Thread luoxhu via Gcc-patches
Hi Richard, This is the updated version that could pass all regression test on Power9-LE. Just need another "maybe_lt (GET_MODE_SIZE (new_mode), access_size)" before generating shift for store_info->const_rhs to ensure correct constant is generated, take testsuite/gfortran1/equiv_2.x for

Re: [PATCH v3] dse: Remove partial load after full store for high part access[PR71309]

2020-07-23 Thread luoxhu via Gcc-patches
On 2020/7/23 04:30, Richard Sandiford wrote: > > I now realise the reason is that the starting mode is too wide. > I think we should fix that by doing: > >FOR_EACH_MODE_IN_CLASS (new_mode_iter, MODE_INT) > { >… > > and then add: > >if (maybe_lt (GET_MODE_SIZE

Re: [PATCH v3] dse: Remove partial load after full store for high part access[PR71309]

2020-07-22 Thread luoxhu via Gcc-patches
Hi, On 2020/7/22 19:05, Richard Sandiford wrote: > This wasn't really what I meant. Using subregs is fine, but I was > thinking of: > >/* Also try a wider mode if the necessary punning is either not >desirable or not possible. */ >if (!CONSTANT_P (store_info->rhs) >

Re: [PATCH v2] dse: Remove partial load after full store for high part access[PR71309]

2020-07-22 Thread luoxhu via Gcc-patches
Hi, On 2020/7/21 23:30, Richard Sandiford wrote: > Xiong Hu Luo writes:>> @@ -1872,9 +1872,27 @@ > get_stored_val (store_info *store_info, machine_mode read_mode, >> { >> poly_int64 shift = gap * BITS_PER_UNIT; >> poly_int64 access_size = GET_MODE_SIZE (read_mode) + gap;

Re: [PATCH] rs6000: Define movsf_from_si2 to extract high part SF element from DImode[PR89310]

2020-07-20 Thread luoxhu via Gcc-patches
On 2020/7/20 23:31, Segher Boessenkool wrote: On Mon, Jul 13, 2020 at 02:30:28PM +0800, luoxhu wrote: For extracting high part element from DImode register like: {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;} split it before reload with "and mask" to avoid generating shift right 32 bit

Re: [PATCH] rs6000: Define movsf_from_si2 to extract high part SF element from DImode[PR89310]

2020-07-14 Thread luoxhu via Gcc-patches
Hi David, On 2020/7/14 22:17, David Edelsohn wrote: > Unfortunately this patch is eliciting a number of new testsuite > failures, all like > > error: unrecognizable insn: > (insn 44 43 45 5 (parallel [ > (set (reg:SI 199) > (unspec:SI [ >

Re: [PATCH] rs6000: Define movsf_from_si2 to extract high part SF element from DImode[PR89310]

2020-07-13 Thread luoxhu via Gcc-patches
Hi, On 2020/7/11 08:54, Segher Boessenkool wrote: > Hi! > > On Fri, Jul 10, 2020 at 09:39:40AM +0800, luoxhu wrote: >> OK, seems the md file needs a format tool too... > > Heh. Just make sure it looks good (that is, does what it looks like), > looks like the rest, etc. It's hard to do

Re: [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi

2020-07-12 Thread luoxhu via Gcc-patches
On 2020/7/11 08:28, Segher Boessenkool wrote: Hi! On Thu, Jul 09, 2020 at 09:14:45PM -0500, Xiong Hu Luo wrote: * config/rs6000/rs6000.md (rotl_unspec): New define_insn_and_split. +; rldimi with UNSPEC_SI_FROM_SF. +(define_insn_and_split "*rotl_unspec" Please have

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-10 Thread luoxhu via Gcc-patches
On 2020/7/10 03:25, Segher Boessenkool wrote: > >> + "TARGET_NO_SF_SUBREG" >> + "#" >> + "&& vsx_reg_sfsubreg_ok (operands[0], SFmode)" > > Put this in the insn condition? And since this is just a predicate, > you can just use it instead of gpc_reg_operand. > > (The split condition

Re: [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP

2020-07-09 Thread luoxhu via Gcc-patches
Update patch to keep the logic for non TARGET_P8_VECTOR targets. Please ignore the previous [PATCH 1/2], Sorry! Move V4SF to V4SI, init vector like V4SI and move to V4SF back. Better instruction sequence could be generated on Power9: lfs + xxpermdi + xvcvdpsp + vmrgew => lwz + (sldi + or) +

Re: [PATCH] rs6000: Define movsf_from_si2 to extract high part SF element from DImode[PR89310]

2020-07-09 Thread luoxhu via Gcc-patches
Hi, On 2020/7/10 03:25, Segher Boessenkool wrote: > Hi! > > On Thu, Jul 09, 2020 at 11:09:42AM +0800, luoxhu wrote: >>> Maybe change it back to just SI? It won't match often at all for QI or >>> HI anyway, it seems. Sorry for that detour. Should be good with the >>> above nits fixed :-) >> >>

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-08 Thread luoxhu via Gcc-patches
On 2020/7/9 06:43, Segher Boessenkool wrote: > Hi! > > On Wed, Jul 08, 2020 at 11:19:21AM +0800, luoxhu wrote: >> For extracting high part element from DImode register like: >> >> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;} >> >> split it before reload with "and mask" to avoid

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-07 Thread luoxhu via Gcc-patches
On 2020/7/8 05:31, Segher Boessenkool wrote: > Hi! > > On Tue, Jul 07, 2020 at 04:39:58PM +0800, luoxhu wrote: >>> Lots of questions, sorry! >> >> Thanks for the nice suggestions of the initial patch contains many issues:), > > Pretty much all of it should *work*, it just can be improved and

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-07 Thread luoxhu via Gcc-patches
On 2020/7/7 08:18, Segher Boessenkool wrote: > Hi! > > On Sun, Jul 05, 2020 at 09:17:57PM -0500, Xionghu Luo wrote: >> For extracting high part element from DImode register like: >> >> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;} >> >> split it before reload with "and mask" to avoid

Ping^1 : [PATCH] [stage1] ipa-cp: Fix PGO regression caused by r278808

2020-06-15 Thread luoxhu via Gcc-patches
Gentle ping... On 2020/6/1 09:45, Xionghu Luo wrote: resend the patch for stage1: https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538186.html The performance of exchange2 built with PGO will decrease ~28% by r278808 due to profile count set incorrectly. The cloned nodes are updated to

Re: [PATCH] rs6000: Use REAL_TYPE to copy when block move array in structure[PR65421]

2020-06-08 Thread luoxhu via Gcc-patches
Hi, On 2020/6/3 04:32, Segher Boessenkool wrote: > Hi Xiong Hu, > > On Tue, Jun 02, 2020 at 04:41:50AM -0500, Xionghu Luo wrote: >> Double array in structure as function arguments or return value is accessed >> by BLKmode, they are stored to stack and load from stack with redundant >> conversion

Re: [PATCH v2] Fold (add -1; zero_ext; add +1) operations to zero_ext when not overflow (PR37451, part of PR61837)

2020-05-13 Thread luoxhu via Gcc-patches
On 2020/5/13 02:24, Richard Sandiford wrote: > luoxhu writes: >> + /* Fold (add -1; zero_ext; add +1) operations to zero_ext. i.e: >> + >> + 73: r145:SI=r123:DI#0-0x1 >> + 74: r144:DI=zero_extend (r145:SI) >> + 75: r143:DI=r144:DI+0x1 >> + ... >> + 31: r135:CC=cmp

[PATCH v2] Fold (add -1; zero_ext; add +1) operations to zero_ext when not overflow (PR37451, part of PR61837)

2020-05-12 Thread luoxhu via Gcc-patches
Minor refine of checking iterations nonoverflow and a testcase for stage 1. This "subtract/extend/add" existed for a long time and still annoying us (PR37451, part of PR61837) when converting from 32bits to 64bits, as the ctr register is used as 64bits on powerpc64, Andraw Pinski had a patch but

Re: [PATCH v2] Add handling of MULT_EXPR/PLUS_EXPR for wrapping overflow in affine combination(PR83403)

2020-05-11 Thread luoxhu via Gcc-patches
在 2020-05-06 20:09,Richard Biener 写道: On Thu, 30 Apr 2020, luoxhu wrote: Update the patch with overflow check. Bootstrap and regression tested PASS on Power8-LE. Use determine_value_range to get value range info for fold convert expressions with internal operation

[PATCH v2] Add handling of MULT_EXPR/PLUS_EXPR for wrapping overflow in affine combination(PR83403)

2020-04-30 Thread luoxhu via Gcc-patches
Update the patch with overflow check. Bootstrap and regression tested PASS on Power8-LE. Use determine_value_range to get value range info for fold convert expressions with internal operation PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow on wrapping overflow inner type. i.e.: (long

Re: [PATCH] Add value range info for affine combination to improve store motion (PR83403)

2020-04-29 Thread luoxhu via Gcc-patches
On 2020/4/28 18:30, Richard Biener wrote: > > OK, I guess instead of get_range_info expr_to_aff_combination could > simply use determine_value_range (op0, , ) == VR_RANGE > (the && TREE_CODE (op0) == SSA_NAME check can then be removed)? > Tried with determine_value_range, it works and is

Re: [PATCH] Add value range info for affine combination to improve store motion (PR83403)

2020-04-28 Thread luoxhu via Gcc-patches
On 2020/4/28 15:01, Richard Biener wrote: > On Tue, 28 Apr 2020, Xionghu Luo wrote: > >> From: Xionghu Luo >> >> Get and propagate value range info to convert expressions with convert >> operation on PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow. i.e.: >> >> (long unsigned int)((unsigned

Re: [PATCH] Fold (add -1; zero_ext; add +1) operations to zero_ext when not zero (PR37451, PR61837)

2020-04-20 Thread luoxhu via Gcc-patches
Tiny update to accommodate unsigned int compare. On 2020/4/20 16:21, luoxhu via Gcc-patches wrote: Hi, On 2020/4/18 00:32, Segher Boessenkool wrote: On Thu, Apr 16, 2020 at 08:21:40PM -0500, Segher Boessenkool wrote: On Wed, Apr 15, 2020 at 10:18:16AM +0100, Richard Sandiford wrote: luoxhu

Re: [PATCH] Fold (add -1; zero_ext; add +1) operations to zero_ext when not zero (PR37451, PR61837)

2020-04-20 Thread luoxhu via Gcc-patches
Hi, On 2020/4/18 00:32, Segher Boessenkool wrote: > On Thu, Apr 16, 2020 at 08:21:40PM -0500, Segher Boessenkool wrote: >> On Wed, Apr 15, 2020 at 10:18:16AM +0100, Richard Sandiford wrote: >>> luoxhu--- via Gcc-patches writes: >>>> -count = simplify_gen_binary

Re: [PATCH v2] rs6000: Don't use HARD_FRAME_POINTER_REGNUM if it's not live in pro_and_epilogue (PR91518)

2020-04-16 Thread luoxhu via Gcc-patches
On 2020/4/17 08:52, Segher Boessenkool wrote: > Hi! > > On Mon, Apr 13, 2020 at 10:11:43AM +0800, luoxhu wrote: >> frame_pointer_needed is set to true in reload pass setup_can_eliminate, >> but regs_ever_live[31] is false, pro_and_epilogue uses it without live >> check causing CPU2006

[PATCH] Fold (add -1; zero_ext; add +1) operations to zero_ext when not zero (PR37451, PR61837)

2020-04-15 Thread luoxhu--- via Gcc-patches
From: Xionghu Luo This "subtract/extend/add" existed for a long time and still annoying us (PR37451, PR61837) when converting from 32bits to 64bits, as the ctr register is used as 64bits on powerpc64, Andraw Pinski had a patch but caused some issue and reverted by Joseph S. Myers(PR37451,

[PATCH v2] rs6000: Don't use HARD_FRAME_POINTER_REGNUM if it's not live in pro_and_epilogue (PR91518)

2020-04-12 Thread luoxhu via Gcc-patches
This bug is exposed by FRE refactor of r263875. Comparing the fre dump file shows no obvious change of the segment fault function proves it to be a target issue. frame_pointer_needed is set to true in reload pass setup_can_eliminate, but regs_ever_live[31] is false, pro_and_epilogue uses it

Re: [PATCH] rs6000: Don't split constant oprator when add, move to temp register for future optimization

2020-04-03 Thread luoxhu via Gcc-patches
On 2020/4/3 06:16, Segher Boessenkool wrote: > Hi! > > On Mon, Mar 30, 2020 at 11:59:57AM +0800, luoxhu wrote: >>> Do we want something later in the RTL pipeline to make "addi"s etc. again? > > (This would be a good thing to consider -- maybe a define_insn_and_split > will work. But see

Re: [PATCH] rs6000: Save/restore r31 if frame_pointer_needed is true

2020-03-29 Thread luoxhu via Gcc-patches
On 2020/3/28 00:04, Segher Boessenkool wrote: Hi! On Fri, Mar 27, 2020 at 09:34:00AM +0800, luoxhu wrote: On 2020/3/27 07:59, Segher Boessenkool wrote: On Wed, Mar 25, 2020 at 11:15:22PM -0500, luo...@linux.ibm.com wrote: frame_pointer_needed is set to true in reload pass

Re: [PATCH] rs6000: Don't split constant oprator when add, move to temp register for future optimization

2020-03-29 Thread luoxhu via Gcc-patches
On 2020/3/27 22:33, Segher Boessenkool wrote: > Hi! > > On Thu, Mar 26, 2020 at 05:06:43AM -0500, luo...@linux.ibm.com wrote: >> Remove split code from add3 to allow a later pass to split. >> This allows later logic to hoist out constant load in add instructions. >> In loop, lis+ori could be

Re: [PATCH] rs6000: Save/restore r31 if frame_pointer_needed is true

2020-03-26 Thread luoxhu via Gcc-patches
On 2020/3/27 07:59, Segher Boessenkool wrote: > Hi! > > On Wed, Mar 25, 2020 at 11:15:22PM -0500, luo...@linux.ibm.com wrote: >> frame_pointer_needed is set to true in reload pass setup_can_eliminate, >> but regs_ever_live[31] is false, so pro_and_epilogue doesn't save/restore >> r31 even it

[PATCH] rs6000: Don't split constant oprator when add, move to temp register for future optimization

2020-03-26 Thread luoxhu--- via Gcc-patches
From: Xionghu Luo Remove split code from add3 to allow a later pass to split. This allows later logic to hoist out constant load in add instructions. In loop, lis+ori could be hoisted out to improve performance compared with previous addis+addi (About 15% on typical case), weak point is one more

[PATCH] rs6000: Save/restore r31 if frame_pointer_needed is true

2020-03-25 Thread luoxhu--- via Gcc-patches
From: Xionghu Luo This P1 bug is exposed by FRE refactor of r263875. Comparing the fre dump file shows no obvious change of the segment fault function proves it to be a target issue. frame_pointer_needed is set to true in reload pass setup_can_eliminate, but regs_ever_live[31] is false, so