On Mon, Jun 20, 2016 at 11:53 AM, H.J. Lu <hjl.to...@gmail.com> wrote: > On Mon, Jun 20, 2016 at 10:31 AM, Ilya Enkovich <enkovich....@gmail.com> > wrote: >> On 20 Jun 09:45, H.J. Lu wrote: >>> On Mon, Jun 20, 2016 at 7:30 AM, Ilya Enkovich <enkovich....@gmail.com> >>> wrote: >>> > 2016-06-20 16:39 GMT+03:00 Uros Bizjak <ubiz...@gmail.com>: >>> >> On Mon, Jun 20, 2016 at 1:55 PM, H.J. Lu <hongjiu...@intel.com> wrote: >>> >>> TImode register referenced in debug insn can be converted to V1TImode >>> >>> by scalar to vector optimization. We need to convert a debug insn if >>> >>> it has a variable in a TImode register. >>> >>> >>> >>> Tested on x86-64. OK for trunk? >>> >> >>> >> Ilya, does this approach look good to you? Also, does DImode STV >>> >> conversion need similar handling of debug insns? >>> > >>> > DImode conversion doesn't change register mode (i.e. never calls >>> > PUT_MODE for registers). That keeps debug instructions valid. >>> > >>> > Overall I don't like the idea of having debug insns in candidates >>> > set and in chains. Looks like it is possible to have a chain >>> > consisting of a debug insn only which is weird (otherwise I don't >>> > see why we may return false in timode_scalar_chain::convert_insn). >>> >>> Yes, it can happen: >>> >>> (insn 11 8 12 2 (parallel [ >>> (set (reg/v:TI 91 [ <retval> ]) >>> (plus:TI (reg/v:TI 92 [ a ]) >>> (reg/v:TI 96 [ b ]))) >>> (clobber (reg:CC 17 flags)) >>> ]) y.i:5 210 {*addti3_doubleword} >>> (expr_list:REG_UNUSED (reg:CC 17 flags) >>> (nil))) >>> (debug_insn 12 11 13 2 (var_location:TI w (reg/v:TI 91 [ <retval> ])) y.i:5 >>> -1 >>> (nil)) >>> >>> >>> > What about other possible register uses? If debug insns are added >>> > to candidates then NONDEBUG_INSN_P check for uses in >>> > timode_check_non_convertible_regs becomes invalid, right? >>> >>> Debug insn has no impact on STV decision. We just need to convert >>> register referenced in debug insn from V1TImode to TImode in >>> timode_scalar_chain::convert_insn. >>> >>> > If we have (or want) to fix some register uses then it's probably >>> > would be better to visit register uses when we convert its mode >>> > and make required fix-ups. It seems better to me to not involve >>> > debug insns in analysis phase. >>> >>> Here is the updated patch to add debug insn, which references the >>> TImode register which will be converted to V1TImode to queue. >>> I am testing it now. >>> >> >> You still count and dump debug insns as optimized ones. Also we >> try to use virtual functions to cover differences in DI and TI >> optimizations and introducing additional TARGET_64BIT in common >> STV code is undesirable. >> >> Also your conversion now depends on instructions processing order. >> You will fail to process debug insn before non-debug ones. Required >> order is not guaranteed because processing depends on instruction >> UIDs only. >> >> I propose to modify transformation phase only like in the patch >> (untested) below. I rely on your code which assumes the only >> possible usage in debug insn is VAR_LOCATION. >> >> Thanks, >> Ilya >> -- >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >> index c5e5e12..ec955f0 100644 >> --- a/gcc/config/i386/i386.c >> +++ b/gcc/config/i386/i386.c >> @@ -3139,6 +3139,7 @@ class timode_scalar_chain : public scalar_chain >> >> private: >> void mark_dual_mode_def (df_ref def); >> + void fix_debug_reg_uses (rtx reg); >> void convert_insn (rtx_insn *insn); >> /* We don't convert registers to difference size. */ >> void convert_registers () {} >> @@ -3790,6 +3791,34 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn) >> df_insn_rescan (insn); >> } >> >> +/* Fix uses of converted REG in debug insns. */ >> + >> +void >> +timode_scalar_chain::fix_debug_reg_uses (rtx reg) >> +{ >> + df_ref ref; >> + for (ref = DF_REG_USE_CHAIN (REGNO (reg)); ref; ref = DF_REF_NEXT_REG >> (ref)) >> + { >> + rtx_insn *insn = DF_REF_INSN (ref); >> + >> + if (DEBUG_INSN_P (insn)) >> + { >> + /* It must be a debug insn with a TImode variable in register. */ >> + rtx val = PATTERN (insn); >> + gcc_assert (GET_MODE (val) == TImode >> + && GET_CODE (val) == VAR_LOCATION); >> + rtx loc = PAT_VAR_LOCATION_LOC (val); >> + gcc_assert (REG_P (loc) >> + && GET_MODE (loc) == V1TImode); >> + /* Convert V1TImode register, which has been updated by a SET >> + insn before, to SUBREG TImode. */ >> + PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0); >> + df_insn_rescan (insn); >> + return; >> + } >> + } >> +} >> + >> /* Convert INSN from TImode to V1T1mode. */ >> >> void >> @@ -3806,8 +3835,10 @@ timode_scalar_chain::convert_insn (rtx_insn *insn) >> rtx tmp = find_reg_equal_equiv_note (insn); >> if (tmp) >> PUT_MODE (XEXP (tmp, 0), V1TImode); >> + PUT_MODE (dst, V1TImode); >> + fix_debug_reg_uses (dst); >> } >> - /* FALLTHRU */ >> + break; >> case MEM: >> PUT_MODE (dst, V1TImode); >> break; > > Won't be it waste CPU cycles when there is no debug insin > which use TImode registers? >
Here is the updated patch. Tested on x86-64. OK for trunk? Thanks. -- H.J.
From df026e33ebae5fef48edc2c1c3a9a7036095aff9 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.to...@gmail.com> Date: Sun, 19 Jun 2016 12:47:45 -0700 Subject: [PATCH] Convert V1TImode register to TImode in debug insn TImode register referenced in debug insn can be converted to V1TImode by scalar to vector optimization. After converting a TImode register to V1TImode, we need to check all debug insns on its use chain to convert the V1TImode register to SUBREG TImode. gcc/ 2016-06-21 H.J. Lu <hongjiu...@intel.com> Ilya Enkovich <ilya.enkov...@intel.com> PR target/71549 * config/i386/i386.c (timode_scalar_chain::fix_debug_reg_uses): New member function to convert V1TImode register to SUBREG TImode in debug insn. (timode_scalar_chain::convert_insn): Call fix_debug_reg_uses after changing register mode to V1TImode. gcc/testsuite/ 2016-06-21 H.J. Lu <hongjiu...@intel.com> PR target/71549 * gcc.target/i386/pr71549.c: New test. --- gcc/config/i386/i386.c | 38 ++++++++++++++++++++++++++++++++- gcc/testsuite/gcc.target/i386/pr71549.c | 24 +++++++++++++++++++++ 2 files changed, 61 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr71549.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 56a5b9c..0dd09ce 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3139,6 +3139,7 @@ class timode_scalar_chain : public scalar_chain private: void mark_dual_mode_def (df_ref def); + void fix_debug_reg_uses (rtx reg); void convert_insn (rtx_insn *insn); /* We don't convert registers to difference size. */ void convert_registers () {} @@ -3790,6 +3791,39 @@ dimode_scalar_chain::convert_insn (rtx_insn *insn) df_insn_rescan (insn); } +/* Fix uses of converted REG in debug insns. */ + +void +timode_scalar_chain::fix_debug_reg_uses (rtx reg) +{ + if (!flag_var_tracking) + return; + + df_ref ref; + for (ref = DF_REG_USE_CHAIN (REGNO (reg)); + ref; + ref = DF_REF_NEXT_REG (ref)) + { + rtx_insn *insn = DF_REF_INSN (ref); + if (DEBUG_INSN_P (insn)) + { + /* It may be a debug insn with a TImode variable in + register. */ + rtx val = PATTERN (insn); + if (GET_MODE (val) != TImode) + continue; + gcc_assert (GET_CODE (val) == VAR_LOCATION); + rtx loc = PAT_VAR_LOCATION_LOC (val); + gcc_assert (REG_P (loc) + && GET_MODE (loc) == V1TImode); + /* Convert V1TImode register, which has been updated by a SET + insn before, to SUBREG TImode. */ + PAT_VAR_LOCATION_LOC (val) = gen_rtx_SUBREG (TImode, loc, 0); + df_insn_rescan (insn); + } + } +} + /* Convert INSN from TImode to V1T1mode. */ void @@ -3806,8 +3840,10 @@ timode_scalar_chain::convert_insn (rtx_insn *insn) rtx tmp = find_reg_equal_equiv_note (insn); if (tmp) PUT_MODE (XEXP (tmp, 0), V1TImode); + PUT_MODE (dst, V1TImode); + fix_debug_reg_uses (dst); } - /* FALLTHRU */ + break; case MEM: PUT_MODE (dst, V1TImode); break; diff --git a/gcc/testsuite/gcc.target/i386/pr71549.c b/gcc/testsuite/gcc.target/i386/pr71549.c new file mode 100644 index 0000000..8aac891 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr71549.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -g" } */ + +struct S1 +{ + int f0; + int f1; + int f2; + int:4; +} a, b; + +void +fn1 (struct S1 p1) +{ + a = p1; + int c = p1.f0; +} + +int +main () +{ + fn1 (b); + return 0; +} -- 2.5.5