[Bug target/52941] SH Target: Add support for movco.l / movli.l atomics on SH4A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52941 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-04-16 22:37:31 UTC --- (In reply to comment #5) The patch looks just fine. I don't mind whether those atomics are fully optimized or not ATM. Programs having atomics in the minor loop are pathological in the first place, I think.
[Bug target/52941] SH Target: Add support for movco.l / movli.l atomics on SH4A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52941 --- Comment #8 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-04-17 00:54:00 UTC --- (In reply to comment #7) Created attachment 27173 [details] Proposed patch Looks even better. Only one thing ... is it safe to do the @-r15, @+r15 stuff in the atomic sequence? I remember there were some border cases where things would blow up, but can't recall. I've also briefly checked with atomic vars being on the stack and it looks OK. I don't know about such restrictions, though my knowledge of SH4A is very limited. Perhaps some weired interaction of ll/sc and cache? Anyway, if it's a border issue, the patch is OK. I'd like to pre-approve it.
[Bug target/52941] SH Target: Add support for movco.l / movli.l atomics on SH4A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52941 --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-04-13 03:29:25 UTC --- (In reply to comment #2) One more thing regarding movco/movli ... do you think it's OK to use them also to do atomics on types SImode? As far as I can see it should be safe to do e.g. read SImode, modify QImode subreg, write-back SImode. Yes, it'll make false-positive cases but would be safe.
[Bug target/52898] SH Target: Inefficient DImode comparisons
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52898 --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-04-12 01:13:15 UTC --- (In reply to comment #2) I don't know about their history. -mcbranchdi is enabled by default, though. See gcc/common/config/sh/sh-common.c:sh_option_optimization_table. Unfortunately, it looks -mcmpeqdi causes many new failures on trunk.
[Bug target/52941] SH Target: Add support for movco.l / movli.l atomics on SH4A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52941 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-04-12 01:18:42 UTC --- (In reply to comment #0) Other than that, should we add another option '-mhard-atomic' (which would enable the movco/movli atomics on SH4A and disable all atomic insns for non-SH4A)? I think so. Actually, I think the options should be '-msp-atomic' and '-mmp-atomic', where '-msp-atomic' would be the current '-msoft-atomic'. I don't think that -msp/mmp-atomic are good naming here. SP/MP notion is not directory connected with the soft/hard implementation of atomics, even if soft atomics are impossible for real MP system. Hard atomics should work with both SP and MP. I guess that the point is the necessity of kernel (i.e. software) services. If the atomics require kernel services, they are soft atomics even some of them utilize the LL/SC-like insns. If they don't require any kernel services, they are hard atomics. Using -msp-atomic for soft atomics looks a bit misleading, from this point of view. Perhaps an unsupprising way would be enable movco/movli on SH4A with both -msoft-atomic/-mhard-atomic if we can.
[Bug libstdc++/29366] atomics config for sh is weird
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29366 --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-04-12 01:20:06 UTC --- (In reply to comment #2) I think some of the problems will disappear once PR 52941 is done. After that, and having the new atomic builtins of 4.7, we could get rid of the config/cpu/sh/atomicity.h file completely, if I'm not mistaken. Agreed.
[Bug target/48806] ICE in reload_cse_simplify_operands, at postreload.c:403
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48806 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-22 21:39:51 UTC --- Author: kkojima Date: Thu Mar 22 21:39:45 2012 New Revision: 185714 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=185714 Log: Backported from mainline 2012-03-02 Kaz Kojima kkoj...@gcc.gnu.org PR target/48596 PR target/48806 * config/sh/sh.c (sh_register_move_cost): Increase cost between GENERAL_REGS and FP_REGS for SImode. Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/config/sh/sh.c
[Bug rtl-optimization/48596] [4.7/4.8 Regression] [SH] unable to find a register to spill in class 'FPUL_REGS'
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48596 --- Comment #9 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-22 21:39:51 UTC --- Author: kkojima Date: Thu Mar 22 21:39:45 2012 New Revision: 185714 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=185714 Log: Backported from mainline 2012-03-02 Kaz Kojima kkoj...@gcc.gnu.org PR target/48596 PR target/48806 * config/sh/sh.c (sh_register_move_cost): Increase cost between GENERAL_REGS and FP_REGS for SImode. Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/config/sh/sh.c
[Bug rtl-optimization/48596] [4.7/4.8 Regression] [SH] unable to find a register to spill in class 'FPUL_REGS'
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48596 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED Resolution||FIXED --- Comment #10 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-22 22:19:47 UTC --- Fixed.
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #35 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-20 01:45:14 UTC --- (In reply to comment #34) Interesting, thanks! I'll also test your patch and send it around, OK? OK, thanks! I'm a bit confused... was the issue caused by my patches to for this PR, or by something else? I guess that it was caused by another changes but was latent for a while.
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #33 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-15 07:52:21 UTC --- (In reply to comment #31) Created attachment 26859 [details] testresult on sh4-unknown-linux-gnu [trunk revision 185088]. FYI, looking into the libstdc++ failures for sh4-unknown-linux-gnu, it seems that the call insn was swapped before prologue frame insns and then it makes unwinder confused. -fno-delayed-branch also stops that swapping for these failing cases. The patch below works for me. * config/sh/sh.c (sh_expand_prologue): Emit blockage at the end of prologue for unwinder and profiler. --- ORIG/trunk/gcc/config/sh/sh.c2012-03-06 10:28:32.0 +0900 +++ trunk/gcc/config/sh/sh.c2012-03-14 20:22:15.0 +0900 @@ -7234,6 +7234,13 @@ sh_expand_prologue (void) emit_insn (gen_shcompact_incoming_args ()); } + /* If we are profiling, make sure no instructions are scheduled before + the call to mcount. Similarly if some call instructions are swapped + before frame related insns, it'll make unwinder confused because + currently SH has no unwind info for function epilogues. */ + if (crtl-profile || flag_exceptions || flag_unwind_tables) +emit_insn (gen_blockage ()); + if (flag_stack_usage_info) current_function_static_stack_size = stack_usage; }
[Bug target/52479] SH Target: SH4A DFmode fsca tests failing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52479 --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-16 03:21:08 UTC --- There is no concrete definition of -ffast-math and users will have different expectations. Numerical programs for astrodynamics may expect precisions even for -ffast-math and OTOH there are many programs which don't take care of precisions at all and prefer sincos instead of sinfcosf. I guess that the former doesn't make much sense for SH4A in the first place and for users of the latter with -ffast-math, the proposed change may be a bit surprising. I have no strong opinion for this, though. I also won't object to the suggested patch.
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #29 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-09 08:40:32 UTC --- (In reply to comment #28) Regtest on sh4-unknown-lunix-gnu has been done successfully. Oleg, your patch is pre-approved.
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #31 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-09 10:36:31 UTC --- Created attachment 26859 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26859 A test result testresult on sh4-unknown-linux-gnu [trunk revision 185088].
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #24 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-08 11:11:32 UTC --- (In reply to comment #23) Kaz, if you have some time, could you try it out in your setup, too please? On trunk revision 185088, for sh4-unknown-linux-gnu, the result of compare_tests is: New tests that FAIL: gfortran.dg/associated_4.f90 -O1 execution test gfortran.dg/forall_4.f90 -O3 -fomit-frame-pointer execution test gfortran.dg/forall_4.f90 -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions execution test gfortran.dg/forall_4.f90 -O3 -fomit-frame-pointer -funroll-loops execution test gfortran.dg/forall_4.f90 -O3 -g execution test Old tests that failed, that have disappeared: (Eeek!) 22_locale/ctype/is/char/3.cc execution test 27_io/basic_filebuf/underflow/wchar_t/9178.cc execution test gfortran.dg/widechar_intrinsics_6.f90 -Os execution test
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #25 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-08 11:13:39 UTC --- Created attachment 26854 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26854 worked .s file associated_4_good.s
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #26 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-08 11:16:39 UTC --- Created attachment 26855 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26855 unworked .s file associated_4_bad.s I've attached .s files against gfortran.dg/associated_4.f90 -O1 with patched/unpatched compilers.
[Bug fortran/34040] Support for DOUBLE_TYPE_SIZE != 64 targets
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34040 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added CC||olegendo at gcc dot gnu.org --- Comment #12 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-08 22:15:33 UTC --- *** Bug 52535 has been marked as a duplicate of this bug. ***
[Bug libfortran/52535] SH Target: libfortran won't build for sub-targets where DFmode is set to SFmode?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52535 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-08 22:15:32 UTC --- This is a known issue as PR34040. *** This bug has been marked as a duplicate of bug 34040 ***
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #28 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-09 01:44:52 UTC --- (In reply to comment #27) Created attachment 26858 [details] Patch for the patch Looks all fortran regressions gone away. I'll run full tests on sh4-unknown-lunix-gnu.
[Bug target/52503] sh-wrs-vxworks: too many target masks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52503 --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-07 22:06:28 UTC --- Author: kkojima Date: Wed Mar 7 22:06:25 2012 New Revision: 185081 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=185081 Log: PR target/52503 * config/sh/sh.opt (msoft-atomic): Use Var instead of Mask. * config/sh/linux.h (TARGET_DEFAULT): Remove MASK_SOFT_ATOMIC. (SUBTARGET_OVERRIDE_OPTIONS): Define. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/linux.h trunk/gcc/config/sh/sh.opt
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #15 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-06 08:49:27 UTC --- (In reply to comment #14) I've run the testsuite on rev 184966 (without fortran though), but the failures that you've mentioned did not show up. Usually when I rebuild the whole toolchain including newlib, I have C/CPP/CXXFLAGS_FOR_TARGET set to '-Os -mpretend-cmove'. This time I removed those, but the results seem to be the same. Could you also please try again? This is suspicious... I've seen same failures on sh4-unknown-linux-gnu for trunk rev 184971. With backing r184966 changes out, they went away. Weird.
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #17 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-06 10:36:01 UTC --- Created attachment 26837 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26837 preprocessed file ctype_configure_char.i
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #18 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-06 10:37:13 UTC --- Created attachment 26838 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26838 worked .s file ctype_configure_char_good.s
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #19 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-06 10:38:22 UTC --- Created attachment 26839 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26839 unworked .s file ctype_configure_char_bad.s
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #20 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-06 10:40:31 UTC --- (In reply to comment #16) Can we keep the r184966 changes anyways? I will keep an eye on these failures whether I can reproduce them. If you have some time, could you please send me the intermediate .i and .s files of the failing and passing version of the '22_locale/ctype/is/char/3.cc' test case? I've confirmed that 22_locale/ctype/is/char/3.cc doesn't fail if linking with libstdc++.a which is built with the compiler without r184966 changes. The .s files against 3.cc are same with the both compilers. It looks that the problematic object is libstdc++-v3/src/c++98/ctype_configure_char.o because the error went away if replacing it with another one. I've attached .i and .s files for that file. The option used is COLLECT_GCC_OPTIONS='-shared-libgcc' '-B' '/exp/ldroot/dodes/xsh-gcc/./gcc' '-nostdinc++' '-L/exp/ldroot/dodes/xsh-gcc-orig/sh4-unknown-linux-gnu/libstdc++-v3/src' '-L/exp/ldroot/dodes/xsh-gcc-orig/sh4-unknown-linux-gnu/libstdc++-v3/src/.libs' '-B' '/usr/local/sh4-unknown-linux-gnu/bin/' '-B' '/usr/local/sh4-unknown-linux-gnu/lib/' '-isystem' '/usr/local/sh4-unknown-linux-gnu/include' '-isystem' '/usr/local/sh4-unknown-linux-gnu/sys-include' '-I' '/exp/ldroot/dodes/ORIG/trunk/libstdc++-v3/../libgcc' '-I' '/exp/ldroot/dodes/xsh-gcc-orig/sh4-unknown-linux-gnu/libstdc++-v3/include/sh4-unknown-linux-gnu' '-I' '/exp/ldroot/dodes/xsh-gcc-orig/sh4-unknown-linux-gnu/libstdc++-v3/include' '-I' '/exp/ldroot/dodes/ORIG/trunk/libstdc++-v3/libsupc++' '-fno-implicit-templates' '-Wall' '-Wextra' '-Wwrite-strings' '-Wcast-qual' '-Wabi' '-fdiagnostics-show-location=once' '-ffunction-sections' '-fdata-sections' '-frandom-seed=ctype_configure_char.lo' '-g' '-O2' '-D' '_GNU_SOURCE' '-S' '-fPIC' '-D' 'PIC' '-o'
[Bug target/52503] sh-wrs-vxworks: too many target masks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52503 --- Comment #2 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-06 23:18:16 UTC --- Created attachment 26845 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26845 A patch config/sh/linux.h requires a few changes too.
[Bug target/48806] ICE in reload_cse_simplify_operands, at postreload.c:403
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48806 --- Comment #4 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-06 00:26:30 UTC --- It looks that the testcase came from a FreeBSD kernel code: http://www.leidinger.net/FreeBSD/dox/net80211/html/d7/d8d/ieee80211__crypto__ccmp_8c_source.html gcc.c-torture/execute/pr20527-1.c is an example of gcc testcase which includes a BSD-like license notice, though I'm not sure about copyright issues. The gcc list would be more appropriate for the questions about copyrights.
[Bug target/52479] SH Target: SH4A DFmode fsca tests failing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52479 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added CC||aoliva at gcc dot gnu.org --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-04 13:46:58 UTC --- I'd like to add Alex to the CC list. Alex, what do you think?
[Bug target/52480] SH Target: SH4A movua.l does not work for big endian
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52480 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-05 05:30:18 UTC --- Created attachment 26831 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26831 A possible patch Looks to be a similar problem with PR52394.
[Bug target/52483] SH Target: Loads from volatile memory leave redundant sign/zero extensions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52483 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-05 05:33:39 UTC --- (In reply to comment #0) Maybe a few peepholes would help here? Sure. Peephole looks to be reasonable for this.
[Bug rtl-optimization/48596] [4.7/4.8 Regression] [SH] unable to find a register to spill in class 'FPUL_REGS'
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48596 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-02 23:59:16 UTC --- Author: kkojima Date: Fri Mar 2 23:59:08 2012 New Revision: 184844 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=184844 Log: PR target/48596 PR target/48806 * config/sh/sh.c (sh_register_move_cost): Increase cost between GENERAL_REGS and FP_REGS for SImode. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.c
[Bug target/48806] ICE in reload_cse_simplify_operands, at postreload.c:403
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48806 --- Comment #2 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-02 23:59:17 UTC --- Author: kkojima Date: Fri Mar 2 23:59:08 2012 New Revision: 184844 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=184844 Log: PR target/48596 PR target/48806 * config/sh/sh.c (sh_register_move_cost): Increase cost between GENERAL_REGS and FP_REGS for SImode. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.c
[Bug target/52441] SH Target: Double sign/zero extensions for function arguments
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52441 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-01 22:00:14 UTC --- (In reply to comment #0) The sign/zero extensions in the caller (_xx) are not emitted when using the original Renesas ABI (-mrenesas), which is correct. Correct for efficiency, but not for robustness :-) Maybe this double sign/zero extension has some historical reason for some ABI backwards compatibilities in the GNU SH ABI... but shouldn't it actually be safe to leave out the sign/zero extensions on one side of the function call (either caller or callee)? I don't know any historical reason but x86 uses that double sign/zero extension too. It wouldn't be a safe ABI change. There can exist hand written functions depending that behavior. It's too late to change the default behavior, I think. Of course, you can add a new -m option or function attribute changing it, though it shouldn't be default for non Renesas ABI.
[Bug rtl-optimization/11736] Stackpointer messed up on SuperH
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11736 --- Comment #9 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-03-01 22:03:09 UTC --- I think so too.
[Bug target/49468] SH Target: inefficient integer abs code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49468 --- Comment #9 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-02-29 23:18:23 UTC --- (In reply to comment #8) Perhaps. Anyway looks fine to me except one minor failure on sh64-elf: xsh64-elf-combined/combined/libgcc/libgcc2.c: In function '__powisf2': xsh64-elf-combined/combined/libgcc/libgcc2.c:1779:1: error: unrecognizable insn: (insn 11 10 12 3 (set (reg:DI 170) (abs:DI (reg:DI 169))) xsh64-elf-combined/combined/libgcc/libgcc2.c:1770 -1 (nil)) xsh64-elf-combined/combined/libgcc/libgcc2.c:1779:1: internal compiler error: in extract_insn, at recog.c:2123 The failure went away if restricting new absdi2 expander to TARGET_SH1. --- gcc/config/sh/sh.md~2012-02-29 10:52:16.0 +0900 +++ gcc/config/sh/sh.md2012-02-29 11:07:42.0 +0900 @@ -4538,7 +4538,7 @@ label: [(set (match_operand:DI 0 arith_reg_dest ) (abs:DI (match_operand:DI 1 arith_reg_operand ))) (clobber (reg:SI T_REG))] - + TARGET_SH1 ) (define_insn_and_split *absdi2
[Bug target/52394] SH Target: SH2A defunct bitops
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52394 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-02-28 00:37:56 UTC --- I guess that now these tests require -fno-strict-volatile-bitfields, though it isn't enough to avoid failures. It looks that something wrong happens in expmed.c:{store, extract}_bit_field_1 and they decide to use slow fallback {store, extract}_fixed_bit_field instead of generating insv/extv. Here is suspicious part of {store, extract}_bit_field_1: /* Now convert from counting within UNIT to counting in EXT_MODE. */ if (BYTES_BIG_ENDIAN !MEM_P (xop0)) xbitpos += GET_MODE_BITSIZE (ext_mode) - unit; unit = GET_MODE_BITSIZE (ext_mode); /* If BITS_BIG_ENDIAN is zero on a BYTES_BIG_ENDIAN machine, we count backwards from the size of the unit we are extracting from. Otherwise, we count bits from the most significant on a BYTES/BITS_BIG_ENDIAN machine. */ if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) xbitpos = unit - bitsize - xbitpos; In the problematic cases, xop0 is a QImode memory and ext_mode is SImode. The initial value of unit is 8. When starting xbitops is 3 and bitsize is 1 for example, these lines set xbitspos to 28! There is no insv/extv which inserts/extracts such bit position for QImode memory and maybe_expand_insn for CODE_FOR_{insv, extv} fails. Perhaps, these parts should be something like /* We have been counting XBITPOS within UNIT. Count instead within the size of the register. */ if (BYTES_BIG_ENDIAN !MEM_P (xop0)) xbitpos += GET_MODE_BITSIZE (op_mode) - unit; /* If BITS_BIG_ENDIAN is zero on a BYTES_BIG_ENDIAN machine, we count backwards from the size of the unit we are inserting into. Otherwise, we count bits from the most significant on a BYTES/BITS_BIG_ENDIAN machine. */ if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) { if (!MEM_P (xop0)) xbitpos = GET_MODE_BITSIZE (op_mode) - bitsize - xbitpos; else xbitpos = unit - bitsize - xbitpos; } unit = GET_MODE_BITSIZE (op_mode); though I don't understand these routines well.
[Bug target/52049] SH Target: Inefficient constant address access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52049 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2012-01-30 00:15:16 UTC --- (In reply to comment #0) I'm not sure whether this is actually a problem of the SH back-end or of some middle-end passes. It happens for all sub-targets and regardless of the endianess. I've tried these cases on arm/thumb and got similar results which look not very good. From the rtl dumps, it looks a general issue with postreload optimization on some targets.
[Bug target/50749] SH Target: Post-increment addressing used only for first memory access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50749 --- Comment #11 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-30 03:24:01 UTC --- (In reply to comment #10) If OK, I'd like to change it from target PR to middle-end PR. Sure.
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #7 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-28 22:25:48 UTC --- (In reply to comment #3) I haven't ran all tests on it yet, but CSiBE shows average code size reduction of approx. -0.1% for -m4* with some code size increases in some files. Would something like that be OK for stage 3? Looks good, though not appropriate for stage 3, I think.
[Bug target/51340] SH Target: Make -mfused-madd enabled by default
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51340 --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-28 22:31:27 UTC --- (In reply to comment #2) Uhm, yes... The title should have been Enable -mfused-madd by -ffast-math Do you mean something like this? --- ORIG/trunk/gcc/config/sh/sh.c2011-12-03 10:03:41.0 +0900 +++ trunk/gcc/config/sh/sh.c2011-12-27 08:33:23.0 +0900 @@ -838,6 +838,11 @@ sh_option_override (void) align_functions = min_align; } + /* Default to use fmac insn when -ffast-math. See PR target/29100. */ + if (global_options_set.x_TARGET_FMAC == 0 + fast_math_flags_set_p (global_options) +TARGET_FMAC = 1; + if (sh_fixed_range_str) sh_fix_range (sh_fixed_range_str); I don't know the exact semantics for the new patterns. All I know is that rounding is supposed to be done only once after the two operations. This is the case for the SH fmac insn. Not sure whether this is enough though. It seems that we can use the fma pattern, though it would be an another issue.
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751 --- Comment #21 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-12 22:08:18 UTC --- (In reply to comment #20) As far as I could observe it, this is mainly triggered by the following in sh_legitimate_index_p: + if (mode == QImode (unsigned) INTVAL (op) 16) +return true; It seems that, with that hunk, recog.c:offsettable_address_addr_space_p returns always true for V2SF mode. Without that hunk, it returns false for that case. There are comments and lines in that function like /* Use QImode because an odd displacement may be automatically invalid for any wider mode. But it should be valid for a single byte. */ good = (*addressp) (QImode, y, as); where addrssp is *memory_address_addr_space_p which returns true with that hunk. You mean, by giving the user the option to turn off displacement addressing for e.g. some specific files / modules by specifying -mno-preferdisp or something like that? By anomalies do you mean code that gets worse because of too much pressure on R0 and all the reloads around it, or do you have any other bad use cases? Yes and yes. Although I didn't look all dis-improvements, it looks r0 pressure is the primary factor. Another thing I could try out is to have load/store insns that allow arbitrary operands in displacement addressing like on SH2A, and split them into two insns of one load/store and one reg-reg move after reload. But that would probably require the R0 clobber in the expander which could make worse code in cases where displacement addressing is not used, I guess. Do you think this approach could make sense? I guess that it could make worse code in some situations as you say. Yep, sure. I've noticed that the latest version of the patch seems to fix some more testsuite failures. I will investigate which hunk is responsible for the fixes so that could be pulled out from the patch. OK? Sounds great.
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751 --- Comment #19 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-11 23:57:13 UTC --- (In reply to comment #18) The results look way better now. I've tested your latest patch for sh4-unknown-linux-gnu and found no new regressions for gcc testsuite. CSiBE with -O2 -fpic on that target shows that 144 improvements and 28 dis-improvements for size on 896 files. The worst case is -4.34783 net/ipv4/ip_forward 704 736 which looks the case of the high r0 register pressure. The best one is 25.7426 arch/testplatform/kernel/traps 10160 8080 which looks to be very impressive. /* We want to enable the use of SUBREGs as a means to VEC_SELECT a single element of a vector. */ + + /* This effectively disallows using GENERAL_REGS for SFmode vector subregs. + This can be problematic when SFmode vector subregs need to be accessed + on the stack with displacement addressing, as it happens with -O0. + Thus we allow the mode change for -O0. */ if (to == SFmode VECTOR_MODE_P (from) GET_MODE_INNER (from) == SFmode) -return (reg_classes_intersect_p (GENERAL_REGS, rclass)); +return optimize ? (reg_classes_intersect_p (GENERAL_REGS, rclass)) : false; Rather than that, I guess that the QI/HImode disp addressing would be an optimization unneeded for -O0 in the first place. Perhaps something like -mpreferdisp option and TARGET_PREFER_DISP macro which are enable by default but disable at -O0 might be help. It'll also help some unfortunate anormallies for which those optimizations will generate worse codes. There are probably smarter ways of doing what the patch does. I have also tried out implementing it with predicates and constraints, few load/store insns and lots of alternatives in the insns. However, reload would refuse to select the displacement addressing due to pressure on R0 in many cases. Maybe. Implementing it with predicates and constraints would be smarter if possible but may be difficult because the register allocator handles the m constraint specially. Would something like the attached patch be acceptable (after some cleanups)? If so, I'd also start adding HImode displacement addressing support. I think so, though we are in stage 3 and have to wait the trunk returns to stage 1 or 2 for committing such changes. You have the time for implementing HImode support. BTW, the changes for white spaces, spells and other clean-ups which are not essential for this work should be separated into another patch.
[Bug middle-end/51351] undefined reference to __sync_fetch_and_ior_4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51351 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED --- Comment #2 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-04 11:45:25 UTC --- Thanks for the quick fix!
[Bug target/50814] SH Target: SHAD / SHLD instructions not used on SH2A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50814 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED --- Comment #7 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-02 23:42:56 UTC --- Fixed on trunk.
[Bug target/51337] SH Target: Various testsuite ICEs for -m2a -O0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51337 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED --- Comment #2 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-02 23:44:34 UTC --- Fixed.
[Bug target/50814] SH Target: SHAD / SHLD instructions not used on SH2A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50814 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-01 23:02:08 UTC --- Author: kkojima Date: Thu Dec 1 23:01:58 2011 New Revision: 181896 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=181896 Log: PR target/50814. * config/sh/sh.c (expand_ashiftrt): Handle TARGET_SH2A same as TARGET_SH3. (shl_sext_kind): Likewise. * config/sh/sh.h (SH_DYNAMIC_SHIFT_COST): Likewise. * config/sh/sh.md (ashlsi3_sh2a, ashrsi3_sh2a, lshrsi3_sh2a): Remove. (ashlsi3_std): Handle TARGET_SH2A same as TARGET_SH3. (ashlsi3): Likewise. (ashrsi3_d): Likewise. (lshrsi3_d): Likewise. (lshrsi3): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.c trunk/gcc/config/sh/sh.h trunk/gcc/config/sh/sh.md
[Bug target/51337] SH Target: Various testsuite ICEs for -m2a -O0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51337 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-29 22:52:59 UTC --- Author: kkojima Date: Tue Nov 29 22:52:55 2011 New Revision: 181823 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=181823 Log: PR target/51337 * config/sh/sh.c (sh_secondary_reload): Add case when FPUL register is being loaded from a pseudo in memory. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.c
[Bug middle-end/51351] New: undefined reference to __sync_fetch_and_ior_4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51351 Bug #: 51351 Summary: undefined reference to __sync_fetch_and_ior_4 Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassig...@gcc.gnu.org ReportedBy: kkoj...@gcc.gnu.org On SH, there are libgomp test failures with undefined reference to `__sync_fetch_and_ior_4' Doc refers __sync_fetch_and_or but not __sync_fetch_and_ior.
[Bug target/50814] SH Target: SHAD / SHLD instructions not used on SH2A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50814 --- Comment #5 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-28 13:43:16 UTC --- BTW, when regtesting, I've found that there are many ICEs at -O0. A typical one is gcc.c-torture/compile/2923-1.c with -m2a -O0: ...: error: insn does not satisfy its constraints: (insn 142 34 35 (set (mem/c:SI (plus:SI (reg/f:SI 14 r14) (const_int 36 [0x24])) [0 %sfp+-16 S4 A32]) (reg:SI 150 fpul)) ... {movsi_ie} (nil)) ...: internal compiler error: in extract_constrain_insn_cached, at recog.c:2052 which is solved by the hunk in the patch against PR50751 --- gcc/config/sh/sh.c.orig2011-11-28 10:03:04.0 +0900 +++ gcc/config/sh/sh.c2011-11-28 15:09:01.0 +0900 @@ -12432,6 +12432,10 @@ sh_secondary_reload (bool in_p, rtx x, r if (rclass != GENERAL_REGS REG_P (x) TARGET_REGISTER_P (REGNO (x))) return GENERAL_REGS; + /* If here fall back to loading FPUL register through general regs. + Happens when FPUL has to be loaded from a reg allocated on the stack. */ + if (rclass == FPUL_REGS !REG_P (x)) +return GENERAL_REGS; return NO_REGS; } Oleg, it seems that this is the right patch for an independent issue described in your comment. Could you please file it to the bugzilla and propose that patch to the gcc-patch list?
[Bug target/51340] SH Target: Make -mfused-madd enabled by default
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51340 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-28 23:09:32 UTC --- (In reply to comment #0) Is there any particular reason why this should not be enabled by default for SH targets that support the FMAC insn? PR29100? BTW, if SH fmac satisfies the semantics for fused multiplication and add operation, the fmaf4 instruction pattern would be better now.
[Bug target/50749] SH Target: Post-increment addressing used only for first memory access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50749 --- Comment #9 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-28 23:29:57 UTC --- (In reply to comment #8) Specifying -fno-tree-forwprop doesn't seem to have any effect on these cases. For that function, -fdump-tree-all shows that the tree loop ivopts optimization does it. Try -fno-tree-forwprop -fno-ivopts.
[Bug target/50814] SH Target: SHAD / SHLD instructions not used on SH2A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50814 --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-28 00:09:17 UTC --- (In reply to comment #2) According to the SW manual document rej09b0051_sh2a.pdf the SHAD and SHLD insns have the same 2-byte format as on SH3: SHAD Rm, Rn: 01001100 SHLD Rm, Rn: 01001101 Am I missing something there? Ugh. You are right. I thought so from sh2a support was introduced at r85286.
[Bug target/50814] SH Target: SHAD / SHLD instructions not used on SH2A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50814 --- Comment #4 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-28 04:31:51 UTC --- Created attachment 25927 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25927 A patch I'm testing the attached patch.
[Bug target/51244] SH Target: Inefficient conditional branch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-22 22:33:43 UTC --- return (a != b || a != c) ? b : c; test_func_0_NG and test_func_1_NG cases are related with the target implementation of cstoresi4. The middle end expands a complex conditional jump to cstores and a simple conditional jumps. For expression a != b, SH's cstoresi4 implementation uses sh.c:sh_emit_compare_and_set which generates cmp/eq and movnegt insn, because we have no cmp/ne insn. Then we've got the sequence mov #-1,rn negc rn,rm tst #255,rm which is essentially T_reg = T_reg. Usually combine catches such situation, but negc might be too complex for combine. For this case, replacing current movnegt expander by insn, splitter and peephole something like (define_insn movnegt [(set (match_operand:SI 0 arith_reg_dest =r) (plus:SI (reg:SI T_REG) (const_int -1))) (clobber (match_scratch:SI 1 =r)) (clobber (reg:SI T_REG))] # [(set_attr length 4)]) (define_split [(set (match_operand:SI 0 arith_reg_dest =r) (plus:SI (reg:SI T_REG) (const_int -1))) (clobber (match_scratch:SI 1 =r)) (clobber (reg:SI T_REG))] reload_completed [(set (match_dup 1) (const_int -1)) (parallel [(set (match_dup 0) (neg:SI (plus:SI (reg:SI T_REG) (match_dup 1 (set (reg:SI T_REG) (ne:SI (ior:SI (reg:SI T_REG) (match_dup 1)) (const_int 0)))])] ) (define_peephole2 [(set (match_operand:SI 1 ) (const_int -1)) (parallel [(set (match_operand:SI 0 ) (neg:SI (plus:SI (reg:SI T_REG) (match_dup 1 (set (reg:SI T_REG) (ne:SI (ior:SI (reg:SI T_REG) (match_dup 1)) (const_int 0)))]) (set (reg:SI T_REG) (eq:SI (match_operand:QI 3 ) (const_int 0)))] REGNO (operands[3]) == REGNO (operands[0]) peep2_reg_dead_p (3, operands[0]) peep2_reg_dead_p (3, operands[1]) [(const_int 0)] ) the above useless sequence could be removed, though we will miss the chance that the -1 can be CSE-ed when the cstore value is used. This will cause a bit worse code for the loop like int foo (int *a, int x, int n) { int i; int count; for (i = 0; i n; i++) count += (*(a + i) != x); return count; } though it may be relatively rare. BTW, OT, (a != b || a != c) ? b : c could be reduced to b, I think. return a = 0 b = 0 ? c : d; x = 0 is expanded to the sequence like ra = not x rb = -31 rc = ra (neg rb) T = (rc == 0) conditional jump and combine tries to simplify it. combine simplifies b = 0 successfully into shll and bt but fails to simplify a = 0. It seems that combine doesn't do constant propagation well and misses the constant -31. In this case, a peephole like (define_peephole2 [(set (match_operand:SI 0 arith_reg_dest ) (not:SI (match_operand:SI 1 arith_reg_operand ))) (set (match_operand:SI 2 arith_reg_dest ) (const_int -31)) (set (match_operand:SI 3 arith_reg_dest ) (lshiftrt:SI (match_dup 0) (neg:SI (match_dup 2 (set (reg:SI T_REG) (eq:SI (match_operand:QI 4 arith_reg_operand ) (const_int 0))) (set (pc) (if_then_else (match_operator 5 comparison_operator [(reg:SI T_REG) (const_int 0)]) (label_ref (match_operand 6 )) (pc)))] REGNO (operands[3]) == REGNO (operands[4]) peep2_reg_dead_p (4, operands[0]) (peep2_reg_dead_p (4, operands[3]) || rtx_equal_p (operands[2], operands[3])) peep2_regno_dead_p (5, T_REG) [(set (match_dup 2) (const_int -31)) (set (reg:SI T_REG) (ge:SI (match_dup 1) (const_int 0))) (set (pc) (if_then_else (match_op_dup 7 [(reg:SI T_REG) (const_int 0)]) (label_ref (match_dup 6)) (pc)))] { operands[7] = gen_rtx_fmt_ee (reverse_condition (GET_CODE (operands[5])), GET_MODE (operands[5]), XEXP (operands[5], 0), XEXP (operands[5], 1)); }) will be a workaround. It isn't ideal, but better than nothing. return a == b ? test_sub0 (a, b) : test_sub1 (a, b); return a != b ? test_sub0 (a, b) : test_sub1 (a, b); This case is intresting. At -Os, two calls are converted into one computed goto. A bit surprisingly, the conversion is done as a side effect of combine-stack-adjustments pass. That pass calls cleanup_cfg (flag_crossjumping ? CLEANUP_CROSSJUMP : 0); and the cross jumping optimization merges two calls. With -Os -fno-delayed-branch, the OK case is compiled to test_func_3_OK: mov r4,r1 cmp/eq r5,r1 mov.l .L4,r0 bf .L3 mov r1,r5 mov.l .L5,r0 bra .L3 nop .L3: jmp @r0 nop and the NG case test_func_3_NG: mov r4,r1 cmp/eq r5,r1 bt .L2 mov.l .L4,r0 bra .L3 nop .L2: mov.l .L5,r0 mov r1,r5 .L3
[Bug target/51241] SH Target: Unnecessary sign/zero extensions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51241 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-21 01:52:22 UTC --- Please put the description of the problem into the trail itself instead of attachment next time. The problem looks to be splitted into several issues. mov.b @r4+,r3 ! 40*extendqisi2_compact_mem_inc extu.b r3,r3 ! 41*zero_extendqisi2_compact --- extu.b r3, r0 mov r3,r0 ! 75movsi_ie/2 --- ?? This mov insn is generated with reload. After all, SH's and #imm,* would be too restrictive. exts.b r3,r3 ! 50*extendqisi2_compact--- ?? and #127,r0 ! 45*andsi3_compact/1 --- makes extu.b useless cmp/pz r3 ! 51cmpgesi_t/2 As you pointed out, if cmp/pz is placed at just after mov.b insn, exts.b is not required. I don't know whether such pass exists or not. mov.l @r4,r1 ! 7 movsi_ie/7 [length = 2] swap.w r1,r1 ! 13rotlsi3_16 [length = 2] exts.w r1,r1 ! 14*extendhisi2_compact/1 [length = 2] rts ! 21*return_i [length = 2] mov.b r1,@r5 ! 10*movqi/4[length = 2] The sequence of swap.w and exts.w are generated ashrsi2_16 insn and its splitter. exts.w could be removed by the combine pass, though the split is done after combine. Perhaps with replacing that insn and splitter with an expand like (define_expand ashrsi2_16 [(set (match_operand:SI 0 arith_reg_dest ) (rotate:SI (match_operand:SI 1 arith_reg_operand ) (const_int 16))) (set (match_dup 0) (sign_extend:SI (match_dup 2)))] TARGET_SH1 operands[2] = gen_lowpart (HImode, operands[0]);) the combine will do the work. negcr1,r1 ! 10negc[length = 2] extu.b r1,r0 ! 12*zero_extendqisi2_compact [length = 2] Again, usually the combine pass can remove such extu.b. Perhaps negc has a pattern [(set (match_operand:SI 0 arith_reg_dest =r) (neg:SI (plus:SI (reg:SI T_REG) (match_operand:SI 1 arith_reg_operand r (set (reg:SI T_REG) (ne:SI (ior:SI (reg:SI T_REG) (match_dup 1)) (const_int 0)))] which would be too complex for combine.
[Bug target/50694] SH Target: SH2A little endian does not actually work
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50694 --- Comment #10 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-13 23:00:15 UTC --- Author: kkojima Date: Sun Nov 13 23:00:10 2011 New Revision: 181340 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=181340 Log: PR target/50694 * config/sh/sh.h (IS_LITTLE_ENDIAN_OPTION, UNSUPPORTED_SH2A): New macros. (DRIVER_SELF_SPECS): Use new macros to filter out unsupported options taking the default configuration into account. * gcc.target/sh/pr21255-2-ml.c: Skip if -mb or -m5* is specified. Remove redundant runtime checks. * gcc.target/sh/20080410-1.c: Skip if -mb is specified. Allow for other than -m4. Fix typos in comments. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.h trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/sh/20080410-1.c trunk/gcc/testsuite/gcc.target/sh/pr21255-2-ml.c
[Bug target/22553] [4.4/4.5/4.6/4.7 regression] ICE building libstdc++
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22553 --- Comment #20 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-09 14:07:01 UTC --- (In reply to comment #19) So I think the workaround from r105496 can be safely removed now and then close this bug as fixed since 4.3.0 I've confirmed that there are no ICEs on SH with reverting 105496 change, though I can't get why does the change pointed in #19 fix the issue pointed by Joern with http://gcc.gnu.org/ml/gcc-patches/2005-09/msg01654.html
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751 --- Comment #14 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-11-02 00:57:59 UTC --- (In reply to comment #13) Apparently this makes something believe that loading the FPUL register from a displacement address is possible, which is of course not the case. However, I can't see any connection there... .ira dump would be your friend, though I suspect that your patch triggered off some other reload problem like PR48596. Could you try the change in #5 of that PR?
[Bug target/50749] SH Target: Post-increment addressing used only for first memory access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50749 --- Comment #7 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-30 23:36:27 UTC --- (In reply to comment #6) I wonder whether there might be something in the target code that suggests the early optimizers to do that? I've tried playing with the TARGET_ADDRESS_COST hook but it didn't have any effect in this case. -ftree-dump-all shows that forward propagation on ssa trees makes those memory accesses into simple array accesses. You can try -fno-tree-forwprop and see the effect of that option. It seems that there are no special knobs to control forwprop from the target side. The problem is that SH target can't do those simple array accesses well at QI/HImode because of the lack of displacement addressing for those modes.
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751 --- Comment #12 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-27 22:30:39 UTC --- It seems that base_reg+index_reg addressing requires special handling in RA and the move insn like (define_insn *movqi_m_reg_reg_store [(set (mem:QI (plus:SI (match_operand:SI 0 arith_reg_operand %z) (match_operand:SI 1 arith_reg_operand r))) (match_operand:QI 2 arith_reg_operand r))] TARGET_SH1 mov.b%2,@(%0,%1) [(set_attr type store)]) might be unexpected for RA.
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751 --- Comment #8 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-27 02:31:35 UTC --- (In reply to comment #7) Created attachment 25622 [details] asmcons and ira pass log for the reload failure of z insn constraint The original insn 13 was (insn 13 12 14 3 (set (reg:SI 193) (plus:SI (subreg:SI (reg:QI 191 [ MEM[(char *)buf1_4(D) + 4B] ]) 0) (subreg:SI (reg:QI 192 [ MEM[(char *)buf0_1(D) + 5B] ]) 0))) and RA chooses r1 and r0 as the registers to where memories will be loaded. The problem is we have no direct way to load buf1[4] to r1. In such situation, a secondary reload is needed. See the description of TARGET_SECONDARY_RELOAD in the gcc manual. Here is a trial: --- ORIG/trunk/gcc/config/sh/sh.c2011-10-16 10:18:53.0 +0900 +++ trunk/gcc/config/sh/sh.c2011-10-27 10:13:21.0 +0900 @@ -12430,6 +12453,10 @@ sh_secondary_reload (bool in_p, rtx x, r if (rclass != GENERAL_REGS REG_P (x) TARGET_REGISTER_P (REGNO (x))) return GENERAL_REGS; + if (rclass == GENERAL_REGS mode == QImode + MEM_P (x) GET_CODE (XEXP (x, 0)) == PLUS + CONST_INT_P (XEXP (XEXP (x, 0), 1))) +return R0_REGS; return NO_REGS; } The ICE for your testcase went away with it, though I've got ../../../INTEST/trunk/zlib/trees.c: In function 'send_tree': ../../../INTEST/trunk/zlib/trees.c:797:1: error: unable to find a register to spill in class 'R0_REGS' ../../../INTEST/trunk/zlib/trees.c:797:1: error: this is the insn: (insn 415 414 416 28 (set (mem:QI (plus:SI (reg/f:SI 6 r6 [orig:742 s_34(D)-pending_buf ] [742]) (reg:SI 7 r7 [orig:307 D.4248 ] [307])) [0 *D.4249_209+0 S1 A8]) (reg:QI 746 [ s_34(D)-bi_buf ])) ../../../INTEST/trunk/zlib/trees.c:780 206 {*movqi_m_reg_reg_store} (expr_list:REG_DEAD (reg:QI 746 [ s_34(D)-bi_buf ]) (expr_list:REG_DEAD (reg/f:SI 6 r6 [orig:742 s_34(D)-pending_buf ] [742]) (expr_list:REG_DEAD (reg:SI 7 r7 [orig:307 D.4248 ] [307]) (nil) ../../../INTEST/trunk/zlib/trees.c:797:1: internal compiler error: in spill_failure, at reload1.c:2118 when bootstrapping.
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751 --- Comment #5 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-24 23:05:08 UTC --- (In reply to comment #4) It seems that clobbering R0 in that expander is simply papering over the real problem. Although the reload issue beyonds me, .ira dump file about that impossible insn which doesn't satisfy the z constraint would be a starting point.
[Bug target/50694] SH Target: SH2A little endian does not actually work
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50694 --- Comment #8 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-20 22:40:27 UTC --- (In reply to comment #7) This problem doesn't require the theoretical/mathematical completeness. There are many inappropriate combinations of options which don't get any warning when running compiler and configurations. The important thing is to warn very confusing ones from the user's point of view. So your patch in #6 or even one liner in #2 would be OK and enough for this PR, I think.
[Bug target/50814] SH Target: SHAD / SHLD instructions not used on SH2A
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50814 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-21 00:24:36 UTC --- (In reply to comment #0) It is also not clear to me why SH2A seems to require different handling for dynamic shifts than SH3 or SH4... Will be slightly different because sh2a's shadshld are 4-byte insns. Perhaps something like below will work, though I don't test it at all. diff -up ORIG/gcc/config/sh/sh.h gcc/config/sh/sh.h --- ORIG/gcc/config/sh/sh.h2011-04-23 09:43:19.0 +0900 +++ gcc/config/sh/sh.h2011-10-21 08:15:25.0 +0900 @@ -2371,7 +2371,8 @@ extern int current_function_interrupt; #define ACCUMULATE_OUTGOING_ARGS TARGET_ACCUMULATE_OUTGOING_ARGS #define SH_DYNAMIC_SHIFT_COST \ - (TARGET_HARD_SH4 ? 1 : TARGET_SH3 ? (optimize_size ? 1 : 2) : 20) + (TARGET_HARD_SH4 ? 1 : TARGET_SH3 ? (optimize_size ? 1 : 2) \ + : TARGET_SH2A ? 2 : 20) #define NUM_MODES_FOR_MODE_SWITCHING { FP_MODE_NONE } diff -up ORIG/gcc/config/sh/sh.c gcc/config/sh/sh.c --- ORIG/gcc/config/sh/sh.c2011-07-29 09:31:42.0 +0900 +++ gcc/config/sh/sh.c2011-10-21 09:03:36.0 +0900 @@ -3246,7 +3246,7 @@ expand_ashiftrt (rtx *operands) char func[18]; int value; - if (TARGET_SH3) + if (TARGET_SH3 || TARGET_SH2A) { if (!CONST_INT_P (operands[2])) { diff -up ORIG/gcc/config/sh/sh.md gcc/config/sh/sh.md --- ORIG/gcc/config/sh/sh.md2011-08-02 09:47:17.0 +0900 +++ gcc/config/sh/sh.md2011-10-21 08:58:49.0 +0900 @@ -3424,15 +3424,6 @@ label: ;; ;; shift left -(define_insn ashlsi3_sh2a - [(set (match_operand:SI 0 arith_reg_dest =r) -(ashift:SI (match_operand:SI 1 arith_reg_operand 0) - (match_operand:SI 2 arith_reg_operand r)))] - TARGET_SH2A - shad%2,%0 - [(set_attr type arith) - (set_attr length 4)]) - ;; This pattern is used by init_expmed for computing the costs of shift ;; insns. @@ -3441,14 +3432,14 @@ label: (ashift:SI (match_operand:SI 1 arith_reg_operand 0,0,0,0) (match_operand:SI 2 nonmemory_operand r,M,P27,?ri))) (clobber (match_scratch:SI 3 =X,X,X,r))] - TARGET_SH3 + (TARGET_SH3 || TARGET_SH2A) || (TARGET_SH1 satisfies_constraint_P27 (operands[2])) @ shld%2,%0 add%0,%0 shll%O2%0 # - TARGET_SH3 + (TARGET_SH3 || TARGET_SH2A) reload_completed CONST_INT_P (operands[2]) ! satisfies_constraint_P27 (operands[2]) @@ -3457,7 +3448,11 @@ label: [(set (match_dup 0) (ashift:SI (match_dup 1) (match_dup 3))) (clobber (match_dup 4))])] operands[4] = gen_rtx_SCRATCH (SImode); - [(set_attr length *,*,*,4) + [(set_attr_alternative length + [(if_then_else +(ne (symbol_ref TARGET_SH2A) (const_int 0)) +(const_int 4) (const_int 2)) + (const_int 2) (const_int 2) (const_int 4)]) (set_attr type dyn_shift,arith,arith,arith)]) (define_insn ashlhi3_k @@ -3584,15 +3579,6 @@ label: ; arithmetic shift right ; -(define_insn ashrsi3_sh2a - [(set (match_operand:SI 0 arith_reg_dest =r) -(ashiftrt:SI (match_operand:SI 1 arith_reg_operand 0) - (neg:SI (match_operand:SI 2 arith_reg_operand r] - TARGET_SH2A - shad%2,%0 - [(set_attr type dyn_shift) - (set_attr length 4)]) - (define_insn ashrsi3_k [(set (match_operand:SI 0 arith_reg_dest =r) (ashiftrt:SI (match_operand:SI 1 arith_reg_operand 0) @@ -3687,9 +3673,13 @@ label: [(set (match_operand:SI 0 arith_reg_dest =r) (ashiftrt:SI (match_operand:SI 1 arith_reg_operand 0) (neg:SI (match_operand:SI 2 arith_reg_operand r] - TARGET_SH3 + TARGET_SH3 || TARGET_SH2A shad%2,%0 - [(set_attr type dyn_shift)]) + [(set_attr_alternative length + [(if_then_else +(ne (symbol_ref TARGET_SH2A) (const_int 0)) +(const_int 4) (const_int 2))]) + (set_attr type dyn_shift)]) (define_insn ashrsi3_n [(set (reg:SI R4_REG) @@ -3735,22 +3725,17 @@ label: ;; logical shift right -(define_insn lshrsi3_sh2a - [(set (match_operand:SI 0 arith_reg_dest =r) -(lshiftrt:SI (match_operand:SI 1 arith_reg_operand 0) - (neg:SI (match_operand:SI 2 arith_reg_operand r] - TARGET_SH2A - shld%2,%0 - [(set_attr type dyn_shift) - (set_attr length 4)]) - (define_insn lshrsi3_d [(set (match_operand:SI 0 arith_reg_dest =r) (lshiftrt:SI (match_operand:SI 1 arith_reg_operand 0) (neg:SI (match_operand:SI 2 arith_reg_operand r] - TARGET_SH3 + TARGET_SH3 || TARGET_SH2A shld%2,%0 - [(set_attr type dyn_shift)]) + [(set_attr type dyn_shift) + (set_attr_alternative length + [(if_then_else +(ne (symbol_ref TARGET_SH2A) (const_int 0)) +(const_int 4) (const_int 2))])]) ;; Only the single bit shift clobbers the T bit.
[Bug target/50749] SH Target: Post-increment addressing used only for first memory access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50749 --- Comment #4 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-19 21:36:56 UTC --- (In reply to comment #3) USE_LOAD_POST_INCREMENT and USE_STORE_PRE_DECREMENT are used only in move_by_pieces which is for some block operations when MOVE_BY_PIECES_P says OK. They don't disable post_inc/pre_dec addressing for SI/DImode in general, I think. It seems that they are 0 for SI/DImode because we have addressing with display for a limited size of memory chunk in these modes, though I'm wrong about it. I'm a bit curious to see what happens if they are changed to non-zero for SI/DImode.
[Bug target/50694] SH Target: SH2A little endian does not actually work
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50694 --- Comment #4 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-18 22:24:32 UTC --- (In reply to comment #3) There are no real uses of SH1/SH2/SH2E/SH3E cores anymore, I think. I agree that taking care of -m2e is not worth. Perhaps same for -m1. Anyway, your change looks plausible to me.
[Bug target/50694] SH Target: SH2A little endian does not actually work
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50694 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-18 22:50:19 UTC --- (In reply to comment #5) I'll send in a patch with a couple of other cosmetic changes later, OK? Please go for it.
[Bug target/50694] SH Target: SH2A little endian does not actually work
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50694 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Priority|P3 |P4 Status|UNCONFIRMED |NEW Last reconfirmed||2011-10-16 Ever Confirmed|0 |1 --- Comment #2 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-16 23:28:48 UTC --- (In reply to comment #1) Ah. One liner -#define DRIVER_SELF_SPECS %{m2a:%{ml:%eSH2a does not support little-endian}} +#define DRIVER_SELF_SPECS %{m2a*:%{ml:%eSH2a does not support little-endian}} should work.
[Bug target/50749] SH Target: Post-increment addressing used only for first memory access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50749 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-16 23:33:40 UTC --- GCC makes usual mem accesses into those with post_inc/pre_dec at auto_inc_dec pass. I guess that auto_inc_dec pass can't find post_inc insns well in that case because other tree/rtl optimizers tweak the code already. If this is the case, the problem would be not target specific.
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751 --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-17 00:29:55 UTC --- This is a known issue. See the comment just before sh.c:sh_legitimate_index_p. Unfortunately, I guess this PR might be marked as WONTFIX.
[Bug target/50749] SH Target: Post-increment addressing used only for first memory access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50749 --- Comment #2 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-17 00:32:39 UTC --- *** Bug 50750 has been marked as a duplicate of this bug. ***
[Bug target/50750] SH Target: Pre-decrement addressing used only for first memory access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50750 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-17 00:32:39 UTC --- Looks duplicate of PR50749. *** This bug has been marked as a duplicate of bug 50749 ***
[Bug target/50751] SH Target: Displacement addressing does not work for QImode and HImode
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50751 --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-17 00:51:15 UTC --- (In reply to comment #2) Yeah, I know this has been around for a while. I'd like to take my chances anyway :) Welcome to the spill-failure-for-class-'R0_REGS' club :-)
[Bug target/49263] SH Target: underutilized TST #imm, R0 instruction
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #12 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-14 23:06:06 UTC --- (In reply to comment #11) Created attachment 25491 [details] Proposed patch including test case Looks fine. A very minor style nits: + if (GET_CODE (XEXP (x, 0)) == AND /* tst instruction. */ This comment looks a bit bogus. A full sentence comment would be better. + + There are some extra empty lines. GNU/GCC coding style says that only one empty line is needed. I know that there are extra empty lines already, but we should not add new ones :-)
[Bug target/49263] SH Target: underutilized TST #imm, R0 instruction
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #13 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-15 02:32:56 UTC --- Author: kkojima Date: Sat Oct 15 02:32:53 2011 New Revision: 180020 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=180020 Log: PR target/49263 * config/sh/sh.h (ZERO_EXTRACT_ANDMASK): New macro. * config/sh/sh.c (sh_rtx_costs): Add test instruction case. * config/sh/sh.md (tstsi_t): Name existing insn. Make inner and instruction commutative. (tsthi_t, tstqi_t, tstqi_t_zero, tstsi_t_and_not, tstsi_t_zero_extract_eq, tstsi_t_zero_extract_xor, tstsi_t_zero_extract_subreg_xor_little, tstsi_t_zero_extract_subreg_xor_big): New insns. (*movsicc_t_false, *movsicc_t_true): Replace space with tab in asm output. (*andsi_compact): Reorder alternatives so that K08 is considered first. * gcc.target/sh/pr49263.c: New. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.c trunk/gcc/config/sh/sh.h trunk/gcc/config/sh/sh.md trunk/gcc/testsuite/ChangeLog
[Bug target/49263] SH Target: underutilized TST #imm, R0 instruction
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #10 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-11 01:47:03 UTC --- (In reply to comment #9) 3) only zero_extract special cases looks to be dominant. I'm sorry, I forgot to mention that it was just a proof of concept hack of mine, just to see whether it has any chance to work at all. I think it would be better to change/fix the behavior of the combine pass in this regard, so that it tries matching combined patterns without sophisticated transformations. I will try asking on the gcc list about that. I see. I also expect that the experts have some idea for this issue. I think it would be a bit too much checking out each individual pattern. I don't think that it's too much. Those numbers can be easily collected for CSiBE. If your patterns are named, you could simply add -dap -save-temps to the compiler option which is specified when ruining CSiBE's create-config and then get the occurrences of testsi_6, for example, with something like grep testsi_6 `find . -name *.s -print` | wc -l after running the CSiBE size test.
[Bug target/49263] SH Target: underutilized TST #imm, R0 instruction
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #8 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-10-10 01:31:42 UTC --- (In reply to comment #7) Option 2 seems more robust even if it seems less effective, what do you think? Another combine pass to reduce size less than 0.3% on one target would be not acceptable, I guess. ~10 new patterns would be overkill for that result, though I'm still expecting that a few patterns of them were dominant. Could you get numbers which pattern was used in the former option?
[Bug bootstrap/49486] [4.7 Regression] Bootstrap failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49486 --- Comment #2 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-09-28 21:43:06 UTC --- Author: kkojima Date: Wed Sep 28 21:43:01 2011 New Revision: 179320 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=179320 Log: PR target/49486 * config/sh/sh.md (negdi2): Move expansion into split to allow more combination options. Add T_REG clobber. (abssi2): New expander. (*negdi2, *abssi2, *negabssi2): New insns. (cneg): Change from insn to insn_and_split. Rename to negsi_cond. Add alternative for non-SH4. * gcc.target/sh/pr49468-si.c: New. Added: trunk/gcc/testsuite/gcc.target/sh/pr49468-si.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.md trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/50287] [4.7 Regression] FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c compilation, -O2 -flto
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50287 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-09-07 00:26:13 UTC --- (In reply to comment #4) Testcase that fails on i686-linux for me. FYI, the testcase is failing also for arm-eabi, mips-elf and sh-elf.
[Bug tree-optimization/50287] New: FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c compilation, -O2 -flto
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50287 Bug #: 50287 Summary: FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c compilation, -O2 -flto Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: kkoj...@gcc.gnu.org Target: arm-eabi sh*-*-* Several gcc.c-torture/execute/builtins/*-chk.c tests fail for ARM and SH with -O2 -flto: gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c: In function '__vsnprintf_chk': gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c:398:1: error: number of operands and imm-links don't agree in statement # .MEM_57 = VDEF .MEM_22 ap = ap_18(D); gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c:398:1: internal compiler error: verify_ssa failed A reduced testcase for arm-eabi: static char buf[4096]; int __attribute__((format(printf,4,0))) foo (char *str, unsigned int len, unsigned int size, const char *fmt, __builtin_va_list ap); int foo (char *str, unsigned int len, unsigned int size, const char *fmt, __builtin_va_list ap) { if (!size) return 0; if (size len) bar (str, buf, size + 1); else bar (str, buf, len - 1); return 0; } It has started to fail after revision 178386. It seems that the fix for PR49886 reveals this issue. -fno-partial-inlining makes the ICE go away.
[Bug target/50068] Invalid memory access in incr_ticks_for_insn
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50068 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-08-17 22:49:21 UTC --- Author: kkojima Date: Wed Aug 17 22:49:18 2011 New Revision: 177839 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=177839 Log: PR target/50068 * config/sh/sh.c (sh_output_mi_thunk): Don't call dbr_schedule. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.c
[Bug target/50068] Invalid memory access in incr_ticks_for_insn
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50068 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Target|shle--netbsdelf |sh*-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2011-08-16 CC||kkojima at gcc dot gnu.org Ever Confirmed|0 |1 Known to fail||4.4.6, 4.5.3, 4.6.1, 4.7.0 --- Comment #5 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-08-16 13:00:12 UTC --- I've added gcc_assert (last_basic_block = NUM_FIXED_BLOCKS) line to init_resource_info and confirmed that trunk and all released branches fail with the testcase given in #1 for sh4-unknown-linux-gnu. Perhaps if (optimize 0 flag_delayed_branch) dbr_schedule (insns); in sh.c:sh_output_mi_thunk might not be a big deal. I'm testing a patch which simply removes these lines.
[Bug rtl-optimization/49977] [4.7 Regression] CFI notes are missed for delayed slot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49977 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED Resolution||FIXED --- Comment #11 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-08-10 02:41:57 UTC --- Now the testresult for hppa64-hp-hpux11.11 looks good http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg00952.html I'd like to close this PR.
[Bug rtl-optimization/49686] [4.7 Regression] CFI notes are missed for delayed slot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49686 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-08-04 12:18:09 UTC --- It seems that the problem comes back on trunk revision 177305 for SH. There are many EH test failures which went away with -fno-delayed-branch and the testcase in #1 is assembled to foo: .LFB0: tstr4,r4 bt/s.L2 sts.lpr,@-r15 mov.l.L3,r0 jsr@r0 nop with -O1 -fexceptions -fnon-call-exceptions.
[Bug rtl-optimization/49686] [4.7 Regression] CFI notes are missed for delayed slot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49686 --- Comment #8 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-08-04 13:57:14 UTC --- Thanks for checking cris-elf. I'd like to open a new PR.
[Bug rtl-optimization/49977] New: [4.7 Regression] CFI notes are missed for delayed slot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49977 Summary: [4.7 Regression] CFI notes are missed for delayed slot Product: gcc Version: 4.7.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: kkoj...@gcc.gnu.org CC: r...@gcc.gnu.org, h...@gcc.gnu.org Target: sh4-unknown-linux-gnu, cris-elf Many EH tests fail on SH and CRIS. These failures went away with -fno-delayed-branch on SH. The symptoms are quite similar to those of PR49686.
[Bug rtl-optimization/49977] [4.7 Regression] CFI notes are missed for delayed slot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49977 --- Comment #5 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-08-04 21:18:55 UTC --- (In reply to comment #2) Kaz, can you enumerate some specific tests that are now failing? I've got FAIL: gcc.dg/cleanup-10.c execution test FAIL: gcc.dg/cleanup-11.c execution test FAIL: g++.dg/eh/crossjump1.C execution test FAIL: g++.dg/eh/unexpected1.C execution test FAIL: g++.dg/ext/cleanup-10.C execution test FAIL: g++.dg/ext/cleanup-11.C execution test FAIL: g++.dg/torture/pr49115.C -O1 execution test ... A tiny testcase in #1 of PR49686 int foo (int a) { if (a) bar (); return 1; } is again compiled to foo: .LFB0: tstr4,r4 bt/s.L2 sts.lpr,@-r15 mov.l.L3,r0 jsr@r0 nop with -O1 -fexceptions -fnon-call-exceptions.
[Bug rtl-optimization/49982] New: [4.7 Regression] ICE in fixup_args_size_notes, at expr.c:3625
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49982 Summary: [4.7 Regression] ICE in fixup_args_size_notes, at expr.c:3625 Product: gcc Version: 4.7.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: kkoj...@gcc.gnu.org CC: r...@gcc.gnu.org Target: sh-*-* For sh-elf, gcc.c-torture/compile/20030224-1.c fails at -O0 -m4 with ICE: internal compiler error: in fixup_args_size_notes, at expr.c:3625 #0 fancy_abort (file=0x88edae8 ../../ORIG/trunk/gcc/expr.c, line=3625, function=0x88ee6af fixup_args_size_notes) at ../../ORIG/trunk/gcc/diagnostic.c:893 #1 0x0828560f in fixup_args_size_notes (prev=0xb7f8e18c, last=0xb7f8e1b0, end_args_size=0) at ../../ORIG/trunk/gcc/expr.c:3625 where prev and last are (gdb) call debug_rtx(prev) (insn 183 182 304 6 (clobber (mem:BLK (reg/f:SI 15 r15) [0 A8])) 20030224-1.c:16 -1 (nil)) (gdb) call debug_rtx(last) (insn 184 305 185 6 (set (reg/f:SI 15 r15) (reg/f:SI 15 r15)) 20030224-1.c:16 176 {movsi_ie} (expr_list:REG_ARGS_SIZE (const_int 0 [0]) (expr_list:REG_DEAD (reg:SI 76 fr12 [260]) (nil It seems that the latter (set stack_pointer_rtx stack_pointer_rtx) insn confuses fixup_args_size_notes. The patch below works for me. --- ORIG/trunk/gcc/expr.c2011-08-04 10:13:24.0 +0900 +++ trunk/gcc/expr.c2011-08-04 20:53:14.0 +0900 @@ -3628,6 +3628,8 @@ fixup_args_size_notes (rtx prev, rtx las XEXP (SET_SRC (set), 0) == stack_pointer_rtx CONST_INT_P (XEXP (SET_SRC (set), 1))) this_delta = INTVAL (XEXP (SET_SRC (set), 1)); + else if (SET_SRC (set) == stack_pointer_rtx) +this_delta = 0; else saw_unknown = true; }
[Bug rtl-optimization/48596] [4.7 Regression] [SH] unable to find a register to spill in class 'FPUL_REGS'
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48596 --- Comment #5 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-08-02 23:53:30 UTC --- I was trying to find a way that solves it without penalizing -O2 or the higher cases, though it's not easy to me. It seems that the target's register_move_cost is the way to discourage trying to use FP registers for a pointer. Unfortunately, Pmode is simply SImode for our case and it also discourages using a FP reg as a cheap storage for SImode. I've tried --- ORIG/trunk/gcc/config/sh/sh.c2011-08-01 09:22:27.0 +0900 +++ trunk/gcc/config/sh/sh.c2011-08-01 09:41:25.0 +0900 @@ -11472,8 +11472,18 @@ sh_register_move_cost (enum machine_mode REGCLASS_HAS_GENERAL_REG (srcclass)) || (REGCLASS_HAS_GENERAL_REG (dstclass) REGCLASS_HAS_FP_REG (srcclass))) -return ((TARGET_SHMEDIA ? 4 : TARGET_FMOVD ? 8 : 12) -* ((GET_MODE_SIZE (mode) + 7) / 8U)); +{ + if (TARGET_SHMEDIA) +return 4 * ((GET_MODE_SIZE (mode) + 7) / 8U); + else +{ + /* Discourage trying to use fp regs for a pointer. */ + int addend = (mode == Pmode) ? 40 : 0; + + return (((TARGET_FMOVD ? 8 : 12) + addend) + * ((GET_MODE_SIZE (mode) + 7) / 8U)); +} +} if ((dstclass == FPUL_REGS REGCLASS_HAS_GENERAL_REG (srcclass)) on the current trunk and observed some CSiBE testresults. A bit surprisingly, there are no code size regressions and one 2% improvement for teem-1.6.0-src src/bane/gkmsTxf which reduces to 3192 bytes from 3256 bytes. Now I'm inclined to apply it on trunk if it passes the bootstrap/regression/other tests.
[Bug target/49880] SuperH: ICE when -m4 is used with -mdiv=call-div1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49880 --- Comment #2 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-07-31 23:01:17 UTC --- Author: kkojima Date: Sun Jul 31 23:01:14 2011 New Revision: 176990 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=176990 Log: PR target/49880 * config/sh/sh.md (udivsi3_i1): Enable for TARGET_DIVIDE_CALL_DIV1. (divsi3_i1): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.md
[Bug target/49880] SuperH: ICE when -m4 is used with -mdiv=call-div1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49880 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Target|shle--netbsdelf |sh*-*-* Status|UNCONFIRMED |NEW Keywords||ice-on-valid-code Last reconfirmed||2011.07.28 22:50:01 CC||kkojima at gcc dot gnu.org Host|i386--netbsdelf | Ever Confirmed|0 |1 Known to fail||4.2.5, 4.3.6, 4.4.7, 4.5.5, ||4.6.2, 4.7.0 Build|i386--netbsdelf | --- Comment #1 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-07-28 22:50:01 UTC --- I've confirmed that trunk and all released compilers fail with -m4 -mdiv=call-div1. I'm testing the patch below. * config/sh/sh.md (udivsi3_i1): Enable for TARGET_DIVIDE_CALL_DIV1. (divsi3_i1): Likewise. --- ORIG/trunk/gcc/config/sh/sh.md2011-07-20 09:27:11.0 +0900 +++ trunk/gcc/config/sh/sh.md2011-07-28 06:49:41.0 +0900 @@ -1609,7 +1609,7 @@ (clobber (reg:SI PR_REG)) (clobber (reg:SI R4_REG)) (use (match_operand:SI 1 arith_reg_operand r))] - TARGET_SH1 ! TARGET_SH4 + TARGET_SH1 (! TARGET_SH4 || TARGET_DIVIDE_CALL_DIV1) jsr@%1%# [(set_attr type sfunc) (set_attr needs_delay_slot yes)]) @@ -1815,7 +1815,7 @@ (clobber (reg:SI R2_REG)) (clobber (reg:SI R3_REG)) (use (match_operand:SI 1 arith_reg_operand r))] - TARGET_SH1 ! TARGET_SH4 + TARGET_SH1 (! TARGET_SH4 || TARGET_DIVIDE_CALL_DIV1) jsr@%1%# [(set_attr type sfunc) (set_attr needs_delay_slot yes)])
[Bug rtl-optimization/49686] New: [4.7 Regression] CFI notes are missed for delayed slot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49686 Summary: [4.7 Regression] CFI notes are missed for delayed slot Product: gcc Version: unknown Status: UNCONFIRMED Keywords: EH Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: kkoj...@gcc.gnu.org CC: r...@gcc.gnu.org Target: sh4-unknown-linux-gnu Many EH tests fail on SH after the recent dwarf2 clean up. These failures went away with -fno-delayed-branch. A tiny testcase is int foo (int a) { if (a) bar (); return 1; } and with -O1 -fexceptions -fnon-call-exceptions, its assember output of the new compiler starts like foo: .LFB0: tstr4,r4 bt/s.L2 sts.lpr,@-r15 while the old compiler outputs CFI for the last frame related insn sts.l pr,@-r15 in the delayed slot: foo: .LFB0: tstr4,r4 .LCFI0: bt/s.L2 sts.lpr,@-r15 It seems that dwarf2out_frame_debug emits CFI notes at the middle of the elements of SEQUENCE and they were lost. The patch below works for me. --- ORIG/trunk/gcc/dwarf2cfi.c2011-07-09 14:42:50.0 +0900 +++ trunk/gcc/dwarf2cfi.c2011-07-09 14:46:18.0 +0900 @@ -2170,11 +2170,10 @@ dwarf2out_frame_debug_expr (rtx expr) sets SP or FP (adjusting how we calculate the frame address) or saves a register to the stack. If INSN is NULL_RTX, initialize our state. - If AFTER_P is false, we're being called before the insn is emitted, - otherwise after. Call instructions get invoked twice. */ + Notes are inserted at WHERE. Call instructions get invoked twice. */ static void -dwarf2out_frame_debug (rtx insn, bool after_p) +dwarf2out_frame_debug (rtx insn, rtx where) { rtx note, n; bool handled_one = false; @@ -2183,13 +2182,13 @@ dwarf2out_frame_debug (rtx insn, bool af /* Remember where we are to insert notes. Do not separate tablejump insns from their ADDR_DIFF_VEC. Putting the note after the VEC should be ok. */ - if (after_p) + if (insn == where) { if (!tablejump_p (insn, NULL, cfi_insn)) -cfi_insn = insn; +cfi_insn = where; } else -cfi_insn = PREV_INSN (insn); +cfi_insn = where; if (!NONJUMP_INSN_P (insn) || clobbers_queued_reg_save (insn)) dwarf2out_flush_queued_reg_saves (); @@ -2200,7 +2199,7 @@ dwarf2out_frame_debug (rtx insn, bool af matter if the stack pointer is not the CFA register anymore but is still used to save registers. */ if (!ACCUMULATE_OUTGOING_ARGS) -dwarf2out_notice_stack_adjust (insn, after_p); +dwarf2out_notice_stack_adjust (insn, (insn == where)); cfi_insn = NULL; return; } @@ -2434,7 +2433,7 @@ create_cfi_notes (void) if (BARRIER_P (insn)) { - dwarf2out_frame_debug (insn, false); + dwarf2out_frame_debug (insn, PREV_INSN (insn)); continue; } @@ -2469,7 +2468,7 @@ create_cfi_notes (void) pat = PATTERN (insn); if (asm_noperands (pat) = 0) { - dwarf2out_frame_debug (insn, false); + dwarf2out_frame_debug (insn, PREV_INSN (insn)); continue; } @@ -2477,14 +2476,14 @@ create_cfi_notes (void) { int i, n = XVECLEN (pat, 0); for (i = 1; i n; ++i) -dwarf2out_frame_debug (XVECEXP (pat, 0, i), false); +dwarf2out_frame_debug (XVECEXP (pat, 0, i), PREV_INSN (insn)); } if (CALL_P (insn) || find_reg_note (insn, REG_CFA_FLUSH_QUEUE, NULL)) -dwarf2out_frame_debug (insn, false); +dwarf2out_frame_debug (insn, PREV_INSN (insn)); - dwarf2out_frame_debug (insn, true); + dwarf2out_frame_debug (insn, insn); } }
[Bug rtl-optimization/49686] [4.7 Regression] CFI notes are missed for delayed slot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49686 --- Comment #5 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-07-09 21:13:52 UTC --- Thanks for the quick fix! (In reply to comment #1) Does the regression look something like this? For sh, the failures were FAIL: g++.dg/compat/eh/unexpected1 cp_compat_x_tst.o-cp_compat_y_tst.o execute FAIL: g++.dg/cpp0x/lambda/lambda-eh2.C execution test FAIL: g++.dg/eh/crossjump1.C execution test FAIL: g++.dg/eh/unexpected1.C execution test FAIL: g++.dg/ext/cleanup-10.C execution test FAIL: g++.dg/ext/cleanup-11.C execution test FAIL: g++.dg/torture/pr49115.C -O1 execution test ... FAIL: 18_support/exception_ptr/lifespan.cc execution test FAIL: 18_support/nested_exception/rethrow_if_nested.cc execution test FAIL: 18_support/nested_exception/throw_with_nested.cc execution test FAIL: 20_util/function/1.cc execution test FAIL: 20_util/hash/chi2_quality.cc execution test FAIL: 20_util/hash/quality.cc execution test FAIL: 21_strings/basic_string/append/char/1.cc execution test FAIL: 21_strings/basic_string/append/wchar_t/1.cc execution test FAIL: 21_strings/basic_string/cons/char/1.cc execution test FAIL: 21_strings/basic_string/cons/char/3.cc execution test ...
[Bug target/49468] SH Target: inefficient integer abs code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49468 --- Comment #5 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-06-27 06:39:40 UTC --- Argh, I also missed clobbers. Looks fine to me now, except that insn_and_split *negdi2 forgot to set constraints and some minor coding style issues below. The first comment should be started with a capital letter and ended with a period. Also please follow GCC C coding style even for C program segments in .md file. C lines in the patch are started with a tab instead of 2 spaces. A long conditional should be broken like as (cond ? value0 : value1) instead of (cond ? value0 : value1) Please use braces { int low_word = ... ... emit_insn (... DONE; }) instead of int low_word = ... ... emit_insn (... DONE; ) especially when new variables are used, though those braces aren't required with the current gen* tools. + emit_insn (gen_negsi_cond (operands[0], operands[1], operands[1], + GEN_INT (1))); The first line has an extra space after the last comma and the indentation of the 2nd line doesn't match with GCC coding standard. BTW, you could use const[01]_rtx for GEN_INT ([01]): emit_insn (gen_negsi_cond (operands[0], operands[1], operands[1], const1_rtx)); There are similar extra white space + broken indentation issues: +(define_insn_and_split negsi_cond + [(set (match_operand:SI 0 arith_reg_dest =r,r) + (if_then_else:SI (eq:SI (reg:SI T_REG) + (match_operand:SI 3 const_int_operand M,N)) ... + emit_label_after (skip_neg_label, + emit_insn (gen_negsi2 (operands[0], operands[1]))); ... Perhaps mail or editor problem?
[Bug target/49263] SH Target: underutilized TST #imm, R0 instruction
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #6 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-06-27 05:14:36 UTC --- (In reply to comment #5) Anyway, why not just add all the currently known-to-work cases? What are your concerns regarding that? I can imagine that it is a maintenance burden to keep all those definitions and special cases in the MD up-to-date (bit rot etc). Do you have anything other than that in mind? Yep, maintenance burden but I don't mean ack/nak for anything. If it's enough fruitful, we should take that route. When it gives 5% improvement in the usual working set like as CSiBE, hundreds lines would be OK, but if it's ~0.5% or less, it doesn't look worth to add many patterns for that. Isn't there a way to tell the combine pass not to do so, but instead first look deeper at what is in the MD? I don't know how to do it cleanly. I guess this might generate wrong code for e.g. if (x -2). When x has any bits[31:1] set this must return true. The code after the peephole optimization will look only at the lower 8 bits and would possibly return false for x = 0xFF00, which is wrong. So it should be satisfies_constraint_K08 only, shouldn't it? You are right. That peephole was simply 'something like this'.
[Bug target/49263] SH Target: underutilized TST #imm, R0 instruction
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #4 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-06-22 22:34:04 UTC --- Yes, that peephole doesn't catch all the patterns which could make tst #imm8,r0 use. Perhaps it would be a good idea to get numbers for the test like CSiBE test with the vanilla and new insns/peepholes patched compilers. Something covers 80% of the possible cases in the usual working set, it would be enough successful for such a micro-optimization, I guess. Cost patch looks fine to me. Could you propose it as a separate patch on gcc-patches list with an appropriate ChangeLog entry? When proposing it, please refer how you've tested it. Also the numbers got with the patch are highly welcome. BTW, do you have FSF copyright assignment for your GCC work? Although the cost patch itself is essentially several lines which doesn't require copyright assignment, the other changes you've proposed clearly require the paper work, I think.
[Bug target/49468] SH Target: inefficient integer abs code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49468 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Severity|normal |enhancement --- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-06-22 22:37:28 UTC --- On sh4-unknown-linux-gnu, this patch causes two new failures on libstdc++ testsuite FAIL: 27_io/basic_ostream/inserters_arithmetic/char/7.cc execution test FAIL: 27_io/basic_ostream/inserters_arithmetic/wchar_t/7.cc execution test I can't find any differences between generated codes for those test cases by compilers with/without your patch and the failures go away if the tests are running with libstdc++ library built with the unpatched compiler. So it seems that something in libstdc++ library is miscompiled. Weired and hard to see what is going on, ATM.
[Bug target/49307] [4.5/4.6/4.7 Regression] ICE in spill_failure, at reload1.c:2113
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49307 --- Comment #4 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-06-16 22:02:48 UTC --- Author: kkojima Date: Thu Jun 16 22:02:45 2011 New Revision: 175116 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=175116 Log: PR target/49307 * config/sh/sh.md (UNSPEC_CHKADD): New. (chk_guard_add): New define_insn_and_split. (symGOT_load): Use chk_guard_add instead of blockage. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr49307.c Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/config/sh/sh.md branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
[Bug target/49307] [4.5/4.6/4.7 Regression] ICE in spill_failure, at reload1.c:2113
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49307 --- Comment #5 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-06-16 22:08:23 UTC --- Author: kkojima Date: Thu Jun 16 22:08:20 2011 New Revision: 175118 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=175118 Log: PR target/49307 * config/sh/sh.md (UNSPEC_CHKADD): New. (chk_guard_add): New define_insn_and_split. (symGOT_load): Use chk_guard_add instead of blockage. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr49307.c Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/config/sh/sh.md branches/gcc-4_5-branch/gcc/testsuite/ChangeLog