Re: [Path,AVR]: Improve loading of 32-bit constants
2011/7/6 Georg-Johann Lay a...@gjlay.de: Denis Chertykov wrote: I have asked about example of *d instead of !d. Just svn GCC with *d vs svn GCC !d. Denis. Is the patch ok with the original !d instead of *d ? Ok. Denis.
Re: [1/11] Use targetm.shift_truncation_mask more consistently
Bernd Schmidt ber...@codesourcery.com writes: On 07/06/11 20:06, Richard Sandiford wrote: Bernd Schmidt ber...@codesourcery.com writes: At some point we've grown a shift_truncation_mask hook, but we're not using it everywhere we're masking shift counts. This patch changes the instances I found. The documentation reads: Note that, unlike @code{SHIFT_COUNT_TRUNCATED}, this function does @emph{not} apply to general shift rtxes; it applies only to instructions that are generated by the named shift patterns. Ouch. That is one seriously misnamed hook then. Yeah. I take the blame for that, sorry :-( I think you need to update the documentation, and check that existing target definitions do in fact apply to shift rtxes as well. Until I can do that, I've reverted this patch. Thanks. Richard
Re: [CFT][PATCH 0/6] Move dwarf2 cfi creation to a new pass
On 7 Jul 2011, at 00:15, Bernd Schmidt wrote: On 07/03/11 22:01, Richard Henderson wrote: Bernd's original patch to optimize dwarf2 cfi for shrink-wrapping is difficult to analyze because that optimization was done via a random debugging hook during final, and the cfi notes are deleted at the end of final so that we don't get debug comparison failures. By pulling the note creation out to a separate pass, we can dump the notes and thus debug the optimization. So far I've tested this only on x86_64-linux. It needs a bit more testing across other targets before going in. Any help that can be given there would be welcome. I'm trying to help by running ARM tests, but I've managed to screw up by running out of disk space, so I'm starting again from scratch now. I've run once through on i686-darwin9 (on the basis that it should make no difference, that seems to be the case). I still need to figure out a way to suppress DW2 epilogue info in unwind frames (for Darwin variants that can't handle them) ... ... will try and merge my patch-in-progress with your changes. Iain
RFA: Fix bogus mode in choose_reload_regs
This patch fixes an ICE in smallest_mode_for_size on the attached testcase. The smallest_mode_for_size call comes from this part of the reload inheritance code in choose_reload_regs: if (byte == 0) need_mode = mode; else need_mode = smallest_mode_for_size (GET_MODE_BITSIZE (mode) + byte * BITS_PER_UNIT, GET_MODE_CLASS (mode) == MODE_PARTIAL_INT ? MODE_INT : GET_MODE_CLASS (mode)); if ((GET_MODE_SIZE (GET_MODE (last_reg)) = GET_MODE_SIZE (need_mode)) Here we have found that the pseudo register we need was last reloaded into LAST_REG. The mode size check is making sure LAST_REG defines every byte of the value we need (which is at byte offset BYTE and has mode MODE). In the attached testcase, LAST_REG is XImode (a 256-bit integer), and the value we need is the last vector quarter of it. BYTE is 24 and MODE is V4SF. The problem is that we then look for a 256-bit vector: smallest_mode_for_size (64 + 24 * 8, MODE_VECTOR_FLOAT) but no such mode exists. Note that this is the only use of need_mode. I don't believe the mode that is being calculated here is fundamental in any way, or that it's used later in the reload process. We have already checked that the mode change is allowed: #ifdef CANNOT_CHANGE_MODE_CLASS /* Verify that the register it's in can be used in mode MODE. */ !REG_CANNOT_CHANGE_MODE_P (REGNO (reg_last_reload_reg[regno]), GET_MODE (reg_last_reload_reg[regno]), mode) #endif and have already calculated which hard register we would need to use after the mode change: i = REGNO (last_reg); i += subreg_regno_offset (i, GET_MODE (last_reg), byte, mode); So once we have verified that the register is suitable, we can (and do) simply use register I in mode MODE. I think the current mode is a historical left-over. Back in 2000 this code was a simple check that the old register entirely encompassed the new one: i = REGNO (last_reg) + word; last_class = REGNO_REG_CLASS (i); if ((GET_MODE_SIZE (GET_MODE (last_reg)) = GET_MODE_SIZE (mode) + word * UNITS_PER_WORD) The register we were interested in was (reg:MODE I), and this check made sure that the old reload register defined every byte of (reg:MODE I). When CLASS_CANNOT_CHANGE_SIZE was introduced, the code became: i = REGNO (last_reg) + word; last_class = REGNO_REG_CLASS (i); if ( #ifdef CLASS_CANNOT_CHANGE_SIZE (TEST_HARD_REG_BIT (reg_class_contents[CLASS_CANNOT_CHANGE_SIZE], i) ? (GET_MODE_SIZE (GET_MODE (last_reg)) == GET_MODE_SIZE (mode) + word * UNITS_PER_WORD) : (GET_MODE_SIZE (GET_MODE (last_reg)) = GET_MODE_SIZE (mode) + word * UNITS_PER_WORD)) #else (GET_MODE_SIZE (GET_MODE (last_reg)) = GET_MODE_SIZE (mode) + word * UNITS_PER_WORD) #endif But I think this was bogus. The new size of the register was: GET_MODE_SIZE (mode) rather than: GET_MODE_SIZE (mode) + word * UNITS_PER_WORD Maybe something like: word == 0 GET_MODE_SIZE (mode) == GET_MODE_SIZE (GET_MODE (last_reg)) would have been more accurate. Anyway, CLASS_CANNOT_CHANGE_SIZE proved to be too limited, so it was replaced with CLASS_CANNOT_CHANGE_MODE. The code above then became: need_mode = smallest_mode_for_size ((word+1) * UNITS_PER_WORD, GET_MODE_CLASS (mode)); if ( #ifdef CLASS_CANNOT_CHANGE_MODE (TEST_HARD_REG_BIT (reg_class_contents[(int) CLASS_CANNOT_CHANGE_MODE], i) ? ! CLASS_CANNOT_CHANGE_MODE_P (GET_MODE (last_reg), need_mode) : (GET_MODE_SIZE (GET_MODE (last_reg)) = GET_MODE_SIZE (need_mode))) #else (GET_MODE_SIZE (GET_MODE (last_reg)) = GET_MODE_SIZE (need_mode)) #endif with need_mode providing a mode of the same size as the then-preexisting size check. I think this mode is bogus for the same reason, and in 2005 I changed the final mode argument from need_mode to mode: http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01665.html That patch also fixed the smallest_mode_for_size argument so that it was a bit count rather than a byte count. Unfortunately, it seems I failed to realise that need_mode was in fact completely
Re: [PATCH, testsuite] Fix for PR49519, miscompiled 447.dealII in SPEC CPU 2006
Let me try again: I've prepared a patch for: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49519 It fixes the problem of tailcall optimization: check for stack overlapping was not strict enough. Patch adds another check for clobbered stack area. If address comes from a register - we have no idea about destination of that address. That means we must act in conservative way - address possibly overlaps with stack area of interest, and we should not perform tailcall optimization ChangeLog entry: 2011-07-06 Kirill Yukhin kirill.yuk...@intel.com PR middle-end/49519 * calls.c (mem_overlaps_already_clobbered_arg_p): Additional check if address is stored in register. If so - give up. (check_sibcall_argument_overlap_1): Do not perform check of overlapping when it is call to address. tessuite/ChangeLog entry: 2011-07-06 Kirill Yukhin kirill.yuk...@intel.com * g++.dg/torture/pr49519.C: New test for tailcall fix. Bootstrapped, new test fails without patch, passes when it is applied. This fixes the problem with SPEC2006/447.dealII miscompile Ok for trunk? Thanks, K pr49519-1.gcc.patch Description: Binary data
Re: [testsuite] fixes for gcc.target/arm/mla-1.c
OK for trunk, and for 4.6 in a few days if no problems? This is OK. Thanks, Ramana
Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls
On Thu, Jul 7, 2011 at 12:29 AM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch adds an option to not load the static chain (r11) for 64-bit PowerPC calls through function pointers (or virtual function). Most of the languages on the PowerPC do not need the static chain being loaded when called, and adding this instruction can slow down code that calls very short functions. In addition, if the function does not call alloca, setjmp or deal with exceptions where the stack is modified, the compiler can move the store of the TOC value for the current function to the prologue of the function, rather than at each call site. The effect of these patches is to speed up 464.h264ref in the Spec 2006 benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but the save of the TOC register is hoisted). I believe this is due to the load of the current function's TOC (r2) having to wait until the store queue is drained with the store just before the call. Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what the cause is. I have bootstraped the compiler and saw that there were no regressions in make check. Is it ok to install in the trunk? Hum. Can't the compiler figure this our itself per-call-site? At least the name of the command-line switch -m[no-]r11 is meaningless to me. Points-to information should be able to tell you if the function pointer points to a nested function. Richard. [gcc] 2011-07-06 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/rs6000-protos.h (rs6000_call_indirect_aix): New declaration. (rs6000_save_toc_in_prologue_p): Ditto. * config/rs6000/rs6000.opt (-mr11): New switch to disable loading up the static chain (r11) during indirect function calls. (-msave-toc-indirect): New undocumented debug switch. * config/rs6000/rs6000.c (struct machine_function): Add save_toc_in_prologue field to note whether the prologue needs to save the TOC value in the reserved stack location. (rs6000_emit_prologue): Use TOC_REGNUM instead of 2. If we need to save the TOC in the prologue, do so. (rs6000_trampoline_init): Don't allow creating AIX style trampolines if -mno-r11 is in effect. (rs6000_call_indirect_aix): New function to create AIX style indirect calls, adding support for -mno-r11 to suppress loading the static chain, and saving the TOC in the prologue instead of the call body. (rs6000_save_toc_in_prologue_p): Return true if we are saving the TOC in the prologue. * config/rs6000/rs6000.md (STACK_POINTER_REGNUM): Add more fixed register numbers. (TOC_REGNUM): Ditto. (STATIC_CHAIN_REGNUM): Ditto. (ARG_POINTER_REGNUM): Ditto. (SFP_REGNO): Delete, unused. (TOC_SAVE_OFFSET_32BIT): Add constants for AIX TOC save and function descriptor offsets. (TOC_SAVE_OFFSET_64BIT): Ditto. (AIX_FUNC_DESC_TOC_32BIT): Ditto. (AIX_FUNC_DESC_TOC_64BIT): Ditto. (AIX_FUNC_DESC_SC_32BIT): Ditto. (AIX_FUNC_DESC_SC_64BIT): Ditto. (ptrload): New mode attribute for the appropriate load of a pointer. (call_indirect_aix32): Delete, rewrite AIX indirect function calls. (call_indirect_aix64): Ditto. (call_value_indirect_aix32): Ditto. (call_value_indirect_aix64): Ditto. (call_indirect_nonlocal_aix32_internal): Ditto. (call_indirect_nonlocal_aix32): Ditto. (call_indirect_nonlocal_aix64_internal): Ditto. (call_indirect_nonlocal_aix64): Ditto. (call): Rewrite AIX indirect function calls. Add support for eliminating the static chain, and for moving the save of the TOC to the function prologue. (call_value): Ditto. (call_indirect_aixptrsize): Ditto. (call_indirect_aixptrsize_internal): Ditto. (call_indirect_aixptrsize_internal2): Ditto. (call_indirect_aixptrsize_nor11): Ditto. (call_value_indirect_aixptrsize): Ditto. (call_value_indirect_aixptrsize_internal): Ditto. (call_value_indirect_aixptrsize_internal2): Ditto. (call_value_indirect_aixptrsize_nor11): Ditto. (call_nonlocal_aix32): Relocate in the rs6000.md file. (call_nonlocal_aix64): Ditto. * doc/invoke.texi (RS/6000 and PowerPC Options): Add -mr11 and -mno-r11 documentation. [gcc/testsuite] 2011-07-06 Michael Meissner meiss...@linux.vnet.ibm.com * gcc.target/powerpc/no-r11-1.c: New test for -mr11, -mno-r11. * gcc.target/powerpc/no-r11-2.c: Ditto. * gcc.target/powerpc/no-r11-3.c: Ditto. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: Remove obsolete %[] specs operator
On Thu, Jul 7, 2011 at 2:03 AM, Joseph S. Myers jos...@codesourcery.com wrote: The %[] spec operator is marked as obsolete and not used by any specs in GCC; I'm also not sure it would work properly now the canonical form of -D options is defined to have separate argument. This patch removes support for that obsolete operator. Bootstrapped with no regressions on x86_64-unknown-linux-gnu. OK to commit? Ok. Thanks, Richard. 2011-07-06 Joseph Myers jos...@codesourcery.com * gcc.c (%[Spec]): Don't document. (struct spec_list): Update comment. (do_spec_1): Don't handle %[Spec]. * doc/invoke.texi (%[@var{name}]): Remove documentation of spec. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 175919) +++ gcc/doc/invoke.texi (working copy) @@ -9768,9 +9768,6 @@ Use this when inconsistent options are d @item %(@var{name}) Substitute the contents of spec string @var{name} at this point. -@item %[@var{name}] -Like @samp{%(@dots{})} but put @samp{__} around @option{-D} arguments. - @item %x@{@var{option}@} Accumulate an option for @samp{%X}. Index: gcc/gcc.c === --- gcc/gcc.c (revision 175919) +++ gcc/gcc.c (working copy) @@ -438,7 +438,6 @@ or with constant text in a single argume This may be combined with '.', '!', ',', '|', and '*' as above. %(Spec) processes a specification defined in a specs file as *Spec: - %[Spec] as above, but put __ around -D arguments The conditional text X in a %{S:X} or similar construct may contain other nested % constructs or spaces, or even newlines. They are @@ -1149,8 +1148,8 @@ static const char *multilib_dir; static const char *multilib_os_dir; /* Structure to keep track of the specs that have been defined so far. - These are accessed using %(specname) or %[specname] in a compiler - or link spec. */ + These are accessed using %(specname) in a compiler or link + spec. */ struct spec_list { @@ -5212,11 +5211,7 @@ do_spec_1 (const char *spec, int inswitc /* Process a string found as the value of a spec given by name. This feature allows individual machine descriptions - to add and use their own specs. - %[...] modifies -D options the way %P does; - %(...) uses the spec unmodified. */ - case '[': - warning (0, use of obsolete %%[ operator in specs); + to add and use their own specs. */ case '(': { const char *name = p; @@ -5225,7 +5220,7 @@ do_spec_1 (const char *spec, int inswitc /* The string after the S/P is the name of a spec that is to be processed. */ - while (*p *p != ')' *p != ']') + while (*p *p != ')') p++; /* See if it's in the list. */ @@ -5234,63 +5229,20 @@ do_spec_1 (const char *spec, int inswitc { name = *(sl-ptr_spec); #ifdef DEBUG_SPECS - fnotice (stderr, Processing spec %c%s%c, which is '%s'\n, - c, sl-name, (c == '(') ? ')' : ']', name); + fnotice (stderr, Processing spec (%s), which is '%s'\n, + sl-name, name); #endif break; } if (sl) { - if (c == '(') - { - value = do_spec_1 (name, 0, NULL); - if (value != 0) - return value; - } - else - { - char *x = (char *) alloca (strlen (name) * 2 + 1); - char *buf = x; - const char *y = name; - int flag = 0; - - /* Copy all of NAME into BUF, but put __ after - every -D and at the end of each arg. */ - while (1) - { - if (! strncmp (y, -D, 2)) - { - *x++ = '-'; - *x++ = 'D'; - *x++ = '_'; - *x++ = '_'; - y += 2; - flag = 1; - continue; - } - else if (flag - (*y == ' ' || *y == '\t' || *y == '=' - || *y == '}' || *y == 0)) - { - *x++ = '_'; - *x++ = '_'; - flag = 0; - } -
Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls
On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote: Hum. Can't the compiler figure this our itself per-call-site? At least the name of the command-line switch -m[no-]r11 is meaningless to me. Points-to information should be able to tell you if the function pointer points to a nested function. Yeah. E.g. for C++ virtual method calls I believe all function pointers in vtables should always ignore the static chain pointer, etc., because you can't have a nested method. Jakub
Re: plugin event for C/C++ declarations
On Tue, Dec 22, 2009 at 11:45 AM, Diego Novillo dnovi...@google.com wrote: On Tue, Dec 22, 2009 at 13:00, Brian Hackett bhackett1...@gmail.com wrote: Hi, this patch adds a new plugin event FINISH_DECL, which is invoked at every finish_decl in the C and C++ frontends. ?Previously there did not seem to be a way for a plugin to see the definition for a global that is never used in the input file, or the initializer for a global which is declared before a function but defined after. ?This event isn't restricted to just globals though, but also locals, fields, and parameters (C frontend only). Thanks for your patch. ?This will be great to fix http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41757 but we need to wait for your copyright assignment to go through before we can accept it. Hi, this is a patch from a few months ago which I was not able to get an assignment for. The FSF has a personal copyright assignment for me, but I could not get one from my employer at the time, Stanford (according to Stanford's policies they would not claim copyright on this patch). I now work for Mozilla, which (I understand) has a company wide copyright assignment. Are there issues if I from scratch rewrite and resubmit a new patch? Original patch (9 new lines of code, doc change and new regression): http://gcc.gnu.org/ml/gcc-patches/2009-12/msg01032.html Brian Hi, Once again, this is a ping for the long time proposed patch by Brian Hackett. See last thread about this one here: http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00315.html Find below the fixed patch for recent revision (changed gcc/testsuite/g++.dg/plugin/decl_plugin.c global and local var decl detection) Romain Geissler 2011-07-07 Romain Geissler romain.geiss...@gmail.com 2010-04-14 Brian Hackett bhackett1...@gmail.com gcc/ChangeLog: * plugin.def: Add event for finish_decl. * plugin.c (register_callback, invoke_plugin_callbacks): Same. * c-decl.c (finish_decl): Invoke callbacks on above event. * doc/plugins.texi: Document above event. gcc/cp/ChangeLog: * decl.c (cp_finish_decl): Invoke callbacks on finish_decl event. gcc/testsuite/ChangeLog: * g++.dg/plugin/decl_plugin.c: New test plugin. * g++.dg/plugin/decl-plugin-test.C: Testcase for above plugin. * g++.dg/plugin/plugin.exp: Add above testcase. Index: gcc/doc/plugins.texi === --- gcc/doc/plugins.texi(revision 175907) +++ gcc/doc/plugins.texi(working copy) @@ -151,6 +151,7 @@ enum plugin_event @{ PLUGIN_PASS_MANAGER_SETUP,/* To hook into pass manager. */ PLUGIN_FINISH_TYPE, /* After finishing parsing a type. */ + PLUGIN_FINISH_DECL, /* After finishing parsing a declaration. */ PLUGIN_FINISH_UNIT, /* Useful for summary processing. */ PLUGIN_PRE_GENERICIZE,/* Allows to see low level AST in C and C++ frontends. */ PLUGIN_FINISH,/* Called before GCC exits. */ Index: gcc/plugin.def === --- gcc/plugin.def (revision 175907) +++ gcc/plugin.def (working copy) @@ -24,6 +24,9 @@ DEFEVENT (PLUGIN_PASS_MANAGER_SETUP) /* After finishing parsing a type. */ DEFEVENT (PLUGIN_FINISH_TYPE) +/* After finishing parsing a declaration. */ +DEFEVENT (PLUGIN_FINISH_DECL) + /* Useful for summary processing. */ DEFEVENT (PLUGIN_FINISH_UNIT) Index: gcc/testsuite/g++.dg/plugin/plugin.exp === --- gcc/testsuite/g++.dg/plugin/plugin.exp (revision 175907) +++ gcc/testsuite/g++.dg/plugin/plugin.exp (working copy) @@ -51,7 +51,8 @@ set plugin_test_list [list \ { pragma_plugin.c pragma_plugin-test-1.C } \ { selfassign.c self-assign-test-1.C self-assign-test-2.C self-assign-test-3.C } \ { dumb_plugin.c dumb-plugin-test-1.C } \ -{ header_plugin.c header-plugin-test.C } ] +{ header_plugin.c header-plugin-test.C } \ +{ decl_plugin.c decl-plugin-test.C } ] foreach plugin_test $plugin_test_list { # Replace each source file with its full-path name Index: gcc/testsuite/g++.dg/plugin/decl-plugin-test.C === --- gcc/testsuite/g++.dg/plugin/decl-plugin-test.C (revision 0) +++ gcc/testsuite/g++.dg/plugin/decl-plugin-test.C (revision 0) @@ -0,0 +1,32 @@ + + +extern int global; // { dg-warning Decl Global global } +int global_array[] = { 1, 2, 3 }; // { dg-warning Decl Global global_array } + +int takes_args(int arg1, int arg2) +{ + int local = arg1 + arg2 + global; // { dg-warning Decl Local local } + return local + 1; +} + +int global = 12; // { dg-warning Decl Global global } + +struct test_str { + int field; // { dg-warning Decl Field field } +}; + +class test_class { + int class_field1; // { dg-warning Decl Field class_field1 } + int
Re: [v3] Correctly determine baseline_subdir for 64-bit default Solaris gcc
Hi, Ok for mainline if that passes? I'm going to trust you Rainer on this and it seems very safe on x86_64-linux anyway. Please wait just one more day or so and then check it in. Thanks, Paolo.
Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls
On Thu, Jul 7, 2011 at 11:03 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote: Hum. Can't the compiler figure this our itself per-call-site? At least the name of the command-line switch -m[no-]r11 is meaningless to me. Points-to information should be able to tell you if the function pointer points to a nested function. Yeah. E.g. for C++ virtual method calls I believe all function pointers in vtables should always ignore the static chain pointer, etc., because you can't have a nested method. For this kind of FE specific info you could use a flag on the CALL_EXPR as well. Richard.
Re: [v3] Correctly determine baseline_subdir for 64-bit default Solaris gcc
Hi Paolo, Ok for mainline if that passes? I'm going to trust you Rainer on this and it seems very safe on x86_64-linux anyway. Please wait just one more day or so and then check it in. ok, will do. The x86_64-unknown-linux-gnu bootstrap has completed without regressions and the correct baselines were used for both multilibs. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [Patch, AVR]: Fix PR46779
Denis Chertykov wrote: 2011/6/27 Georg-Johann Lay: Denis Chertykov wrote: The main problem for me is that the new addressing mode produce a worse code in many tests. You have an example source? In attachment. Denis. Hi Denis. I had a look at the sources you sent. sort.c: === There is some difference because of register allocation, but the new code does not look awfully bad, just a bit different because of different register allocation that might need some more bytes. The difference is *not* because of deny fake X addressing, it's because of the new avr_hard_regno_mode_ok implementation to fix PR46779. When I add if (GET_MODE_SIZE (mode) == 1) return 1; + if (SImode == mode regno == 28) + return 0; return regno % 2 == 0; to that function, the difference in code disappears. pr.c: = I get the following sizes with pr-0 the original compile and pr qith my patch: avr-size pr-0.o textdata bss dec hex filename 2824 24 02848 b20 pr-0.o avr-size pr.o textdata bss dec hex filename 2564 24 02588 a1c pr.o So the size actually decreased significantly. Avoiding SI in avr_hard_regno_mode_ok like above does not change code size. Note that I did *not* use the version from the git repository; I could not get reasonable code out of it (even after some fixes). Hundreds of testsuite crashes... I used the initial patch that I posted; I attached it again for reference. Note that LEGITIMIZE_RELOAD_ADDRESS is still not implemented there. Did you decide about the fix for PR46779? http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html Is it ok to commit? I think fix PR46779 and fix fake X addresses (PR46278) should be separate patches and not intermixed. Johann Index: config/avr/avr.md === --- config/avr/avr.md (revision 175956) +++ config/avr/avr.md (working copy) @@ -246,8 +246,8 @@ (define_expand movqi ) (define_insn *movqi - [(set (match_operand:QI 0 nonimmediate_operand =r,d,Qm,r,q,r,*r) - (match_operand:QI 1 general_operand rL,i,rL,Qm,r,q,i))] + [(set (match_operand:QI 0 nonimmediate_operand =r,d,m,r,q,r,*r) + (match_operand:QI 1 general_operand rL,i,rL,m,r,q,i))] (register_operand (operands[0],QImode) || register_operand (operands[1], QImode) || const0_rtx == operands[1]) * return output_movqi (insn, operands, NULL); @@ -295,15 +295,6 @@ (define_expand movhi } }) -(define_insn *movhi_sp - [(set (match_operand:HI 0 register_operand =q,r) -(match_operand:HI 1 register_operand r,q))] - ((stack_register_operand(operands[0], HImode) register_operand (operands[1], HImode)) -|| (register_operand (operands[0], HImode) stack_register_operand(operands[1], HImode))) - * return output_movhi (insn, operands, NULL); - [(set_attr length 5,2) - (set_attr cc none,none)]) - (define_insn movhi_sp_r_irq_off [(set (match_operand:HI 0 stack_register_operand =q) (unspec_volatile:HI [(match_operand:HI 1 register_operand r)] @@ -427,8 +418,8 @@ (define_insn *reload_insi (define_insn *movsi - [(set (match_operand:SI 0 nonimmediate_operand =r,r,r,Qm,!d,r) -(match_operand:SI 1 general_operand r,L,Qm,rL,i,i))] + [(set (match_operand:SI 0 nonimmediate_operand =r,r,r,m,!d,r) +(match_operand:SI 1 general_operand r,L,m,rL,i,i))] (register_operand (operands[0],SImode) || register_operand (operands[1],SImode) || const0_rtx == operands[1]) { @@ -455,8 +446,8 @@ (define_expand movsf }) (define_insn *movsf - [(set (match_operand:SF 0 nonimmediate_operand =r,r,r,Qm,!d,r) -(match_operand:SF 1 general_operand r,G,Qm,rG,F,F))] + [(set (match_operand:SF 0 nonimmediate_operand =r,r,r,m,!d,r) +(match_operand:SF 1 general_operand r,G,m,rG,F,F))] register_operand (operands[0], SFmode) || register_operand (operands[1], SFmode) || operands[1] == CONST0_RTX (SFmode) @@ -1592,8 +1583,8 @@ (define_mode_attr rotx [(DI r,r,X) ( (define_mode_attr rotsmode [(DI QI) (SI HI) (HI QI)]) (define_expand rotlmode3 - [(parallel [(set (match_operand:HIDI 0 register_operand ) - (rotate:HIDI (match_operand:HIDI 1 register_operand ) + [(parallel [(set (match_operand:HISI 0 register_operand ) + (rotate:HISI (match_operand:HISI 1 register_operand ) (match_operand:VOID 2 const_int_operand ))) (clobber (match_dup 3))])] @@ -1692,7 +1683,7 @@ (define_split ; ashlqi3_const6 (define_insn *ashlqi3 [(set (match_operand:QI 0 register_operand =r,r,r,r,!d,r,r) (ashift:QI (match_operand:QI 1 register_operand 0,0,0,0,0,0,0) - (match_operand:QI 2 general_operand r,L,P,K,n,n,Qm)))] + (match_operand:QI 2 general_operand r,L,P,K,n,n,m)))] * return ashlqi3_out (insn, operands, NULL); [(set_attr length 5,0,1,2,4,6,9) @@ -1701,7 +1692,7 @@ (define_insn
Re: Provide 64-bit default Solaris/x86 configuration (PR target/39150)
Rainer Orth r...@cebitec.uni-bielefeld.de writes: There has long been some clamoring for a amd64-*-solaris2 configuration similar to sparcv9-sun-solaris2. I've resisted this for quite some time, primarily because it doubles the maintenance effort of testing both the 32-bit default and 64-bit default configurations. [...] I think practically the whole patch falls under the Solaris maintainership, with the possible exception of the change to the copy of libtool.m4 in libgo/config. This is not for the technical content, but for the special commit rules to that directory. Ian? Anyway, this part of the patch will have to go to upstream libtool. Ralf, could you take care of that? Bootstrapped without regression on i386-pc-solaris2.10 (both 32-bit default and 64-bit default configurations), i386-pc-solaris2.11 and sparc-sun-solaris2.11 in progress. [...] Once all the bootstraps have finished, I'll commit this patch (at least the non-libgo parts) unless anything unexpected comes up. All bootstraps have completed without regressions, so I've installed the patch as is, after verifying that the libgo parts aren't present in the upstream Go repo. I've also synced the toplevel configure.ac/configure changes to src. One other issue: it was suggested that the 64-bit compiler might actually be faster than a 32-bit one. At least bootstrap times speak another language: on a Sun Fire X4450 running Solaris 10 with 4 x 2.93 GHz Quad-Core Xeon Xeon X7350, I find for make -j32 + make -j32 -k check for both multilibs: 64-bit 32-bit as/ld real 1:59:28.66 1:52:15.55 user 7:14:33.93 6:43:25.84 sys 5:26:30.66 4:41:02.78 gas/ld 2:02:47.64 1:54:24.51 7:10:41.93 6:39:39.39 5:37:15.86 4:51:41.02 gas/gld 1:59:57.13 1:45:13.18 7:57:37.13 7:11:41.83 5:11:58.14 4:04:26.97 Same picture on a Sun Fire X4600 M2 running Solaris 11 with 8 x 2.6 GHz Dual-Core Opteron 8218. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
On Mon, Jul 4, 2011 at 4:26 PM, Andrew Stubbs a...@codesourcery.com wrote: On 28/06/11 15:14, Andrew Stubbs wrote: On 28/06/11 13:33, Andrew Stubbs wrote: On 23/06/11 15:41, Andrew Stubbs wrote: If one or both of the inputs to a widening multiply are of unsigned type then the compiler will attempt to use usmul_widen_optab or umul_widen_optab, respectively. That works fine, but only if the target supports those operations directly. Otherwise, it just bombs out and reverts to the normal inefficient non-widening multiply. This patch attempts to catch these cases and use an alternative signed widening multiply instruction, if one of those is available. I believe this should be legal as long as the top bit of both inputs is guaranteed to be zero. The code achieves this guarantee by zero-extending the inputs to a wider mode (which must still be narrower than the output mode). OK? This update fixes the testsuite issue Janis pointed out. And this one fixes up the wmul-5.c testcase also. The patch has changed the correct result. Here's an update for the context changed by the update to patch 3. The content of the patch has not changed. + gimple stmt = gimple_build_assign (result, fold_convert (type, val)); please use gimple_build_assign_with_ops -convert_mult_to_widen (gimple stmt) +convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) The comment needs updating for the new parameter. + type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0); don't use type_for_mode, use build_nonstandard_integer_type (GET_MODE_PRECISION (from_mode), 0) instead. Both types are equal, so please share the temporary variable you create + rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), rhs1, type1); + rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), rhs2, type2); here (CSE create_tmp_var). + type1 = type2 = lang_hooks.types.type_for_mode (mode, 0); + mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), +create_tmp_var (type1, NULL), +mult_rhs1, type1); + mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), +create_tmp_var (type2, NULL), +mult_rhs2, type2); Likewise. Thanks, Richard. Andrew
Re: Improve Solaris mudflap support (PR libmudflap/49550)
Hi Frank, I could either commit the current version with the MFWRAP_SPEC addition and work from there, or wait until those failures are understood and fixed, too. Committing now would be fine, assuming no regressions on a primary platform. below is the patch I've actually comitted, after the x86_64-unknown-linux-gnu bootstrap completed without regressions. In fact, 4 failures were fixed: -FAIL: libmudflap.c/pass-stratcliff.c (test for excess errors) -FAIL: libmudflap.c/pass-stratcliff.c (-static) (test for excess errors) -FAIL: libmudflap.c/pass-stratcliff.c (-O2) (test for excess errors) -FAIL: libmudflap.c/pass-stratcliff.c (-O3) (test for excess errors) /vol/gcc/src/hg/trunk/local/libmudflap/testsuite/libmudflap.c/pass-stratcliff.c:253:21: warning: extra tokens at end of #ifndef directive [enabled by default] which was introduced by me in my last patch, but got unnoticed ;-) Rainer 2011-06-29 Rainer Orth r...@cebitec.uni-bielefeld.de gcc: libmudflap/49550 * gcc.c (MFWRAP_SPEC): Also wrap mmap64. libmudflap: libmudflap/49550 * mf-runtime.c (__wrap_main) [__sun__ __svr4__]: Don't register stdin, stdout, stderr. Register __ctype, __ctype_mask. * configure.ac: Check for mmap64. Check for rawmemchr, stpcpy, mempcpy. * configure: Regenerate. * config.h.in: Regenerate. * mf-hooks1.c [HAVE_MMAP64] (__mf_0fn_mmap64): New function. (mmap64): New wrapper function. * mf-impl.h (__mf_dynamic_index) [HAVE_MMAP64]: Add dyn_mmap64. * mf-runtime.c (__mf_dynamic) [HAVE_MMAP64]: Handle mmap64. * mf-hooks2.c [HAVE_GETMNTENT HAVE_SYS_MNTTAB_H]: Implement getmntent wrapper. * mf-hooks3.c (_REENTRANT): Define. * testsuite/libmudflap.c/heap-scalestress.c (SCALE): Reduce to 1. * testsuite/libmudflap.c/pass-stratcliff.c: Include ../config.h. (MIN): Define. Use HAVE_RAWMEMCHR, HAVE_STPCPY, HAVE_MEMPCPY as guards. * testsuite/libmudflap.c/pass47-frag.c: Expect __ctype warning on *-*-solaris2.*. diff --git a/gcc/gcc.c b/gcc/gcc.c --- a/gcc/gcc.c +++ b/gcc/gcc.c @@ -518,7 +518,7 @@ proper position among the other output f /* XXX: should exactly match hooks provided by libmudflap.a */ #define MFWRAP_SPEC %{static: %{fmudflap|fmudflapth: \ --wrap=malloc --wrap=free --wrap=calloc --wrap=realloc\ - --wrap=mmap --wrap=munmap --wrap=alloca\ + --wrap=mmap --wrap=mmap64 --wrap=munmap --wrap=alloca\ } %{fmudflapth: --wrap=pthread_create\ }} %{fmudflap|fmudflapth: --wrap=main} #endif diff --git a/libmudflap/configure.ac b/libmudflap/configure.ac --- a/libmudflap/configure.ac +++ b/libmudflap/configure.ac @@ -75,7 +75,9 @@ AC_CHECK_FUNCS(getservent getservbyname AC_CHECK_FUNCS(getprotoent getprotobyname getprotobynumber) AC_CHECK_FUNCS(getmntent setmntent addmntent) AC_CHECK_FUNCS(inet_ntoa mmap munmap) +AC_CHECK_FUNCS(mmap64) AC_CHECK_FUNCS(__libc_freeres) +AC_CHECK_FUNCS(rawmemchr stpcpy mempcpy) AC_TRY_COMPILE([#include sys/types.h #include sys/ipc.h diff --git a/libmudflap/mf-hooks1.c b/libmudflap/mf-hooks1.c --- a/libmudflap/mf-hooks1.c +++ b/libmudflap/mf-hooks1.c @@ -1,5 +1,5 @@ /* Mudflap: narrow-pointer bounds-checking by tree rewriting. - Copyright (C) 2002, 2003, 2004, 2009 Free Software Foundation, Inc. + Copyright (C) 2002, 2003, 2004, 2009, 2011 Free Software Foundation, Inc. Contributed by Frank Ch. Eigler f...@redhat.com and Graydon Hoare gray...@redhat.com @@ -414,6 +414,61 @@ WRAPPER(int , munmap, void *start, size_ #endif /* HAVE_MMAP */ +#ifdef HAVE_MMAP64 +#if PIC +/* A special bootstrap variant. */ +void * +__mf_0fn_mmap64 (void *start, size_t l, int prot, int f, int fd, off64_t off) +{ + return (void *) -1; +} +#endif + + +#undef mmap +WRAPPER(void *, mmap64, + void *start, size_t length, int prot, + int flags, int fd, off64_t offset) +{ + DECLARE(void *, mmap64, void *, size_t, int, + int, int, off64_t); + void *result; + BEGIN_PROTECT (mmap64, start, length, prot, flags, fd, offset); + + result = CALL_REAL (mmap64, start, length, prot, + flags, fd, offset); + + /* + VERBOSE_TRACE (mmap64 (%08lx, %08lx, ...) = %08lx\n, +(uintptr_t) start, (uintptr_t) length, +(uintptr_t) result); + */ + + if (result != (void *)-1) +{ + /* Register each page as a heap object. Why not register it all +as a single segment? That's so that a later munmap() call +can unmap individual pages. XXX: would __MF_TYPE_GUESS make +this more automatic? */ + size_t ps = getpagesize (); + uintptr_t base = (uintptr_t) result; + uintptr_t offset; + + for (offset=0; offsetlength; offset+=ps) + { + /* XXX: We could map PROT_NONE to __MF_TYPE_NOACCESS. */ + /* XXX: Unaccessed HEAP pages are reported as
Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
On Mon, Jul 4, 2011 at 4:29 PM, Andrew Stubbs a...@codesourcery.com wrote: On 28/06/11 16:08, Andrew Stubbs wrote: On 23/06/11 15:41, Andrew Stubbs wrote: This patch removes the restriction that the inputs to a widening multiply must be of the same mode. It does this by extending the smaller of the two inputs to match the larger; therefore, it remains the case that subsequent code (in the expand pass, for example) can rely on the type of rhs1 being the input type of the operation, and the gimple verification code is still valid. OK? This update fixes the testcase issue Janis highlighted. And this one updates the context changed by my update to patch 3. The content of the patch has not changed. Similar to the previous patch + if (TYPE_MODE (type2) != from_mode) +{ + type2 = lang_hooks.types.type_for_mode (from_mode, + TYPE_UNSIGNED (type2)); use build_nonstandard_integer_type. + if (cast1) +rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), rhs1, type1); + if (cast2) +rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), rhs2, type2); and CSE create_tmp_var - at this point type1 and type2 should be the same, right? So I guess it would be a good place to assert types_compatible_p (type1, type2). gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1)); gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2)); and that's now seemingly redundant ... it should probably be gimple_assign_set_rhs1 (stmt, rhs1);, no? A conversion isn't a valid rhs1/2. Similar oddity in convert_plusminus_to_widen. + if (TYPE_MODE (type2) != TYPE_MODE (type1)) +{ + type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1), + TYPE_UNSIGNED (type2)); + cast2 = true; +} + + if (cast1) +mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), + mult_rhs1, type1); + if (cast2) +mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), + mult_rhs2, type2); see above. Thanks, Richard. Andrew
[PATCH, MELT] new function register_data_handler
Hi, this patch add a new function allowing to add a pragma handler more easily. In the past, we were directly modifying the :sysdata_meltpragmas field of initial_system_data. The pragma handler take a list of new pragma handler that we want to add. The reason is that the field :sysdata_meltpragmas is a tuple (fixed size, this is a mandatory because we uses index to recognize the handler later). Each time we call register_data_handler, we recreate the tuple, so we try to give a list of handler to call it not to often. This function should works with a GCC 4.6 but should be used with care, as we can only register a single pragma named melt (maybe we could use another function specially for 4.6 ?). Thanks! Pierre Vittet 2011-07-07 Pierre Vittet pier...@pvittet.com * melt/warmelt-base.melt (register_pragma_handler ): new function. Index: gcc/melt/warmelt-base.melt === --- gcc/melt/warmelt-base.melt (revision 175906) +++ gcc/melt/warmelt-base.melt (working copy) @@ -1135,6 +1135,42 @@ registered with $REGISTER_PASS_EXECUTION_HOOK.}# }#) ))) +;;register a new pragma handler. +(defun register_pragma_handler (lsthandler) + :doc #{register a list of new pragma handlers. As :sysdata_meltpragmas must + be a tuple (we use an index to recognize handlers), we have to recreate this + tuple each time we call this function. That why $LSTHANDLER is a list of + handlers (class_gcc_pragma) and not a single object. }# + (assert_msg register_pragma_handler takes a list as argument. +(is_list lsthandler)) + (let ((oldtuple (get_field :sysdata_meltpragmas initial_system_data)) +(:long oldsize 0)) +(if notnull oldtuple) + (setq oldsize (multiple_length oldtuple)) +(let ((:long newsize (+i (multiple_length oldtuple) + (list_length lsthandler))) + (newtuple (make_multiple discr_multiple newsize)) + (:long i 0)) +;;copy in oldhandlers in the newtuple +(foreach_in_multiple +(oldtuple) +(curhander :long iunused) + (multiple_put_nth newtuple i curhander) + (setq i (+i i 1)) +) +;;add new handler from lsthandler +(foreach_in_list +(lsthandler) +(curpair curhandler) + (assert_msg register_pragma_handler must be a list of class_gcc_pragma. +(is_a curhandler class_gcc_pragma)) + (multiple_put_nth newtuple i curhandler) + (setq i (+i i 1)) +) +(put_fields initial_system_data :sysdata_meltpragmas newtuple) +)) +) + the descriptions of values which are not ctype related. (defclass class_value_descriptor @@ -2361,6 +2397,7 @@ polyhedron values.}# ppstrbuf_mixbigint read_file register_pass_execution_hook + register_pragma_handler retrieve_value_descriptor_list some_integer_greater_than some_integer_multiple
Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
On Mon, Jul 4, 2011 at 4:31 PM, Andrew Stubbs a...@codesourcery.com wrote: On 28/06/11 16:30, Andrew Stubbs wrote: On 23/06/11 15:42, Andrew Stubbs wrote: This patch fixes the case where widening multiply-and-accumulate were not recognised because the multiplication itself is not actually widening. This can happen when you have DI + SI * SI - the multiplication will be done in SImode as a non-widening multiply, and it's only the final accumulate step that is widening. This was not recognised for two reasons: 1. is_widening_mult_p inferred the output type from the multiply statement, which in not useful in this case. 2. The inputs to the multiply instruction may not have been converted at all (because they're not being widened), so the pattern match failed. The patch fixes these issues by making the output type explicit, and by permitting unconverted inputs (the types are still checked, so this is safe). OK? This update fixes Janis' testsuite issue. This updates the context changed by my update to patch 3. The content of this patch has not changed. Ok. Thanks, Richard. Andrew
[PATCH] Fix dead_debug_insert_before ICE (PR debug/49522, take 3)
On Wed, Jul 06, 2011 at 10:36:02PM +0200, Eric Botcazou wrote: And here is a version that passed bootstrap/regtest on x86_64-linux and i686-linux: 2011-07-06 Jakub Jelinek ja...@redhat.com PR debug/49522 * df-problems.c (dead_debug_reset): Remove dead_debug_uses referencing debug insns that have been reset. (dead_debug_insert_before): Don't assert reg is non-NULL, instead return immediately if it is NULL. * gcc.dg/debug/pr49522.c: New test. Sorry, our messages crossed. I'd set a flag in the first loop. In the end, it's up to you. Actually, looking at it some more, dead_debug_use structs referencing the same insn are always adjacent due to the way how they are added using dead_debug_add. While some of the dead_debug_use records might preceede the record because of which it is reset, it isn't hard to remember a pointer pointing to the pointer to the first entry for the current insn. So, here is a new patch which doesn't need two loops, just might go a little bit backwards to unchain dead_debug_use for the reset insn. It still needs the change of the gcc_assert (reg) into if (reg == NULL) return;, because the dead-used bitmap is with this sometimes a false positive (saying that a regno is referenced even when it isn't). But here it is IMHO better to occassionaly live with the false positives, which just means we'll sometimes once walk the chain in dead_debug_reset or dead_debug_insert_before before resetting it, than to recompute the bitmap (we'd need a second loop for that, bitmap_clear (debug-used) and populate it again). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-07-07 Jakub Jelinek ja...@redhat.com PR debug/49522 * df-problems.c (dead_debug_reset): Remove dead_debug_uses referencing debug insns that have been reset. (dead_debug_insert_before): Don't assert reg is non-NULL, instead return immediately if it is NULL. * gcc.dg/debug/pr49522.c: New test. --- gcc/df-problems.c.jj2011-07-07 02:32:45.928547053 +0200 +++ gcc/df-problems.c 2011-07-07 09:57:34.846464573 +0200 @@ -3096,6 +3096,7 @@ static void dead_debug_reset (struct dead_debug *debug, unsigned int dregno) { struct dead_debug_use **tailp = debug-head; + struct dead_debug_use **insnp = debug-head; struct dead_debug_use *cur; rtx insn; @@ -3113,9 +3114,21 @@ dead_debug_reset (struct dead_debug *deb debug-to_rescan = BITMAP_ALLOC (NULL); bitmap_set_bit (debug-to_rescan, INSN_UID (insn)); XDELETE (cur); + if (tailp != insnp DF_REF_INSN ((*insnp)-use) == insn) + tailp = insnp; + while ((cur = *tailp) DF_REF_INSN (cur-use) == insn) + { + *tailp = cur-next; + XDELETE (cur); + } + insnp = tailp; } else - tailp = (*tailp)-next; + { + if (DF_REF_INSN ((*insnp)-use) != DF_REF_INSN (cur-use)) + insnp = tailp; + tailp = (*tailp)-next; + } } } @@ -3174,7 +3187,8 @@ dead_debug_insert_before (struct dead_de tailp = (*tailp)-next; } - gcc_assert (reg); + if (reg == NULL) +return; /* Create DEBUG_EXPR (and DEBUG_EXPR_DECL). */ dval = make_debug_expr_from_rtl (reg); --- gcc/testsuite/gcc.dg/debug/pr49522.c.jj 2011-07-04 10:54:23.0 +0200 +++ gcc/testsuite/gcc.dg/debug/pr49522.c2011-07-04 10:54:02.0 +0200 @@ -0,0 +1,41 @@ +/* PR debug/49522 */ +/* { dg-do compile } */ +/* { dg-options -fcompare-debug } */ + +int val1 = 0L; +volatile int val2 = 7L; +long long val3; +int *ptr = val1; + +static int +func1 () +{ + return 0; +} + +static short int +func2 (short int a, unsigned int b) +{ + return !b ? a : a b; +} + +static unsigned long long +func3 (unsigned long long a, unsigned long long b) +{ + return !b ? a : a % b; +} + +void +func4 (unsigned short arg1, int arg2) +{ + for (arg2 = 0; arg2 2; arg2++) +{ + *ptr = func3 (func3 (10, func2 (val3, val2)), val3); + for (arg1 = -14; arg1 14; arg1 = func1 ()) + { + *ptr = -1; + if (foo ()) + ; + } +} +} Jakub
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 07/07/11 10:58, Richard Guenther wrote: I think you should assume that series of widenings, (int)(short)char_variable are already combined. Thus I believe you only need to consider a single conversion in valid_types_for_madd_p. Hmm, I'm not so sure. I'll look into it a bit further. +/* Check the input types, TYPE1 and TYPE2 to a widening multiply, what are those types? Is TYPE1 the result type and TYPE2 the operand type? If so why TYPE1 and TYPE2 are the inputs to the multiply. I thought I explained that in the comment before the function. + initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2); this?! The result of the multiply will be this many bits wide. This may be narrower than the type that holds it. E.g., 16-bit * 8-bit gives a result at most 24-bits wide, which will usually be held in a 32- or 64-bit variable. + initial_unsigned = TYPE_UNSIGNED (type1) TYPE_UNSIGNED (type2); that also looks odd. So probably TYPE1 isn't the result type. If they are the types of the operands, then what operand is EXPR for? EXPR, as the comment says, is the addition that follows the multiply. - if (TREE_CODE (rhs1) == SSA_NAME) + for (tmp = rhs1, rhs1_code = ERROR_MARK; + TREE_CODE (tmp) == SSA_NAME + (CONVERT_EXPR_CODE_P (rhs1_code) || rhs1_code == ERROR_MARK); + tmp = gimple_assign_rhs1 (rhs1_stmt)) { - rhs1_stmt = SSA_NAME_DEF_STMT (rhs1); - if (is_gimple_assign (rhs1_stmt)) - rhs1_code = gimple_assign_rhs_code (rhs1_stmt); + rhs1_stmt = SSA_NAME_DEF_STMT (tmp); + if (!is_gimple_assign (rhs1_stmt)) + break; + rhs1_code = gimple_assign_rhs_code (rhs1_stmt); } the result looks a bit like spaghetti code ... and lacks a comment on what it is trying to do. It looks like it sees through an arbitrary number of conversions - possibly ones that will make the macc invalid, as for (short)int-var * short-var + int-var. So you'll be pessimizing code by doing that unconditionally. As I said above you should at most consider one intermediate conversion. Ok, I need to add a comment here. The code does indeed look back through an arbitrary number of conversions. It is searching for the last real operation before the addition, hoping to find a multiply. I believe the code should be arranged such that only valid conversions are looked through in the first place. Valid, in that the resulting types should still match the macc constraints. Well, it might be possible to discard some conversions initially, but until the multiply is found, and it's input types are known, we can't know for certain what conversions are valid. I think I need to explain what's going on here more clearly. 1. It finds an addition statement. It's not known yet whether it is part of a multiply-and-accumulate, or not. 2. It follows the conversion chain back from each operand to see if it finds a multiply, or widening multiply statement. 3. If it finds a non-widening multiply, it checks it to see if it could be widening multiply-and-accumulate (it will already have been rejected as a widening multiply on it's own, but the addition might be in a wider mode, or the target might provide multiply-and-accumulate insns that don't have corresponding widening multiply insns). 4. (This is the new bit!) It looks to see if there are any conversions between the multiply and addition that can safely be ignored. 5. If we get here, then emit any necessary conversion statements, and convert the addition to a WIDEN_MULT_PLUS_EXPR. Before these changes, any conversion between the multiply and addition statements would prevent optimization, even though there are many cases where the conversions are valid, and even inserted automatically. I'm going to go away and find out whether there are really any cases where there can legitimately be more than one conversion, and at least update my patch with better commenting. Thanks for you review. Andrew
[build] Move dfp-bit support to toplevel libgcc
The next patch in the `move to toplevel libgcc' series is hopefully easier to get review and approval for. This one moves dfp-bit and related build stuff to libgcc. I think it's completely straight forward: it moves D{32, 64, 128}PBIT{, _FUNCS}, related Makefile fragments, and the source files themselves over. The only part that may require revision is the location of dfp-bit.? in libgcc: I've kept them in libgcc/config, as they lived in gcc/config before, but one might as well argue that they are generic and belong into libgcc itself. Bootstrapped without regressions on x86_64-unknown-linux-gnu. Ok for mainline? Thanks. Rainer 2011-06-22 Rainer Orth r...@cebitec.uni-bielefeld.de gcc: * config/dfp-bit.c, config/dfp-bit.h: Move to ../libgcc/config. * config/t-dfprules: Likewise. * config.gcc (i[34567]86-*-linux*, i[34567]86-*-kfreebsd*-gnu, i[34567]86-*-knetbsd*-gnu, i[34567]86-*-gnu*, i[34567]86-*-kopensolaris*-gnu): Remove t-dfprules from tmake_file. (x86_64-*-linux*, x86_64-*-kfreebsd*-gnu, x86_64-*-knetbsd*-gnu): Likewise. (i[34567]86-*-cygwin*): Likewise. (i[34567]86-*-mingw*, x86_64-*-mingw*): Likewise. (powerpc-*-linux*, powerpc64-*-linux*): Likewise. * Makefile.in (D32PBIT_FUNCS, D64PBIT_FUNCS, D128PBIT_FUNCS): Remove. (libgcc.mvars): Remove DFP_ENABLE, DFP_CFLAGS, D32PBIT_FUNCS, D64PBIT_FUNCS, D128PBIT_FUNCS. libgcc: * config/dfp-bit.c, config/dfp-bit.h: New files. * Makefile.in (D32PBIT_FUNCS, D64PBIT_FUNCS, D128PBIT_FUNCS): New variables. ($(d32pbit-o)): Use $(srcdir) to refer to dfp-bit.c ($(d64pbit-o)): Likewise. ($(d128pbit-o)): Likewise. * config/t-dfprules: New file. * config.host (i[34567]86-*-linux*, i[34567]86-*-kfreebsd*-gnu, i[34567]86-*-knetbsd*-gnu, i[34567]86-*-gnu*, i[34567]86-*-kopensolaris*-gnu): Add t-dfprules to tmake_file. (x86_64-*-linux*, x86_64-*-kfreebsd*-gnu, x86_64-*-knetbsd*-gnu): Likewise. (i[34567]86-*-cygwin*): Likewise. (i[34567]86-*-mingw*, x86_64-*-mingw*): Likewise. (powerpc-*-linux*, powerpc64-*-linux*): Likewise. diff --git a/gcc/Makefile.in b/gcc/Makefile.in --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1548,30 +1548,6 @@ TPBIT_FUNCS = _pack_tf _unpack_tf _addsu _lt_tf _le_tf _unord_tf _si_to_tf _tf_to_si _negate_tf _make_tf \ _tf_to_df _tf_to_sf _thenan_tf _tf_to_usi _usi_to_tf -D32PBIT_FUNCS = _addsub_sd _div_sd _mul_sd _plus_sd _minus_sd \ - _eq_sd _ne_sd _lt_sd _gt_sd _le_sd _ge_sd \ - _sd_to_si _sd_to_di _sd_to_usi _sd_to_udi \ - _si_to_sd _di_to_sd _usi_to_sd _udi_to_sd \ - _sd_to_sf _sd_to_df _sd_to_xf _sd_to_tf \ - _sf_to_sd _df_to_sd _xf_to_sd _tf_to_sd \ - _sd_to_dd _sd_to_td _unord_sd _conv_sd - -D64PBIT_FUNCS = _addsub_dd _div_dd _mul_dd _plus_dd _minus_dd \ - _eq_dd _ne_dd _lt_dd _gt_dd _le_dd _ge_dd \ - _dd_to_si _dd_to_di _dd_to_usi _dd_to_udi \ - _si_to_dd _di_to_dd _usi_to_dd _udi_to_dd \ - _dd_to_sf _dd_to_df _dd_to_xf _dd_to_tf \ - _sf_to_dd _df_to_dd _xf_to_dd _tf_to_dd \ - _dd_to_sd _dd_to_td _unord_dd _conv_dd - -D128PBIT_FUNCS = _addsub_td _div_td _mul_td _plus_td _minus_td \ - _eq_td _ne_td _lt_td _gt_td _le_td _ge_td \ - _td_to_si _td_to_di _td_to_usi _td_to_udi \ - _si_to_td _di_to_td _usi_to_td _udi_to_td \ - _td_to_sf _td_to_df _td_to_xf _td_to_tf \ - _sf_to_td _df_to_td _xf_to_td _tf_to_td \ - _td_to_sd _td_to_dd _unord_td _conv_td - # These might cause a divide overflow trap and so are compiled with # unwinder info. LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _udivdi3 _umoddi3 _udiv_w_sdiv _udivmoddi4 @@ -1929,14 +1905,6 @@ libgcc.mvars: config.status Makefile $(L echo DPBIT_FUNCS = '$(DPBIT_FUNCS)' tmp-libgcc.mvars echo TPBIT = '$(TPBIT)' tmp-libgcc.mvars echo TPBIT_FUNCS = '$(TPBIT_FUNCS)' tmp-libgcc.mvars - echo DFP_ENABLE = '$(DFP_ENABLE)' tmp-libgcc.mvars - echo DFP_CFLAGS='$(DFP_CFLAGS)' tmp-libgcc.mvars - echo D32PBIT='$(D32PBIT)' tmp-libgcc.mvars - echo D32PBIT_FUNCS='$(D32PBIT_FUNCS)' tmp-libgcc.mvars - echo D64PBIT='$(D64PBIT)' tmp-libgcc.mvars - echo D64PBIT_FUNCS='$(D64PBIT_FUNCS)' tmp-libgcc.mvars - echo D128PBIT='$(D128PBIT)' tmp-libgcc.mvars - echo D128PBIT_FUNCS='$(D128PBIT_FUNCS)' tmp-libgcc.mvars echo GCC_EXTRA_PARTS = '$(GCC_EXTRA_PARTS)' tmp-libgcc.mvars echo SHLIB_LINK = '$(subst $(GCC_FOR_TARGET),$$(GCC_FOR_TARGET),$(SHLIB_LINK))' tmp-libgcc.mvars echo SHLIB_INSTALL = '$(SHLIB_INSTALL)' tmp-libgcc.mvars diff --git a/gcc/config.gcc b/gcc/config.gcc --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1299,7 +1299,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfree i[34567]86-*-kopensolaris*-gnu) tm_file=${tm_file} i386/gnu-user.h
Re: [PATCH, MELT] new function register_data_handler
On Thu, Jul 07, 2011 at 12:10:30PM +0200, Pierre Vittet wrote: Hi, this patch add a new function allowing to add a pragma handler more easily. In the past, we were directly modifying the :sysdata_meltpragmas field of initial_system_data. 2011-07-07 Pierre Vittet pier...@pvittet.com * melt/warmelt-base.melt (register_pragma_handler): new function. Thanks. I committed it on the MELT branch. Committed revision 175962. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
On 07/07/11 11:04, Richard Guenther wrote: Both types are equal, so please share the temporary variable you create + rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), rhs1, type1); + rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), rhs2, type2); here (CSE create_tmp_var). I'm sorry, I don't understand this? This takes code like this: r1 = a; r2 = b; result = r1 + r2; And transforms it to this: r1 = a; r2 = b; t1 = (type1) r1; t2 = (type2) r2; result = t1 + t2; Yes, type1 == type2, but r1 != r2, so t1 != t2. I don't see where the common expression is here? But then, I am something of a newbie to tree optimizations. Andrew
[Patch, Fortran] PR fortran/49648 ICE with use-associated array-returning function
Hello, this is the patch I posted yesterday on bugzilla at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49648#c8 The problem is a NULL pointer encountered during code generation when trying to get the rank from the array spec. The array ref's array spec is normally copied in resolve_ref from the symbol's one. It is not the case, however, in this special case (use-associated function return variable whose shape involves a function call). This patch calls gfc_resolve_array_spec on sym-result, which calls gfc_resolve_expr on every bound, which in turn calls resolve_ref on them. As pointed out by Tobias in the PR audit trail, there could be some similar bugs with character lengths. The character length variant of the testcase doesn't ICE however, so I have decided to propose the patch as is, because it should be a step forward anyway. Regression tested on x86_64-unknown-freebsd8.2. OK for trunk? Should I backport to the branches? Mikael 2011-07-07 Mikael Morin mikael.mo...@sfr.fr gcc/fortran PR fortran/49648 * resolve.c (resolve_symbol): Force resolution of function result's array specification. gcc/testsuite PR fortran/49648 * gfortran.dg/result_in_spec_4.f90: New test. diff --git a/resolve.c b/resolve.c index f484a22..cbf403c 100644 --- a/resolve.c +++ b/resolve.c @@ -12198,6 +12198,8 @@ resolve_symbol (gfc_symbol *sym) } } } + else if (mp_flag sym-attr.flavor == FL_PROCEDURE sym-attr.function) +gfc_resolve_array_spec (sym-result-as, false); /* Assumed size arrays and assumed shape arrays must be dummy arguments. Array-spec's of implied-shape should have been resolved to ! { dg-do compile } ! ! PR fortran/49648 ! ICE for calls to a use-associated function returning an array whose spec ! depends on a function call. ! Contributed by Tobias Burnus bur...@net-b.de module m2 COMPLEX, SAVE, ALLOCATABLE :: P(:) contains FUNCTION getPhaseMatrix() RESULT(PM) COMPLEX:: PM(SIZE(P),3) PM=0.0 END FUNCTION end module m2 module m use m2 contains SUBROUTINE gf_generateEmbPot() COMPLEX :: sigma2(3,3) sigma2 = MATMUL(getPhaseMatrix(), sigma2) END SUBROUTINE end module m ! { dg-final { cleanup-modules m m2 } }
Re: [build] Move dfp-bit support to toplevel libgcc
Paolo Bonzini bonz...@gnu.org writes: i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i[34567]86-*-gnu* | i[34567]86-*-kopensolaris*-gnu) extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o -tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm +tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm t-dfprules md_unwind_header=i386/linux-unwind.h ;; x86_64-*-linux* | x86_64-*-kfreebsd*-gnu | x86_64-*-knetbsd*-gnu) extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o -tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm +tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm t-dfprules md_unwind_header=i386/linux-unwind.h ;; This conflicts with the Hurd/k*BSD patch. Patch is okay if you take care of committing both, but please wait 48 hours I see Thomas already committed his, but my patch hadn't been updated for top-of-tree. or so, and please post the updated patch with config/dfp-bit.c moved to dfp-bit.c (config/t-dfprules should stay there). Ok, will do. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
On Thu, Jul 7, 2011 at 12:41 PM, Andrew Stubbs andrew.stu...@gmail.com wrote: On 07/07/11 11:04, Richard Guenther wrote: Both types are equal, so please share the temporary variable you create + rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), rhs1, type1); + rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), rhs2, type2); here (CSE create_tmp_var). I'm sorry, I don't understand this? This takes code like this: r1 = a; r2 = b; result = r1 + r2; And transforms it to this: r1 = a; r2 = b; t1 = (type1) r1; t2 = (type2) r2; result = t1 + t2; Yes, type1 == type2, but r1 != r2, so t1 != t2. I don't see where the common expression is here? But then, I am something of a newbie to tree optimizations. create_tmp_var creates a var-decl, build_and_insert_casts builds an SSA name from it. You can build multiple SSA names from a single VAR_DECL, so no need to waste two VAR_DECLs for temporaries of the same type. Richard. Andrew
[PATCH][1/n] Do not force sizetype for POINTER_PLUS_EXPR
This is the first of a series of enabling patches to make POINTER_PLUS_EXPR not forcefully take a sizetype offset (I'm still no 100% what requirements I will end up implementing, but the first goal is to have less TYPE_IS_SIZETYPE types). This patch removes the (T *)index +p (int)PTR - PTR +p index folding. We shouldn't change what the user specified as the pointer base as we can't be sure we don't mess up here, considering int foo(int *p, uintptr_t o) { return *((uintptr_t)p + (int *)o); } int main () { int res = 0; return foo((int *)0, (uintptr_t)res); } if the o argument in foo is really the offset then the C code is invoking undefined behavior as you may not do anything with an integer which you converted to a pointer other than converting it back. Bootstrapped and tested on x86_64-unknown-linux-gnu. Richard. 2011-07-07 Richard Guenther rguent...@suse.de * fold-const.c (fold_binary_loc): Remove index +p PTR - PTR +p index folding. Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 175920) +++ gcc/fold-const.c(working copy) @@ -9484,13 +9484,6 @@ fold_binary_loc (location_t loc, fold_convert_loc (loc, sizetype, arg0))); - /* index +p PTR - PTR +p index */ - if (POINTER_TYPE_P (TREE_TYPE (arg1)) - INTEGRAL_TYPE_P (TREE_TYPE (arg0))) -return fold_build2_loc (loc, POINTER_PLUS_EXPR, type, - fold_convert_loc (loc, type, arg1), - fold_convert_loc (loc, sizetype, arg0)); - /* (PTR +p B) +p A - PTR +p (B + A) */ if (TREE_CODE (arg0) == POINTER_PLUS_EXPR) {
[go]: Port to ALPHA arch - epoll problems
On Tue, Jul 5, 2011 at 10:12 PM, Ian Lance Taylor i...@google.com wrote: What remains is a couple of unrelated failures in the testsuite: Epoll unexpected fd=0 pollServer: unexpected wakeup for fd=0 mode=w panic: test timed out ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 7123 Aborted ./a.out -test.short -test.timeout=$timeout $@ FAIL: http gmake[2]: *** [http/check] Error 1 2011/07/05 18:43:28 Test RPC server listening on 127.0.0.1:50334 2011/07/05 18:43:28 Test HTTP RPC server listening on 127.0.0.1:49010 2011/07/05 18:43:28 rpc.Serve: accept:accept tcp 127.0.0.1:50334: Resource temporarily unavailable FAIL: rpc gmake[2]: *** [rpc/check] Error 1 2011/07/05 18:44:22 Test WebSocket server listening on 127.0.0.1:40893 Epoll unexpected fd=0 pollServer: unexpected wakeup for fd=0 mode=w panic: test timed out ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 12993 Aborted ./a.out -test.short -test.timeout=$timeout $@ FAIL: websocket gmake[2]: *** [websocket/check] Error 1 ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 13945 Segmentation fault ./a.out -test.short -test.timeout=$timeout $@ FAIL: compress/flate gmake[2]: *** [compress/flate/check] Error 1 Any ideas how to attack these? None of these look familiar to me. An Epoll unexpected fd error means that epoll returned information about a file descriptor which the program didn't ask about. Not sure why that would happen. Particularly for fd 0, since epoll is only used for network connections, which fd 0 presumably is not. The way to look into these is to cd to TARGET/libgo and run make GOTESTFLAGS=--keep http/check (or whatever/check). That will leave a directory gotest in your libgo directory. The executable a.out in that directory is the test case. You can debug the test case using gdb in more or less the usual way. It's a bit painful to set breakpoints by function name, but setting breakpoints by file:line works fine. Printing variables works as well as it ever does, but the variables are printed in C form rather than Go form. It turned out that the EpollEvent definition in libgo/syscalls/epoll/socket_epoll.go is non-portable (if not outright dangerous...). The definition does have a FIXME comment, but does not take into account the effects of __attribute__((__packed__)) from system headers. Contrary to alpha header, x86 has __attribute__((__packed__)) added to struct epoll_event definition in sys/epoll.h header. To illustrate the problem, please run following test: --cut here-- #include stdint.h #include stdio.h typedef union epoll_data { void *ptr; int fd; uint32_t u32; uint64_t u64; } epoll_data_t; struct epoll_event { uint32_t events; epoll_data_t data; }; struct packed_epoll_event { uint32_t events; epoll_data_t data; } __attribute__ ((__packed__)); struct fake_epoll_event { uint32_t events; int32_t fd; int32_t pad; }; int main () { struct epoll_event *ep; struct packed_epoll_event *pep; struct fake_epoll_event fep; fep.events = 0xfe; fep.fd = 9; fep.pad = 0; ep = (struct epoll_event *) fep; pep = (struct packed_epoll_event *) fep; printf (%#x %i\n, ep-events, ep-data.fd); printf (%#x %i\n, pep-events, pep-data.fd); return 0; } --cut here-- ./a.out 0xfe 0 0xfe 9 So, the first line simulates the alpha, the second simulates x86_64. 32bit targets are OK in both cases: ./a.out 0xfe 9 0xfe 9 By changing the definition of EpollEvent to the form that suits alpha: type EpollEvent struct { Events uint32; Pad int32; Fd int32; }; both timeouts got fixed and correct FD was passed to and from the syscall. Uros.
Re: [Patch, Fortran] Add stat=/errmsg= support to _gfortran_caf_register
On Thursday 07 July 2011 07:35:07 Tobias Burnus wrote: diff --git a/libgfortran/caf/mpi.c b/libgfortran/caf/mpi.c index 83f39f6..2d4af6b 100644 --- a/libgfortran/caf/mpi.c +++ b/libgfortran/caf/mpi.c @@ -103,10 +110,19 @@ _gfortran_caf_register (ptrdiff_t size, caf_register_t type, /* Token contains only a list of pointers. */ local = malloc (size); token = malloc (sizeof (void*) * caf_num_images); + Trailing blanks + if (unlikely (local == NULL || token == NULL)) +goto error; /* token[img-1] is the address of the token in image img. */ - MPI_Allgather (local, sizeof (void*), MPI_BYTE, - token, sizeof (void*), MPI_BYTE, MPI_COMM_WORLD); + err = MPI_Allgather (local, sizeof (void*), MPI_BYTE, token, +sizeof (void*), MPI_BYTE, MPI_COMM_WORLD); + if (unlikely (err)) +{ + free (local); + free (token); + goto error; +} if (type == CAF_REGTYPE_COARRAY_STATIC) { This will return the same error (memory allocation failure) as in the case just above. Is this expected or should it have an error of its own? + char *msg; + if (caf_is_finalized) Space indentation + msg = Failed to allocate coarray - stopped images; Also I'm wondering whether it would be pertinent to share the error handling between single.c (one error) and mpi.c (2 or 3 errors) as the codes are very close (with an interface such as handle_error (int *stat, char *errmsg, int errmsg_len, char *actual_error)). Build and regtested on x86-64-linux. OK for the trunk? The above is nitpicking, and I leave the final decision to you and Daniel, so the patch is basically OK with the two indentation nits fixed. Mikael
[PATCH][C] Fixup pointer-int-sum
This tries to make sense of the comments and code in the code doing the index - size multiplication in pointer-int-sum. It also fixes a bogus integer-constant conversion which results in not properly canonicalized integer constants. The comment in the code claims the index - size multiplication is carried out signed, which doesn't match the code which does it unsigned (the commend dates back to rev. 6733 where we _did_ carry out the multiplication in a signed type, using c_common_type_for_size (TYPE_PRECISION (sizetype), 0)). The following patch makes us preserve the signedness of intop so that for signed intop the multiplication will be known to not overflow (what is actually the C semantics - is the multiplication allowed to overflow for unsigned intop? If not I guess the orginal code of always choosing a signed type was more correct and we should go back to it instead?) The comment also claims there is a sign-extension of t to sizetype - that's not true either, it's just a sign-change. Joseph, do we want an unconditional (un-)signed multiplication (before the patch it's unsigned), or what the patch does? Bootstrapped and tested on x86_64-unknown-linux-gnu. I'll also test the unconditionally signed variant. Thanks, Richard. 2011-07-07 Richard Guenther rguent...@suse.de * c-common.c (pointer_int_sum): Do the index times size multiplication in the signedness of index. Properly strip overflow flags. Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c (revision 175962) +++ gcc/c-family/c-common.c (working copy) @@ -3737,23 +3737,22 @@ pointer_int_sum (location_t loc, enum tr /* Convert the integer argument to a type the same size as sizetype so the multiply won't overflow spuriously. */ - if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype) - || TYPE_UNSIGNED (TREE_TYPE (intop)) != TYPE_UNSIGNED (sizetype)) + if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype)) intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype), -TYPE_UNSIGNED (sizetype)), intop); +TYPE_UNSIGNED (TREE_TYPE (intop))), +intop); /* Replace the integer argument with a suitable product by the object size. - Do this multiplication as signed, then convert to the appropriate type - for the pointer operation and disregard an overflow that occured only - because of the sign-extension change in the latter conversion. */ + Do this multiplication in a widened intop type, then convert to the + appropriate type for the pointer operation and disregard an overflow + that occured only because of the sign-change in the latter conversion. */ { tree t = build_binary_op (loc, MULT_EXPR, intop, convert (TREE_TYPE (intop), size_exp), 1); intop = convert (sizetype, t); if (TREE_OVERFLOW_P (intop) !TREE_OVERFLOW (t)) - intop = build_int_cst_wide (TREE_TYPE (intop), TREE_INT_CST_LOW (intop), - TREE_INT_CST_HIGH (intop)); + intop = double_int_to_tree (sizetype, tree_to_double_int (intop)); } /* Create the sum or difference. */
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 07/07/11 11:26, Andrew Stubbs wrote: On 07/07/11 10:58, Richard Guenther wrote: I think you should assume that series of widenings, (int)(short)char_variable are already combined. Thus I believe you only need to consider a single conversion in valid_types_for_madd_p. Hmm, I'm not so sure. I'll look into it a bit further. OK, here's a test case that gives multiple conversions: long long foo (long long a, signed char b, signed char c) { int bc = b * c; return a + (short)bc; } The dump right before the widen_mult pass gives: foo (long long int a, signed char b, signed char c) { int bc; long long int D.2018; short int D.2017; long long int D.2016; int D.2015; int D.2014; bb 2: D.2014_2 = (int) b_1(D); D.2015_4 = (int) c_3(D); bc_5 = D.2014_2 * D.2015_4; D.2017_6 = (short int) bc_5; D.2018_7 = (long long int) D.2017_6; D.2016_9 = D.2018_7 + a_8(D); return D.2016_9; } Here we have a multiply and accumulate done the long way. The 8-bit inputs are widened to 32-bit, multiplied to give a 32-bit result (of which only the lower 16-bits contain meaningful data), then truncated to 16-bits, and sign-extended up to 64-bits ready for the 64-bit addition. This is slight contrived, perhaps, but not unlike the sort of thing that might occur when you have inline functions and macros, and most importantly - it is mathematically valid! So, here's the output from my patched widen_mult pass: foo (long long int a, signed char b, signed char c) { int bc; long long int D.2018; short int D.2017; long long int D.2016; int D.2015; int D.2014; bb 2: D.2014_2 = (int) b_1(D); D.2015_4 = (int) c_3(D); bc_5 = b_1(D) w* c_3(D); D.2017_6 = (short int) bc_5; D.2018_7 = (long long int) D.2017_6; D.2016_9 = WIDEN_MULT_PLUS_EXPR b_1(D), c_3(D), a_8(D); return D.2016_9; } As you can see, everything except the WIDEN_MULT_PLUS_EXPR statement is now redundant. (Ideally, this would be removed now, but in fact it doesn't get eliminated until the RTL into_cfglayout pass. This is not new behaviour.) My point is that it's possible to have at least two conversions to examine. Is it possible to have more? I don't know, but once I'm dealing with two I might as well deal with an arbitrary number. Andrew
[go]: Many valgrind errors (use of uninit value, jump depends on uninit value) in the testsuite
On Tue, Jul 5, 2011 at 10:12 PM, Ian Lance Taylor i...@google.com wrote: What remains is a couple of unrelated failures in the testsuite: ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 13945 Segmentation fault ./a.out -test.short -test.timeout=$timeout $@ FAIL: compress/flate gmake[2]: *** [compress/flate/check] Error 1 Any ideas how to attack these? None of these look familiar to me. compress/flate test sometimes passes and sometimes don't. I have run the resulting executable through the valgrind, and there are many (i.e. hundreds) of warnings of uses and calls that depend on uninitialized variables, also on x86_64. ATM, I would like to just report problems with valgrind, and due to the number of them, it looks to me that something is wrong with the library. Uros.
Re: plugin event for C/C++ declarations
On 11-07-07 05:06 , Romain Geissler wrote: gcc/ChangeLog: * plugin.def: Add event for finish_decl. * plugin.c (register_callback, invoke_plugin_callbacks): Same. * c-decl.c (finish_decl): Invoke callbacks on above event. * doc/plugins.texi: Document above event. gcc/cp/ChangeLog: * decl.c (cp_finish_decl): Invoke callbacks on finish_decl event. gcc/testsuite/ChangeLog: * g++.dg/plugin/decl_plugin.c: New test plugin. * g++.dg/plugin/decl-plugin-test.C: Testcase for above plugin. * g++.dg/plugin/plugin.exp: Add above testcase. OK. This one fell through the cracks in my inbox. Apologies. Diego.
Re: [Patch, Fortran] PR fortran/49648 ICE with use-associated array-returning function
Dear Mikael, On 07/07/2011 12:42 PM, Mikael Morin wrote: this is the patch I posted yesterday on bugzilla at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49648#c8 This patch calls gfc_resolve_array_spec on sym-result, which calls gfc_resolve_expr on every bound, which in turn calls resolve_ref on them. As pointed out by Tobias in the PR audit trail, there could be some similar bugs with character lengths. The character length variant of the testcase doesn't ICE however, so I have decided to propose the patch as is, because it should be a step forward anyway. My impression is that the type-spec - contrary to the array spec - is shared between the function symbol and the result symbol. That's also what I get for the example you posted, when looking at the expressions in the debugger. Thus, it seems as the array spec is the only case where one needs to do something. Regression tested on x86_64-unknown-freebsd8.2. OK for trunk? Should I backport to the branches? OK. Regarding backporting: I don't know; I don't have a strong opinion. It's not a regression - but it is also a simple fix. Thus, backporting to 4.6 should be OK, but I wouldn't port it to older versions. Thanks a lot for the patch and going though resolve.c! Tobias
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On Thu, Jul 7, 2011 at 1:43 PM, Andrew Stubbs andrew.stu...@gmail.com wrote: On 07/07/11 11:26, Andrew Stubbs wrote: On 07/07/11 10:58, Richard Guenther wrote: I think you should assume that series of widenings, (int)(short)char_variable are already combined. Thus I believe you only need to consider a single conversion in valid_types_for_madd_p. Hmm, I'm not so sure. I'll look into it a bit further. OK, here's a test case that gives multiple conversions: long long foo (long long a, signed char b, signed char c) { int bc = b * c; return a + (short)bc; } The dump right before the widen_mult pass gives: foo (long long int a, signed char b, signed char c) { int bc; long long int D.2018; short int D.2017; long long int D.2016; int D.2015; int D.2014; bb 2: D.2014_2 = (int) b_1(D); D.2015_4 = (int) c_3(D); bc_5 = D.2014_2 * D.2015_4; D.2017_6 = (short int) bc_5; Ok, so you have a truncation that is a no-op value-wise. I would argue that this truncation should be removed independent on whether we have a widening multiply instruction or not. The technically most capable place to remove non-value-changing truncations (and combine them with a successive conversion) would be value-range propagation. Which already knows: Value ranges after VRP: b_1(D): VARYING D.2698_2: [-128, 127] c_3(D): VARYING D.2699_4: [-128, 127] bc_5: [-16256, 16384] D.2701_6: [-16256, 16384] D.2702_7: [-16256, 16384] a_8(D): VARYING D.2700_9: VARYING thus truncating bc_5 to short does not change the value. The simplification could be made when looking at the statement D.2018_7 = (long long int) D.2017_6; in vrp_fold_stmt, based on the fact that this conversion converts from a value-preserving intermediate conversion. Thus the transform would replace the D.2017_6 operand with bc_5. So yes, the case appears - but it shouldn't ;) I'll cook up a quick patch for VRP. Thanks, Richard. D.2016_9 = D.2018_7 + a_8(D); return D.2016_9; } Here we have a multiply and accumulate done the long way. The 8-bit inputs are widened to 32-bit, multiplied to give a 32-bit result (of which only the lower 16-bits contain meaningful data), then truncated to 16-bits, and sign-extended up to 64-bits ready for the 64-bit addition. This is slight contrived, perhaps, but not unlike the sort of thing that might occur when you have inline functions and macros, and most importantly - it is mathematically valid! So, here's the output from my patched widen_mult pass: foo (long long int a, signed char b, signed char c) { int bc; long long int D.2018; short int D.2017; long long int D.2016; int D.2015; int D.2014; bb 2: D.2014_2 = (int) b_1(D); D.2015_4 = (int) c_3(D); bc_5 = b_1(D) w* c_3(D); D.2017_6 = (short int) bc_5; D.2018_7 = (long long int) D.2017_6; D.2016_9 = WIDEN_MULT_PLUS_EXPR b_1(D), c_3(D), a_8(D); return D.2016_9; } As you can see, everything except the WIDEN_MULT_PLUS_EXPR statement is now redundant. (Ideally, this would be removed now, but in fact it doesn't get eliminated until the RTL into_cfglayout pass. This is not new behaviour.) My point is that it's possible to have at least two conversions to examine. Is it possible to have more? I don't know, but once I'm dealing with two I might as well deal with an arbitrary number. Andrew
Re: Improve Solaris mudflap support (PR libmudflap/49550)
Hello! diff --git a/libmudflap/testsuite/libmudflap.c/pass47-frag.c b/libmudflap/testsuite/libmudflap.c/pass47-frag.c --- a/libmudflap/testsuite/libmudflap.c/pass47-frag.c +++ b/libmudflap/testsuite/libmudflap.c/pass47-frag.c @@ -8,3 +8,5 @@ int main () tolower (buf[4]) == 'o' tolower ('X') == 'x' isdigit (buf[3])) == 0 isalnum ('4')); } + +/* { dg-warning cannot track unknown size extern .__ctype. Solaris __ctype declared without size { target *-*-solaris2.* } 0 } */ This is handled differently throughout the mudflap testsuite: /* Ignore a warning that is irrelevant to the purpose of this test. */ /* { dg-prune-output .*mudflap cannot track unknown size extern.* } */ Uros.
Re: [PATCH] Fix UNRESOLVED gcc.dg/graphite/pr37485.c
Hello! Committed. Richard. 2011-07-07 Richard Guenther rguent...@suse.de * gcc.dg/graphite/pr37485.c: Add -floop-block. Heh, you were faster by a minute! Uros.
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On Thu, Jul 7, 2011 at 2:28 PM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Jul 7, 2011 at 1:43 PM, Andrew Stubbs andrew.stu...@gmail.com wrote: On 07/07/11 11:26, Andrew Stubbs wrote: On 07/07/11 10:58, Richard Guenther wrote: I think you should assume that series of widenings, (int)(short)char_variable are already combined. Thus I believe you only need to consider a single conversion in valid_types_for_madd_p. Hmm, I'm not so sure. I'll look into it a bit further. OK, here's a test case that gives multiple conversions: long long foo (long long a, signed char b, signed char c) { int bc = b * c; return a + (short)bc; } The dump right before the widen_mult pass gives: foo (long long int a, signed char b, signed char c) { int bc; long long int D.2018; short int D.2017; long long int D.2016; int D.2015; int D.2014; bb 2: D.2014_2 = (int) b_1(D); D.2015_4 = (int) c_3(D); bc_5 = D.2014_2 * D.2015_4; D.2017_6 = (short int) bc_5; Ok, so you have a truncation that is a no-op value-wise. I would argue that this truncation should be removed independent on whether we have a widening multiply instruction or not. The technically most capable place to remove non-value-changing truncations (and combine them with a successive conversion) would be value-range propagation. Which already knows: Value ranges after VRP: b_1(D): VARYING D.2698_2: [-128, 127] c_3(D): VARYING D.2699_4: [-128, 127] bc_5: [-16256, 16384] D.2701_6: [-16256, 16384] D.2702_7: [-16256, 16384] a_8(D): VARYING D.2700_9: VARYING thus truncating bc_5 to short does not change the value. The simplification could be made when looking at the statement D.2018_7 = (long long int) D.2017_6; in vrp_fold_stmt, based on the fact that this conversion converts from a value-preserving intermediate conversion. Thus the transform would replace the D.2017_6 operand with bc_5. So yes, the case appears - but it shouldn't ;) I'll cook up a quick patch for VRP. Like the attached. I'll finish and properly test it. Richard. p Description: Binary data
Re: [Patch, Fortran] Add stat=/errmsg= support to _gfortran_caf_register
On 07/07/2011 01:35 PM, Mikael Morin wrote: if (type == CAF_REGTYPE_COARRAY_STATIC) { This will return the same error (memory allocation failure) as in the case just above. Is this expected or should it have an error of its own? I think it is OK in either case. CAF_REFTYPE_COARRAY_STATIC is an automatic allocation for static coarrays, e.g. REAL, SAVE :: my_coarray(1000,1000,10)[*] is allocated at startup (via a constructor) while the other case is for allocatable coarrays of the form REAL, ALLOCATABLE :: my_alloc_coarray(:, :, :)[:] ALLOCATE (my_alloc_coarray(1000,1000,10)[*]) I admit that it is might be not obvious to the user that there is an explicit allocate in the first case. However, one allocates memory in either case and, thus, one could leave the message as is. In particular, I would assume that on most systems, the size of static coarrays is small enough that the message does not trigger. However, if you think that the message could be clearer, I could also change it. + msg = Failed to allocate coarray - stopped images; Also I'm wondering whether it would be pertinent to share the error handling between single.c (one error) and mpi.c (2 or 3 errors) as the codes are very close (with an interface such as handle_error (int *stat, char *errmsg, int errmsg_len, char *actual_error)). The question is where to handle it; in principle, single.c and mpi.c are completely separate files - and both might be compiled by the user/system administrator, contrary to the rest of GCC. Well, single.c is actually automatically compiled as static library and installed as libcaf_single.a. The MPI version is never compiled automatically. Thus, anyone who wants to use gfortran with coarrays (based on mpi.c), has to do: a) Fetch libcaf.h and mpi.c b) Compile mpi.c, e.g., using mpicc -g -O2 -c mpi.c c) Link the such generated mpi.o (or libcaf_mpi.a) to the Fortran program. As the user/sysadmin as to do the compiliation himself, I would like to make it as easy as possible. The current idea is to have just a single C file plus a header file and no further dependency. Other communication backends could be added by simply creating a new file and implementing the library calls. Thus, I do not see how one could best share single.c and mpi.c error messages. But if you have a good idea, I am open to change the current implementation. (See also http://gcc.gnu.org/wiki/CoarrayLib ) Build and regtested on x86-64-linux. OK for the trunk? The above is nitpicking, and I leave the final decision to you and Daniel, so the patch is basically OK with the two indentation nits fixed. I have now committed the patch with only the nits fixed (Rev.175966). But given that the coarray support - especially with regards to the library - is still in a flux, we can still change everything, including the ABI of the library and the file organization. I am sure that not all design decisions are optimal. Thanks for the review! Tobias
[committed] Regimplify last 2 ARRAY_*REF operands and last COMPONENT_REF operand (PR middle-end/49640)
Hi! The attached testcase ICEs, because gimple_regimplify_operands ignores lb: and sz: operands on ARRAY*_REF (and last operand on COMPONENT_REF), assuming that if it is non-NULL, it is valid GIMPLE and doesn't need further processing. That is true for gimplification, as FEs/generic leave those operands NULL and only gimplification sets them, but when we need to regimplify them, e.g. for OpenMP (or perhaps inlining etc.), it wouldn't do anything. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk and 4.6 branch. 2011-07-07 Jakub Jelinek ja...@redhat.com PR middle-end/49640 * gimplify.c (gimplify_compound_lval): For last 2 ARRAY_*REF operands and last COMPONENT_REF operand call gimplify_expr on it if non-NULL. * gcc.dg/gomp/pr49640.c: New test. --- gcc/gimplify.c.jj 2011-06-17 11:02:19.0 +0200 +++ gcc/gimplify.c 2011-07-07 10:56:30.0 +0200 @@ -2010,8 +2010,14 @@ gimplify_compound_lval (tree *expr_p, gi ret = MIN (ret, tret); } } + else + { + tret = gimplify_expr (TREE_OPERAND (t, 2), pre_p, post_p, + is_gimple_reg, fb_rvalue); + ret = MIN (ret, tret); + } - if (!TREE_OPERAND (t, 3)) + if (TREE_OPERAND (t, 3) == NULL_TREE) { tree elmt_type = TREE_TYPE (TREE_TYPE (TREE_OPERAND (t, 0))); tree elmt_size = unshare_expr (array_ref_element_size (t)); @@ -2031,11 +2037,17 @@ gimplify_compound_lval (tree *expr_p, gi ret = MIN (ret, tret); } } + else + { + tret = gimplify_expr (TREE_OPERAND (t, 3), pre_p, post_p, + is_gimple_reg, fb_rvalue); + ret = MIN (ret, tret); + } } else if (TREE_CODE (t) == COMPONENT_REF) { /* Set the field offset into T and gimplify it. */ - if (!TREE_OPERAND (t, 2)) + if (TREE_OPERAND (t, 2) == NULL_TREE) { tree offset = unshare_expr (component_ref_field_offset (t)); tree field = TREE_OPERAND (t, 1); @@ -2054,6 +2066,12 @@ gimplify_compound_lval (tree *expr_p, gi ret = MIN (ret, tret); } } + else + { + tret = gimplify_expr (TREE_OPERAND (t, 2), pre_p, post_p, + is_gimple_reg, fb_rvalue); + ret = MIN (ret, tret); + } } } --- gcc/testsuite/gcc.dg/gomp/pr49640.c.jj 2011-07-07 11:07:08.0 +0200 +++ gcc/testsuite/gcc.dg/gomp/pr49640.c 2011-07-07 11:05:19.0 +0200 @@ -0,0 +1,29 @@ +/* PR middle-end/49640 */ +/* { dg-do compile } */ +/* { dg-options -O2 -std=gnu99 -fopenmp } */ + +void +foo (int N, int M, int K, int P, int Q, int R, int i, int j, int k, + unsigned char x[P][Q][R], int y[N][M][K]) +{ + int ii, jj, kk; + +#pragma omp parallel for private(ii,jj,kk) + for (ii = 0; ii P; ++ii) +for (jj = 0; jj Q; ++jj) + for (kk = 0; kk R; ++kk) + y[i + ii][j + jj][k + kk] = x[ii][jj][kk]; +} + +void +bar (int N, int M, int K, int P, int Q, int R, int i, int j, int k, + unsigned char x[P][Q][R], float y[N][M][K], float factor, float zero) +{ + int ii, jj, kk; + +#pragma omp parallel for private(ii,jj,kk) + for (ii = 0; ii P; ++ii) +for (jj = 0; jj Q; ++jj) + for (kk = 0; kk R; ++kk) + y[i + ii][j + jj][k + kk] = factor * x[ii][jj][kk] + zero; +} Jakub
[PATCH] Fix complex {*,/} real or real * complex handling in C FE (PR c/49644)
Hi! For MULT_EXPR and TRUNC_DIV_EXPR, both sides of COMPLEX_EXPR contain a copy of the non-complex operand, which means its side-effects can be evaluated twice. For PLUS_EXPR/MINUS_EXPR they appear just in one of the operands and thus it works fine as is. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.6? 2011-07-07 Jakub Jelinek ja...@redhat.com PR c/49644 * c-typeck.c (build_binary_op): For MULT_EXPR and TRUNC_DIV_EXPR with one non-complex and one complex argument, call c_save_expr on both operands. * gcc.c-torture/execute/pr49644.c: New test. --- gcc/c-typeck.c.jj 2011-05-31 08:03:10.0 +0200 +++ gcc/c-typeck.c 2011-07-07 11:47:31.0 +0200 @@ -10032,6 +10032,8 @@ build_binary_op (location_t location, en if (first_complex) { op0 = c_save_expr (op0); + if (code == MULT_EXPR || code == TRUNC_DIV_EXPR) + op1 = c_save_expr (op1); real = build_unary_op (EXPR_LOCATION (orig_op0), REALPART_EXPR, op0, 1); imag = build_unary_op (EXPR_LOCATION (orig_op0), IMAGPART_EXPR, @@ -10052,6 +10054,8 @@ build_binary_op (location_t location, en } else { + if (code == MULT_EXPR) + op0 = c_save_expr (op0); op1 = c_save_expr (op1); real = build_unary_op (EXPR_LOCATION (orig_op1), REALPART_EXPR, op1, 1); --- gcc/testsuite/gcc.c-torture/execute/pr49644.c.jj2011-07-07 11:48:34.0 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr49644.c 2011-07-07 11:35:52.0 +0200 @@ -0,0 +1,16 @@ +/* PR c/49644 */ + +extern void abort (void); + +int +main (void) +{ + _Complex double a[12], *c = a, s = 3.0 + 1.0i; + double b[12] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }, *d = b; + int i; + for (i = 0; i 6; i++) +*c++ = *d++ * s; + if (c != a + 6 || d != b + 6) +abort (); + return 0; +} Jakub
Re: PATCH [1/n] X32: Add initial -x32 support
On Wed, Jul 6, 2011 at 9:22 AM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Jul 6, 2011 at 8:02 AM, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Jul 6, 2011 at 4:48 PM, H.J. Lu hjl.to...@gmail.com wrote: Hi Paolo, DJ, Nathanael, Alexandre, Ralf, Is the change . * configure.ac: Support --enable-x32. * configure: Regenerated. diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f3641b..bddabeb 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -611,6 +611,11 @@ AC_ARG_ENABLE(multilib, [], [enable_multilib=yes]) AC_SUBST(enable_multilib) +# With x32 support +AC_ARG_ENABLE(x32, +[ --enable-x32 enable x32 library support for multiple ABIs], Looks like a very very generic switch for a global configury ... we already have --with-multilib-list (SH only), why not extend that to also work for x86_64? Richard. +[], [enable_x32=no]) + # Enable __cxa_atexit for C++. AC_ARG_ENABLE(__cxa_atexit, [AS_HELP_STRING([--enable-__cxa_atexit], [enable __cxa_atexit for C++])], OK? Thanks. Here is the updated patch to use --with-multilib-list=x32. Paolo, DJ, Nathanael, Alexandre, Ralf, Is the configure.ac change --- * configure.ac: Mention x86-64 for --with-multilib-list. * configure: Regenerated. * doc/install.texi: Document --with-multilib-list=x32. diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f3641b..a73f758 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -795,7 +795,7 @@ esac], [enable_languages=c]) AC_ARG_WITH(multilib-list, -[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH only)])], +[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH and x86-64 only)])], :, with_multilib_list=default) diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 49aac95..a5d266c 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1049,8 +1049,10 @@ sysv, aix. @item --with-multilib-list=@var{list} @itemx --without-multilib-list Specify what multilibs to build. -Currently only implemented for sh*-*-*. +Currently only implemented for sh*-*-* and x86-64-*-linux*. +@table @code +@item sh*-*-* @var{list} is a comma separated list of CPU names. These must be of the form @code{sh*} or @code{m*} (in which case they match the compiler option for that processor). The list should not contain any endian options - @@ -1082,6 +1084,12 @@ only little endian SH4AL: --with-multilib-list=sh4al,!mb/m4al @end smallexample +@item x86-64-*-linux* +If @var{list} is @code{x32}, x32 run-time library will be enabled. By +default, x32 run-time library is disabled. + +@end table + @item --with-endian=@var{endians} Specify what endians to use. Currently only implemented for sh*-*-*. --- OK? Thanks. -- H.J. --- 2011-07-06 H.J. Lu hongjiu...@intel.com * config.gcc: Support --with-multilib-list=x32 for x86 Linux targets. * configure.ac: Mention x86-64 for --with-multilib-list. * configure: Regenerated. * config/i386/gnu-user64.h (SPEC_64): Support x32. (SPEC_32): Likewise. (ASM_SPEC): Likewise. (LINK_SPEC): Likewise. (TARGET_THREAD_SSP_OFFSET): Likewise. (TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise. (SPEC_X32): New. * config/i386/i386.h (TARGET_X32): New. (TARGET_LP64): New. (LONG_TYPE_SIZE): Likewise. (POINTER_SIZE): Likewise. (POINTERS_EXTEND_UNSIGNED): Likewise. (OPT_ARCH64): Support x32. (OPT_ARCH32): Likewise. * config/i386/i386.opt (mx32): New. * config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/i386/t-linux-x32: New. * config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New. (BIONIC_DYNAMIC_LINKERX32): Likewise. (GNU_USER_DYNAMIC_LINKERX32): Likewise. * doc/install.texi: Document --with-multilib-list=x32. * doc/invoke.texi: Document -mx32. Hi Uros, This new version only adds a comment to configure.ac. OK to install? Thanks. -- H.J.
Re: RFA: Fix bogus mode in choose_reload_regs
Richard Sandiford wrote: gcc/ * reload1.c (choose_reload_regs): Use mode sizes to check whether an old relaod register completely defines the required value. gcc/testsuite/ * gcc.target/arm/neon-modes-3.c: New test. This is OK. Thanks, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: PATCH [1/n] X32: Add initial -x32 support
On Thu, Jul 7, 2011 at 2:59 PM, H.J. Lu hjl.to...@gmail.com wrote: Hi Paolo, DJ, Nathanael, Alexandre, Ralf, Is the change . * configure.ac: Support --enable-x32. * configure: Regenerated. diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f3641b..bddabeb 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -611,6 +611,11 @@ AC_ARG_ENABLE(multilib, [], [enable_multilib=yes]) AC_SUBST(enable_multilib) +# With x32 support +AC_ARG_ENABLE(x32, +[ --enable-x32 enable x32 library support for multiple ABIs], Looks like a very very generic switch for a global configury ... we already have --with-multilib-list (SH only), why not extend that to also work for x86_64? Richard. +[], [enable_x32=no]) + # Enable __cxa_atexit for C++. AC_ARG_ENABLE(__cxa_atexit, [AS_HELP_STRING([--enable-__cxa_atexit], [enable __cxa_atexit for C++])], OK? Thanks. Here is the updated patch to use --with-multilib-list=x32. Paolo, DJ, Nathanael, Alexandre, Ralf, Is the configure.ac change --- * configure.ac: Mention x86-64 for --with-multilib-list. * configure: Regenerated. * doc/install.texi: Document --with-multilib-list=x32. diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f3641b..a73f758 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -795,7 +795,7 @@ esac], [enable_languages=c]) AC_ARG_WITH(multilib-list, -[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH only)])], +[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH and x86-64 only)])], :, with_multilib_list=default) diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 49aac95..a5d266c 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1049,8 +1049,10 @@ sysv, aix. @item --with-multilib-list=@var{list} @itemx --without-multilib-list Specify what multilibs to build. -Currently only implemented for sh*-*-*. +Currently only implemented for sh*-*-* and x86-64-*-linux*. +@table @code +@item sh*-*-* @var{list} is a comma separated list of CPU names. These must be of the form @code{sh*} or @code{m*} (in which case they match the compiler option for that processor). The list should not contain any endian options - @@ -1082,6 +1084,12 @@ only little endian SH4AL: --with-multilib-list=sh4al,!mb/m4al @end smallexample +@item x86-64-*-linux* +If @var{list} is @code{x32}, x32 run-time library will be enabled. By +default, x32 run-time library is disabled. + +@end table + @item --with-endian=@var{endians} Specify what endians to use. Currently only implemented for sh*-*-*. --- OK? Thanks. -- H.J. --- 2011-07-06 H.J. Lu hongjiu...@intel.com * config.gcc: Support --with-multilib-list=x32 for x86 Linux targets. * configure.ac: Mention x86-64 for --with-multilib-list. * configure: Regenerated. * config/i386/gnu-user64.h (SPEC_64): Support x32. (SPEC_32): Likewise. (ASM_SPEC): Likewise. (LINK_SPEC): Likewise. (TARGET_THREAD_SSP_OFFSET): Likewise. (TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise. (SPEC_X32): New. * config/i386/i386.h (TARGET_X32): New. (TARGET_LP64): New. (LONG_TYPE_SIZE): Likewise. (POINTER_SIZE): Likewise. (POINTERS_EXTEND_UNSIGNED): Likewise. (OPT_ARCH64): Support x32. (OPT_ARCH32): Likewise. * config/i386/i386.opt (mx32): New. * config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/i386/t-linux-x32: New. * config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New. (BIONIC_DYNAMIC_LINKERX32): Likewise. (GNU_USER_DYNAMIC_LINKERX32): Likewise. * doc/install.texi: Document --with-multilib-list=x32. * doc/invoke.texi: Document -mx32. Hi Uros, This new version only adds a comment to configure.ac. OK to install? OK. Thanks, Uros.
Re: CFT: Move unwinder to toplevel libgcc
Tristan Gingold ging...@adacore.com writes: Otherwise, the patch is unchanged from the original submission: [build] Move unwinder to toplevel libgcc http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01452.html Unfortunately, it hasn't seen much comment. I'm now looking for testers especially on platforms with more change and approval of those parts: * Several IA-64 targets: ia64*-*-linux* ia64*-*-hpux* ia64-hp-*vms* For ia64-hp-vms, consider your patch approved if the parts for ia64 are. In case of break, I will fix them. In that case, perhaps Steve could have a look? I'd finally like to make some progress on this patch. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: RFA: Fix bogus mode in choose_reload_regs
On 7 July 2011 09:09, Richard Sandiford richard.sandif...@linaro.org wrote: gcc/ * reload1.c (choose_reload_regs): Use mode sizes to check whether an old relaod register completely defines the required value. s/relaod/reload/ Jay.
[PATCH] Fix folding of -(unsigned)(a * -b)
Folding of $subject is currently broken (noticed that when playing with types in pointer_int_sum). We happily ignore the fact that the negate operates on an unsigned type and change it to operate on a signed one - which may cause new undefined overflow. Seen with the testcase below which aborts with current trunk. The fix is to not strip sign-changing conversions as already done for ABS_EXPR. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2011-07-07 Richard Guenther rguent...@suse.de * fold-const.c (fold_unary_loc): Do not strip sign-changes for NEGATE_EXPR. * gcc.dg/ftrapv-3.c: New testcase. Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 175962) +++ gcc/fold-const.c(working copy) @@ -7561,7 +7561,7 @@ fold_unary_loc (location_t loc, enum tre if (arg0) { if (CONVERT_EXPR_CODE_P (code) - || code == FLOAT_EXPR || code == ABS_EXPR) + || code == FLOAT_EXPR || code == ABS_EXPR || code == NEGATE_EXPR) { /* Don't use STRIP_NOPS, because signedness of argument type matters. */ Index: gcc/testsuite/gcc.dg/ftrapv-3.c === --- gcc/testsuite/gcc.dg/ftrapv-3.c (revision 0) +++ gcc/testsuite/gcc.dg/ftrapv-3.c (revision 0) @@ -0,0 +1,16 @@ +/* { dg-do run } */ +/* { dg-options -ftrapv } */ + +extern void abort (void); +unsigned long +foo (long i, long j) +{ + /* We may not fold this to (unsigned long)(i * j). */ + return -(unsigned long)(i * -j); +} +int main() +{ + if (foo (-__LONG_MAX__ - 1, -1) != -(unsigned long)(-__LONG_MAX__ - 1)) +abort (); + return 0; +}
Re: PATCH [1/n] X32: Add initial -x32 support
Did you even _think_ of looking at the sh configury, and do something vaguely similar for x86? You should not duplicate t-linux64 at all. Instead, in config.gcc set m64/m32 as the default value for with_multilib_list on i386 biarch and x86_64. Pass $with_multilib_list to t-linux64 using TM_MULTILIB_CONFIG. Then, do something like comma=, MULTILIB_OPTIONS= $(subst $(comma),/,@TM_MULTILIB_CONFIG@) MULTILIB_DIRNAMES = $(patsubst m%, %, $(subst /, ,$(MULTILIB_OPTIONS))) MULTILIB_OSDIRNAMES = 64=../lib64 MULTILIB_OSDIRNAMES += 32=$(if $(wildcard $(shell echo $(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib) MULTILIB_OSDIRNAMES += x32=../libx32 in config/t-linux64. (Each on one line, apologies for any wrapping) The option will be used as --with-multilib-list=m64,m32,mx32 (allowing the user to omit some of the variants, too). Paolo
[PATCH] Make VRP optimize useless conversions
The following patch teaches VRP to disregard the intermediate conversion in a sequence (T1)(T2)val if that sequence is value-preserving for val. There are possibly some more cases that could be handled when a sign-change is involved but the following is a first safe step. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2011-07-07 Richard Guenther rguent...@suse.de * tree-vrp.c (simplify_conversion_using_ranges): New function. (simplify_stmt_using_ranges): Call it. * gcc.dg/tree-ssa/vrp58.c: New testcase. Index: gcc/tree-vrp.c === *** gcc/tree-vrp.c (revision 175962) --- gcc/tree-vrp.c (working copy) *** simplify_switch_using_ranges (gimple stm *** 7342,7347 --- 7342,7378 return false; } + /* Simplify an integral conversion from an SSA name in STMT. */ + + static bool + simplify_conversion_using_ranges (gimple stmt) + { + tree rhs1 = gimple_assign_rhs1 (stmt); + gimple def_stmt = SSA_NAME_DEF_STMT (rhs1); + value_range_t *final, *inner; + + /* Obtain final and inner value-ranges for a conversion + sequence (final-type)(intermediate-type)inner-type. */ + final = get_value_range (gimple_assign_lhs (stmt)); + if (final-type != VR_RANGE) + return false; + if (!is_gimple_assign (def_stmt) + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt))) + return false; + rhs1 = gimple_assign_rhs1 (def_stmt); + if (TREE_CODE (rhs1) != SSA_NAME) + return false; + inner = get_value_range (rhs1); + if (inner-type != VR_RANGE) + return false; + if (!tree_int_cst_equal (final-min, inner-min) + || !tree_int_cst_equal (final-max, inner-max)) + return false; + gimple_assign_set_rhs1 (stmt, rhs1); + update_stmt (stmt); + return true; + } + /* Simplify STMT using ranges if possible. */ static bool *** simplify_stmt_using_ranges (gimple_stmt_ *** 7351,7356 --- 7382,7388 if (is_gimple_assign (stmt)) { enum tree_code rhs_code = gimple_assign_rhs_code (stmt); + tree rhs1 = gimple_assign_rhs1 (stmt); switch (rhs_code) { *** simplify_stmt_using_ranges (gimple_stmt_ *** 7364,7370 or identity if the RHS is zero or one, and the LHS are known to be boolean values. Transform all TRUTH_*_EXPR into BIT_*_EXPR if both arguments are known to be boolean values. */ ! if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt return simplify_truth_ops_using_ranges (gsi, stmt); break; --- 7396,7402 or identity if the RHS is zero or one, and the LHS are known to be boolean values. Transform all TRUTH_*_EXPR into BIT_*_EXPR if both arguments are known to be boolean values. */ ! if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1))) return simplify_truth_ops_using_ranges (gsi, stmt); break; *** simplify_stmt_using_ranges (gimple_stmt_ *** 7373,7387 than zero and the second operand is an exact power of two. */ case TRUNC_DIV_EXPR: case TRUNC_MOD_EXPR: ! if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))) integer_pow2p (gimple_assign_rhs2 (stmt))) return simplify_div_or_mod_using_ranges (stmt); break; /* Transform ABS (X) into X or -X as appropriate. */ case ABS_EXPR: ! if (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME ! INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt return simplify_abs_using_ranges (stmt); break; --- 7405,7419 than zero and the second operand is an exact power of two. */ case TRUNC_DIV_EXPR: case TRUNC_MOD_EXPR: ! if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1)) integer_pow2p (gimple_assign_rhs2 (stmt))) return simplify_div_or_mod_using_ranges (stmt); break; /* Transform ABS (X) into X or -X as appropriate. */ case ABS_EXPR: ! if (TREE_CODE (rhs1) == SSA_NAME ! INTEGRAL_TYPE_P (TREE_TYPE (rhs1))) return simplify_abs_using_ranges (stmt); break; *** simplify_stmt_using_ranges (gimple_stmt_ *** 7390,7399 /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR if all the bits being cleared are already cleared or all the bits being set are already set. */ ! if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt return simplify_bit_ops_using_ranges (gsi, stmt); break; default: break; } --- 7422,7437 /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR if all the bits being cleared are already cleared or all the bits being set are
Re: Remove unused t-* fragments
On 7/6/2011 4:14 PM, Joseph S. Myers wrote: 2011-07-06 Joseph Myersjos...@codesourcery.com * config/i386/t-crtpic, config/i386/t-svr3dbx, config/pa/t-pa: Remove. Ok for pa. Dave -- John David Anglindave.ang...@bell.net
Re: [PATCH, graphite]: Fix UNRESOLVED: gcc.dg/graphite/pr37485.c scan-tree-dump-times graphite Loop blocked
On Thu, Jul 7, 2011 at 05:36, Uros Bizjak ubiz...@gmail.com wrote: Hello! We should add loop blocking flags (the same as in graphite.exp) if we want to check graphite tree dump. 2011-07-07 Uros Bizjak ubiz...@gmail.com * gcc.dg/graphite/pr37485.c (dg-options): Add -floop-block -fno-loop-strip-mine -fno-loop-interchange -ffast-math. Tested on x86_64-pc-linux-gnu {,-m32}. OK for mainline? Yes, thanks, Sebastian
Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns
On Mon, Jul 4, 2011 at 8:55 PM, Kai Tietz ktiet...@googlemail.com wrote: Ok, reworked version. The folding of X op X and !X op !X seems indeed not being necessary. So function simplifies much. Bootstrapped and regression tested for all standard languages (plus Ada and Obj-C++). Ok for apply? Ok with a proper changelog entry. Thanks, Richard. Regards, Kai Index: gcc-head/gcc/tree-ssa-forwprop.c === --- gcc-head.orig/gcc/tree-ssa-forwprop.c +++ gcc-head/gcc/tree-ssa-forwprop.c @@ -1602,6 +1602,129 @@ simplify_builtin_call (gimple_stmt_itera return false; } +/* Checks if expression has type of one-bit precision, or is a known + truth-valued expression. */ +static bool +truth_valued_ssa_name (tree name) +{ + gimple def; + tree type = TREE_TYPE (name); + + if (!INTEGRAL_TYPE_P (type)) + return false; + /* Don't check here for BOOLEAN_TYPE as the precision isn't + necessarily one and so ~X is not equal to !X. */ + if (TYPE_PRECISION (type) == 1) + return true; + def = SSA_NAME_DEF_STMT (name); + if (is_gimple_assign (def)) + return truth_value_p (gimple_assign_rhs_code (def), type); + return false; +} + +/* Helper routine for simplify_bitwise_binary_1 function. + Return for the SSA name NAME the expression X if it mets condition + NAME = !X. Otherwise return NULL_TREE. + Detected patterns for NAME = !X are: + !X and X == 0 for X with integral type. + X ^ 1, X != 1,or ~X for X with integral type with precision of one. */ +static tree +lookup_logical_inverted_value (tree name) +{ + tree op1, op2; + enum tree_code code; + gimple def; + + /* If name has none-intergal type, or isn't a SSA_NAME, then + return. */ + if (TREE_CODE (name) != SSA_NAME + || !INTEGRAL_TYPE_P (TREE_TYPE (name))) + return NULL_TREE; + def = SSA_NAME_DEF_STMT (name); + if (!is_gimple_assign (def)) + return NULL_TREE; + + code = gimple_assign_rhs_code (def); + op1 = gimple_assign_rhs1 (def); + op2 = NULL_TREE; + + /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand. + If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR, + or BIT_NOT_EXPR, then return. */ + if (code == EQ_EXPR || code == NE_EXPR + || code == BIT_XOR_EXPR) + op2 = gimple_assign_rhs2 (def); + + switch (code) + { + case TRUTH_NOT_EXPR: + return op1; + case BIT_NOT_EXPR: + if (truth_valued_ssa_name (name)) + return op1; + break; + case EQ_EXPR: + /* Check if we have X == 0 and X has an integral type. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; + if (integer_zerop (op2)) + return op1; + break; + case NE_EXPR: + /* Check if we have X != 1 and X is a truth-valued. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; + if (integer_onep (op2) truth_valued_ssa_name (op1)) + return op1; + break; + case BIT_XOR_EXPR: + /* Check if we have X ^ 1 and X is truth valued. */ + if (integer_onep (op2) truth_valued_ssa_name (op1)) + return op1; + break; + default: + break; + } + + return NULL_TREE; +} + +/* Optimize ARG1 CODE ARG2 to a constant for bitwise binary + operations CODE, if one operand has the logically inverted + value of the other. */ +static tree +simplify_bitwise_binary_1 (enum tree_code code, tree type, + tree arg1, tree arg2) +{ + tree anot; + + /* If CODE isn't a bitwise binary operation, return NULL_TREE. */ + if (code != BIT_AND_EXPR code != BIT_IOR_EXPR + code != BIT_XOR_EXPR) + return NULL_TREE; + + /* First check if operands ARG1 and ARG2 are equal. If so + return NULL_TREE as this optimization is handled fold_stmt. */ + if (arg1 == arg2) + return NULL_TREE; + /* See if we have in arguments logical-not patterns. */ + if (((anot = lookup_logical_inverted_value (arg1)) == NULL_TREE + || anot != arg2) + ((anot = lookup_logical_inverted_value (arg2)) == NULL_TREE + || anot != arg1)) + return NULL_TREE; + + /* X !X - 0. */ + if (code == BIT_AND_EXPR) + return fold_convert (type, integer_zero_node); + /* X | !X - 1 and X ^ !X - 1, if X is truth-valued. */ + if (truth_valued_ssa_name (anot)) + return fold_convert (type, integer_one_node); + + /* ??? Otherwise result is (X != 0 ? X : 1). not handled. */ + return NULL_TREE; +} + /* Simplify bitwise binary operations. Return true if a transformation applied, otherwise return false. */ @@ -1769,6 +1892,15 @@ simplify_bitwise_binary (gimple_stmt_ite return true; } + /* Try simple folding for X op !X, and X op X. */ + res = simplify_bitwise_binary_1 (code, TREE_TYPE (arg1), arg1, arg2); + if (res != NULL_TREE) + { + gimple_assign_set_rhs_from_tree
Re: [PATCH 4/6] Shrink-wrapping
Bernd Schmidt ber...@codesourcery.com writes: This adds the actual optimization, and reworks the JUMP_LABEL handling for return blocks. See the introduction mail or the new comment ahead of thread_prologue_and_epilogue_insns for more notes. It seems a shame to have both (return) and (simple_return). You said that we need the distinction in order to cope with targets like ARM, whose (return) instruction actually performs some of the epilogue too. It feels like the load of the saved registers should really be expressed in rtl, in parallel with the return. I realise that'd prevent conditional returns though. Maybe there's no elegant way out... With the hidden loads, it seems like we'll have a situation in which the values of call-saved registers will appear to be different for different real incoming edges to the exit block. Is JUMP_LABEL ever null after this change? (In fully-complete rtl sequences, I mean.) It looked like some of the null checks in the patch might not be necessary any more. JUMP_LABEL also seems somewhat misnamed after this change; maybe JUMP_TARGET would be better? I'm the last person who should be recommending names though. I know it's a pain, but it'd really help if you could split the JUMP_LABEL == a return rtx stuff out. I think it'd also be worth splitting the RETURN_ADDR_REGNUM bit out into a separate patch, and handling other things in a more generic way. E.g. the default INCOMING_RETURN_ADDR_RTX could then be: #define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM) and df.c:df_get_exit_block_use_set should include RETURN_ADDR_REGNUM when epilogue_completed. It'd be nice to handle cases in which all references to the stack pointer are to the incoming arguments. Maybe mention the fact that we don't as another source of conservatism? It'd also be nice to get rid of all these big blocks of code that are conditional on preprocessor macros, but I realise you're just following existing practice in the surrounding code, so again it can be left to a future cleanup. @@ -1280,7 +1297,7 @@ force_nonfallthru_and_redirect (edge e, basic_block force_nonfallthru (edge e) { - return force_nonfallthru_and_redirect (e, e-dest); + return force_nonfallthru_and_redirect (e, e-dest, NULL_RTX); } Maybe assert here that e-dest isn't the exit block? I realise it will be caught by the: gcc_assert (jump_label == simple_return_rtx); check, but an assert here would make it more obvious what had gone wrong. - if (GET_CODE (x) == RETURN) + if (GET_CODE (x) == RETURN || GET_CODE (x) == SIMPLE_RETURN) ANY_RETURN_P (x). A few other cases. @@ -5654,6 +5658,7 @@ init_emit_regs (void) /* Assign register numbers to the globally defined register rtx. */ pc_rtx = gen_rtx_fmt_ (PC, VOIDmode); ret_rtx = gen_rtx_fmt_ (RETURN, VOIDmode); + simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode); cc0_rtx = gen_rtx_fmt_ (CC0, VOIDmode); stack_pointer_rtx = gen_raw_REG (Pmode, STACK_POINTER_REGNUM); frame_pointer_rtx = gen_raw_REG (Pmode, FRAME_POINTER_REGNUM); It'd be nice to s/ret_rtx/return_rtx/ for consistency, but that can happen anytime. +/* Return true if INSN requires the stack frame to be set up. */ +static bool +requires_stack_frame_p (rtx insn) +{ + HARD_REG_SET hardregs; + unsigned regno; + + if (!INSN_P (insn) || DEBUG_INSN_P (insn)) +return false; + if (CALL_P (insn)) +return !SIBLING_CALL_P (insn); + if (for_each_rtx (PATTERN (insn), frame_required_for_rtx, NULL)) +return true; + CLEAR_HARD_REG_SET (hardregs); + note_stores (PATTERN (insn), record_hard_reg_sets, hardregs); + AND_COMPL_HARD_REG_SET (hardregs, call_used_reg_set); + for (regno = 0; regno FIRST_PSEUDO_REGISTER; regno++) +if (TEST_HARD_REG_BIT (hardregs, regno) + df_regs_ever_live_p (regno)) + return true; This can be done as a follow-up, but it looks like df should be using a HARD_REG_SET here, and that we should be able to get at it directly. + FOR_EACH_EDGE (e, ei, bb-preds) + if (!bitmap_bit_p (bb_antic_flags, e-src-index)) + { + VEC_quick_push (basic_block, vec, e-src); + bitmap_set_bit (bb_on_list, e-src-index); + } !bitmap_bit_p (bb_on_list, e-src-index) ? + } + while (!VEC_empty (basic_block, vec)) + { + basic_block tmp_bb = VEC_pop (basic_block, vec); + edge e; + edge_iterator ei; + bool all_set = true; + + bitmap_clear_bit (bb_on_list, tmp_bb-index); + FOR_EACH_EDGE (e, ei, tmp_bb-succs) + { + if (!bitmap_bit_p (bb_antic_flags, e-dest-index)) + { + all_set = false; + break; + } + } + if (all_set) + { + bitmap_set_bit (bb_antic_flags, tmp_bb-index); + FOR_EACH_EDGE (e, ei, tmp_bb-preds) + if (!bitmap_bit_p (bb_antic_flags,
Re: [PATCH][C] Fixup pointer-int-sum
On Thu, 7 Jul 2011, Richard Guenther wrote: not overflow (what is actually the C semantics - is the multiplication allowed to overflow for unsigned intop? If not Overflow is not allowed. Formally the multiplication is as-if to infinite precision, and then there is undefined behavior if the result of the addition (to infinite precision) is outside the array pointed to - wrapping around by some multiple of the whole address space is not allowed. In practice, as previously discussed objects half or more of the address space do not work reliably because of the problems doing pointer subtraction, so always using a signed type shouldn't break anything that actually worked reliably (though how unreliable things were with large malloced objects - which unfortunately glibc's malloc can provide - if the source code didn't use pointer subtraction, I don't know). In GCC's terms half or more of the address space generally means half the range of size_t. (m32c has ptrdiff_t wider than size_t in some cases. On such unusual architectures it ought to be possible to have objects whose size is up to SIZE_MAX bytes and have pointer addition and subtraction work reliably, which would suggest using ptrdiff_t for arithmetic in such cases, but the code checking sizes for arrays of constant size uses the signed type corresponding to size_t, so you could only get a larger object through malloc or VLAs.) The patch is OK. Unconditionally signed is also OK, though I don't see any advantage over this version. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Fix complex {*,/} real or real * complex handling in C FE (PR c/49644)
On Thu, 7 Jul 2011, Jakub Jelinek wrote: Hi! For MULT_EXPR and TRUNC_DIV_EXPR, both sides of COMPLEX_EXPR contain a copy of the non-complex operand, which means its side-effects can be evaluated twice. For PLUS_EXPR/MINUS_EXPR they appear just in one of the operands and thus it works fine as is. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.6? OK, but I think you need a similar patch for the C++ front end as well. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH][C] Fixup pointer-int-sum
On Thu, 7 Jul 2011, Joseph S. Myers wrote: On Thu, 7 Jul 2011, Richard Guenther wrote: not overflow (what is actually the C semantics - is the multiplication allowed to overflow for unsigned intop? If not Overflow is not allowed. Formally the multiplication is as-if to infinite precision, and then there is undefined behavior if the result of the addition (to infinite precision) is outside the array pointed to - wrapping around by some multiple of the whole address space is not allowed. In practice, as previously discussed objects half or more of the address space do not work reliably because of the problems doing pointer subtraction, so always using a signed type shouldn't break anything that actually worked reliably (though how unreliable things were with large malloced objects - which unfortunately glibc's malloc can provide - if the source code didn't use pointer subtraction, I don't know). In GCC's terms half or more of the address space generally means half the range of size_t. (m32c has ptrdiff_t wider than size_t in some cases. On such unusual architectures it ought to be possible to have objects whose size is up to SIZE_MAX bytes and have pointer addition and subtraction work reliably, which would suggest using ptrdiff_t for arithmetic in such cases, but the code checking sizes for arrays of constant size uses the signed type corresponding to size_t, so you could only get a larger object through malloc or VLAs.) The patch is OK. Unconditionally signed is also OK, though I don't see any advantage over this version. Ok, I'll defer the decision to the time I have settled on a final solution to get rid of the (unsigned) sizetype offset operand for POINTER_PLUS_EXPR. The least invasive idea is to introduce a new signed ptrofftype to replace all sizetype conversions at places we build POINTER_PLUS_EXPRs. That would favor unconditionally signed. The moderate invasive idea is to allow both a signed and an unsigned ptrofftype (but still force a common precision), with all the fun that arises from combining (ptr p+ off1) p+ off2 with different signs for the offset operand ... Thanks, Richard.
Re: [Patch,testsuite]: Filter more test cases to fit target capabilities
On Jul 6, 2011, at 10:26 AM, Georg-Johann Lay wrote: Hi, I am struggling against hundreds of fails in the testsuite because many cases are not carefully written, e.g. stull like shifting an int by 19 bits if int is only 16 bits wide. Ok to commit? Ok.
Re: PATCH [1/n] X32: Add initial -x32 support
On Thu, Jul 7, 2011 at 6:21 AM, Paolo Bonzini bonz...@gnu.org wrote: Did you even _think_ of looking at the sh configury, and do something vaguely similar for x86? You should not duplicate t-linux64 at all. Instead, in config.gcc set m64/m32 as the default value for with_multilib_list on i386 biarch and x86_64. Pass $with_multilib_list to t-linux64 using TM_MULTILIB_CONFIG. Then, do something like comma=, MULTILIB_OPTIONS = $(subst $(comma),/,@TM_MULTILIB_CONFIG@) MULTILIB_DIRNAMES = $(patsubst m%, %, $(subst /, ,$(MULTILIB_OPTIONS))) MULTILIB_OSDIRNAMES = 64=../lib64 MULTILIB_OSDIRNAMES += 32=$(if $(wildcard $(shell echo $(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib) MULTILIB_OSDIRNAMES += x32=../libx32 in config/t-linux64. (Each on one line, apologies for any wrapping) The option will be used as --with-multilib-list=m64,m32,mx32 (allowing the user to omit some of the variants, too). This is an excellent suggestion. Here is the updated patch. It uses TM_MULTILIB_CONFIG and removes config/i386/t-linux-x32. Uros, is this OK for trunk to replace the patch you approved earlier? Thanks. -- H.J. --- 2011-07-07 H.J. Lu hongjiu...@intel.com * config.gcc: Support --with-multilib-list for x86 Linux targets. * configure.ac: Mention x86-64 for --with-multilib-list. * configure: Regenerated. * config/i386/gnu-user64.h (SPEC_64): Support x32. (SPEC_32): Likewise. (ASM_SPEC): Likewise. (LINK_SPEC): Likewise. (TARGET_THREAD_SSP_OFFSET): Likewise. (TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise. (SPEC_X32): New. * config/i386/i386.h (TARGET_X32): New. (TARGET_LP64): New. (LONG_TYPE_SIZE): Likewise. (POINTER_SIZE): Likewise. (POINTERS_EXTEND_UNSIGNED): Likewise. (OPT_ARCH64): Support x32. (OPT_ARCH32): Likewise. * config/i386/i386.opt (mx32): New. * config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New. (BIONIC_DYNAMIC_LINKERX32): Likewise. (GNU_USER_DYNAMIC_LINKERX32): Likewise. * config/i386/t-linux64: Support TM_MULTILIB_CONFIG. * doc/install.texi: Document --with-multilib-list for Linux/x86-64. * doc/invoke.texi: Document -mx32. 2011-07-07 H.J. Lu hongjiu...@intel.com * config.gcc: Support --with-multilib-list for x86 Linux targets. * configure.ac: Mention x86-64 for --with-multilib-list. * configure: Regenerated. * config/i386/gnu-user64.h (SPEC_64): Support x32. (SPEC_32): Likewise. (ASM_SPEC): Likewise. (LINK_SPEC): Likewise. (TARGET_THREAD_SSP_OFFSET): Likewise. (TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise. (SPEC_X32): New. * config/i386/i386.h (TARGET_X32): New. (TARGET_LP64): New. (LONG_TYPE_SIZE): Likewise. (POINTER_SIZE): Likewise. (POINTERS_EXTEND_UNSIGNED): Likewise. (OPT_ARCH64): Support x32. (OPT_ARCH32): Likewise. * config/i386/i386.opt (mx32): New. * config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New. (BIONIC_DYNAMIC_LINKERX32): Likewise. (GNU_USER_DYNAMIC_LINKERX32): Likewise. * config/i386/t-linux64: Support TM_MULTILIB_CONFIG. * doc/install.texi: Document --with-multilib-list for Linux/x86-64. * doc/invoke.texi: Document -mx32. diff --git a/gcc/config.gcc b/gcc/config.gcc index c77f40b..449409e 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1280,6 +1280,22 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i tm_file=${tm_file} i386/x86-64.h i386/gnu-user64.h i386/linux64.h tm_defines=${tm_defines} TARGET_BI_ARCH=1 tmake_file=${tmake_file} i386/t-linux64 + x86_multilibs=${with_multilib_list} + if test $x86_multilibs = default; then + x86_multilibs=m64,m32 + fi + x86_multilibs=`echo $x86_multilibs | sed -e 's/,/ /g'` + for x86_multilib in ${x86_multilibs}; do + case ${x86_multilib} in + m32 | m64 | mx32) + TM_MULTILIB_CONFIG=${TM_MULTILIB_CONFIG},${x86_multilib} + ;; +
Re: [PATCH] Fix complex {*,/} real or real * complex handling in C FE (PR c/49644)
On Thu, Jul 07, 2011 at 02:55:45PM +, Joseph S. Myers wrote: On Thu, 7 Jul 2011, Jakub Jelinek wrote: For MULT_EXPR and TRUNC_DIV_EXPR, both sides of COMPLEX_EXPR contain a copy of the non-complex operand, which means its side-effects can be evaluated twice. For PLUS_EXPR/MINUS_EXPR they appear just in one of the operands and thus it works fine as is. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.6? OK, but I think you need a similar patch for the C++ front end as well. Indeed, thanks. Attached is the corresponding C++ patch and simplified C patch (with c_save_expr calls right in the switch stmt for the cases that need it instead of another condition before). Jakub 2011-07-07 Jakub Jelinek ja...@redhat.com PR c/49644 * typeck.c (cp_build_binary_op): For MULT_EXPR and TRUNC_DIV_EXPR with one non-complex and one complex argument, call save_expr on both operands. * g++.dg/torture/pr49644.C: New test. --- gcc/cp/typeck.c.jj 2011-06-21 16:45:52.0 +0200 +++ gcc/cp/typeck.c 2011-07-07 17:00:17.0 +0200 @@ -4338,6 +4338,7 @@ cp_build_binary_op (location_t location, { case MULT_EXPR: case TRUNC_DIV_EXPR: + op1 = save_expr (op1); imag = build2 (resultcode, real_type, imag, op1); /* Fall through. */ case PLUS_EXPR: @@ -4356,6 +4357,7 @@ cp_build_binary_op (location_t location, switch (code) { case MULT_EXPR: + op0 = save_expr (op0); imag = build2 (resultcode, real_type, op0, imag); /* Fall through. */ case PLUS_EXPR: --- gcc/testsuite/g++.dg/torture/pr49644.C.jj 2011-07-07 17:01:21.0 +0200 +++ gcc/testsuite/g++.dg/torture/pr49644.C 2011-07-07 17:01:27.0 +0200 @@ -0,0 +1,17 @@ +// PR c/49644 +// { dg-do run } + +extern C void abort (); + +int +main () +{ + _Complex double a[12], *c = a, s = 3.0 + 1.0i; + double b[12] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }, *d = b; + int i; + for (i = 0; i 6; i++) +*c++ = *d++ * s; + if (c != a + 6 || d != b + 6) +abort (); + return 0; +} 2011-07-07 Jakub Jelinek ja...@redhat.com PR c/49644 * c-typeck.c (build_binary_op): For MULT_EXPR and TRUNC_DIV_EXPR with one non-complex and one complex argument, call c_save_expr on both operands. * gcc.c-torture/execute/pr49644.c: New test. --- gcc/c-typeck.c.jj 2011-05-31 08:03:10.0 +0200 +++ gcc/c-typeck.c 2011-07-07 11:47:31.0 +0200 @@ -10040,6 +10040,7 @@ build_binary_op (location_t location, en { case MULT_EXPR: case TRUNC_DIV_EXPR: + op1 = c_save_expr (op1); imag = build2 (resultcode, real_type, imag, op1); /* Fall through. */ case PLUS_EXPR: @@ -10060,6 +10061,7 @@ build_binary_op (location_t location, en switch (code) { case MULT_EXPR: + op0 = c_save_expr (op0); imag = build2 (resultcode, real_type, op0, imag); /* Fall through. */ case PLUS_EXPR: --- gcc/testsuite/gcc.c-torture/execute/pr49644.c.jj2011-07-07 11:48:34.0 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr49644.c 2011-07-07 11:35:52.0 +0200 @@ -0,0 +1,16 @@ +/* PR c/49644 */ + +extern void abort (void); + +int +main () +{ + _Complex double a[12], *c = a, s = 3.0 + 1.0i; + double b[12] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }, *d = b; + int i; + for (i = 0; i 6; i++) +*c++ = *d++ * s; + if (c != a + 6 || d != b + 6) +abort (); + return 0; +}
Re: [ARM] Deprecate -mwords-little-endian
Richard Earnshaw rearn...@arm.com writes: On 29/06/11 12:28, Richard Sandiford wrote: ARM has an option called -mwords-little-endian that provides big-endian compatibility with pre-2.8 compilers. When I asked Richard about it, he seemed to think it had outlived its usefulness, so this patch deprecates it. We can then remove it once 4.7 is out. Tested on arm-linux-gnueabi. OK to install? If so, I'll do a patch for the web page as well. Please also update the in-line help text in arm.opt. OK with that change. Thanks. I've attached the patch I applied below. How's this for the docs change? -- Index: htdocs/gcc-4.7/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.20 diff -u -r1.20 changes.html --- htdocs/gcc-4.7/changes.html 6 Jul 2011 23:37:11 - 1.20 +++ htdocs/gcc-4.7/changes.html 7 Jul 2011 15:17:03 - @@ -43,6 +43,9 @@ only intended as a migration aid from SunOS 4 to SunOS 5. The code-compat-bsd/code compiler option is not recognized any longer./li + +liThe ARM port's code-mwords-little-endian/code option has +been deprecated. It will be removed in a future release./li /ul h2General Optimizer Improvements/h2 -- I wondered about expanding it a bit (describing why the option was added and why it's no longer needed). It felt like overkill for such a niche option though. Richard gcc/ * doc/invoke.texi (mwords-little-endian): Deprecate. * config/arm/arm.opt (mwords-little-endian): Likewise. * config/arm/arm.c (arm_option_override): Warn about the deprecation of -mwords-little-endian. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi 2011-07-04 09:09:02.0 +0100 +++ gcc/doc/invoke.texi 2011-07-04 13:50:06.0 +0100 @@ -10239,7 +10239,7 @@ Generate code for a little-endian word o order. That is, a byte order of the form @samp{32107654}. Note: this option should only be used if you require compatibility with code for big-endian ARM processors generated by versions of the compiler prior to -2.8. +2.8. This option is now deprecated. @item -mcpu=@var{name} @opindex mcpu Index: gcc/config/arm/arm.opt === --- gcc/config/arm/arm.opt 2011-06-22 16:46:28.0 +0100 +++ gcc/config/arm/arm.opt 2011-07-04 13:52:38.0 +0100 @@ -235,7 +235,7 @@ Tune code for the given processor mwords-little-endian Target Report RejectNegative Mask(LITTLE_WORDS) -Assume big endian bytes, little endian words +Assume big endian bytes, little endian words. This option is deprecated. mvectorize-with-neon-quad Target Report Mask(NEON_VECTORIZE_QUAD) Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c2011-07-01 05:37:51.0 +0100 +++ gcc/config/arm/arm.c2011-07-04 13:50:06.0 +0100 @@ -1483,6 +1483,10 @@ arm_option_override (void) if (TARGET_APCS_FLOAT) warning (0, passing floating point arguments in fp regs not yet supported); + if (TARGET_LITTLE_WORDS) +warning (OPT_Wdeprecated, %mwords-little-endian% is deprecated and +will be removed in a future release); + /* Initialize boolean versions of the flags, for use in the arm.md file. */ arm_arch3m = (insn_flags FL_ARCH3M) != 0; arm_arch4 = (insn_flags FL_ARCH4) != 0;
Re: PATCH [1/n] X32: Add initial -x32 support
On Thu, Jul 7, 2011 at 17:12, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Jul 7, 2011 at 5:02 PM, H.J. Lu hjl.to...@gmail.com wrote: Did you even _think_ of looking at the sh configury, and do something vaguely similar for x86? You should not duplicate t-linux64 at all. Instead, in config.gcc set m64/m32 as the default value for with_multilib_list on i386 biarch and x86_64. Pass $with_multilib_list to t-linux64 using TM_MULTILIB_CONFIG. Then, do something like comma=, MULTILIB_OPTIONS = $(subst $(comma),/,@TM_MULTILIB_CONFIG@) MULTILIB_DIRNAMES = $(patsubst m%, %, $(subst /, ,$(MULTILIB_OPTIONS))) MULTILIB_OSDIRNAMES = 64=../lib64 MULTILIB_OSDIRNAMES += 32=$(if $(wildcard $(shell echo $(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib) MULTILIB_OSDIRNAMES += x32=../libx32 in config/t-linux64. (Each on one line, apologies for any wrapping) The option will be used as --with-multilib-list=m64,m32,mx32 (allowing the user to omit some of the variants, too). This is an excellent suggestion. Here is the updated patch. It uses TM_MULTILIB_CONFIG and removes config/i386/t-linux-x32. Uros, is this OK for trunk to replace the patch you approved earlier? Er, the approval was fo x86 parts, I will leave approval for build parts to Paolo. Yes, build parts are okay too. Paolo
Re: [PATCH 4/6] Shrink-wrapping
Whee! Thanks for reviewing (reviving?) this old thing. I should be posting an up-to-date version of this, but for the moment it has to wait until dwarf2out is sorted out, and I'm rather busy with other stuff. I hope to squeeze this in in the not too distant future. I'll try to answer some of the questions now... On 07/07/11 16:34, Richard Sandiford wrote: Bernd Schmidt ber...@codesourcery.com writes: This adds the actual optimization, and reworks the JUMP_LABEL handling for return blocks. See the introduction mail or the new comment ahead of thread_prologue_and_epilogue_insns for more notes. It seems a shame to have both (return) and (simple_return). Yes, but the distinction exists and must be represented somehow - you can have both in the same function. You said that we need the distinction in order to cope with targets like ARM, whose (return) instruction actually performs some of the epilogue too. It feels like the load of the saved registers should really be expressed in rtl, in parallel with the return. I realise that'd prevent conditional returns though. Maybe there's no elegant way out... It certainly would make it harder to transform branches to conditional returns. It would also require examining every port to see if it needs changes to its return patterns. It probably only affects ARM though, but that target is important enough that we should support the feature (i.e. conditional returns that pop registers). If we described conditional returns only as COND_EXEC maybe... AFAICT only ia64, arm, frv and c6x have conditional return. I'll have to think about it. Note that some interface changes will be necessary in any case - passing NULL as a new jump label simply isn't informative enough when redirecting a jump; we must be able to distinguish between the two forms of return at this level. So the ret_rtx/simple_return_rtx may turn out to be the simplest solution after all. With the hidden loads, it seems like we'll have a situation in which the values of call-saved registers will appear to be different for different real incoming edges to the exit block. Probably true, but I doubt we have any code that would notice. Can you imagine anything that would care? Is JUMP_LABEL ever null after this change? (In fully-complete rtl sequences, I mean.) It looked like some of the null checks in the patch might not be necessary any more. It shouldn't be, and it's possible that a few of these tests survived when they shouldn't have. JUMP_LABEL also seems somewhat misnamed after this change; maybe JUMP_TARGET would be better? Maybe. I dread the renaming patch though. It'd also be nice to get rid of all these big blocks of code that are conditional on preprocessor macros, but I realise you're just following existing practice in the surrounding code, so again it can be left to a future cleanup. Yeah, this function is quite horrid - so many different paths through it. However, it looks like the only target without HAVE_prologue is actually pdp11, so we're carrying some unnecessary baggage for purely retrocomputing purposes. Paul, can you fix that? ret_rtx = gen_rtx_fmt_ (RETURN, VOIDmode); + simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode); It'd be nice to s/ret_rtx/return_rtx/ for consistency, but that can happen anytime. Unfortunately there's another macro called return_rtx. + df_regs_ever_live_p (regno)) + return true; This can be done as a follow-up, but it looks like df should be using a HARD_REG_SET here, and that we should be able to get at it directly. For the df_regs_ever_live thing? Could change that, yes. [...] AIUI, this prevents the optimisation for things like if (a) { switch (b) { case 1: ...stuff that requires a frame... break; case 2: ...stuff that requires a frame... break; default: ...stuff that doesn't require a frame... break; } } The switch won't be in ANTIC, but it will have two successors that are. Is that right? Would it work to do something like: [...] IIRC the problem here is making sure to match up prologues and epilogues - the latter should not occur on any path that had a prologue set up and vice versa. I think something more clever would break on e.g. if (c) goto label; if (a) { switch (b) { case 1: ...stuff that requires a frame... break; case 2: ...stuff that requires a frame... break; default: ...stuff that doesn't require a frame... label: ...more stuff that doesn't require a frame... break; } } If you add a prologue before the switch, two paths join at label where one needs a prologue and the other doesn't. Does the JUMP_LABEL (returnjump) = ret_rtx; handle targets that use things like (set (pc) (reg RA)) as their return? Probably worth adding a comment if so. It simply
Re: [ARM] Deprecate -mwords-little-endian
On 07/07/11 16:18, Richard Sandiford wrote: Richard Earnshaw rearn...@arm.com writes: On 29/06/11 12:28, Richard Sandiford wrote: ARM has an option called -mwords-little-endian that provides big-endian compatibility with pre-2.8 compilers. When I asked Richard about it, he seemed to think it had outlived its usefulness, so this patch deprecates it. We can then remove it once 4.7 is out. Tested on arm-linux-gnueabi. OK to install? If so, I'll do a patch for the web page as well. Please also update the in-line help text in arm.opt. OK with that change. Thanks. I've attached the patch I applied below. How's this for the docs change? -- Index: htdocs/gcc-4.7/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.20 diff -u -r1.20 changes.html --- htdocs/gcc-4.7/changes.html 6 Jul 2011 23:37:11 - 1.20 +++ htdocs/gcc-4.7/changes.html 7 Jul 2011 15:17:03 - @@ -43,6 +43,9 @@ only intended as a migration aid from SunOS 4 to SunOS 5. The code-compat-bsd/code compiler option is not recognized any longer./li + +liThe ARM port's code-mwords-little-endian/code option has +been deprecated. It will be removed in a future release./li /ul Looks fine to me, but please allow 24 hours for the web maintainers to comment if they wish. R.
Re: Generic hwloop support library
On 07/05/11 21:25, Richard Sandiford wrote: (Could you bootstrap this on x86_64 to check for things like that? That has no loop_end pattern so it wouldn't be much of a test, but a x86_64 x bfin compiler has no warnings in this file with the intptr_t thing fixed. A C bootstrap only should be fine of course, since the code isn't going to be run.) + hwloop_info loops = NULL; Unnecessary initialisation (or at least, it should be). ? The value is used inside the loop to initialize next of the first loop. Committed with these changes (except the last). Bernd
Re: [PATCH] Fix folding of -(unsigned)(a * -b)
Hi, On Thu, 7 Jul 2011, Richard Guenther wrote: Index: gcc/fold-const.c === --- gcc/fold-const.c (revision 175962) +++ gcc/fold-const.c (working copy) @@ -7561,7 +7561,7 @@ fold_unary_loc (location_t loc, enum tre if (arg0) { if (CONVERT_EXPR_CODE_P (code) - || code == FLOAT_EXPR || code == ABS_EXPR) + || code == FLOAT_EXPR || code == ABS_EXPR || code == NEGATE_EXPR) { /* Don't use STRIP_NOPS, because signedness of argument type matters. */ Um, so why would stripping a signchange ever be okay? There are many other unary codes that behave similar enough to FLOAT_EXPR, or CONVERT_EXPR that it's not obvious to me why those would allow sign stripping but the above not. When the operand is float or fixed point types then STRIP_SIGN_NOPS and STRIP_NOPS aren't different, and when the operands are integer types I don't see how we can ignore sign-changing nops. I'm thinking about: VEC_UNPACK_HI_EXPR, VEC_UNPACK_LO_EXPR and PAREN_EXPR Perhaps BIT_NOT_EXPR. Perhaps also NON_LVALUE_EXPR. All these can conceivably have integer operands, where signedness seems to matter. I think these are harmless: CONJ_EXPR, FIXED_CONVERT_EXPR, FIX_TRUNC_EXPR, ADDR_SPACE_CONVERT_EXPR as their operands are either float/fixed-point types or pointers, but as said in those cases STRIP_NOPS and STRIP_SIGN_NOPS are equivalent. So, why not simply always use STRIP_SIGN_NOPS? Ciao, Michael.
Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls
On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote: On Thu, Jul 7, 2011 at 12:29 AM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch adds an option to not load the static chain (r11) for 64-bit PowerPC calls through function pointers (or virtual function). Most of the languages on the PowerPC do not need the static chain being loaded when called, and adding this instruction can slow down code that calls very short functions. In addition, if the function does not call alloca, setjmp or deal with exceptions where the stack is modified, the compiler can move the store of the TOC value for the current function to the prologue of the function, rather than at each call site. The effect of these patches is to speed up 464.h264ref in the Spec 2006 benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but the save of the TOC register is hoisted). I believe this is due to the load of the current function's TOC (r2) having to wait until the store queue is drained with the store just before the call. Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what the cause is. I have bootstraped the compiler and saw that there were no regressions in make check. Is it ok to install in the trunk? Hum. Can't the compiler figure this our itself per-call-site? At least the name of the command-line switch -m[no-]r11 is meaningless to me. Points-to information should be able to tell you if the function pointer points to a nested function. No, the compiler cannot figure it out. Consider the case where a function is passed a pointer to a function, such as the standard library function qsort. The call may come from any random module, that isn't part of the compilation suite, such as if the function being passed the pointer is in a shared library. You don't know whether the function pointed to uses the static chain (i.e. nested function call with trampoline, call to PL/I, or other language that does use the static chain, which is part of the ABI). The point of the switch is similar to -ffast-math where you say you are willing to ignore some corner cases in the standard in order to get better performance. I certainly can call the switch -mno-static-chain, which is perhaps more meaningful (at least to us compiler folk, I'm not sure static chain means much to the normal programmer). -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Ping Re: Remove config.gcc support for *local* configurations
Ping. This patch http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02408.html is pending review. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Fix dead_debug_insert_before ICE (PR debug/49522, take 3)
So, here is a new patch which doesn't need two loops, just might go a little bit backwards to unchain dead_debug_use for the reset insn. It still needs the change of the gcc_assert (reg) into if (reg == NULL) return;, because the dead-used bitmap is with this sometimes a false positive (saying that a regno is referenced even when it isn't). But here it is IMHO better to occassionaly live with the false positives, which just means we'll sometimes once walk the chain in dead_debug_reset or dead_debug_insert_before before resetting it, than to recompute the bitmap (we'd need a second loop for that, bitmap_clear (debug-used) and populate it again). Fine with me for both points, but move some bits of these explanations to the code itself because this isn't obvious. For example see below. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-07-07 Jakub Jelinek ja...@redhat.com PR debug/49522 * df-problems.c (dead_debug_reset): Remove dead_debug_uses referencing debug insns that have been reset. (dead_debug_insert_before): Don't assert reg is non-NULL, instead return immediately if it is NULL. * gcc.dg/debug/pr49522.c: New test. OK, thanks. --- gcc/df-problems.c.jj 2011-07-07 02:32:45.928547053 +0200 +++ gcc/df-problems.c 2011-07-07 09:57:34.846464573 +0200 @@ -3096,6 +3096,7 @@ static void dead_debug_reset (struct dead_debug *debug, unsigned int dregno) { struct dead_debug_use **tailp = debug-head; + struct dead_debug_use **insnp = debug-head; struct dead_debug_use *cur; rtx insn; @@ -3113,9 +3114,21 @@ dead_debug_reset (struct dead_debug *deb debug-to_rescan = BITMAP_ALLOC (NULL); bitmap_set_bit (debug-to_rescan, INSN_UID (insn)); XDELETE (cur); + if (tailp != insnp DF_REF_INSN ((*insnp)-use) == insn) + tailp = insnp; /* If the current use isn't the first one attached to INSN, go back to this first use. We assume that the uses attached to an insn are adjacent. */ + while ((cur = *tailp) DF_REF_INSN (cur-use) == insn) + { + *tailp = cur-next; + XDELETE (cur); + } + insnp = tailp; /* Then remove all the other uses attached to INSN. */ } else - tailp = (*tailp)-next; + { + if (DF_REF_INSN ((*insnp)-use) != DF_REF_INSN (cur-use)) + insnp = tailp; + tailp = (*tailp)-next; + } } } @@ -3174,7 +3187,8 @@ dead_debug_insert_before (struct dead_de tailp = (*tailp)-next; } - gcc_assert (reg); + if (reg == NULL) +return; /* We may have dangling bits in debug-used for registers that were part of a multi-register use, one component of which has been reset. */ -- Eric Botcazou
Re: CFT: Move unwinder to toplevel libgcc
On Thu, 2011-07-07 at 15:08 +0200, Rainer Orth wrote: Tristan Gingold ging...@adacore.com writes: Otherwise, the patch is unchanged from the original submission: [build] Move unwinder to toplevel libgcc http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01452.html Unfortunately, it hasn't seen much comment. I'm now looking for testers especially on platforms with more change and approval of those parts: * Several IA-64 targets: ia64*-*-linux* ia64*-*-hpux* ia64-hp-*vms* For ia64-hp-vms, consider your patch approved if the parts for ia64 are. In case of break, I will fix them. In that case, perhaps Steve could have a look? I'd finally like to make some progress on this patch. Thanks. Rainer I just tried builds on ia64 linux and HP-UX and both builds failed. I am re-trying the HP-UX build with --with-system-libunwind to see if that changes things but that should be the default on IA64 HP-UX. On Linux (debian) the build stopped with: /test/big-foot1/gcc/nightly/gcc-ia64-debian-linux-gnu-trunk/ia64-debian-linux-gnu/bin/ranlib libgcov.a make[3]: *** No rule to make target `/test/big-foot1/gcc/nightly/src/trunk/libgcc/unwind-sjlj.c', needed by `unwind-sjlj.o'. Stop. make[3]: Leaving directory `/test/big-foot1/gcc/nightly/build-ia64-debian-linux-gnu-trunk/obj_gcc/ia64-debian-linux-gnu/libgcc' make[2]: *** [all-stage1-target-libgcc] Error 2 make[2]: Leaving directory `/test/big-foot1/gcc/nightly/build-ia64-debian-linux-gnu-trunk/obj_gcc' make[1]: *** [stage1-bubble] Error 2 make[1]: Leaving directory `/test/big-foot1/gcc/nightly/build-ia64-debian-linux-gnu-trunk/obj_gcc' make: *** [bootstrap] Error 2 The patch appeared to install correctly into my source tree and I ran autoreconf to regenerate the configure files. It looks like patch didn't handle the unwind files that moved. I will try doing that by hand and see if that fixes things. Steve Ellcey s...@cup.hp.com
Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls
On Thu, Jul 7, 2011 at 5:47 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote: On Thu, Jul 7, 2011 at 12:29 AM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch adds an option to not load the static chain (r11) for 64-bit PowerPC calls through function pointers (or virtual function). Most of the languages on the PowerPC do not need the static chain being loaded when called, and adding this instruction can slow down code that calls very short functions. In addition, if the function does not call alloca, setjmp or deal with exceptions where the stack is modified, the compiler can move the store of the TOC value for the current function to the prologue of the function, rather than at each call site. The effect of these patches is to speed up 464.h264ref in the Spec 2006 benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but the save of the TOC register is hoisted). I believe this is due to the load of the current function's TOC (r2) having to wait until the store queue is drained with the store just before the call. Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what the cause is. I have bootstraped the compiler and saw that there were no regressions in make check. Is it ok to install in the trunk? Hum. Can't the compiler figure this our itself per-call-site? At least the name of the command-line switch -m[no-]r11 is meaningless to me. Points-to information should be able to tell you if the function pointer points to a nested function. No, the compiler cannot figure it out. Consider the case where a function is passed a pointer to a function, such as the standard library function qsort. The call may come from any random module, that isn't part of the compilation suite, such as if the function being passed the pointer is in a shared library. You don't know whether the function pointed to uses the static chain (i.e. nested function call with trampoline, call to PL/I, or other language that does use the static chain, which is part of the ABI). The point of the switch is similar to -ffast-math where you say you are willing to ignore some corner cases in the standard in order to get better performance. Well, I guess you don't propose to build glibc with -mno-r11? The compiler certainly can't figure out in _all_ cases - but it should be able to handle most of the cases (with LTO even more cases) ok, no? I also wonder why loading a register is so expensive compared to the actual call ... I certainly can call the switch -mno-static-chain, which is perhaps more meaningful (at least to us compiler folk, I'm not sure static chain means much to the normal programmer). Well, that's up to the target maintainers to decide, maybe -mno-nested-functions instead? Richard.
Re: [PATCH 4/6] Shrink-wrapping
On 07/07/11 15:34, Richard Sandiford wrote: It seems a shame to have both (return) and (simple_return). You said that we need the distinction in order to cope with targets like ARM, whose (return) instruction actually performs some of the epilogue too. It feels like the load of the saved registers should really be expressed in rtl, in parallel with the return. I realise that'd prevent conditional returns though. Maybe there's no elegant way out... You'd still need to deal with distinct returns for shrink-wrapped code when the full (return) expands to ldm sp, {regs..., pc} The shrink wrapped version would always be bx lr There are also cases (eg on v4T) where the Thumb return sequence sometimes has to pop into a lo register before branching to that return address, eg pop {r3} bx r3 in order to get interworking. R.
Re: [PATCH] Make VRP optimize useless conversions
Hi, On Thu, 7 Jul 2011, Richard Guenther wrote: + tree rhs1 = gimple_assign_rhs1 (stmt); + gimple def_stmt = SSA_NAME_DEF_STMT (rhs1); + value_range_t *final, *inner; + + /* Obtain final and inner value-ranges for a conversion + sequence (final-type)(intermediate-type)inner-type. */ + final = get_value_range (gimple_assign_lhs (stmt)); + if (final-type != VR_RANGE) + return false; + if (!is_gimple_assign (def_stmt) + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt))) + return false; + rhs1 = gimple_assign_rhs1 (def_stmt); + if (TREE_CODE (rhs1) != SSA_NAME) + return false; + inner = get_value_range (rhs1); + if (inner-type != VR_RANGE) + return false; + if (!tree_int_cst_equal (final-min, inner-min) + || !tree_int_cst_equal (final-max, inner-max)) + return false; I think that's a bit too conservative. Granted in current VRP it might work, but think about an intermediate truncation plus widening: short s; short d = (short)(signed char)s; It wouldn't be wrong for VRP to assign d the range [-16384,16383], suboptimal but correct. That would trigger your function in removing the truncation, and _that_ would be incorrect. The bounds of VRP aren't reliably tight. You probably want to recheck if the intermediate conversion isn't truncating the known input range of rhs1. Ciao, Michael.
Re: [testsuite] ARM wmul tests: require arm_dsp_multiply
On 06/07/11 18:33, Janis Johnson wrote: On 06/29/2011 06:25 AM, Richard Earnshaw wrote: On 23/06/11 22:38, Janis Johnson wrote: Tests wmul-[1234].c and mla-2.c in gcc.target/arm require support that the arm backend identifies as TARGET_DSP_MULTIPLY. The tests all specify a -march option with that support, but it is overridden by multilib flags. This patch adds a new effective target, arm_dsp_multiply, and requires it for those tests instead of having them specify a -march value. This means that the tests will be skipped for older targets and test coverage relies on testing for some newer multilibs. The same effective target is needed for tests smlaltb-1.c, smlaltt-1.c, smlatb-1.c, and smlatt-1.c, but those also need to be renamed so the scans don't pass just because the file name is in the assembly file. OK for trunk, and later for 4.6? (btw, I'm currently testing ARM compile-only tests with 43 sets of multilib flags) I've recently approved a patch from James Greenhalgh (http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01852.html) that defines __ARM_DSP_MULTIPLY when these features are available. That should simplify your target-supports change and also serve as a check that we aren't erroneously defining that macro. R. This version uses the new macro from James Greenhalgh, making the effective-target check trivial. The patch removes -march options from the tests, and adds a tab to the scans in smla*.c so the scan won't match the file name; there are other arm tests that use tab in the search target. OK for trunk, and later for 4.6? Putting this patch on 4.6 requires the new macro there as well. I have no objections if the branch maintainers are happy with this. R.
Re: [PATCH] New IPA-CP with real function cloning
Hi, patch is long, so let me review it in more passes. 2011-06-22 Martin Jambor mjam...@suse.cz * ipa-prop.h: Include alloc-pool.h. (ipa_lattice_type): Removed. (ipcp_value_source): New type. (ipcp_value): Likewise. (ipcp_values_pool): Declare. (ipcp_sources_pool): Likewise. (ipa_param_descriptor): Removed. (ipcp_lattice): Removed fileds type and constant. Added fields decl, values, values_count, contains_variable, bottom, used and virt_call. (ipa_node_params): New fields lattices, known_vals, clone_for_all_contexts and noe dead, removed fields params and count_scale. (ipa_get_param): Updated. (ipa_param_cannot_devirtualize_p): Removed. (ipa_param_types_vec_empty): Likewise. (ipa_edge_args): New field next_edge_clone. (ipa_func_list): Removed. (ipa_init_func_list): Removed declaration. (ipa_push_func_to_list_1): Likewise. (ipa_pop_func_from_list): Likewise. (ipa_push_func_to_list): Removed. (ipa_lattice_from_jfunc): Remove declaration. (ipa_get_jf_pass_through_result): Declare. (ipa_get_jf_ancestor_result): Likewise. (ipa_value_from_jfunc): Likewise. (ipa_get_lattice): Update. (ipa_lat_is_single_const): New function. * ipa-prop.c (ipa_push_func_to_list_1): Removed. (ipa_init_func_list): Likewise. (ipa_pop_func_from_list): Likewise. (ipa_get_param_decl_index): Fix coding style. (ipa_populate_param_decls): Update to use new lattices. (ipa_initialize_node_params): Likewise. (visit_ref_for_mod_analysis): Likewise. (ipa_analyze_params_uses): Likewise. (ipa_free_node_params_substructures): Likewise. (ipa_edge_duplication_hook): Add the new edge to the list of edge clones. (ipa_node_duplication_hook): Update to use new lattices. (ipa_free_all_structures_after_ipa_cp): Free alloc pools. (ipa_free_all_structures_after_iinln): Likewise. (ipa_write_node_info): Update to use new lattices. (ipa_read_node_info): Likewise. (ipa_get_jf_pass_through_result): New function. (ipa_get_jf_ancestor_result): Likewise. (ipa_value_from_jfunc): Likewise. (ipa_cst_from_jfunc): Reimplemented using ipa_value_from_jfunc. * ipa-cp.c: Reimplemented. * params.def (PARAM_DEVIRT_TYPE_LIST_SIZE): Removed. (PARAM_IPA_CP_VALUE_LIST_SIZE): New parameter. * Makefile.in (IPA_PROP_H): Added alloc-pool.h to dependencies. * doc/invoke.texi (devirt-type-list-size): Removed description. (ipa-cp-value-list-size): Added description. * testsuite/gcc.dg/ipa/ipa-1.c: Updated testcase dump scan. * testsuite/gcc.dg/ipa/ipa-2.c: Likewise. * testsuite/gcc.dg/ipa/ipa-3.c: Likewise and made functions static. * testsuite/gcc.dg/ipa/ipa-4.c: Updated testcase dump scan. * testsuite/gcc.dg/ipa/ipa-5.c: Likewise. * testsuite/gcc.dg/ipa/ipa-7.c: Xfail test. * testsuite/gcc.dg/ipa/ipa-8.c: Updated testcase dump scan. * testsuite/gcc.dg/ipa/ipacost-1.c: Likewise. * testsuite/gcc.dg/ipa/ipacost-2.c: Likewise. * testsuite/gcc.dg/ipa/ipcp-1.c: New test. * testsuite/gcc.dg/ipa/ipcp-2.c: Likewise. * testsuite/gcc.dg/tree-ssa/ipa-cp-1.c: Updated testcase. /* Interprocedural analyses. Copyright (C) 2005, 2007, 2008, 2009, 2010 2011 Free Software Foundation, Inc. /* The following definitions and interfaces are used by interprocedural analyses or parameters. */ /* ipa-prop.c stuff (ipa-cp, indirect inlining): */ I was bit thinking about it and probably we could make ipa-prop and ipa-inline-analysis to be stand alone analysis passes, instead of something called either from inliner or ipa-cp analysis stage. But that could be done incrementally. /* A jump function for a callsite represents the values passed as actual arguments of the callsite. There are three main types of values : Pass-through - the caller's formal parameter is passed as an actual argument, possibly one simple operation performed on it. Constant - a constant (is_gimple_ip_invariant)is passed as an actual argument. Unknown - neither of the above. IPA_JF_CONST_MEMBER_PTR stands for C++ member pointers, it is a special constant in this regard. Other constants are represented with IPA_JF_CONST. While we are at docs, I would bit expand. It seems to me that for someone not familiar with the concept is not clear at all why member pointers are special. (i.e. explain that they are non-gimple-regs etc.) IPA_JF_ANCESTOR is a special pass-through jump function, which means that the result is an address of a part of the object pointed to by the formal parameter to which the function refers. It is mainly intended to represent getting
[patch tree-optimization]: [1 of 3]: Boolify compares more
Hello, This patch - first of series - adds to fold and some helper routines support for one-bit precision bitwise folding and detection. This patch is necessary for - next patch of series - boolification of comparisons. Bootstrapped and regression tested for all standard-languages (plus Ada and Obj-C++) on host x86_64-pc-linux-gnu. Ok for apply? Regards, Kai ChangeLog 2011-07-07 Kai Tietz kti...@redhat.com * fold-const.c (fold_truth_not_expr): Handle one bit precision bitwise operations. (fold_range_test): Likewise. (fold_truthop): Likewise. (fold_binary_loc): Likewise. (fold_truth_andor): Function replaces truth_andor label. (fold_ternary_loc): Use truth_value_type_p instead of truth_value_p. * gimple.c (canonicalize_cond_expr_cond): Likewise. * gimplify.c (gimple_boolify): Likewise. * tree-ssa-structalias.c (find_func_aliases): Likewise. * tree-ssa-forwprop.c (truth_valued_ssa_name): Likewise. * tree.h (truth_value_type_p): New function. (truth_value_p): Implemented as macro via truth_value_type_p. Index: gcc-head/gcc/fold-const.c === --- gcc-head.orig/gcc/fold-const.c +++ gcc-head/gcc/fold-const.c @@ -3074,20 +3074,35 @@ fold_truth_not_expr (location_t loc, tre case INTEGER_CST: return constant_boolean_node (integer_zerop (arg), type); +case BIT_AND_EXPR: + if (integer_onep (TREE_OPERAND (arg, 1))) + return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0)); + if (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 0))) != 1) +return NULL_TREE; + /* fall through */ case TRUTH_AND_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc); - return build2_loc (loc, TRUTH_OR_EXPR, type, + return build2_loc (loc, (code == BIT_AND_EXPR ? BIT_IOR_EXPR + : TRUTH_OR_EXPR), type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)), invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1))); +case BIT_IOR_EXPR: + if (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 0))) != 1) +return NULL_TREE; + /* fall through. */ case TRUTH_OR_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc); - return build2_loc (loc, TRUTH_AND_EXPR, type, + return build2_loc (loc, (code == BIT_IOR_EXPR ? BIT_AND_EXPR + : TRUTH_AND_EXPR), type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)), invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1))); - +case BIT_XOR_EXPR: + if (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 0))) != 1) +return NULL_TREE; + /* fall through. */ case TRUTH_XOR_EXPR: /* Here we can invert either operand. We invert the first operand unless the second operand is a TRUTH_NOT_EXPR in which case our @@ -3095,10 +3110,14 @@ fold_truth_not_expr (location_t loc, tre negation of the second operand. */ if (TREE_CODE (TREE_OPERAND (arg, 1)) == TRUTH_NOT_EXPR) - return build2_loc (loc, TRUTH_XOR_EXPR, type, TREE_OPERAND (arg, 0), + return build2_loc (loc, code, type, TREE_OPERAND (arg, 0), + TREE_OPERAND (TREE_OPERAND (arg, 1), 0)); + else if (TREE_CODE (TREE_OPERAND (arg, 1)) == BIT_NOT_EXPR + TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 1))) == 1) + return build2_loc (loc, code, type, TREE_OPERAND (arg, 0), TREE_OPERAND (TREE_OPERAND (arg, 1), 0)); else - return build2_loc (loc, TRUTH_XOR_EXPR, type, + return build2_loc (loc, code, type, invert_truthvalue_loc (loc, TREE_OPERAND (arg, 0)), TREE_OPERAND (arg, 1)); @@ -3116,6 +3135,11 @@ fold_truth_not_expr (location_t loc, tre invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)), invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1))); + +case BIT_NOT_EXPR: + if (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 0))) != 1) +return NULL_TREE; + /* fall through */ case TRUTH_NOT_EXPR: return TREE_OPERAND (arg, 0); @@ -3158,11 +3182,6 @@ fold_truth_not_expr (location_t loc, tre return build1_loc (loc, TREE_CODE (arg), type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0))); -case BIT_AND_EXPR: - if (!integer_onep (TREE_OPERAND (arg, 1))) - return NULL_TREE; - return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0)); - case SAVE_EXPR: return build1_loc (loc,
[patch tree-optimization]: [2 of 3]: Boolify compares more
Hello, This patch - second of series - adds boolification of comparisions in gimplifier. For this casts from/to boolean are marked as not-useless. And in fold_unary_loc casts to non-boolean integral types are preserved. The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not strictly necessary - as long as fold-const handles 1-bit precision bitwise-expression with truth-logic - but it has shown to short-cut some expensier folding. So I kept it within this patch. The adjusted testcase gcc.dg/uninit-15.c indicates that due optimization we loose in this case variables declaration. But this might be to be expected. In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c test-case. It's caused by always having boolean-type on conditions. So vectorizer sees different types, which aren't handled by vectorizer right now. Maybe this issue could be special-cased for boolean-types in tree-vect-loop, by making operand for used condition equal to vector-type. But this is a subject for a different patch and not addressed by this series. There is a regressions in tree-ssa/vrp47.c, and the fix is addressed by the 3rd patch of this series. Bootstrapped and regression tested for all standard-languages (plus Ada and Obj-C++) on host x86_64-pc-linux-gnu. Ok for apply? Regards, Kai ChangeLog 2011-07-07 Kai Tietz kti...@redhat.com * fold-const.c (fold_unary_loc): Preserve non-boolean-typed casts. * gimplify.c (gimple_boolify): Handle boolification of comparisons. (gimplify_expr): Boolifiy non aggregate-typed comparisons. * tree-cfg.c (verify_gimple_comparison): Check result type of comparison expression. * tree-ssa.c (useless_type_conversion_p): Preserve incompatible casts from/to boolean, * tree-ssa-forwprop.c (combine_cond_expr_cond): Add simplification support for one-bit-precision typed X for cases X != 0 and X == 0. (forward_propagate_comparison): Adjust test of condition result. * gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted. * gcc.dg/tree-ssa/pr21031.c: Likewise. * gcc.dg/tree-ssa/pr30978.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-6.c: Likewise. * gcc.dg/binop-xor1.c: Mark it as expected fail. * gcc.dg/binop-xor3.c: Likewise. * gcc.dg/uninit-15.c: Adjust reported message. Index: gcc-head/gcc/fold-const.c === --- gcc-head.orig/gcc/fold-const.c +++ gcc-head/gcc/fold-const.c @@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre non-integral type. Do not fold the result as that would not simplify further, also folding again results in recursions. */ - if (INTEGRAL_TYPE_P (type)) + if (TREE_CODE (type) == BOOLEAN_TYPE) return build2_loc (loc, TREE_CODE (op0), type, TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1)); - else + else if (!INTEGRAL_TYPE_P (type)) return build3_loc (loc, COND_EXPR, type, op0, fold_convert (type, boolean_true_node), fold_convert (type, boolean_false_node)); Index: gcc-head/gcc/gimplify.c === --- gcc-head.orig/gcc/gimplify.c +++ gcc-head/gcc/gimplify.c @@ -2842,18 +2842,23 @@ gimple_boolify (tree expr) case TRUTH_NOT_EXPR: TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0)); - /* FALLTHRU */ -case EQ_EXPR: case NE_EXPR: -case LE_EXPR: case GE_EXPR: case LT_EXPR: case GT_EXPR: /* These expressions always produce boolean results. */ - TREE_TYPE (expr) = boolean_type_node; + if (TREE_CODE (type) != BOOLEAN_TYPE) + TREE_TYPE (expr) = boolean_type_node; return expr; default: + if (COMPARISON_CLASS_P (expr)) + { + /* There expressions always prduce boolean results. */ + if (TREE_CODE (type) != BOOLEAN_TYPE) + TREE_TYPE (expr) = boolean_type_node; + return expr; + } /* Other expressions that get here must have boolean values, but might need to be converted to the appropriate mode. */ - if (type == boolean_type_node) + if (TREE_CODE (type) == BOOLEAN_TYPE) return expr; return fold_convert_loc (loc, boolean_type_node, expr); } @@ -6763,7 +6768,7 @@ gimplify_expr (tree *expr_p, gimple_seq tree org_type = TREE_TYPE (*expr_p); *expr_p = gimple_boolify (*expr_p); - if (org_type != boolean_type_node) + if (!useless_type_conversion_p (org_type, TREE_TYPE (*expr_p))) { *expr_p = fold_convert (org_type, *expr_p); ret = GS_OK; @@ -7208,7 +7213,7 @@ gimplify_expr (tree *expr_p, gimple_seq
[patch tree-optimization]: [3 of 3]: Boolify compares more
Hello, This patch - third of series - fixes vrp to handle bitwise one-bit precision typed operations. And it introduces a second - limitted to non-switch-statement range - vrp pass. Bootstrapped and regression tested for all standard-languages (plus Ada and Obj-C++) on host x86_64-pc-linux-gnu. Ok for apply? Regards, Kai ChangeLog 2011-07-07 Kai Tietz kti...@redhat.com * tree-vrp.c (in_second_pass): New static variable. (extract_range_from_binary_expr): Add handling for BIT_IOR_EXPR, BIT_AND_EXPR, and BIT_NOT_EXPR. (register_edge_assert_for_1): Add handling for 1-bit BIT_IOR_EXPR and BIT_NOT_EXPR. (register_edge_assert_for): Add handling for 1-bit BIT_IOR_EXPR. (ssa_name_get_inner_ssa_name_p): New helper function. (ssa_name_get_cast_to_p): New helper function. (simplify_truth_ops_using_ranges): Handle prefixed cast instruction for result, and add support for one bit precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR, , and BIT_NOT_EXPR. (simplify_stmt_using_ranges): Add handling for one bit precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR, and BIT_NOT_EXPR. (vrp_finalize): Do substitute and fold pass a second time for vrp_stmt and preserve switch-edge simplification on second run. (simplify_switch_using_ranges): Preserve rerun of function in second pass. Index: gcc-head/gcc/tree-vrp.c === --- gcc-head.orig/gcc/tree-vrp.c +++ gcc-head/gcc/tree-vrp.c @@ -74,6 +74,9 @@ struct value_range_d typedef struct value_range_d value_range_t; +/* This flag indicates that we are doing a second pass of VRP. */ +static bool in_second_pass = false; + /* Set of SSA names found live during the RPO traversal of the function for still active basic-blocks. */ static sbitmap *live; @@ -2232,6 +2235,7 @@ extract_range_from_binary_expr (value_ra some cases. */ if (code != BIT_AND_EXPR code != TRUTH_AND_EXPR + code != BIT_IOR_EXPR code != TRUTH_OR_EXPR code != TRUNC_DIV_EXPR code != FLOOR_DIV_EXPR @@ -2291,6 +2295,8 @@ extract_range_from_binary_expr (value_ra else set_value_range_to_varying (vr); } + else if (code == BIT_IOR_EXPR) +set_value_range_to_varying (vr); else gcc_unreachable (); @@ -2300,11 +2306,13 @@ extract_range_from_binary_expr (value_ra /* For integer ranges, apply the operation to each end of the range and see what we end up with. */ if (code == TRUTH_AND_EXPR - || code == TRUTH_OR_EXPR) + || code == TRUTH_OR_EXPR + || ((code == BIT_AND_EXPR || code == BIT_IOR_EXPR) + TYPE_PRECISION (TREE_TYPE (op1)) == 1)) { /* If one of the operands is zero, we know that the whole expression evaluates zero. */ - if (code == TRUTH_AND_EXPR + if ((code == TRUTH_AND_EXPR || code == BIT_AND_EXPR) ((vr0.type == VR_RANGE integer_zerop (vr0.min) integer_zerop (vr0.max)) @@ -2317,7 +2325,7 @@ extract_range_from_binary_expr (value_ra } /* If one of the operands is one, we know that the whole expression evaluates one. */ - else if (code == TRUTH_OR_EXPR + else if ((code == TRUTH_OR_EXPR || code == BIT_IOR_EXPR) ((vr0.type == VR_RANGE integer_onep (vr0.min) integer_onep (vr0.max)) @@ -2809,7 +2817,7 @@ extract_range_from_unary_expr (value_ran cannot easily determine a resulting range. */ if (code == FIX_TRUNC_EXPR || code == FLOAT_EXPR - || code == BIT_NOT_EXPR + || (code == BIT_NOT_EXPR TYPE_PRECISION (type) != 1) || code == CONJ_EXPR) { /* We can still do constant propagation here. */ @@ -3976,7 +3984,9 @@ build_assert_expr_for (tree cond, tree v tree a = build2 (ASSERT_EXPR, TREE_TYPE (v), v, cond); assertion = gimple_build_assign (n, a); } - else if (TREE_CODE (cond) == TRUTH_NOT_EXPR) + else if (TREE_CODE (cond) == TRUTH_NOT_EXPR + || (TREE_CODE (cond) == BIT_NOT_EXPR + TYPE_PRECISION (TREE_TYPE (cond)) == 1)) { /* Given !V, build the assignment N = false. */ tree op0 = TREE_OPERAND (cond, 0); @@ -4531,7 +4541,9 @@ register_edge_assert_for_1 (tree op, enu retval |= register_edge_assert_for_1 (gimple_assign_rhs2 (op_def), code, e, bsi); } - else if (gimple_assign_rhs_code (op_def) == TRUTH_NOT_EXPR) + else if (gimple_assign_rhs_code (op_def) == TRUTH_NOT_EXPR + || (gimple_assign_rhs_code (op_def) == BIT_NOT_EXPR + TYPE_PRECISION (TREE_TYPE (op)) == 1)) { /* Recurse, flipping CODE. */ code = invert_tree_comparison (code, false); @@ -4617,6 +4629,9 @@
Re: [testsuite] arm tests: remove -march= and dg-prune-output from 3 tests
On 07/07/11 00:26, Janis Johnson wrote: For three tests in gcc.target/arm that don't depend on processor-specific behavior, don't specify the -march option. This makes dg-prune-output for warnings about conflicts unnecessary, so remove it. Two of these tests are for internal compiler errors that showed up with particular values of -march. I think it's fine to test them with normal multilibs, some of which will use those -march values, and others of which could trigger a closely-related ICE. If there'a a desire to use specific options in a test, I'd prefer to see it done in a copy of the test that is skipped for all multilibs but the default. OK for trunk, and for 4.6 after a few days? gcc-20110706-3 2011-07-06 Janis Johnson jani...@codesourcery.com * gcc.target/arm/pr41679.c: Remove -march options and unneeded dg-prune-output. * gcc.target/arm/pr46883.c: Likewise. * gcc.target/arm/xor-and.c: Likewise. Index: gcc.target/arm/pr41679.c I think this should just be moved to gcc.c-torture/compile. There doesn't seem to be anything processor-specific here. Index: gcc.target/arm/pr46883.c Likewise. Index: gcc.target/arm/xor-and.c === --- gcc.target/arm/xor-and.c (revision 175921) +++ gcc.target/arm/xor-and.c (working copy) @@ -1,6 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O -march=armv6 } */ -/* { dg-prune-output switch .* conflicts with } */ +/* { dg-options -O } */ unsigned short foo (unsigned short x) { The purpose of this test seems to be to ensure that when compiling for v6 we don't get particular instructions. Removing the -march flag means we won't normally test this in the way intended (ie unless the multilibs explicitly test v6). This is one of those cases where I think the intention really is to force one particular instruction set. R.
Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls
[...] On Jul 7, 2011, at 5:53 PM, Richard Guenther wrote: On Thu, Jul 7, 2011 at 5:47 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: I certainly can call the switch -mno-static-chain, which is perhaps more meaningful (at least to us compiler folk, I'm not sure static chain means much to the normal programmer). Well, that's up to the target maintainers to decide, maybe -mno-nested-functions instead? Isn't that an issue of pointer to nested functions rather than nested functions ? So -mno-nested-function-pointers would be more accurate That's somewhat important from an Ada POV as nested subprograms are common, but access/pointer to nested subprogram is not very usual. My two cents. Tristan.
Re: Fix PR 49014
On 07/01/2011 10:50 AM, Andrey Belevantsev wrote: On 26.05.2011 17:32, Andrey Belevantsev wrote: On 25.05.2011 19:31, Bernd Schmidt wrote: On 05/25/2011 03:29 PM, Andrey Belevantsev wrote: I think the hook is a better idea than the attribute because nobody will care to mark all offending insns with an attribute. I don't know. IIRC when I looked at sh or whatever the broken port was, it was only two insns - there would still be some value in being able to assert that all other insns have a reservation. OK, I will take a look on x86-64 and will get back with more information. Andrey So, I have made an attempt to bootstrap on x86-64 with the extra assert in selective scheduling that assumes the DFA state always changes when issuing a recog_memoized =0 insn (patch attached). Indeed, there are just a few general insns that don't have proper reservations. However, it was a surprise to me to see that almost any insn with SSE registers fails this assert and thus does not get properly scheduled. Overall, the work on fixing those seems doable, it took just a day to get the compiler bootstrapped (of course, the testsuite may bring much more issues). So, if there is an agreement on marking a few offending insns with the new attribute, we can proceed with the help of somebody from the x86 land on fixing those and researching for other targets. The changes in sel-sched.c is ok for me. i386.md changes look ok for me too but you should ask a x86 maintainer to get an approval for the change. I think you should describe the attribute in the documentation because it is common for all targets. I can not approve common.opt changes because it makes selective scheduler is default for the 2nd insn scheduling for all targets. Such change should be justified by thorough testing and benchmarking (compilation speed, code size, performance improvements) on several platforms (at least on major ones).
Re: [patch tree-optimization]: [3 of 3]: Boolify compares more
2011/7/7 Paolo Bonzini bonz...@gnu.org: On 07/07/2011 06:07 PM, Kai Tietz wrote: + /* We redo folding here one time for allowing to inspect more + complex reductions. */ + substitute_and_fold (op_with_constant_singleton_value_range, + vrp_fold_stmt, false); + /* We need to mark this second pass to avoid re-entering of same + edges for switch statments. */ + in_second_pass = true; substitute_and_fold (op_with_constant_singleton_value_range, vrp_fold_stmt, false); + in_second_pass = false; This needs a much better explanation. Paolo Well, I can work on a better comment. The complex reduction I mean here are cases like int x; int y; _Bool D1; _Bool D2; _Bool D3; int R; D1 = x[0..1] != 0; D2 = y[0..1] != 0; D3 = D1 D2 R = (int) D3 (testcase is already present. See tree-ssa/vrp47.c). As VRP in first pass produces (and replaces) to: D1 = (_Bool) x[0..1]; D2 = (_Bool) y[0..1]; D3 = D1 D2 R = (int) D3 Just in the second pass the reduction R = x[0..1] y[0..1] can happen. In general it is sad that VRP can't insert during pass new statements right now. This would cause issues in range-tables, which aren't designed for insertations. As otherwise, we could do also simplify things like D1 = x[0..1] != 0; D2 = y[0..1] == 0; D3 = D1 D2 R = (int) D3 to R = x[0..1] (y[0..1] ^ 1) Regards, Kai
[patch] Disable static build for libjava
As discussed at the Google GCC gathering, disable the build of static libraries in libjava, which should cut the build time of libjava by 50%. The static libjava build isn't useful out of the box, and I don't see it packaged by Linux distributions either. The AC_PROG_LIBTOOL check is needed to get access to the enable_shared macro. I'm unsure about the check in the switch construct. Taken from libtool.m4, and determining the value of enable_shared_with_static_runtimes. Ok for the trunk? 2011-07-07 Matthias Klose d...@ubuntu.com * Makefile.def (target_modules/libjava): Pass $(libjava_disable_static). * configure.ac: Check for libtool, pass --disable-static in libjava_disable_static. * Makefile.in: Regenerate. * configure: Likewise. Index: Makefile.def === --- Makefile.def(revision 175963) +++ Makefile.def(working copy) @@ -132,7 +132,8 @@ target_modules = { module= winsup; }; target_modules = { module= libgloss; no_check=true; }; target_modules = { module= libffi; }; -target_modules = { module= libjava; raw_cxx=true; }; +target_modules = { module= libjava; raw_cxx=true; + extra_configure_flags=$(libjava_disable_static); }; target_modules = { module= zlib; }; target_modules = { module= boehm-gc; }; target_modules = { module= rda; }; Index: configure.ac === --- configure.ac(revision 175963) +++ configure.ac(working copy) @@ -443,6 +443,16 @@ ;; esac +AC_PROG_LIBTOOL +if test x$enable_shared = xyes ; then + case $host_cpu in + cygwin* | mingw* | pw32* | cegcc*) +;; + *) +libjava_disable_static=--disable-static + esac +fi +AC_SUBST(libjava_disable_static) # Disable libmudflap on some systems. if test x$enable_libmudflap = x ; then
Re: [testsuite] arm tests: remove -march= and dg-prune-output from 3 tests
On 07/07/2011 09:14 AM, Richard Earnshaw wrote: On 07/07/11 00:26, Janis Johnson wrote: Index: gcc.target/arm/pr41679.c I think this should just be moved to gcc.c-torture/compile. There doesn't seem to be anything processor-specific here. Index: gcc.target/arm/pr46883.c Likewise. OK, I'll do that. Index: gcc.target/arm/xor-and.c === --- gcc.target/arm/xor-and.c (revision 175921) +++ gcc.target/arm/xor-and.c (working copy) @@ -1,6 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O -march=armv6 } */ -/* { dg-prune-output switch .* conflicts with } */ +/* { dg-options -O } */ unsigned short foo (unsigned short x) { The purpose of this test seems to be to ensure that when compiling for v6 we don't get particular instructions. Removing the -march flag means we won't normally test this in the way intended (ie unless the multilibs explicitly test v6). This is one of those cases where I think the intention really is to force one particular instruction set. R. It passes everywhere, do you want to know when it stops passing for some other multilib, or just care about armv6? If you only care about armv6 then the test should be limited to run with the default multilib instead of having to muck around checking for incompatible options. Janis
Re: [testsuite] arm thumb tests: remove -march= and dg-prune-output from 9 tests
On 07/07/11 00:28, Janis Johnson wrote: This patch removes -march= from nine tests that also check for relevant effective targets. If -march is removed there is no need to ignore compiler warnings about conflicting options with dg-prune-output, so the patch removes that from the tests. OK for trunk, and for 4.6 in a few days? gcc-20110706-4 2011-07-06 Janis Johnson jani...@codesourcery.com * gcc.target/arm/pr39839.c: Remove -march option and unneeded dg-prune-output. * gcc.target/arm/pr40657-2.c: Likewise. * gcc.target/arm/pr40956.c: Likewise. * gcc.target/arm/pr42235.c: Likewise. * gcc.target/arm/pr42495.c: Likewise. * gcc.target/arm/pr42505.c: Likewise. * gcc.target/arm/pr42574.c: Likewise. * gcc.target/arm/pr46934.c: Likewise. * gcc.target/arm/thumb-branch1.c: Likewise. Index: gcc.target/arm/pr39839.c === --- gcc.target/arm/pr39839.c (revision 175921) +++ gcc.target/arm/pr39839.c (working copy) @@ -1,6 +1,5 @@ -/* { dg-options -mthumb -Os -march=armv5te -mthumb-interwork -fpic } */ +/* { dg-options -mthumb -Os -mthumb-interwork -fpic } */ /* { dg-require-effective-target arm_thumb1_ok } */ -/* { dg-prune-output switch .* conflicts with } */ /* { dg-final { scan-assembler-not str\[\\t \]*r.,\[\\t \]*.sp, } } */ I think this test should work in both ARM and Thumb mode and for any Thumb variant. So I'd be inclined to remove arm_thumb1_ok and change the dg-options to -Os -fpic. Index: gcc.target/arm/pr40657-2.c OK. Index: gcc.target/arm/pr40956.c === --- gcc.target/arm/pr40956.c (revision 175921) +++ gcc.target/arm/pr40956.c (working copy) @@ -1,7 +1,6 @@ -/* { dg-options -mthumb -Os -fpic -march=armv5te } */ +/* { dg-options -mthumb -Os -fpic } */ /* { dg-require-effective-target arm_thumb1_ok } */ /* { dg-require-effective-target fpic } */ -/* { dg-prune-output switch .* conflicts with } */ /* Make sure the constant 0 is loaded into register only once. */ /* { dg-final { scan-assembler-times mov\[\\t \]*r., #0 1 } } */ Same comment as for pr39839.c Index: gcc.target/arm/pr42235.c OK. Index: gcc.target/arm/pr42495.c OK. Index: gcc.target/arm/pr42505.c === --- gcc.target/arm/pr42505.c (revision 175921) +++ gcc.target/arm/pr42505.c (working copy) @@ -1,6 +1,5 @@ -/* { dg-options -mthumb -Os -march=armv5te } */ +/* { dg-options -mthumb -Os } */ /* { dg-require-effective-target arm_thumb1_ok } */ -/* { dg-prune-output switch .* conflicts with } */ /* { dg-final { scan-assembler-not str\[\\t \]*r.,\[\\t \]*.sp, } } */ Same comment as for pr39839.c Index: gcc.target/arm/pr42574.c OK Index: gcc.target/arm/pr46934.c There's nothing cpu specific here, this should be in gcc.c-torture/compile. Index: gcc.target/arm/thumb-branch1.c OK.
Re: CFT: Move unwinder to toplevel libgcc
On Thu, 2011-07-07 at 15:08 +0200, Rainer Orth wrote: In that case, perhaps Steve could have a look? I'd finally like to make some progress on this patch. Thanks. Rainer It looks like the GCC build is trying to compile unwind-ia64.c on IA64 HP-UX even though it should not use or need this file. Using --with-system-libunwind doesn't seem to help. I am not sure where this should be handled under the new setup. Previously config.gcc would either include or not include t-glibc-libunwind in the Makefile to build or not build this file. This might be coming from t-eh-ia64 rather then t-glibc-libunwind. Both of these include unwind-ia64.c. Steve Ellcey s...@cup.hp.com
Re: [testsuite] arm tests: remove -march= and dg-prune-output from 3 tests
On 07/07/11 17:30, Janis Johnson wrote: On 07/07/2011 09:14 AM, Richard Earnshaw wrote: On 07/07/11 00:26, Janis Johnson wrote: Index: gcc.target/arm/xor-and.c === --- gcc.target/arm/xor-and.c(revision 175921) +++ gcc.target/arm/xor-and.c(working copy) @@ -1,6 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O -march=armv6 } */ -/* { dg-prune-output switch .* conflicts with } */ +/* { dg-options -O } */ unsigned short foo (unsigned short x) { The purpose of this test seems to be to ensure that when compiling for v6 we don't get particular instructions. Removing the -march flag means we won't normally test this in the way intended (ie unless the multilibs explicitly test v6). This is one of those cases where I think the intention really is to force one particular instruction set. R. It passes everywhere, do you want to know when it stops passing for some other multilib, or just care about armv6? If you only care about armv6 then the test should be limited to run with the default multilib instead of having to muck around checking for incompatible options. We only care about v6 here, I think. There aren't really any multilib issues, since it's a compile-only test. I don't mind not testing it for non-default multilibs, but it should be marked as 'skipped' or recorded in some other way, so that the total number of tests is the same for each variant. BTW, can the testsuite ever be run with no default multilib? If so, then I don't think we should always skip the test. R.
Re: [PATCH 4/6] Shrink-wrapping
On Jul 7, 2011, at 11:38 AM, Bernd Schmidt wrote: ... It'd also be nice to get rid of all these big blocks of code that are conditional on preprocessor macros, but I realise you're just following existing practice in the surrounding code, so again it can be left to a future cleanup. Yeah, this function is quite horrid - so many different paths through it. However, it looks like the only target without HAVE_prologue is actually pdp11, so we're carrying some unnecessary baggage for purely retrocomputing purposes. Paul, can you fix that? Sure, but... I searched for HAVE_prologue and I can't find any place that set it. There are tests for it, but I see nothing that defines it (other than df-scan.c which defines it as zero if it's not defined, not sure what the point of that is). I must be missing something... paul
Re: [PATCH 4/6] Shrink-wrapping
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/07/11 10:58, Paul Koning wrote: On Jul 7, 2011, at 11:38 AM, Bernd Schmidt wrote: ... It'd also be nice to get rid of all these big blocks of code that are conditional on preprocessor macros, but I realise you're just following existing practice in the surrounding code, so again it can be left to a future cleanup. Yeah, this function is quite horrid - so many different paths through it. However, it looks like the only target without HAVE_prologue is actually pdp11, so we're carrying some unnecessary baggage for purely retrocomputing purposes. Paul, can you fix that? Sure, but... I searched for HAVE_prologue and I can't find any place that set it. There are tests for it, but I see nothing that defines it (other than df-scan.c which defines it as zero if it's not defined, not sure what the point of that is). I must be missing something... Isn't it defined by the insn-foo generators based on the existence of a prologue/epilogue insn in the MD file? jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOFeYcAAoJEBRtltQi2kC7VGgIALj386a0t+LMKL8dqj81DnQ1 iMx7q+bMcKhJz6HT9iJNsH1u9rFuwlw5K+FqNlrlxazSUmDpnbqUbwcem33ciicl jdBQQrCCyNMI0piWNS+2VwG8D3UZYOLsgHWSONK5oBDwNwDo5P8rQ3USOh4Gv6in puKL0HsteTvycMPGoAj2ZQCs+dL6r5nogIsBMAtJ7n+Vw+hstGnbc7TdxDbWikDC 63KekXpeTyrYSBwK+mxzhs6p3lkydZxEQoh/iuKm4Pi6DFZRSZB+GTvFWSz+0Ek5 hLgqEI42LWRKx34qioO37C7cbY5ONo/O/G7wiPp3wjCm07YBFDV4awKP6XEnEfQ= =4v2Y -END PGP SIGNATURE-
Re: [patch] Disable static build for libjava
On 07/07/2011 09:57 AM, Matthias Klose wrote: On 07/07/2011 06:51 PM, David Daney wrote: On 07/07/2011 09:27 AM, Matthias Klose wrote: As discussed at the Google GCC gathering, disable the build of static libraries in libjava, which should cut the build time of libjava by 50%. The static libjava build isn't useful out of the box, and I don't see it packaged by Linux distributions either. The AC_PROG_LIBTOOL check is needed to get access to the enable_shared macro. I'm unsure about the check in the switch construct. Taken from libtool.m4, and determining the value of enable_shared_with_static_runtimes. Ok for the trunk? 2011-07-07 Matthias Klosed...@ubuntu.com * Makefile.def (target_modules/libjava): Pass $(libjava_disable_static). * configure.ac: Check for libtool, pass --disable-static in libjava_disable_static. * Makefile.in: Regenerate. * configure: Likewise. My autoconf fu is not what it used to be. It is fine if static libraries are disabled by default, but it should be possible to enable them from the configure command line. It is unclear to me if this patch does that. no. I assume an extra option --enable-static-libjava would be needed. Not being a libjava maintainer, I cannot force you to add something like that as part of the patch, but I think it would be a good idea. Also I would like to go on record as disagreeing with the statement that 'static libjava build isn't useful out of the box' I remember that there were some restrictions with the static library. but maybe I'm wrong. There are restrictions, but it is still useful for some embedded environments. David Daney
Re: [PATCH 4/6] Shrink-wrapping
On Jul 7, 2011, at 1:00 PM, Jeff Law wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/07/11 10:58, Paul Koning wrote: On Jul 7, 2011, at 11:38 AM, Bernd Schmidt wrote: ... It'd also be nice to get rid of all these big blocks of code that are conditional on preprocessor macros, but I realise you're just following existing practice in the surrounding code, so again it can be left to a future cleanup. Yeah, this function is quite horrid - so many different paths through it. However, it looks like the only target without HAVE_prologue is actually pdp11, so we're carrying some unnecessary baggage for purely retrocomputing purposes. Paul, can you fix that? Sure, but... I searched for HAVE_prologue and I can't find any place that set it. There are tests for it, but I see nothing that defines it (other than df-scan.c which defines it as zero if it's not defined, not sure what the point of that is). I must be missing something... Isn't it defined by the insn-foo generators based on the existence of a prologue/epilogue insn in the MD file? Thanks, that must be what I was missing. So someone is generating HAVE_%s, and that's why grep didn't find HAVE_prologue? From a note by Richard Henderson (June 30, 2011) it sounds like rs6000 is the other platform that still generates asm prologues. But yes, I said I would do this. It sounds like doing it soon would help Bernd a lot. Let me try to accelerate it. paul
Re: CFT: Move unwinder to toplevel libgcc
On Thu, 2011-07-07 at 15:08 +0200, Rainer Orth wrote: In that case, perhaps Steve could have a look? I'd finally like to make some progress on this patch. Thanks. Rainer When doing an IA64 Linux build (where I do need to compile unwind-ia64.c) I am dying with this failure: In file included from /test/big-foot1/gcc/nightly/src/trunk/libgcc/config/ia64/unwind-ia64.c:35:0: ./md-unwind-support.h:42:7: error: unknown type name '_Unwind_FrameState' ./md-unwind-support.h:132:54: error: unknown type name '_Unwind_FrameState' /test/big-foot1/gcc/nightly/src/trunk/libgcc/config/ia64/unwind-ia64.c: In function 'uw_update_reg_address': /test/big-foot1/gcc/nightly/src/trunk/libgcc/config/ia64/unwind-ia64.c:1931:11: warning: cast discards '__attribute__((const))' qualifier from pointer target type [-Wcast-qual] make[3]: *** [unwind-ia64.o] Error 1 make[3]: Leaving directory `/test/big-foot1/gcc/nightly/build-ia64-debian-linux-gnu-trunk/obj_gcc/ia64-debian-linux-gnu/libgcc' make[2]: *** [all-stage1-target-libgcc] Error 2 Steve Ellcey s...@cup.hp.com
Re: [PATCH 4/6] Shrink-wrapping
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/07/11 11:05, Paul Koning wrote: On Jul 7, 2011, at 1:00 PM, Jeff Law wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/07/11 10:58, Paul Koning wrote: On Jul 7, 2011, at 11:38 AM, Bernd Schmidt wrote: ... It'd also be nice to get rid of all these big blocks of code that are conditional on preprocessor macros, but I realise you're just following existing practice in the surrounding code, so again it can be left to a future cleanup. Yeah, this function is quite horrid - so many different paths through it. However, it looks like the only target without HAVE_prologue is actually pdp11, so we're carrying some unnecessary baggage for purely retrocomputing purposes. Paul, can you fix that? Sure, but... I searched for HAVE_prologue and I can't find any place that set it. There are tests for it, but I see nothing that defines it (other than df-scan.c which defines it as zero if it's not defined, not sure what the point of that is). I must be missing something... Isn't it defined by the insn-foo generators based on the existence of a prologue/epilogue insn in the MD file? Thanks, that must be what I was missing. So someone is generating HAVE_%s, and that's why grep didn't find HAVE_prologue? Yup. Jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOFefZAAoJEBRtltQi2kC7Q1wH/R/vdaJUQfF732FZyuAHSMMu TcDFJT4+uL4r5WaqBdrboyllLN0sJZYsXle/SDIMlL6wBMHDOmCykzEqWUC/Kukl YC6u1NabYlWp0KcZqB+o2+ge4aixahPc5IJiQ/WHU9aT7/7t6VePYVSI8O9p7FjI VXAtzrd7rrXpZnarTBHrbnmPOq/BIBzYM33kPUwThPkvy+NpYWWMPrH2moeN8EFM 1D9CATQTy3ysUGyLpxxIxNKmWqS/wJyl6+JycOE8aws9hiCclnlOdaI9yiKnU1Ht cJut1tCv987VUidyEvKKGv/iDHm8fvTEPQ+EuwB3zD9bRqVM/cSRq2RKAdOiXoE= =laeg -END PGP SIGNATURE-
Re: Fix PR 49014
On 07/01/11 16:50, Andrey Belevantsev wrote: On 26.05.2011 17:32, Andrey Belevantsev wrote: On 25.05.2011 19:31, Bernd Schmidt wrote: On 05/25/2011 03:29 PM, Andrey Belevantsev wrote: I think the hook is a better idea than the attribute because nobody will care to mark all offending insns with an attribute. I don't know. IIRC when I looked at sh or whatever the broken port was, it was only two insns - there would still be some value in being able to assert that all other insns have a reservation. OK, I will take a look on x86-64 and will get back with more information. Andrey So, I have made an attempt to bootstrap on x86-64 with the extra assert in selective scheduling that assumes the DFA state always changes when issuing a recog_memoized =0 insn (patch attached). Indeed, there are just a few general insns that don't have proper reservations. However, it was a surprise to me to see that almost any insn with SSE registers fails this assert and thus does not get properly scheduled. Probably because it's picking a scheduling description for an old CPU? With -mcpu=pentium probably none of the newer patterns has a reservation. That may scupper any plans to use this attribute on i386. Overall, the work on fixing those seems doable, it took just a day to get the compiler bootstrapped (of course, the testsuite may bring much more issues). So, if there is an agreement on marking a few offending insns with the new attribute, we can proceed with the help of somebody from the x86 land on fixing those and researching for other targets. +(set (attr nondfa_insn) (if_then_else (eq_attr alternative 3,4,5,6) (const_int 1) (const_int 0))) I think this shouldn't use (const_int x); you want to be able to write (set_attr nondfa_insn 0,0,0,1,1,1,1) Bernd
Re: [PATCH 4/6] Shrink-wrapping
On 07/07/11 19:05, Paul Koning wrote: From a note by Richard Henderson (June 30, 2011) it sounds like rs6000 is the other platform that still generates asm prologues. But yes, I said I would do this. It sounds like doing it soon would help Bernd a lot. Let me try to accelerate it. Maybe not a whole lot, but it would allow us to simplify some code. Bernd
PATCH: Support -mx32 in GCC tests
Hi, On Linux/x86-64, when we pass RUNTESTFLAGS=--target_board='unix{-mx32}' to GCC tests, we can't check lp64/ilp32 for availability of 64bit x86 instructions. This patch adds ia32 and x32 effetive targets. OK for trunk? Thanks. H.J. --- 2011-07-07 H.J. Lu hongjiu...@intel.com * lib/target-supports.exp (check_effective_target_ia32): New. (check_effective_target_x32): Likewise. (check_effective_target_vect_cmdline_needed): Also check x32. diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 7db156f..b5b8782 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -1512,6 +1512,28 @@ proc check_effective_target_ilp32 { } { }] } +# Return 1 if we're generating ia32 code using default options, 0 +# otherwise. + +proc check_effective_target_ia32 { } { +return [check_no_compiler_messages ia32 object { + int dummy[sizeof (int) == 4 + sizeof (void *) == 4 + sizeof (long) == 4 ? 1 : -1] = { __i386__ }; +}] +} + +# Return 1 if we're generating x32 code using default options, 0 +# otherwise. + +proc check_effective_target_x32 { } { +return [check_no_compiler_messages x32 object { + int dummy[sizeof (int) == 4 + sizeof (void *) == 4 + sizeof (long) == 4 ? 1 : -1] = { __x86_64__ }; +}] +} + # Return 1 if we're generating 32-bit or larger integers using default # options, 0 otherwise. @@ -1713,7 +1735,8 @@ proc check_effective_target_vect_cmdline_needed { } { if { [istarget alpha*-*-*] || [istarget ia64-*-*] || (([istarget x86_64-*-*] || [istarget i?86-*-*]) - [check_effective_target_lp64]) + ([check_effective_target_x32] +|| [check_effective_target_lp64])) || ([istarget powerpc*-*-*] ([check_effective_target_powerpc_spe] || [check_effective_target_powerpc_altivec]))
Re: [testsuite] arm tests: remove -march= and dg-prune-output from 3 tests
On 07/07/2011 09:48 AM, Richard Earnshaw wrote: On 07/07/11 17:30, Janis Johnson wrote: On 07/07/2011 09:14 AM, Richard Earnshaw wrote: On 07/07/11 00:26, Janis Johnson wrote: Index: gcc.target/arm/xor-and.c === --- gcc.target/arm/xor-and.c (revision 175921) +++ gcc.target/arm/xor-and.c (working copy) @@ -1,6 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O -march=armv6 } */ -/* { dg-prune-output switch .* conflicts with } */ +/* { dg-options -O } */ unsigned short foo (unsigned short x) { The purpose of this test seems to be to ensure that when compiling for v6 we don't get particular instructions. Removing the -march flag means we won't normally test this in the way intended (ie unless the multilibs explicitly test v6). This is one of those cases where I think the intention really is to force one particular instruction set. R. It passes everywhere, do you want to know when it stops passing for some other multilib, or just care about armv6? If you only care about armv6 then the test should be limited to run with the default multilib instead of having to muck around checking for incompatible options. We only care about v6 here, I think. There aren't really any multilib issues, since it's a compile-only test. I don't mind not testing it for non-default multilibs, but it should be marked as 'skipped' or recorded in some other way, so that the total number of tests is the same for each variant. The total number of tests is not the same. A test that compiles and does a scan is 2 tests when it is run but is only reported as 1 UNSUPPORTED. We don't currently have a way to count things like dg-final or dg-error as UNSUPPORTED if the entire test is skipped. BTW, can the testsuite ever be run with no default multilib? If so, then I don't think we should always skip the test. R. I don't know. I can leave it the way it is, always specifying -march and ignoring warnings about conflicting options. That doesn't guarantee, though, that it will ever use the specified -march option because unless there is a default multilib or one that doesn't use -march, the one in the test will always be overridden by multilib options. Janis
Re: PATCH: Support -mx32 in GCC tests
On Jul 7, 2011, at 10:29 AM, H.J. Lu wrote: On Linux/x86-64, when we pass RUNTESTFLAGS=--target_board='unix{-mx32}' to GCC tests, we can't check lp64/ilp32 for availability of 64bit x86 instructions. This patch adds ia32 and x32 effetive targets. OK for trunk? Ok.
[PATCH 0/3] Fix PR47654 and PR49649
Hi, First there are two cleanup patches independent of the fix: Start counting nesting level from 0. Do not compute twice type, lb, and ub. Then the patch that fixes PR47654: Fix PR47654: Compute LB and UB of a CLAST expression. One of the reasons we cannot determine the IV type only from the polyhedral representation is that as in the testcase of PR47654, we are asked to generate an induction variable going from 0 to 127. That could be represented with a char. However the upper bound expression of the loop generated by CLOOG is min (127, 51*scat_1 + 50) and that would overflow if we use a char type. To evaluate a type in which the expression 51*scat_1 + 50 does not overflow, we have to compute an upper and lower bound for the expression. To fix the problem exposed by Tobias: for (i = 0 ; i 2; i++) for (j = i ; j i + 1; j++) for (k = j ; k j + 1; k++) for (m = k ; m k + 1; m++) for (n = m ; n m + 1; n++) A[0] += A[n]; I am a little bit afraid that we will increase the type size by an order of magnitude (or at least one bit) for each nesting level. instead of computing the lb and ub of scat_1 in 51*scat_1 + 50 based on the type of scat_1 (that we already code generated when building the outer loop), we use the polyhedral representation to get an accurate lb and ub for scat_1. When translating the substitutions of a user statement using this precise method, like for example S5 in vect-pr43423.c: for (scat_1=0;scat_1=min(T_3-1,T_4-1);scat_1++) { S5(scat_1); we get a type that is too precise: based on the interval [0,99] we get the type unsigned char when the type of scat_1 is int, misleading the vectorizer due to the insertion of spurious casts: # Access function 0: (int) {(unnamed-unsigned:8) graphite_IV.7_56, +, 1}_3; #) affine dependence test not usable: access function not affine or constant. So we have to keep around the previous code gcc_type_for_clast_* that computes the type of an expression as the max precision of the components of that expression, and use that when computing the types of substitution expressions. The patches passed together a full bootstrap and test on amd64-linux. Ok for trunk? Thanks, Sebastian