Re: [Path,AVR]: Improve loading of 32-bit constants

2011-07-07 Thread Denis Chertykov
2011/7/6 Georg-Johann Lay a...@gjlay.de:
 Denis Chertykov wrote:
 I have asked about example of *d instead of !d.
 Just svn GCC with *d vs svn GCC !d.

 Denis.

 Is the patch ok with the original !d instead of *d ?

Ok.

Denis.


Re: [1/11] Use targetm.shift_truncation_mask more consistently

2011-07-07 Thread Richard Sandiford
Bernd Schmidt ber...@codesourcery.com writes:
 On 07/06/11 20:06, Richard Sandiford wrote:
 Bernd Schmidt ber...@codesourcery.com writes:
 At some point we've grown a shift_truncation_mask hook, but we're not
 using it everywhere we're masking shift counts. This patch changes the
 instances I found.
 
 The documentation reads:
 
  Note that, unlike @code{SHIFT_COUNT_TRUNCATED}, this function does
  @emph{not} apply to general shift rtxes; it applies only to instructions
  that are generated by the named shift patterns.

 Ouch. That is one seriously misnamed hook then.

Yeah.  I take the blame for that, sorry :-(

 I think you need to update the documentation, and check that existing
 target definitions do in fact apply to shift rtxes as well.

 Until I can do that, I've reverted this patch.

Thanks.

Richard


Re: [CFT][PATCH 0/6] Move dwarf2 cfi creation to a new pass

2011-07-07 Thread Iain Sandoe


On 7 Jul 2011, at 00:15, Bernd Schmidt wrote:


On 07/03/11 22:01, Richard Henderson wrote:

Bernd's original patch to optimize dwarf2 cfi for shrink-wrapping
is difficult to analyze because that optimization was done via a
random debugging hook during final, and the cfi notes are deleted
at the end of final so that we don't get debug comparison failures.

By pulling the note creation out to a separate pass, we can dump
the notes and thus debug the optimization.

So far I've tested this only on x86_64-linux.  It needs a bit more
testing across other targets before going in.  Any help that can
be given there would be welcome.


I'm trying to help by running ARM tests, but I've managed to screw  
up by

running out of disk space, so I'm starting again from scratch now.


I've  run once through on i686-darwin9 (on the basis that it should  
make no difference, that seems to be the case).


I still need to figure out a way to suppress DW2 epilogue info in  
unwind frames (for Darwin variants that can't handle them) ...

...  will try and merge my patch-in-progress with your changes.

Iain



RFA: Fix bogus mode in choose_reload_regs

2011-07-07 Thread Richard Sandiford
This patch fixes an ICE in smallest_mode_for_size on the attached testcase.
The smallest_mode_for_size call comes from this part of the reload
inheritance code in choose_reload_regs:

  if (byte == 0)
need_mode = mode;
  else
need_mode
  = smallest_mode_for_size
(GET_MODE_BITSIZE (mode) + byte * BITS_PER_UNIT,
 GET_MODE_CLASS (mode) == MODE_PARTIAL_INT
 ? MODE_INT : GET_MODE_CLASS (mode));

  if ((GET_MODE_SIZE (GET_MODE (last_reg))
   = GET_MODE_SIZE (need_mode))

Here we have found that the pseudo register we need was last reloaded
into LAST_REG.  The mode size check is making sure LAST_REG defines
every byte of the value we need (which is at byte offset BYTE and
has mode MODE).

In the attached testcase, LAST_REG is XImode (a 256-bit integer),
and the value we need is the last vector quarter of it.  BYTE is 24
and MODE is V4SF.  The problem is that we then look for a 256-bit vector:

smallest_mode_for_size (64 + 24 * 8, MODE_VECTOR_FLOAT)

but no such mode exists.

Note that this is the only use of need_mode.  I don't believe the mode
that is being calculated here is fundamental in any way, or that it's
used later in the reload process.  We have already checked that the mode
change is allowed:

#ifdef CANNOT_CHANGE_MODE_CLASS
  /* Verify that the register it's in can be used in
 mode MODE.  */
   !REG_CANNOT_CHANGE_MODE_P (REGNO 
(reg_last_reload_reg[regno]),
GET_MODE 
(reg_last_reload_reg[regno]),
mode)
#endif

and have already calculated which hard register we would need to
use after the mode change:

  i = REGNO (last_reg);
  i += subreg_regno_offset (i, GET_MODE (last_reg), byte, mode);

So once we have verified that the register is suitable, we can (and do)
simply use register I in mode MODE.

I think the current mode is a historical left-over.  Back in 2000 this code
was a simple check that the old register entirely encompassed the new one:

  i = REGNO (last_reg) + word;
  last_class = REGNO_REG_CLASS (i);
  if ((GET_MODE_SIZE (GET_MODE (last_reg))
   = GET_MODE_SIZE (mode) + word * UNITS_PER_WORD)

The register we were interested in was (reg:MODE I), and this check made
sure that the old reload register defined every byte of (reg:MODE I).
When CLASS_CANNOT_CHANGE_SIZE was introduced, the code became:

  i = REGNO (last_reg) + word;
  last_class = REGNO_REG_CLASS (i);
  if (
#ifdef CLASS_CANNOT_CHANGE_SIZE
  (TEST_HARD_REG_BIT
   (reg_class_contents[CLASS_CANNOT_CHANGE_SIZE], i)
   ? (GET_MODE_SIZE (GET_MODE (last_reg))
  == GET_MODE_SIZE (mode) + word * UNITS_PER_WORD)
   : (GET_MODE_SIZE (GET_MODE (last_reg))
  = GET_MODE_SIZE (mode) + word * UNITS_PER_WORD))
#else
  (GET_MODE_SIZE (GET_MODE (last_reg))
   = GET_MODE_SIZE (mode) + word * UNITS_PER_WORD)
#endif

But I think this was bogus.  The new size of the register was:

   GET_MODE_SIZE (mode)

rather than:

   GET_MODE_SIZE (mode) + word * UNITS_PER_WORD

Maybe something like:

   word == 0  GET_MODE_SIZE (mode) == GET_MODE_SIZE (GET_MODE (last_reg))

would have been more accurate.  Anyway, CLASS_CANNOT_CHANGE_SIZE proved
to be too limited, so it was replaced with CLASS_CANNOT_CHANGE_MODE.
The code above then became:

  need_mode = smallest_mode_for_size ((word+1) * UNITS_PER_WORD,
  GET_MODE_CLASS (mode));

  if (
#ifdef CLASS_CANNOT_CHANGE_MODE
  (TEST_HARD_REG_BIT
   (reg_class_contents[(int) CLASS_CANNOT_CHANGE_MODE], i)
   ? ! CLASS_CANNOT_CHANGE_MODE_P (GET_MODE (last_reg), 
   need_mode)
   : (GET_MODE_SIZE (GET_MODE (last_reg))
  = GET_MODE_SIZE (need_mode)))
#else
  (GET_MODE_SIZE (GET_MODE (last_reg))
   = GET_MODE_SIZE (need_mode))
#endif

with need_mode providing a mode of the same size as the then-preexisting
size check.  I think this mode is bogus for the same reason, and in 2005
I changed the final mode argument from need_mode to mode:

http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01665.html

That patch also fixed the smallest_mode_for_size argument so that it
was a bit count rather than a byte count.  Unfortunately, it seems
I failed to realise that need_mode was in fact completely 

Re: [PATCH, testsuite] Fix for PR49519, miscompiled 447.dealII in SPEC CPU 2006

2011-07-07 Thread Kirill Yukhin
Let me try again:
I've prepared a patch for: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49519

It fixes the problem of tailcall optimization: check for stack
overlapping was not strict enough.
Patch adds another check for clobbered stack area. If address comes
from a register - we have no idea about destination of that address.
That means we must act in conservative way - address possibly overlaps
with stack area of interest, and we should not perform tailcall
optimization

ChangeLog entry:
2011-07-06  Kirill Yukhin  kirill.yuk...@intel.com

   PR middle-end/49519
   * calls.c (mem_overlaps_already_clobbered_arg_p): Additional
   check if address is stored in register. If so - give up.
   (check_sibcall_argument_overlap_1): Do not perform check of
   overlapping when it is call to address.

tessuite/ChangeLog entry:
2011-07-06  Kirill Yukhin  kirill.yuk...@intel.com

   * g++.dg/torture/pr49519.C: New test for tailcall fix.

Bootstrapped, new test fails without patch, passes when it is applied.
This fixes the problem with SPEC2006/447.dealII miscompile

Ok for trunk?

Thanks, K


pr49519-1.gcc.patch
Description: Binary data


Re: [testsuite] fixes for gcc.target/arm/mla-1.c

2011-07-07 Thread Ramana Radhakrishnan
 OK for trunk, and for 4.6 in a few days if no problems?


This is OK.

Thanks,
Ramana


Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

2011-07-07 Thread Richard Guenther
On Thu, Jul 7, 2011 at 12:29 AM, Michael Meissner
meiss...@linux.vnet.ibm.com wrote:
 This patch adds an option to not load the static chain (r11) for 64-bit 
 PowerPC
 calls through function pointers (or virtual function).  Most of the languages
 on the PowerPC do not need the static chain being loaded when called, and
 adding this instruction can slow down code that calls very short functions.

 In addition, if the function does not call alloca, setjmp or deal with
 exceptions where the stack is modified, the compiler can move the store of the
 TOC value for the current function to the prologue of the function, rather 
 than
 at each call site.

 The effect of these patches is to speed up 464.h264ref in the Spec 2006
 benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but the
 save of the TOC register is hoisted).  I believe this is due to the load of 
 the
 current function's TOC (r2) having to wait until the store queue is drained
 with the store just before the call.

 Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what the
 cause is.

 I have bootstraped the compiler and saw that there were no regressions in make
 check.  Is it ok to install in the trunk?

Hum.  Can't the compiler figure this our itself per-call-site?  At least
the name of the command-line switch -m[no-]r11 is meaningless to me.
Points-to information should be able to tell you if the function pointer
points to a nested function.

Richard.

 [gcc]
 2011-07-06  Michael Meissner  meiss...@linux.vnet.ibm.com

        * config/rs6000/rs6000-protos.h (rs6000_call_indirect_aix): New
        declaration.
        (rs6000_save_toc_in_prologue_p): Ditto.

        * config/rs6000/rs6000.opt (-mr11): New switch to disable loading
        up the static chain (r11) during indirect function calls.
        (-msave-toc-indirect): New undocumented debug switch.

        * config/rs6000/rs6000.c (struct machine_function): Add
        save_toc_in_prologue field to note whether the prologue needs to
        save the TOC value in the reserved stack location.
        (rs6000_emit_prologue): Use TOC_REGNUM instead of 2.  If we need
        to save the TOC in the prologue, do so.
        (rs6000_trampoline_init): Don't allow creating AIX style
        trampolines if -mno-r11 is in effect.
        (rs6000_call_indirect_aix): New function to create AIX style
        indirect calls, adding support for -mno-r11 to suppress loading
        the static chain, and saving the TOC in the prologue instead of
        the call body.
        (rs6000_save_toc_in_prologue_p): Return true if we are saving the
        TOC in the prologue.

        * config/rs6000/rs6000.md (STACK_POINTER_REGNUM): Add more fixed
        register numbers.
        (TOC_REGNUM): Ditto.
        (STATIC_CHAIN_REGNUM): Ditto.
        (ARG_POINTER_REGNUM): Ditto.
        (SFP_REGNO): Delete, unused.
        (TOC_SAVE_OFFSET_32BIT): Add constants for AIX TOC save and
        function descriptor offsets.
        (TOC_SAVE_OFFSET_64BIT): Ditto.
        (AIX_FUNC_DESC_TOC_32BIT): Ditto.
        (AIX_FUNC_DESC_TOC_64BIT): Ditto.
        (AIX_FUNC_DESC_SC_32BIT): Ditto.
        (AIX_FUNC_DESC_SC_64BIT): Ditto.
        (ptrload): New mode attribute for the appropriate load of a
        pointer.
        (call_indirect_aix32): Delete, rewrite AIX indirect function
        calls.
        (call_indirect_aix64): Ditto.
        (call_value_indirect_aix32): Ditto.
        (call_value_indirect_aix64): Ditto.
        (call_indirect_nonlocal_aix32_internal): Ditto.
        (call_indirect_nonlocal_aix32): Ditto.
        (call_indirect_nonlocal_aix64_internal): Ditto.
        (call_indirect_nonlocal_aix64): Ditto.
        (call): Rewrite AIX indirect function calls.  Add support for
        eliminating the static chain, and for moving the save of the TOC
        to the function prologue.
        (call_value): Ditto.
        (call_indirect_aixptrsize): Ditto.
        (call_indirect_aixptrsize_internal): Ditto.
        (call_indirect_aixptrsize_internal2): Ditto.
        (call_indirect_aixptrsize_nor11): Ditto.
        (call_value_indirect_aixptrsize): Ditto.
        (call_value_indirect_aixptrsize_internal): Ditto.
        (call_value_indirect_aixptrsize_internal2): Ditto.
        (call_value_indirect_aixptrsize_nor11): Ditto.
        (call_nonlocal_aix32): Relocate in the rs6000.md file.
        (call_nonlocal_aix64): Ditto.

        * doc/invoke.texi (RS/6000 and PowerPC Options): Add -mr11 and
        -mno-r11 documentation.
 [gcc/testsuite]
 2011-07-06  Michael Meissner  meiss...@linux.vnet.ibm.com

        * gcc.target/powerpc/no-r11-1.c: New test for -mr11, -mno-r11.
        * gcc.target/powerpc/no-r11-2.c: Ditto.
        * gcc.target/powerpc/no-r11-3.c: Ditto.

 --
 Michael Meissner, IBM
 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
 meiss...@linux.vnet.ibm.com     fax +1 (978) 399-6899



Re: Remove obsolete %[] specs operator

2011-07-07 Thread Richard Guenther
On Thu, Jul 7, 2011 at 2:03 AM, Joseph S. Myers jos...@codesourcery.com wrote:
 The %[] spec operator is marked as obsolete and not used by any specs
 in GCC; I'm also not sure it would work properly now the canonical
 form of -D options is defined to have separate argument.  This patch
 removes support for that obsolete operator.

 Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  OK to
 commit?

Ok.

Thanks,
Richard.

 2011-07-06  Joseph Myers  jos...@codesourcery.com

        * gcc.c (%[Spec]): Don't document.
        (struct spec_list): Update comment.
        (do_spec_1): Don't handle %[Spec].
        * doc/invoke.texi (%[@var{name}]): Remove documentation of spec.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 175919)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -9768,9 +9768,6 @@ Use this when inconsistent options are d
  @item %(@var{name})
  Substitute the contents of spec string @var{name} at this point.

 -@item %[@var{name}]
 -Like @samp{%(@dots{})} but put @samp{__} around @option{-D} arguments.
 -
  @item %x@{@var{option}@}
  Accumulate an option for @samp{%X}.

 Index: gcc/gcc.c
 ===
 --- gcc/gcc.c   (revision 175919)
 +++ gcc/gcc.c   (working copy)
 @@ -438,7 +438,6 @@ or with constant text in a single argume
           This may be combined with '.', '!', ',', '|', and '*' as above.

  %(Spec) processes a specification defined in a specs file as *Spec:
 - %[Spec] as above, but put __ around -D arguments

  The conditional text X in a %{S:X} or similar construct may contain
  other nested % constructs or spaces, or even newlines.  They are
 @@ -1149,8 +1148,8 @@ static const char *multilib_dir;
  static const char *multilib_os_dir;

  /* Structure to keep track of the specs that have been defined so far.
 -   These are accessed using %(specname) or %[specname] in a compiler
 -   or link spec.  */
 +   These are accessed using %(specname) in a compiler or link
 +   spec.  */

  struct spec_list
  {
 @@ -5212,11 +5211,7 @@ do_spec_1 (const char *spec, int inswitc

            /* Process a string found as the value of a spec given by name.
               This feature allows individual machine descriptions
 -              to add and use their own specs.
 -              %[...] modifies -D options the way %P does;
 -              %(...) uses the spec unmodified.  */
 -         case '[':
 -           warning (0, use of obsolete %%[ operator in specs);
 +              to add and use their own specs.  */
          case '(':
            {
              const char *name = p;
 @@ -5225,7 +5220,7 @@ do_spec_1 (const char *spec, int inswitc

              /* The string after the S/P is the name of a spec that is to be
                 processed.  */
 -             while (*p  *p != ')'  *p != ']')
 +             while (*p  *p != ')')
                p++;

              /* See if it's in the list.  */
 @@ -5234,63 +5229,20 @@ do_spec_1 (const char *spec, int inswitc
                  {
                    name = *(sl-ptr_spec);
  #ifdef DEBUG_SPECS
 -                   fnotice (stderr, Processing spec %c%s%c, which is 
 '%s'\n,
 -                           c, sl-name, (c == '(') ? ')' : ']', name);
 +                   fnotice (stderr, Processing spec (%s), which is '%s'\n,
 +                            sl-name, name);
  #endif
                    break;
                  }

              if (sl)
                {
 -                 if (c == '(')
 -                   {
 -                     value = do_spec_1 (name, 0, NULL);
 -                     if (value != 0)
 -                       return value;
 -                   }
 -                 else
 -                   {
 -                     char *x = (char *) alloca (strlen (name) * 2 + 1);
 -                     char *buf = x;
 -                     const char *y = name;
 -                     int flag = 0;
 -
 -                     /* Copy all of NAME into BUF, but put __ after
 -                        every -D and at the end of each arg.  */
 -                     while (1)
 -                       {
 -                         if (! strncmp (y, -D, 2))
 -                           {
 -                             *x++ = '-';
 -                             *x++ = 'D';
 -                             *x++ = '_';
 -                             *x++ = '_';
 -                             y += 2;
 -                             flag = 1;
 -                             continue;
 -                           }
 -                         else if (flag
 -                                   (*y == ' ' || *y == '\t' || *y == '='
 -                                      || *y == '}' || *y == 0))
 -                           {
 -                             *x++ = '_';
 -                             *x++ = '_';
 -                             flag = 0;
 -                           }
 -                         

Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

2011-07-07 Thread Jakub Jelinek
On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote:
 Hum.  Can't the compiler figure this our itself per-call-site?  At least
 the name of the command-line switch -m[no-]r11 is meaningless to me.
 Points-to information should be able to tell you if the function pointer
 points to a nested function.

Yeah.  E.g. for C++ virtual method calls I believe all function pointers in
vtables should always ignore the static chain pointer, etc., because you
can't have a nested method.

Jakub


Re: plugin event for C/C++ declarations

2011-07-07 Thread Romain Geissler
 On Tue, Dec 22, 2009 at 11:45 AM, Diego Novillo dnovi...@google.com wrote:
 On Tue, Dec 22, 2009 at 13:00, Brian Hackett bhackett1...@gmail.com wrote:
 Hi, this patch adds a new plugin event FINISH_DECL, which is invoked
 at every finish_decl in the C and C++ frontends. ?Previously there did
 not seem to be a way for a plugin to see the definition for a global
 that is never used in the input file, or the initializer for a global
 which is declared before a function but defined after. ?This event
 isn't restricted to just globals though, but also locals, fields, and
 parameters (C frontend only).

 Thanks for your patch. ?This will be great to fix
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41757 but we need to wait
 for your copyright assignment to go through before we can accept it.

 Hi, this is a patch from a few months ago which I was not able to get
 an assignment for.  The FSF has a personal copyright assignment for
 me, but I could not get one from my employer at the time, Stanford
 (according to Stanford's policies they would not claim copyright on
 this patch).  I now work for Mozilla, which (I understand) has a
 company wide copyright assignment.  Are there issues if I from scratch
 rewrite and resubmit a new patch?

 Original patch (9 new lines of code, doc change and new regression):

 http://gcc.gnu.org/ml/gcc-patches/2009-12/msg01032.html

 Brian

Hi,

Once again, this is a ping for the long time proposed patch by Brian Hackett.
See last thread about this one here:
http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00315.html

Find below the fixed patch for recent revision (changed
gcc/testsuite/g++.dg/plugin/decl_plugin.c global and local var
decl detection)

Romain Geissler

2011-07-07  Romain Geissler  romain.geiss...@gmail.com
2010-04-14  Brian Hackett  bhackett1...@gmail.com

gcc/ChangeLog:

   * plugin.def: Add event for finish_decl.
   * plugin.c (register_callback, invoke_plugin_callbacks): Same.
   * c-decl.c (finish_decl): Invoke callbacks on above event.
   * doc/plugins.texi: Document above event.

gcc/cp/ChangeLog:

   * decl.c (cp_finish_decl): Invoke callbacks on finish_decl event.

gcc/testsuite/ChangeLog:

   * g++.dg/plugin/decl_plugin.c: New test plugin.
   * g++.dg/plugin/decl-plugin-test.C: Testcase for above plugin.
   * g++.dg/plugin/plugin.exp: Add above testcase.


Index: gcc/doc/plugins.texi
===
--- gcc/doc/plugins.texi(revision 175907)
+++ gcc/doc/plugins.texi(working copy)
@@ -151,6 +151,7 @@ enum plugin_event
 @{
   PLUGIN_PASS_MANAGER_SETUP,/* To hook into pass manager.  */
   PLUGIN_FINISH_TYPE,   /* After finishing parsing a type.  */
+  PLUGIN_FINISH_DECL,   /* After finishing parsing a declaration. */
   PLUGIN_FINISH_UNIT,   /* Useful for summary processing.  */
   PLUGIN_PRE_GENERICIZE,/* Allows to see low level AST in C
and C++ frontends.  */
   PLUGIN_FINISH,/* Called before GCC exits.  */
Index: gcc/plugin.def
===
--- gcc/plugin.def  (revision 175907)
+++ gcc/plugin.def  (working copy)
@@ -24,6 +24,9 @@ DEFEVENT (PLUGIN_PASS_MANAGER_SETUP)
 /* After finishing parsing a type.  */
 DEFEVENT (PLUGIN_FINISH_TYPE)

+/* After finishing parsing a declaration. */
+DEFEVENT (PLUGIN_FINISH_DECL)
+
 /* Useful for summary processing.  */
 DEFEVENT (PLUGIN_FINISH_UNIT)

Index: gcc/testsuite/g++.dg/plugin/plugin.exp
===
--- gcc/testsuite/g++.dg/plugin/plugin.exp  (revision 175907)
+++ gcc/testsuite/g++.dg/plugin/plugin.exp  (working copy)
@@ -51,7 +51,8 @@ set plugin_test_list [list \
 { pragma_plugin.c pragma_plugin-test-1.C } \
 { selfassign.c self-assign-test-1.C self-assign-test-2.C
self-assign-test-3.C } \
 { dumb_plugin.c dumb-plugin-test-1.C } \
-{ header_plugin.c header-plugin-test.C } ]
+{ header_plugin.c header-plugin-test.C } \
+{ decl_plugin.c decl-plugin-test.C } ]

 foreach plugin_test $plugin_test_list {
 # Replace each source file with its full-path name
Index: gcc/testsuite/g++.dg/plugin/decl-plugin-test.C
===
--- gcc/testsuite/g++.dg/plugin/decl-plugin-test.C  (revision 0)
+++ gcc/testsuite/g++.dg/plugin/decl-plugin-test.C  (revision 0)
@@ -0,0 +1,32 @@
+
+
+extern int global; // { dg-warning Decl Global global }
+int global_array[] = { 1, 2, 3 }; // { dg-warning Decl Global global_array }
+
+int takes_args(int arg1, int arg2)
+{
+  int local = arg1 + arg2 + global; // { dg-warning Decl Local local }
+  return local + 1;
+}
+
+int global = 12; // { dg-warning Decl Global global }
+
+struct test_str {
+  int field; // { dg-warning Decl Field field }
+};
+
+class test_class {
+  int class_field1; // { dg-warning Decl Field class_field1 }
+  int 

Re: [v3] Correctly determine baseline_subdir for 64-bit default Solaris gcc

2011-07-07 Thread Paolo Carlini

Hi,

Ok for mainline if that passes?
I'm going to trust you Rainer on this and it seems very safe on 
x86_64-linux anyway. Please wait just one more day or so and then check 
it in.


Thanks,
Paolo.


Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

2011-07-07 Thread Richard Guenther
On Thu, Jul 7, 2011 at 11:03 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote:
 Hum.  Can't the compiler figure this our itself per-call-site?  At least
 the name of the command-line switch -m[no-]r11 is meaningless to me.
 Points-to information should be able to tell you if the function pointer
 points to a nested function.

 Yeah.  E.g. for C++ virtual method calls I believe all function pointers in
 vtables should always ignore the static chain pointer, etc., because you
 can't have a nested method.

For this kind of FE specific info you could use a flag on the CALL_EXPR
as well.

Richard.


Re: [v3] Correctly determine baseline_subdir for 64-bit default Solaris gcc

2011-07-07 Thread Rainer Orth
Hi Paolo,

 Ok for mainline if that passes?
 I'm going to trust you Rainer on this and it seems very safe on
 x86_64-linux anyway. Please wait just one more day or so and then check it
 in.

ok, will do.  The x86_64-unknown-linux-gnu bootstrap has completed
without regressions and the correct baselines were used for both
multilibs.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [Patch, AVR]: Fix PR46779

2011-07-07 Thread Georg-Johann Lay
Denis Chertykov wrote:
 2011/6/27 Georg-Johann Lay:
 Denis Chertykov wrote:

 The main problem for me is that the new addressing mode produce a
 worse code in many tests.
 You have an example source?
 
 In attachment.
 
 Denis.

Hi Denis.

I had a look at the sources you sent.

sort.c:
===

There is some difference because of register allocation, but the new
code does not look awfully bad, just a bit different because of
different register allocation that might need some more bytes.

The difference is *not* because of deny fake X addressing, it's
because of the new avr_hard_regno_mode_ok implementation to fix
PR46779.  When I add


  if (GET_MODE_SIZE (mode) == 1)
 return 1;

+ if (SImode == mode  regno == 28)
+   return 0;

   return regno % 2 == 0;

to that function, the difference in code disappears.

pr.c:
=

I get the following sizes with pr-0 the original compile and pr qith
my patch:

avr-size pr-0.o
   textdata bss dec hex filename
   2824  24   02848 b20 pr-0.o
avr-size pr.o
   textdata bss dec hex filename
   2564  24   02588 a1c pr.o

So the size actually decreased significantly.  Avoiding SI in
avr_hard_regno_mode_ok like above does not change code size.



Note that I did *not* use the version from the git repository; I could
not get reasonable code out of it (even after some fixes).  Hundreds
of testsuite crashes...

I used the initial patch that I posted; I attached it again for
reference. Note that LEGITIMIZE_RELOAD_ADDRESS is still not
implemented there.



Did you decide about the fix for PR46779?

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00810.html

Is it ok to commit?

I think fix PR46779 and fix fake X addresses (PR46278) should be
separate patches and not intermixed.

Johann

Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 175956)
+++ config/avr/avr.md	(working copy)
@@ -246,8 +246,8 @@ (define_expand movqi
   )
 
 (define_insn *movqi
-  [(set (match_operand:QI 0 nonimmediate_operand =r,d,Qm,r,q,r,*r)
-	(match_operand:QI 1 general_operand   rL,i,rL,Qm,r,q,i))]
+  [(set (match_operand:QI 0 nonimmediate_operand =r,d,m,r,q,r,*r)
+	(match_operand:QI 1 general_operand   rL,i,rL,m,r,q,i))]
   (register_operand (operands[0],QImode)
 || register_operand (operands[1], QImode) || const0_rtx == operands[1])
   * return output_movqi (insn, operands, NULL);
@@ -295,15 +295,6 @@ (define_expand movhi
 }
 })
 
-(define_insn *movhi_sp
-  [(set (match_operand:HI 0 register_operand =q,r)
-(match_operand:HI 1 register_operand  r,q))]
-  ((stack_register_operand(operands[0], HImode)  register_operand (operands[1], HImode))
-|| (register_operand (operands[0], HImode)  stack_register_operand(operands[1], HImode)))
-  * return output_movhi (insn, operands, NULL);
-  [(set_attr length 5,2)
-   (set_attr cc none,none)])
-
 (define_insn movhi_sp_r_irq_off
   [(set (match_operand:HI 0 stack_register_operand =q)
 (unspec_volatile:HI [(match_operand:HI 1 register_operand  r)] 
@@ -427,8 +418,8 @@ (define_insn *reload_insi
 
 
 (define_insn *movsi
-  [(set (match_operand:SI 0 nonimmediate_operand =r,r,r,Qm,!d,r)
-(match_operand:SI 1 general_operand   r,L,Qm,rL,i,i))]
+  [(set (match_operand:SI 0 nonimmediate_operand =r,r,r,m,!d,r)
+(match_operand:SI 1 general_operand   r,L,m,rL,i,i))]
   (register_operand (operands[0],SImode)
 || register_operand (operands[1],SImode) || const0_rtx == operands[1])
   {
@@ -455,8 +446,8 @@ (define_expand movsf
 })
 
 (define_insn *movsf
-  [(set (match_operand:SF 0 nonimmediate_operand =r,r,r,Qm,!d,r)
-(match_operand:SF 1 general_operand   r,G,Qm,rG,F,F))]
+  [(set (match_operand:SF 0 nonimmediate_operand =r,r,r,m,!d,r)
+(match_operand:SF 1 general_operand   r,G,m,rG,F,F))]
   register_operand (operands[0], SFmode)
|| register_operand (operands[1], SFmode)
|| operands[1] == CONST0_RTX (SFmode)
@@ -1592,8 +1583,8 @@ (define_mode_attr rotx [(DI r,r,X) (
 (define_mode_attr rotsmode [(DI QI) (SI HI) (HI QI)])
 
 (define_expand rotlmode3
-  [(parallel [(set (match_operand:HIDI 0 register_operand )
-		   (rotate:HIDI (match_operand:HIDI 1 register_operand )
+  [(parallel [(set (match_operand:HISI 0 register_operand )
+		   (rotate:HISI (match_operand:HISI 1 register_operand )
 (match_operand:VOID 2 const_int_operand )))
 		(clobber (match_dup 3))])]
   
@@ -1692,7 +1683,7 @@ (define_split ; ashlqi3_const6
 (define_insn *ashlqi3
   [(set (match_operand:QI 0 register_operand   =r,r,r,r,!d,r,r)
 	(ashift:QI (match_operand:QI 1 register_operand 0,0,0,0,0,0,0)
-		   (match_operand:QI 2 general_operand  r,L,P,K,n,n,Qm)))]
+		   (match_operand:QI 2 general_operand  r,L,P,K,n,n,m)))]
   
   * return ashlqi3_out (insn, operands, NULL);
   [(set_attr length 5,0,1,2,4,6,9)
@@ -1701,7 +1692,7 @@ (define_insn 

Re: Provide 64-bit default Solaris/x86 configuration (PR target/39150)

2011-07-07 Thread Rainer Orth
Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 There has long been some clamoring for a amd64-*-solaris2 configuration
 similar to sparcv9-sun-solaris2.  I've resisted this for quite some
 time, primarily because it doubles the maintenance effort of testing
 both the 32-bit default and 64-bit default configurations.
[...]
 I think practically the whole patch falls under the Solaris
 maintainership, with the possible exception of the change to the copy of
 libtool.m4 in libgo/config.  This is not for the technical content, but
 for the special commit rules to that directory.  Ian?
  
 Anyway, this part of the patch will have to go to upstream libtool.
 Ralf, could you take care of that?

 Bootstrapped without regression on i386-pc-solaris2.10 (both 32-bit
 default and 64-bit default configurations), i386-pc-solaris2.11 and
 sparc-sun-solaris2.11 in progress.
[...]
 Once all the bootstraps have finished, I'll commit this patch (at least
 the non-libgo parts) unless anything unexpected comes up.

All bootstraps have completed without regressions, so I've installed the
patch as is, after verifying that the libgo parts aren't present in the
upstream Go repo.

I've also synced the toplevel configure.ac/configure changes to src.

One other issue: it was suggested that the 64-bit compiler might
actually be faster than a 32-bit one.  At least bootstrap times speak
another language: on a Sun Fire X4450 running Solaris 10 with 4 x 2.93
GHz Quad-Core Xeon Xeon X7350, I find for make -j32 + make -j32 -k check
for both multilibs:

64-bit  32-bit

as/ld

  real  1:59:28.66  1:52:15.55
  user  7:14:33.93  6:43:25.84
  sys   5:26:30.66  4:41:02.78

gas/ld

2:02:47.64  1:54:24.51
7:10:41.93  6:39:39.39
5:37:15.86  4:51:41.02

gas/gld

1:59:57.13  1:45:13.18
7:57:37.13  7:11:41.83
5:11:58.14  4:04:26.97

Same picture on a Sun Fire X4600 M2 running Solaris 11 with 8 x 2.6 GHz
Dual-Core Opteron 8218.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-07-07 Thread Richard Guenther
On Mon, Jul 4, 2011 at 4:26 PM, Andrew Stubbs a...@codesourcery.com wrote:
 On 28/06/11 15:14, Andrew Stubbs wrote:

 On 28/06/11 13:33, Andrew Stubbs wrote:

 On 23/06/11 15:41, Andrew Stubbs wrote:

 If one or both of the inputs to a widening multiply are of unsigned type
 then the compiler will attempt to use usmul_widen_optab or
 umul_widen_optab, respectively.

 That works fine, but only if the target supports those operations
 directly. Otherwise, it just bombs out and reverts to the normal
 inefficient non-widening multiply.

 This patch attempts to catch these cases and use an alternative signed
 widening multiply instruction, if one of those is available.

 I believe this should be legal as long as the top bit of both inputs is
 guaranteed to be zero. The code achieves this guarantee by
 zero-extending the inputs to a wider mode (which must still be narrower
 than the output mode).

 OK?

 This update fixes the testsuite issue Janis pointed out.

 And this one fixes up the wmul-5.c testcase also. The patch has changed
 the correct result.

 Here's an update for the context changed by the update to patch 3.

 The content of the patch has not changed.

+  gimple stmt = gimple_build_assign (result, fold_convert (type, val));

please use gimple_build_assign_with_ops

-convert_mult_to_widen (gimple stmt)
+convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)

The comment needs updating for the new parameter.

+ type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0);

don't use type_for_mode, use build_nonstandard_integer_type
(GET_MODE_PRECISION (from_mode), 0) instead.

Both types are equal, so please share the temporary variable you
create

+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+   create_tmp_var (type1, NULL),
rhs1, type1);
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+   create_tmp_var (type2, NULL),
rhs2, type2);

here (CSE create_tmp_var).

+ type1 = type2 = lang_hooks.types.type_for_mode (mode, 0);
+ mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+create_tmp_var (type1, NULL),
+mult_rhs1, type1);
+ mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+create_tmp_var (type2, NULL),
+mult_rhs2, type2);

Likewise.

Thanks,
Richard.

 Andrew



Re: Improve Solaris mudflap support (PR libmudflap/49550)

2011-07-07 Thread Rainer Orth
Hi Frank,

 I could either commit the current version with the MFWRAP_SPEC addition
 and work from there, or wait until those failures are understood and
 fixed, too.

 Committing now would be fine, assuming no regressions on a primary
 platform.

below is the patch I've actually comitted, after the
x86_64-unknown-linux-gnu bootstrap completed without regressions.  In
fact, 4 failures were fixed:

-FAIL: libmudflap.c/pass-stratcliff.c (test for excess errors)
-FAIL: libmudflap.c/pass-stratcliff.c (-static) (test for excess errors)
-FAIL: libmudflap.c/pass-stratcliff.c (-O2) (test for excess errors)
-FAIL: libmudflap.c/pass-stratcliff.c (-O3) (test for excess errors)

/vol/gcc/src/hg/trunk/local/libmudflap/testsuite/libmudflap.c/pass-stratcliff.c:253:21:
 warning: extra tokens at end of #ifndef directive [enabled by default]

which was introduced by me in my last patch, but got unnoticed ;-)

Rainer


2011-06-29  Rainer Orth  r...@cebitec.uni-bielefeld.de

gcc:
libmudflap/49550
* gcc.c (MFWRAP_SPEC): Also wrap mmap64.

libmudflap:
libmudflap/49550
* mf-runtime.c (__wrap_main) [__sun__  __svr4__]: Don't register
stdin, stdout, stderr.
Register __ctype, __ctype_mask.

* configure.ac: Check for mmap64.
Check for rawmemchr, stpcpy, mempcpy.
* configure: Regenerate.
* config.h.in: Regenerate.
* mf-hooks1.c [HAVE_MMAP64] (__mf_0fn_mmap64): New function.
(mmap64): New wrapper function.
* mf-impl.h (__mf_dynamic_index) [HAVE_MMAP64]: Add dyn_mmap64.
* mf-runtime.c (__mf_dynamic) [HAVE_MMAP64]: Handle mmap64.

* mf-hooks2.c [HAVE_GETMNTENT  HAVE_SYS_MNTTAB_H]: Implement
getmntent wrapper.

* mf-hooks3.c (_REENTRANT): Define.

* testsuite/libmudflap.c/heap-scalestress.c (SCALE): Reduce to 1.

* testsuite/libmudflap.c/pass-stratcliff.c: Include ../config.h.
(MIN): Define.
Use HAVE_RAWMEMCHR, HAVE_STPCPY, HAVE_MEMPCPY as guards.

* testsuite/libmudflap.c/pass47-frag.c: Expect __ctype warning on
*-*-solaris2.*.

diff --git a/gcc/gcc.c b/gcc/gcc.c
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -518,7 +518,7 @@ proper position among the other output f
 /* XXX: should exactly match hooks provided by libmudflap.a */
 #define MFWRAP_SPEC  %{static: %{fmudflap|fmudflapth: \
  --wrap=malloc --wrap=free --wrap=calloc --wrap=realloc\
- --wrap=mmap --wrap=munmap --wrap=alloca\
+ --wrap=mmap --wrap=mmap64 --wrap=munmap --wrap=alloca\
 } %{fmudflapth: --wrap=pthread_create\
 }} %{fmudflap|fmudflapth: --wrap=main}
 #endif
diff --git a/libmudflap/configure.ac b/libmudflap/configure.ac
--- a/libmudflap/configure.ac
+++ b/libmudflap/configure.ac
@@ -75,7 +75,9 @@ AC_CHECK_FUNCS(getservent getservbyname 
 AC_CHECK_FUNCS(getprotoent getprotobyname getprotobynumber)
 AC_CHECK_FUNCS(getmntent setmntent addmntent)
 AC_CHECK_FUNCS(inet_ntoa mmap munmap)
+AC_CHECK_FUNCS(mmap64)
 AC_CHECK_FUNCS(__libc_freeres)
+AC_CHECK_FUNCS(rawmemchr stpcpy mempcpy)
 
 AC_TRY_COMPILE([#include sys/types.h
 #include sys/ipc.h
diff --git a/libmudflap/mf-hooks1.c b/libmudflap/mf-hooks1.c
--- a/libmudflap/mf-hooks1.c
+++ b/libmudflap/mf-hooks1.c
@@ -1,5 +1,5 @@
 /* Mudflap: narrow-pointer bounds-checking by tree rewriting.
-   Copyright (C) 2002, 2003, 2004, 2009 Free Software Foundation, Inc.
+   Copyright (C) 2002, 2003, 2004, 2009, 2011 Free Software Foundation, Inc.
Contributed by Frank Ch. Eigler f...@redhat.com
and Graydon Hoare gray...@redhat.com
 
@@ -414,6 +414,61 @@ WRAPPER(int , munmap, void *start, size_
 #endif /* HAVE_MMAP */
 
 
+#ifdef HAVE_MMAP64
+#if PIC
+/* A special bootstrap variant. */
+void *
+__mf_0fn_mmap64 (void *start, size_t l, int prot, int f, int fd, off64_t off)
+{
+  return (void *) -1;
+}
+#endif
+
+
+#undef mmap
+WRAPPER(void *, mmap64,
+   void  *start,  size_t length, int prot,
+   int flags, int fd, off64_t offset)
+{
+  DECLARE(void *, mmap64, void *, size_t, int,
+   int, int, off64_t);
+  void *result;
+  BEGIN_PROTECT (mmap64, start, length, prot, flags, fd, offset);
+
+  result = CALL_REAL (mmap64, start, length, prot,
+   flags, fd, offset);
+
+  /*
+  VERBOSE_TRACE (mmap64 (%08lx, %08lx, ...) = %08lx\n,
+(uintptr_t) start, (uintptr_t) length,
+(uintptr_t) result);
+  */
+
+  if (result != (void *)-1)
+{
+  /* Register each page as a heap object.  Why not register it all
+as a single segment?  That's so that a later munmap() call
+can unmap individual pages.  XXX: would __MF_TYPE_GUESS make
+this more automatic?  */
+  size_t ps = getpagesize ();
+  uintptr_t base = (uintptr_t) result;
+  uintptr_t offset;
+
+  for (offset=0; offsetlength; offset+=ps)
+   {
+ /* XXX: We could map PROT_NONE to __MF_TYPE_NOACCESS. */
+ /* XXX: Unaccessed HEAP pages are reported as 

Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs

2011-07-07 Thread Richard Guenther
On Mon, Jul 4, 2011 at 4:29 PM, Andrew Stubbs a...@codesourcery.com wrote:
 On 28/06/11 16:08, Andrew Stubbs wrote:

 On 23/06/11 15:41, Andrew Stubbs wrote:

 This patch removes the restriction that the inputs to a widening
 multiply must be of the same mode.

 It does this by extending the smaller of the two inputs to match the
 larger; therefore, it remains the case that subsequent code (in the
 expand pass, for example) can rely on the type of rhs1 being the input
 type of the operation, and the gimple verification code is still valid.

 OK?

 This update fixes the testcase issue Janis highlighted.

 And this one updates the context changed by my update to patch 3.

 The content of the patch has not changed.

Similar to the previous patch

+  if (TYPE_MODE (type2) != from_mode)
+{
+  type2 = lang_hooks.types.type_for_mode (from_mode,
+ TYPE_UNSIGNED (type2));

use build_nonstandard_integer_type.

+  if (cast1)
+rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type1, NULL), rhs1, type1);
+  if (cast2)
+rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+ create_tmp_var (type2, NULL), rhs2, type2);

and CSE create_tmp_var - at this point type1 and type2 should be
the same, right?  So I guess it would be a good place to assert
types_compatible_p (type1, type2).

   gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1));
   gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2));

and that's now seemingly redundant ... it should probably be
gimple_assign_set_rhs1 (stmt, rhs1);, no?  A conversion isn't
a valid rhs1/2.  Similar oddity in convert_plusminus_to_widen.

+  if (TYPE_MODE (type2) != TYPE_MODE (type1))
+{
+  type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1),
+ TYPE_UNSIGNED (type2));
+  cast2 = true;
+}
+
+  if (cast1)
+mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+  create_tmp_var (type1, NULL),
+  mult_rhs1, type1);
+  if (cast2)
+mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+  create_tmp_var (type2, NULL),
+  mult_rhs2, type2);

see above.

Thanks,
Richard.

 Andrew



[PATCH, MELT] new function register_data_handler

2011-07-07 Thread Pierre Vittet
Hi, this patch add a new function allowing to add a pragma handler more 
easily. In the past, we were directly modifying the :sysdata_meltpragmas 
field of initial_system_data.


The pragma handler take a list of new pragma handler that we want to 
add. The reason is that the field :sysdata_meltpragmas is a tuple (fixed 
size, this is a mandatory because we uses index to recognize the handler 
later). Each time we call register_data_handler, we recreate the tuple, 
so we try to give a list of handler to call it not to often.


This function should works with a GCC 4.6 but should be used with care, 
as we can only register a single pragma named melt (maybe we could use 
another function specially for 4.6 ?).


Thanks!

Pierre Vittet
2011-07-07  Pierre Vittet  pier...@pvittet.com

* melt/warmelt-base.melt (register_pragma_handler ): new function.
Index: gcc/melt/warmelt-base.melt
===
--- gcc/melt/warmelt-base.melt  (revision 175906)
+++ gcc/melt/warmelt-base.melt  (working copy)
@@ -1135,6 +1135,42 @@ registered with $REGISTER_PASS_EXECUTION_HOOK.}#
   }#)
 )))
 
+;;register a new pragma handler.
+(defun register_pragma_handler (lsthandler)
+  :doc #{register a list of new pragma handlers.  As :sysdata_meltpragmas must
+  be a tuple (we use an index to recognize handlers), we have to recreate this
+  tuple each time we call this function.  That why $LSTHANDLER is a list of
+  handlers (class_gcc_pragma) and not a single object.  }#
+  (assert_msg register_pragma_handler takes a list as argument. 
+(is_list lsthandler))
+  (let ((oldtuple (get_field :sysdata_meltpragmas initial_system_data))
+(:long oldsize 0))
+(if notnull oldtuple)
+  (setq oldsize (multiple_length oldtuple))
+(let ((:long newsize (+i (multiple_length oldtuple)
+  (list_length lsthandler)))
+  (newtuple (make_multiple discr_multiple newsize))
+  (:long i 0))
+;;copy in oldhandlers in the newtuple
+(foreach_in_multiple
+(oldtuple)
+(curhander :long iunused)
+  (multiple_put_nth newtuple i curhander)
+  (setq i (+i i 1))
+)
+;;add new handler from lsthandler
+(foreach_in_list
+(lsthandler)
+(curpair curhandler)
+  (assert_msg register_pragma_handler must be a list of class_gcc_pragma.
+(is_a curhandler class_gcc_pragma))
+  (multiple_put_nth newtuple i curhandler)
+  (setq i (+i i 1))
+)
+(put_fields initial_system_data :sysdata_meltpragmas newtuple)
+))
+)
+
 
  the descriptions of values which are not ctype related.
 (defclass class_value_descriptor
@@ -2361,6 +2397,7 @@ polyhedron values.}#
  ppstrbuf_mixbigint
  read_file
  register_pass_execution_hook
+ register_pragma_handler
  retrieve_value_descriptor_list
  some_integer_greater_than
  some_integer_multiple


Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching

2011-07-07 Thread Richard Guenther
On Mon, Jul 4, 2011 at 4:31 PM, Andrew Stubbs a...@codesourcery.com wrote:
 On 28/06/11 16:30, Andrew Stubbs wrote:

 On 23/06/11 15:42, Andrew Stubbs wrote:

 This patch fixes the case where widening multiply-and-accumulate were
 not recognised because the multiplication itself is not actually
 widening.

 This can happen when you have DI + SI * SI - the multiplication will
 be done in SImode as a non-widening multiply, and it's only the final
 accumulate step that is widening.

 This was not recognised for two reasons:

 1. is_widening_mult_p inferred the output type from the multiply
 statement, which in not useful in this case.

 2. The inputs to the multiply instruction may not have been converted at
 all (because they're not being widened), so the pattern match failed.

 The patch fixes these issues by making the output type explicit, and by
 permitting unconverted inputs (the types are still checked, so this is
 safe).

 OK?

 This update fixes Janis' testsuite issue.

 This updates the context changed by my update to patch 3.

 The content of this patch has not changed.

Ok.

Thanks,
Richard.

 Andrew



[PATCH] Fix dead_debug_insert_before ICE (PR debug/49522, take 3)

2011-07-07 Thread Jakub Jelinek
On Wed, Jul 06, 2011 at 10:36:02PM +0200, Eric Botcazou wrote:
  And here is a version that passed bootstrap/regtest on x86_64-linux and
  i686-linux:
 
  2011-07-06  Jakub Jelinek  ja...@redhat.com
 
  PR debug/49522
  * df-problems.c (dead_debug_reset): Remove dead_debug_uses
  referencing debug insns that have been reset.
  (dead_debug_insert_before): Don't assert reg is non-NULL,
  instead return immediately if it is NULL.
 
  * gcc.dg/debug/pr49522.c: New test.
 
 Sorry, our messages crossed.  I'd set a flag in the first loop.  In the end, 
 it's up to you.

Actually, looking at it some more, dead_debug_use structs referencing the
same insn are always adjacent due to the way how they are added using
dead_debug_add.  While some of the dead_debug_use records might preceede
the record because of which it is reset, it isn't hard to remember a pointer
pointing to the pointer to the first entry for the current insn.

So, here is a new patch which doesn't need two loops, just might go a little
bit backwards to unchain dead_debug_use for the reset insn.

It still needs the change of the gcc_assert (reg) into if (reg == NULL)
return;, because the dead-used bitmap is with this sometimes a false
positive (saying that a regno is referenced even when it isn't).
But here it is IMHO better to occassionaly live with the false positives,
which just means we'll sometimes once walk the chain in dead_debug_reset
or dead_debug_insert_before before resetting it, than to recompute the
bitmap (we'd need a second loop for that, bitmap_clear (debug-used) and
populate it again).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-07-07  Jakub Jelinek  ja...@redhat.com

PR debug/49522
* df-problems.c (dead_debug_reset): Remove dead_debug_uses
referencing debug insns that have been reset.
(dead_debug_insert_before): Don't assert reg is non-NULL,
instead return immediately if it is NULL.

* gcc.dg/debug/pr49522.c: New test.

--- gcc/df-problems.c.jj2011-07-07 02:32:45.928547053 +0200
+++ gcc/df-problems.c   2011-07-07 09:57:34.846464573 +0200
@@ -3096,6 +3096,7 @@ static void
 dead_debug_reset (struct dead_debug *debug, unsigned int dregno)
 {
   struct dead_debug_use **tailp = debug-head;
+  struct dead_debug_use **insnp = debug-head;
   struct dead_debug_use *cur;
   rtx insn;
 
@@ -3113,9 +3114,21 @@ dead_debug_reset (struct dead_debug *deb
debug-to_rescan = BITMAP_ALLOC (NULL);
  bitmap_set_bit (debug-to_rescan, INSN_UID (insn));
  XDELETE (cur);
+ if (tailp != insnp  DF_REF_INSN ((*insnp)-use) == insn)
+   tailp = insnp;
+ while ((cur = *tailp)  DF_REF_INSN (cur-use) == insn)
+   {
+ *tailp = cur-next;
+ XDELETE (cur);
+   }
+ insnp = tailp;
}
   else
-   tailp = (*tailp)-next;
+   {
+ if (DF_REF_INSN ((*insnp)-use) != DF_REF_INSN (cur-use))
+   insnp = tailp;
+ tailp = (*tailp)-next;
+   }
 }
 }
 
@@ -3174,7 +3187,8 @@ dead_debug_insert_before (struct dead_de
tailp = (*tailp)-next;
 }
 
-  gcc_assert (reg);
+  if (reg == NULL)
+return;
 
   /* Create DEBUG_EXPR (and DEBUG_EXPR_DECL).  */
   dval = make_debug_expr_from_rtl (reg);
--- gcc/testsuite/gcc.dg/debug/pr49522.c.jj 2011-07-04 10:54:23.0 
+0200
+++ gcc/testsuite/gcc.dg/debug/pr49522.c2011-07-04 10:54:02.0 
+0200
@@ -0,0 +1,41 @@
+/* PR debug/49522 */
+/* { dg-do compile } */
+/* { dg-options -fcompare-debug } */
+
+int val1 = 0L;
+volatile int val2 = 7L;
+long long val3;
+int *ptr = val1;
+
+static int
+func1 ()
+{
+  return 0;
+}
+
+static short int
+func2 (short int a, unsigned int b)
+{
+  return !b ? a : a  b;
+}
+
+static unsigned long long
+func3 (unsigned long long a, unsigned long long b)
+{
+  return !b ? a : a % b;
+}
+
+void
+func4 (unsigned short arg1, int arg2)
+{
+  for (arg2 = 0; arg2  2; arg2++)
+{
+  *ptr = func3 (func3 (10, func2 (val3, val2)), val3);
+  for (arg1 = -14; arg1  14; arg1 = func1 ())
+   {
+ *ptr = -1;
+ if (foo ())
+   ;
+   }
+}
+}


Jakub


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-07 Thread Andrew Stubbs

On 07/07/11 10:58, Richard Guenther wrote:

I think you should assume that series of widenings, (int)(short)char_variable
are already combined.  Thus I believe you only need to consider a single
conversion in valid_types_for_madd_p.


Hmm, I'm not so sure. I'll look into it a bit further.


+/* Check the input types, TYPE1 and TYPE2 to a widening multiply,

what are those types?  Is TYPE1 the result type and TYPE2 the
operand type?  If so why


TYPE1 and TYPE2 are the inputs to the multiply. I thought I explained 
that in the comment before the function.



+  initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);

this?!


The result of the multiply will be this many bits wide. This may be 
narrower than the type that holds it.


E.g., 16-bit * 8-bit gives a result at most 24-bits wide, which will 
usually be held in a 32- or 64-bit variable.



+  initial_unsigned = TYPE_UNSIGNED (type1)  TYPE_UNSIGNED (type2);

that also looks odd.  So probably TYPE1 isn't the result type.  If they
are the types of the operands, then what operand is EXPR for?


EXPR, as the comment says, is the addition that follows the multiply.


-  if (TREE_CODE (rhs1) == SSA_NAME)
+  for (tmp = rhs1, rhs1_code = ERROR_MARK;
+   TREE_CODE (tmp) == SSA_NAME
+  (CONVERT_EXPR_CODE_P (rhs1_code) || rhs1_code == ERROR_MARK);
+   tmp = gimple_assign_rhs1 (rhs1_stmt))
  {
-  rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
-  if (is_gimple_assign (rhs1_stmt))
-   rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+  rhs1_stmt = SSA_NAME_DEF_STMT (tmp);
+  if (!is_gimple_assign (rhs1_stmt))
+   break;
+  rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
  }

the result looks a bit like spaghetti code ... and lacks a comment
on what it is trying to do.  It looks like it sees through an arbitrary
number of conversions - possibly ones that will make the
macc invalid, as for (short)int-var * short-var + int-var.  So you'll
be pessimizing code by doing that unconditionally.  As I said
above you should at most consider one intermediate conversion.


Ok, I need to add a comment here. The code does indeed look back through 
an arbitrary number of conversions. It is searching for the last real 
operation before the addition, hoping to find a multiply.



I believe the code should be arranged such that only valid
conversions are looked through in the first place.  Valid, in
that the resulting types should still match the macc constraints.


Well, it might be possible to discard some conversions initially, but 
until the multiply is found, and it's input types are known, we can't 
know for certain what conversions are valid.


I think I need to explain what's going on here more clearly.

  1. It finds an addition statement. It's not known yet whether it is 
part of a multiply-and-accumulate, or not.


  2. It follows the conversion chain back from each operand to see if 
it finds a multiply, or widening multiply statement.


  3. If it finds a non-widening multiply, it checks it to see if it 
could be widening multiply-and-accumulate (it will already have been 
rejected as a widening multiply on it's own, but the addition might be 
in a wider mode, or the target might provide multiply-and-accumulate 
insns that don't have corresponding widening multiply insns).


  4. (This is the new bit!) It looks to see if there are any 
conversions between the multiply and addition that can safely be ignored.


  5. If we get here, then emit any necessary conversion statements, and 
convert the addition to a WIDEN_MULT_PLUS_EXPR.


Before these changes, any conversion between the multiply and addition 
statements would prevent optimization, even though there are many cases 
where the conversions are valid, and even inserted automatically.


I'm going to go away and find out whether there are really any cases 
where there can legitimately be more than one conversion, and at least 
update my patch with better commenting.


Thanks for you review.

Andrew


[build] Move dfp-bit support to toplevel libgcc

2011-07-07 Thread Rainer Orth
The next patch in the `move to toplevel libgcc' series is hopefully
easier to get review and approval for.  This one moves dfp-bit and
related build stuff to libgcc.  I think it's completely straight
forward: it moves D{32, 64, 128}PBIT{, _FUNCS}, related Makefile
fragments, and the source files themselves over.  The only part that may
require revision is the location of dfp-bit.? in libgcc: I've kept them
in libgcc/config, as they lived in gcc/config before, but one might as
well argue that they are generic and belong into libgcc itself.

Bootstrapped without regressions on x86_64-unknown-linux-gnu.

Ok for mainline?

Thanks.
Rainer


2011-06-22  Rainer Orth  r...@cebitec.uni-bielefeld.de

gcc:
* config/dfp-bit.c, config/dfp-bit.h: Move to ../libgcc/config.
* config/t-dfprules: Likewise.
* config.gcc (i[34567]86-*-linux*, i[34567]86-*-kfreebsd*-gnu,
i[34567]86-*-knetbsd*-gnu, i[34567]86-*-gnu*,
i[34567]86-*-kopensolaris*-gnu): Remove t-dfprules from tmake_file.
(x86_64-*-linux*, x86_64-*-kfreebsd*-gnu, x86_64-*-knetbsd*-gnu):
Likewise.
(i[34567]86-*-cygwin*): Likewise.
(i[34567]86-*-mingw*,  x86_64-*-mingw*): Likewise.
(powerpc-*-linux*, powerpc64-*-linux*): Likewise.
* Makefile.in (D32PBIT_FUNCS, D64PBIT_FUNCS, D128PBIT_FUNCS): Remove.
(libgcc.mvars): Remove DFP_ENABLE, DFP_CFLAGS, D32PBIT_FUNCS,
D64PBIT_FUNCS, D128PBIT_FUNCS.

libgcc:
* config/dfp-bit.c, config/dfp-bit.h: New files.
* Makefile.in (D32PBIT_FUNCS, D64PBIT_FUNCS, D128PBIT_FUNCS): New
variables.
($(d32pbit-o)): Use $(srcdir) to refer to dfp-bit.c
($(d64pbit-o)): Likewise.
($(d128pbit-o)): Likewise.
* config/t-dfprules: New file.
* config.host (i[34567]86-*-linux*, i[34567]86-*-kfreebsd*-gnu,
i[34567]86-*-knetbsd*-gnu, i[34567]86-*-gnu*,
i[34567]86-*-kopensolaris*-gnu): Add t-dfprules to tmake_file.
(x86_64-*-linux*, x86_64-*-kfreebsd*-gnu, x86_64-*-knetbsd*-gnu):
Likewise.
(i[34567]86-*-cygwin*): Likewise.
(i[34567]86-*-mingw*,  x86_64-*-mingw*): Likewise.
(powerpc-*-linux*, powerpc64-*-linux*): Likewise.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1548,30 +1548,6 @@ TPBIT_FUNCS = _pack_tf _unpack_tf _addsu
 _lt_tf _le_tf _unord_tf _si_to_tf _tf_to_si _negate_tf _make_tf \
 _tf_to_df _tf_to_sf _thenan_tf _tf_to_usi _usi_to_tf
 
-D32PBIT_FUNCS = _addsub_sd _div_sd _mul_sd _plus_sd _minus_sd \
-   _eq_sd _ne_sd _lt_sd _gt_sd _le_sd _ge_sd \
-   _sd_to_si _sd_to_di _sd_to_usi _sd_to_udi \
-   _si_to_sd _di_to_sd _usi_to_sd _udi_to_sd \
-   _sd_to_sf _sd_to_df _sd_to_xf _sd_to_tf \
-   _sf_to_sd _df_to_sd _xf_to_sd _tf_to_sd \
-   _sd_to_dd _sd_to_td _unord_sd _conv_sd
-
-D64PBIT_FUNCS = _addsub_dd _div_dd _mul_dd _plus_dd _minus_dd \
-   _eq_dd _ne_dd _lt_dd _gt_dd _le_dd _ge_dd \
-   _dd_to_si _dd_to_di _dd_to_usi _dd_to_udi \
-   _si_to_dd _di_to_dd _usi_to_dd _udi_to_dd \
-   _dd_to_sf _dd_to_df _dd_to_xf _dd_to_tf \
-   _sf_to_dd _df_to_dd _xf_to_dd _tf_to_dd \
-   _dd_to_sd _dd_to_td _unord_dd _conv_dd
-
-D128PBIT_FUNCS = _addsub_td _div_td _mul_td _plus_td _minus_td \
-   _eq_td _ne_td _lt_td _gt_td _le_td _ge_td \
-   _td_to_si _td_to_di _td_to_usi _td_to_udi \
-   _si_to_td _di_to_td _usi_to_td _udi_to_td \
-   _td_to_sf _td_to_df _td_to_xf _td_to_tf \
-   _sf_to_td _df_to_td _xf_to_td _tf_to_td \
-   _td_to_sd _td_to_dd _unord_td _conv_td
-
 # These might cause a divide overflow trap and so are compiled with
 # unwinder info.
 LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _udivdi3 _umoddi3 _udiv_w_sdiv _udivmoddi4
@@ -1929,14 +1905,6 @@ libgcc.mvars: config.status Makefile $(L
echo DPBIT_FUNCS = '$(DPBIT_FUNCS)'  tmp-libgcc.mvars
echo TPBIT = '$(TPBIT)'  tmp-libgcc.mvars
echo TPBIT_FUNCS = '$(TPBIT_FUNCS)'  tmp-libgcc.mvars
-   echo DFP_ENABLE = '$(DFP_ENABLE)'  tmp-libgcc.mvars
-   echo DFP_CFLAGS='$(DFP_CFLAGS)'  tmp-libgcc.mvars
-   echo D32PBIT='$(D32PBIT)'  tmp-libgcc.mvars
-   echo D32PBIT_FUNCS='$(D32PBIT_FUNCS)'  tmp-libgcc.mvars
-   echo D64PBIT='$(D64PBIT)'  tmp-libgcc.mvars
-   echo D64PBIT_FUNCS='$(D64PBIT_FUNCS)'  tmp-libgcc.mvars
-   echo D128PBIT='$(D128PBIT)'  tmp-libgcc.mvars
-   echo D128PBIT_FUNCS='$(D128PBIT_FUNCS)'  tmp-libgcc.mvars
echo GCC_EXTRA_PARTS = '$(GCC_EXTRA_PARTS)'  tmp-libgcc.mvars
echo SHLIB_LINK = '$(subst 
$(GCC_FOR_TARGET),$$(GCC_FOR_TARGET),$(SHLIB_LINK))'  tmp-libgcc.mvars
echo SHLIB_INSTALL = '$(SHLIB_INSTALL)'  tmp-libgcc.mvars
diff --git a/gcc/config.gcc b/gcc/config.gcc
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1299,7 +1299,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfree
i[34567]86-*-kopensolaris*-gnu) tm_file=${tm_file} i386/gnu-user.h 

Re: [PATCH, MELT] new function register_data_handler

2011-07-07 Thread Basile Starynkevitch
On Thu, Jul 07, 2011 at 12:10:30PM +0200, Pierre Vittet wrote:
 Hi, this patch add a new function allowing to add a pragma handler
 more easily. In the past, we were directly modifying the
 :sysdata_meltpragmas field of initial_system_data.
 

 2011-07-07  Pierre Vittet  pier...@pvittet.com
 
   * melt/warmelt-base.melt (register_pragma_handler): new function.


Thanks. I committed it on the MELT branch. Committed revision 175962.


-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-07-07 Thread Andrew Stubbs

On 07/07/11 11:04, Richard Guenther wrote:

Both types are equal, so please share the temporary variable you
create

+ rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
+   create_tmp_var (type1, NULL),
rhs1, type1);
+ rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
+   create_tmp_var (type2, NULL),
rhs2, type2);

here (CSE create_tmp_var).


I'm sorry, I don't understand this?

This takes code like this:

  r1 = a;
  r2 = b;
  result = r1 + r2;

And transforms it to this:

  r1 = a;
  r2 = b;
  t1 = (type1) r1;
  t2 = (type2) r2;
  result = t1 + t2;

Yes, type1 == type2, but r1 != r2, so t1 != t2.

I don't see where the common expression is here? But then, I am 
something of a newbie to tree optimizations.


Andrew


[Patch, Fortran] PR fortran/49648 ICE with use-associated array-returning function

2011-07-07 Thread Mikael Morin
Hello,

this is the patch I posted yesterday on bugzilla at 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49648#c8

The problem is a NULL pointer encountered during code generation when trying 
to get the rank from the array spec.

The array ref's array spec is normally copied in resolve_ref from the symbol's 
one. It is not the case, however, in this special case (use-associated 
function return variable whose shape involves a function call).

This patch calls gfc_resolve_array_spec on sym-result, which calls 
gfc_resolve_expr on every bound, which in turn calls resolve_ref on them.

As pointed out by Tobias in the PR audit trail, there could be some similar 
bugs with character lengths. The character length variant of the testcase 
doesn't ICE however, so I have decided to propose the patch as is, because it 
should be a step forward anyway.

Regression tested on x86_64-unknown-freebsd8.2. OK for trunk? Should I 
backport to the branches?

Mikael
2011-07-07  Mikael Morin  mikael.mo...@sfr.fr

gcc/fortran
	PR fortran/49648
	* resolve.c (resolve_symbol): Force resolution of function result's
	array specification.

gcc/testsuite
	PR fortran/49648
	* gfortran.dg/result_in_spec_4.f90: New test.

diff --git a/resolve.c b/resolve.c
index f484a22..cbf403c 100644
--- a/resolve.c
+++ b/resolve.c
@@ -12198,6 +12198,8 @@ resolve_symbol (gfc_symbol *sym)
 	}
 	}
 }
+  else if (mp_flag  sym-attr.flavor == FL_PROCEDURE  sym-attr.function)
+gfc_resolve_array_spec (sym-result-as, false);
 
   /* Assumed size arrays and assumed shape arrays must be dummy
  arguments.  Array-spec's of implied-shape should have been resolved to
! { dg-do compile }
!
! PR fortran/49648
! ICE for calls to a use-associated function returning an array whose spec
! depends on a function call.

! Contributed by Tobias Burnus bur...@net-b.de

module m2
  COMPLEX, SAVE, ALLOCATABLE :: P(:)
contains
  FUNCTION getPhaseMatrix() RESULT(PM)
COMPLEX:: PM(SIZE(P),3)
PM=0.0
  END FUNCTION
end module m2

module m
  use m2
contains
   SUBROUTINE gf_generateEmbPot()
  COMPLEX :: sigma2(3,3)
  sigma2 = MATMUL(getPhaseMatrix(), sigma2)
   END SUBROUTINE
end module m

! { dg-final { cleanup-modules m m2 } }


Re: [build] Move dfp-bit support to toplevel libgcc

2011-07-07 Thread Rainer Orth
Paolo Bonzini bonz...@gnu.org writes:

   i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | 
 i[34567]86-*-knetbsd*-gnu | i[34567]86-*-gnu* | 
 i[34567]86-*-kopensolaris*-gnu)
  extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
 crtfastmath.o
 -tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm
 +tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm t-dfprules
  md_unwind_header=i386/linux-unwind.h
  ;;
   x86_64-*-linux* | x86_64-*-kfreebsd*-gnu | x86_64-*-knetbsd*-gnu)
  extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
 crtfastmath.o
 -tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm
 +tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm t-dfprules
  md_unwind_header=i386/linux-unwind.h
  ;;

 This conflicts with the Hurd/k*BSD patch.
 
 Patch is okay if you take care of committing both, but please wait 48 hours

I see Thomas already committed his, but my patch hadn't been updated for
top-of-tree.

 or so, and please post the updated patch with config/dfp-bit.c moved to
 dfp-bit.c (config/t-dfprules should stay there).

Ok, will do.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-07-07 Thread Richard Guenther
On Thu, Jul 7, 2011 at 12:41 PM, Andrew Stubbs andrew.stu...@gmail.com wrote:
 On 07/07/11 11:04, Richard Guenther wrote:

 Both types are equal, so please share the temporary variable you
 create

 +         rhs1 = build_and_insert_cast (gsi, gimple_location (stmt),
 +                                       create_tmp_var (type1, NULL),
 rhs1, type1);
 +         rhs2 = build_and_insert_cast (gsi, gimple_location (stmt),
 +                                       create_tmp_var (type2, NULL),
 rhs2, type2);

 here (CSE create_tmp_var).

 I'm sorry, I don't understand this?

 This takes code like this:

  r1 = a;
  r2 = b;
  result = r1 + r2;

 And transforms it to this:

  r1 = a;
  r2 = b;
  t1 = (type1) r1;
  t2 = (type2) r2;
  result = t1 + t2;

 Yes, type1 == type2, but r1 != r2, so t1 != t2.

 I don't see where the common expression is here? But then, I am something of
 a newbie to tree optimizations.

create_tmp_var creates a var-decl, build_and_insert_casts builds an
SSA name from it.  You can build multiple SSA names from a single
VAR_DECL, so no need to waste two VAR_DECLs for temporaries
of the same type.

Richard.

 Andrew



[PATCH][1/n] Do not force sizetype for POINTER_PLUS_EXPR

2011-07-07 Thread Richard Guenther

This is the first of a series of enabling patches to make 
POINTER_PLUS_EXPR not forcefully take a sizetype offset
(I'm still no 100% what requirements I will end up implementing,
but the first goal is to have less TYPE_IS_SIZETYPE types).

This patch removes the (T *)index +p (int)PTR - PTR +p index
folding.  We shouldn't change what the user specified as
the pointer base as we can't be sure we don't mess up here,
considering

int foo(int *p, uintptr_t o)
{
  return *((uintptr_t)p + (int *)o);
}
int main ()
{
  int res = 0;
  return foo((int *)0, (uintptr_t)res);
}

if the o argument in foo is really the offset then the
C code is invoking undefined behavior as you may not do
anything with an integer which you converted to a pointer
other than converting it back.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2011-07-07  Richard Guenther  rguent...@suse.de

* fold-const.c (fold_binary_loc): Remove index +p PTR - PTR +p index
folding.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 175920)
+++ gcc/fold-const.c(working copy)
@@ -9484,13 +9484,6 @@ fold_binary_loc (location_t loc,
  fold_convert_loc (loc, sizetype,
arg0)));
 
-  /* index +p PTR - PTR +p index */
-  if (POINTER_TYPE_P (TREE_TYPE (arg1))
-  INTEGRAL_TYPE_P (TREE_TYPE (arg0)))
-return fold_build2_loc (loc, POINTER_PLUS_EXPR, type,
-   fold_convert_loc (loc, type, arg1),
-   fold_convert_loc (loc, sizetype, arg0));
-
   /* (PTR +p B) +p A - PTR +p (B + A) */
   if (TREE_CODE (arg0) == POINTER_PLUS_EXPR)
{


[go]: Port to ALPHA arch - epoll problems

2011-07-07 Thread Uros Bizjak
On Tue, Jul 5, 2011 at 10:12 PM, Ian Lance Taylor i...@google.com wrote:

 What remains is a couple of unrelated failures in the testsuite:

 Epoll unexpected fd=0
 pollServer: unexpected wakeup for fd=0 mode=w
 panic: test timed out
 ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388:  7123 Aborted
                 ./a.out -test.short -test.timeout=$timeout $@
 FAIL: http
 gmake[2]: *** [http/check] Error 1

 2011/07/05 18:43:28 Test RPC server listening on 127.0.0.1:50334
 2011/07/05 18:43:28 Test HTTP RPC server listening on 127.0.0.1:49010
 2011/07/05 18:43:28 rpc.Serve: accept:accept tcp 127.0.0.1:50334:
 Resource temporarily unavailable
 FAIL: rpc
 gmake[2]: *** [rpc/check] Error 1

 2011/07/05 18:44:22 Test WebSocket server listening on 127.0.0.1:40893
 Epoll unexpected fd=0
 pollServer: unexpected wakeup for fd=0 mode=w
 panic: test timed out
 ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 12993 Aborted
                 ./a.out -test.short -test.timeout=$timeout $@
 FAIL: websocket
 gmake[2]: *** [websocket/check] Error 1

 ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 13945
 Segmentation fault      ./a.out -test.short -test.timeout=$timeout
 $@
 FAIL: compress/flate
 gmake[2]: *** [compress/flate/check] Error 1

 Any ideas how to attack these?

 None of these look familiar to me.

 An Epoll unexpected fd error means that epoll returned information
 about a file descriptor which the program didn't ask about.  Not sure
 why that would happen.  Particularly for fd 0, since epoll is only used
 for network connections, which fd 0 presumably is not.

 The way to look into these is to cd to TARGET/libgo and run make
 GOTESTFLAGS=--keep http/check (or whatever/check).  That will leave a
 directory gotest in your libgo directory.  The executable a.out in
 that directory is the test case.  You can debug the test case using gdb
 in more or less the usual way.  It's a bit painful to set breakpoints by
 function name, but setting breakpoints by file:line works fine.
 Printing variables works as well as it ever does, but the variables are
 printed in C form rather than Go form.

It turned out that the EpollEvent definition in
libgo/syscalls/epoll/socket_epoll.go is non-portable (if not outright
dangerous...). The definition does have a FIXME comment, but does not
take into account the effects of __attribute__((__packed__)) from
system headers. Contrary to alpha header, x86 has
__attribute__((__packed__)) added to struct epoll_event definition in
sys/epoll.h header.

To illustrate the problem, please run following test:

--cut here--
#include stdint.h
#include stdio.h

typedef union epoll_data
{
  void *ptr;
  int fd;
  uint32_t u32;
  uint64_t u64;
} epoll_data_t;

struct epoll_event
{
  uint32_t events;
  epoll_data_t data;
};

struct packed_epoll_event
{
  uint32_t events;
  epoll_data_t data;
} __attribute__ ((__packed__));

struct fake_epoll_event
{
  uint32_t events;
  int32_t fd;
  int32_t pad;
};

int
main ()
{
  struct epoll_event *ep;
  struct packed_epoll_event *pep;

  struct fake_epoll_event fep;

  fep.events = 0xfe;
  fep.fd = 9;
  fep.pad = 0;

  ep = (struct epoll_event *) fep;
  pep = (struct packed_epoll_event *) fep;

  printf (%#x %i\n, ep-events, ep-data.fd);
  printf (%#x %i\n, pep-events, pep-data.fd);
  return 0;
}
--cut here--

./a.out
0xfe 0
0xfe 9

So, the first line simulates the alpha, the second simulates x86_64.
32bit targets are OK in both cases:

./a.out
0xfe 9
0xfe 9

By changing the definition of EpollEvent to the form that suits alpha:

type EpollEvent struct {
  Events uint32;
  Pad int32;
  Fd int32;
};

both timeouts got fixed and correct FD was passed to and from the syscall.

Uros.


Re: [Patch, Fortran] Add stat=/errmsg= support to _gfortran_caf_register

2011-07-07 Thread Mikael Morin
On Thursday 07 July 2011 07:35:07 Tobias Burnus wrote:
 diff --git a/libgfortran/caf/mpi.c b/libgfortran/caf/mpi.c
 index 83f39f6..2d4af6b 100644
 --- a/libgfortran/caf/mpi.c
 +++ b/libgfortran/caf/mpi.c
 @@ -103,10 +110,19 @@ _gfortran_caf_register (ptrdiff_t size, caf_register_t 
type,
/* Token contains only a list of pointers.  */
local = malloc (size);
token = malloc (sizeof (void*) * caf_num_images);
 +  
Trailing blanks
 +  if (unlikely (local == NULL || token == NULL))
 +goto error;
  
/* token[img-1] is the address of the token in image img.  */
 -  MPI_Allgather (local, sizeof (void*), MPI_BYTE,
 -  token,  sizeof (void*), MPI_BYTE, MPI_COMM_WORLD);
 +  err = MPI_Allgather (local, sizeof (void*), MPI_BYTE, token,
 +sizeof (void*), MPI_BYTE, MPI_COMM_WORLD);
 +  if (unlikely (err))
 +{
 +  free (local);
 +  free (token);
 +  goto error;
 +}
  
if (type == CAF_REGTYPE_COARRAY_STATIC)
  {
This will return the same error (memory allocation failure) as in the case 
just above. Is this expected or should it have an error of its own?

 +   char *msg;
 +  if (caf_is_finalized)
Space indentation
 + msg = Failed to allocate coarray - stopped images;


Also I'm wondering whether it would be pertinent to share the error handling 
between single.c (one error) and mpi.c (2 or 3 errors) as the codes are very 
close (with an interface such as handle_error (int *stat, char *errmsg, int 
errmsg_len, char *actual_error)).


 Build and regtested on x86-64-linux.
 OK for the trunk?
The above is nitpicking, and I leave the final decision to you and Daniel, so 
the patch is basically OK with the two indentation nits fixed.

Mikael



[PATCH][C] Fixup pointer-int-sum

2011-07-07 Thread Richard Guenther

This tries to make sense of the comments and code in the code
doing the index - size multiplication in pointer-int-sum.  It
also fixes a bogus integer-constant conversion which results
in not properly canonicalized integer constants.

The comment in the code claims the index - size multiplication
is carried out signed, which doesn't match the code which does
it unsigned (the commend dates back to rev. 6733 where we _did_
carry out the multiplication in a signed type, using
c_common_type_for_size (TYPE_PRECISION (sizetype), 0)).  The
following patch makes us preserve the signedness of intop
so that for signed intop the multiplication will be known to
not overflow (what is actually the C semantics - is the
multiplication allowed to overflow for unsigned intop?  If not
I guess the orginal code of always choosing a signed type was
more correct and we should go back to it instead?)

The comment also claims there is a sign-extension of t
to sizetype - that's not true either, it's just a sign-change.

Joseph, do we want an unconditional (un-)signed multiplication
(before the patch it's unsigned), or what the patch does?

Bootstrapped and tested on x86_64-unknown-linux-gnu.  I'll also
test the unconditionally signed variant.

Thanks,
Richard.

2011-07-07  Richard Guenther  rguent...@suse.de

* c-common.c (pointer_int_sum): Do the index times size
multiplication in the signedness of index.  Properly
strip overflow flags.

Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 175962)
+++ gcc/c-family/c-common.c (working copy)
@@ -3737,23 +3737,22 @@ pointer_int_sum (location_t loc, enum tr
 
   /* Convert the integer argument to a type the same size as sizetype
  so the multiply won't overflow spuriously.  */
-  if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype)
-  || TYPE_UNSIGNED (TREE_TYPE (intop)) != TYPE_UNSIGNED (sizetype))
+  if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype))
 intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
-TYPE_UNSIGNED (sizetype)), intop);
+TYPE_UNSIGNED (TREE_TYPE (intop))),
+intop);
 
   /* Replace the integer argument with a suitable product by the object size.
- Do this multiplication as signed, then convert to the appropriate type
- for the pointer operation and disregard an overflow that occured only
- because of the sign-extension change in the latter conversion.  */
+ Do this multiplication in a widened intop type, then convert to the
+ appropriate type for the pointer operation and disregard an overflow
+ that occured only because of the sign-change in the latter conversion.  */
   {
 tree t = build_binary_op (loc,
  MULT_EXPR, intop,
  convert (TREE_TYPE (intop), size_exp), 1);
 intop = convert (sizetype, t);
 if (TREE_OVERFLOW_P (intop)  !TREE_OVERFLOW (t))
-  intop = build_int_cst_wide (TREE_TYPE (intop), TREE_INT_CST_LOW (intop),
- TREE_INT_CST_HIGH (intop));
+  intop = double_int_to_tree (sizetype, tree_to_double_int (intop));
   }
 
   /* Create the sum or difference.  */


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-07 Thread Andrew Stubbs

On 07/07/11 11:26, Andrew Stubbs wrote:

On 07/07/11 10:58, Richard Guenther wrote:

I think you should assume that series of widenings,
(int)(short)char_variable
are already combined.  Thus I believe you only need to consider a single
conversion in valid_types_for_madd_p.


Hmm, I'm not so sure. I'll look into it a bit further.


OK, here's a test case that gives multiple conversions:

  long long
  foo (long long a, signed char b, signed char c)
  {
int bc = b * c;
return a + (short)bc;
  }

The dump right before the widen_mult pass gives:

  foo (long long int a, signed char b, signed char c)
  {
int bc;
long long int D.2018;
short int D.2017;
long long int D.2016;
int D.2015;
int D.2014;

  bb 2:
D.2014_2 = (int) b_1(D);
D.2015_4 = (int) c_3(D);
bc_5 = D.2014_2 * D.2015_4;
D.2017_6 = (short int) bc_5;
D.2018_7 = (long long int) D.2017_6;
D.2016_9 = D.2018_7 + a_8(D);
return D.2016_9;

  }

Here we have a multiply and accumulate done the long way. The 8-bit 
inputs are widened to 32-bit, multiplied to give a 32-bit result (of 
which only the lower 16-bits contain meaningful data), then truncated to 
16-bits, and sign-extended up to 64-bits ready for the 64-bit addition.


This is slight contrived, perhaps, but not unlike the sort of thing that 
might occur when you have inline functions and macros, and most 
importantly - it is mathematically valid!



So, here's the output from my patched widen_mult pass:

  foo (long long int a, signed char b, signed char c)
  {
int bc;
long long int D.2018;
short int D.2017;
long long int D.2016;
int D.2015;
int D.2014;

  bb 2:
D.2014_2 = (int) b_1(D);
D.2015_4 = (int) c_3(D);
bc_5 = b_1(D) w* c_3(D);
D.2017_6 = (short int) bc_5;
D.2018_7 = (long long int) D.2017_6;
D.2016_9 = WIDEN_MULT_PLUS_EXPR b_1(D), c_3(D), a_8(D);
return D.2016_9;

  }

As you can see, everything except the WIDEN_MULT_PLUS_EXPR statement is 
now redundant. (Ideally, this would be removed now, but in fact it 
doesn't get eliminated until the RTL into_cfglayout pass. This is not 
new behaviour.)



My point is that it's possible to have at least two conversions to 
examine. Is it possible to have more? I don't know, but once I'm dealing 
with two I might as well deal with an arbitrary number.


Andrew


[go]: Many valgrind errors (use of uninit value, jump depends on uninit value) in the testsuite

2011-07-07 Thread Uros Bizjak
On Tue, Jul 5, 2011 at 10:12 PM, Ian Lance Taylor i...@google.com wrote:

 What remains is a couple of unrelated failures in the testsuite:

 ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 13945
 Segmentation fault      ./a.out -test.short -test.timeout=$timeout
 $@
 FAIL: compress/flate
 gmake[2]: *** [compress/flate/check] Error 1

 Any ideas how to attack these?

 None of these look familiar to me.

compress/flate test sometimes passes and sometimes don't. I  have run
the resulting executable through the valgrind, and there are many
(i.e. hundreds) of warnings of uses and calls that depend on
uninitialized variables, also on x86_64.

ATM, I would like to just report problems with valgrind, and due to
the number of them, it looks to me that something is wrong with the
library.

Uros.


Re: plugin event for C/C++ declarations

2011-07-07 Thread Diego Novillo

On 11-07-07 05:06 , Romain Geissler wrote:


gcc/ChangeLog:

* plugin.def: Add event for finish_decl.
* plugin.c (register_callback, invoke_plugin_callbacks): Same.
* c-decl.c (finish_decl): Invoke callbacks on above event.
* doc/plugins.texi: Document above event.

gcc/cp/ChangeLog:

* decl.c (cp_finish_decl): Invoke callbacks on finish_decl event.

gcc/testsuite/ChangeLog:

* g++.dg/plugin/decl_plugin.c: New test plugin.
* g++.dg/plugin/decl-plugin-test.C: Testcase for above plugin.
* g++.dg/plugin/plugin.exp: Add above testcase.


OK.  This one fell through the cracks in my inbox.  Apologies.


Diego.


Re: [Patch, Fortran] PR fortran/49648 ICE with use-associated array-returning function

2011-07-07 Thread Tobias Burnus

Dear Mikael,

On 07/07/2011 12:42 PM, Mikael Morin wrote:

this is the patch I posted yesterday on bugzilla at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49648#c8

This patch calls gfc_resolve_array_spec on sym-result, which calls
gfc_resolve_expr on every bound, which in turn calls resolve_ref on them.

As pointed out by Tobias in the PR audit trail, there could be some similar
bugs with character lengths. The character length variant of the testcase
doesn't ICE however, so I have decided to propose the patch as is, because it
should be a step forward anyway.


My impression is that the type-spec - contrary to the array spec - is 
shared between the function symbol and the result symbol. That's also 
what I get for the example you posted, when looking at the expressions 
in the debugger. Thus, it seems as the array spec is the only case where 
one needs to do something.



Regression tested on x86_64-unknown-freebsd8.2. OK for trunk? Should I
backport to the branches?


OK. Regarding backporting: I don't know; I don't have a strong opinion. 
It's not a regression - but it is also a simple fix. Thus, backporting 
to 4.6 should be OK, but I wouldn't port it to older versions.


Thanks a lot for the patch and going though resolve.c!

Tobias


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-07 Thread Richard Guenther
On Thu, Jul 7, 2011 at 1:43 PM, Andrew Stubbs andrew.stu...@gmail.com wrote:
 On 07/07/11 11:26, Andrew Stubbs wrote:

 On 07/07/11 10:58, Richard Guenther wrote:

 I think you should assume that series of widenings,
 (int)(short)char_variable
 are already combined.  Thus I believe you only need to consider a single
 conversion in valid_types_for_madd_p.

 Hmm, I'm not so sure. I'll look into it a bit further.

 OK, here's a test case that gives multiple conversions:

  long long
  foo (long long a, signed char b, signed char c)
  {
    int bc = b * c;
    return a + (short)bc;
  }

 The dump right before the widen_mult pass gives:

  foo (long long int a, signed char b, signed char c)
  {
    int bc;
    long long int D.2018;
    short int D.2017;
    long long int D.2016;
    int D.2015;
    int D.2014;

  bb 2:
    D.2014_2 = (int) b_1(D);
    D.2015_4 = (int) c_3(D);
    bc_5 = D.2014_2 * D.2015_4;
    D.2017_6 = (short int) bc_5;

Ok, so you have a truncation that is a no-op value-wise.  I would
argue that this truncation should be removed independent on
whether we have a widening multiply instruction or not.

The technically most capable place to remove non-value-changing
truncations (and combine them with a successive conversion)
would be value-range propagation.  Which already knows:

Value ranges after VRP:

b_1(D): VARYING
D.2698_2: [-128, 127]
c_3(D): VARYING
D.2699_4: [-128, 127]
bc_5: [-16256, 16384]
D.2701_6: [-16256, 16384]
D.2702_7: [-16256, 16384]
a_8(D): VARYING
D.2700_9: VARYING

thus truncating bc_5 to short does not change the value.

The simplification could be made when looking at the
statement

    D.2018_7 = (long long int) D.2017_6;

in vrp_fold_stmt, based on the fact that this conversion
converts from a value-preserving intermediate conversion.
Thus the transform would replace the D.2017_6 operand
with bc_5.

So yes, the case appears - but it shouldn't ;)

I'll cook up a quick patch for VRP.

Thanks,
Richard.

    D.2016_9 = D.2018_7 + a_8(D);
    return D.2016_9;

  }

 Here we have a multiply and accumulate done the long way. The 8-bit inputs
 are widened to 32-bit, multiplied to give a 32-bit result (of which only the
 lower 16-bits contain meaningful data), then truncated to 16-bits, and
 sign-extended up to 64-bits ready for the 64-bit addition.

 This is slight contrived, perhaps, but not unlike the sort of thing that
 might occur when you have inline functions and macros, and most importantly
 - it is mathematically valid!


 So, here's the output from my patched widen_mult pass:

  foo (long long int a, signed char b, signed char c)
  {
    int bc;
    long long int D.2018;
    short int D.2017;
    long long int D.2016;
    int D.2015;
    int D.2014;

  bb 2:
    D.2014_2 = (int) b_1(D);
    D.2015_4 = (int) c_3(D);
    bc_5 = b_1(D) w* c_3(D);
    D.2017_6 = (short int) bc_5;
    D.2018_7 = (long long int) D.2017_6;
    D.2016_9 = WIDEN_MULT_PLUS_EXPR b_1(D), c_3(D), a_8(D);
    return D.2016_9;

  }

 As you can see, everything except the WIDEN_MULT_PLUS_EXPR statement is now
 redundant. (Ideally, this would be removed now, but in fact it doesn't get
 eliminated until the RTL into_cfglayout pass. This is not new behaviour.)


 My point is that it's possible to have at least two conversions to examine.
 Is it possible to have more? I don't know, but once I'm dealing with two I
 might as well deal with an arbitrary number.

 Andrew



Re: Improve Solaris mudflap support (PR libmudflap/49550)

2011-07-07 Thread Uros Bizjak
Hello!

 diff --git a/libmudflap/testsuite/libmudflap.c/pass47-frag.c 
 b/libmudflap/testsuite/libmudflap.c/pass47-frag.c
  --- a/libmudflap/testsuite/libmudflap.c/pass47-frag.c
  +++ b/libmudflap/testsuite/libmudflap.c/pass47-frag.c
 @@ -8,3 +8,5 @@ int main ()
   tolower (buf[4]) == 'o'  tolower ('X') == 'x' 
   isdigit (buf[3])) == 0  isalnum ('4'));
  }
 +
 +/* { dg-warning cannot track unknown size extern .__ctype. Solaris 
 __ctype declared without size { target *-*-solaris2.* } 0 } */

This is handled differently throughout the mudflap testsuite:

/* Ignore a warning that is irrelevant to the purpose of this test.  */
/* { dg-prune-output .*mudflap cannot track unknown size extern.* } */

Uros.


Re: [PATCH] Fix UNRESOLVED gcc.dg/graphite/pr37485.c

2011-07-07 Thread Uros Bizjak
Hello!

 Committed.

 Richard.

 2011-07-07  Richard Guenther  rguent...@suse.de

   * gcc.dg/graphite/pr37485.c: Add -floop-block.

Heh, you were faster by a minute!

Uros.


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-07 Thread Richard Guenther
On Thu, Jul 7, 2011 at 2:28 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Thu, Jul 7, 2011 at 1:43 PM, Andrew Stubbs andrew.stu...@gmail.com wrote:
 On 07/07/11 11:26, Andrew Stubbs wrote:

 On 07/07/11 10:58, Richard Guenther wrote:

 I think you should assume that series of widenings,
 (int)(short)char_variable
 are already combined.  Thus I believe you only need to consider a single
 conversion in valid_types_for_madd_p.

 Hmm, I'm not so sure. I'll look into it a bit further.

 OK, here's a test case that gives multiple conversions:

  long long
  foo (long long a, signed char b, signed char c)
  {
    int bc = b * c;
    return a + (short)bc;
  }

 The dump right before the widen_mult pass gives:

  foo (long long int a, signed char b, signed char c)
  {
    int bc;
    long long int D.2018;
    short int D.2017;
    long long int D.2016;
    int D.2015;
    int D.2014;

  bb 2:
    D.2014_2 = (int) b_1(D);
    D.2015_4 = (int) c_3(D);
    bc_5 = D.2014_2 * D.2015_4;
    D.2017_6 = (short int) bc_5;

 Ok, so you have a truncation that is a no-op value-wise.  I would
 argue that this truncation should be removed independent on
 whether we have a widening multiply instruction or not.

 The technically most capable place to remove non-value-changing
 truncations (and combine them with a successive conversion)
 would be value-range propagation.  Which already knows:

 Value ranges after VRP:

 b_1(D): VARYING
 D.2698_2: [-128, 127]
 c_3(D): VARYING
 D.2699_4: [-128, 127]
 bc_5: [-16256, 16384]
 D.2701_6: [-16256, 16384]
 D.2702_7: [-16256, 16384]
 a_8(D): VARYING
 D.2700_9: VARYING

 thus truncating bc_5 to short does not change the value.

 The simplification could be made when looking at the
 statement

    D.2018_7 = (long long int) D.2017_6;

 in vrp_fold_stmt, based on the fact that this conversion
 converts from a value-preserving intermediate conversion.
 Thus the transform would replace the D.2017_6 operand
 with bc_5.

 So yes, the case appears - but it shouldn't ;)

 I'll cook up a quick patch for VRP.

Like the attached.  I'll finish and properly test it.

Richard.


p
Description: Binary data


Re: [Patch, Fortran] Add stat=/errmsg= support to _gfortran_caf_register

2011-07-07 Thread Tobias Burnus

On 07/07/2011 01:35 PM, Mikael Morin wrote:

if (type == CAF_REGTYPE_COARRAY_STATIC)
  {

This will return the same error (memory allocation failure) as in the case
just above. Is this expected or should it have an error of its own?


I think it is OK in either case. CAF_REFTYPE_COARRAY_STATIC is an 
automatic allocation for static coarrays, e.g.

   REAL, SAVE :: my_coarray(1000,1000,10)[*]
is allocated at startup (via a constructor) while the other case is for 
allocatable coarrays of the form

   REAL, ALLOCATABLE :: my_alloc_coarray(:, :, :)[:]
   ALLOCATE (my_alloc_coarray(1000,1000,10)[*])

I admit that it is might be not obvious to the user that there is an 
explicit allocate in the first case. However, one allocates memory in 
either case and, thus, one could leave the message as is. In particular, 
I would assume that on most systems, the size of static coarrays is 
small enough that the message does not trigger.
However, if you think that the message could be clearer, I could also 
change it.



+   msg = Failed to allocate coarray - stopped images;


Also I'm wondering whether it would be pertinent to share the error handling
between single.c (one error) and mpi.c (2 or 3 errors) as the codes are very
close (with an interface such as handle_error (int *stat, char *errmsg, int
errmsg_len, char *actual_error)).


The question is where to handle it; in principle, single.c and mpi.c are 
completely separate files - and both might be compiled by the 
user/system administrator, contrary to the rest of GCC. Well, single.c 
is actually automatically compiled as static library and installed as 
libcaf_single.a. The MPI version is never compiled automatically.


Thus, anyone who wants to use gfortran with coarrays (based on mpi.c), 
has to do:

a) Fetch libcaf.h and mpi.c
b) Compile mpi.c, e.g., using mpicc -g -O2 -c mpi.c
c) Link the such generated mpi.o  (or libcaf_mpi.a) to the Fortran program.

As the user/sysadmin as to do the compiliation himself, I would like to 
make it as easy as possible. The current idea is to have just a single C 
file plus a header file and no further dependency. Other communication 
backends could be added by simply creating a new file and implementing 
the library calls.


Thus, I do not see how one could best share single.c and mpi.c error 
messages. But if you have a good idea, I am open to change the current 
implementation.


(See also http://gcc.gnu.org/wiki/CoarrayLib )


Build and regtested on x86-64-linux.
OK for the trunk?

The above is nitpicking, and I leave the final decision to you and Daniel, so
the patch is basically OK with the two indentation nits fixed.


I have now committed the patch with only the nits fixed (Rev.175966). 
But given that the coarray support - especially with regards to the 
library - is still in a flux, we can still change everything, including 
the ABI of the library and the file organization. I am sure that not all 
design decisions are optimal.


Thanks for the review!

Tobias


[committed] Regimplify last 2 ARRAY_*REF operands and last COMPONENT_REF operand (PR middle-end/49640)

2011-07-07 Thread Jakub Jelinek
Hi!

The attached testcase ICEs, because gimple_regimplify_operands ignores
lb: and sz: operands on ARRAY*_REF (and last operand on COMPONENT_REF),
assuming that if it is non-NULL, it is valid GIMPLE and doesn't need
further processing.  That is true for gimplification, as FEs/generic
leave those operands NULL and only gimplification sets them, but when
we need to regimplify them, e.g. for OpenMP (or perhaps inlining etc.),
it wouldn't do anything.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk and 4.6 branch.

2011-07-07  Jakub Jelinek  ja...@redhat.com

PR middle-end/49640
* gimplify.c (gimplify_compound_lval): For last 2 ARRAY_*REF
operands and last COMPONENT_REF operand call gimplify_expr on it
if non-NULL.

* gcc.dg/gomp/pr49640.c: New test.

--- gcc/gimplify.c.jj   2011-06-17 11:02:19.0 +0200
+++ gcc/gimplify.c  2011-07-07 10:56:30.0 +0200
@@ -2010,8 +2010,14 @@ gimplify_compound_lval (tree *expr_p, gi
  ret = MIN (ret, tret);
}
}
+ else
+   {
+ tret = gimplify_expr (TREE_OPERAND (t, 2), pre_p, post_p,
+   is_gimple_reg, fb_rvalue);
+ ret = MIN (ret, tret);
+   }
 
- if (!TREE_OPERAND (t, 3))
+ if (TREE_OPERAND (t, 3) == NULL_TREE)
{
  tree elmt_type = TREE_TYPE (TREE_TYPE (TREE_OPERAND (t, 0)));
  tree elmt_size = unshare_expr (array_ref_element_size (t));
@@ -2031,11 +2037,17 @@ gimplify_compound_lval (tree *expr_p, gi
  ret = MIN (ret, tret);
}
}
+ else
+   {
+ tret = gimplify_expr (TREE_OPERAND (t, 3), pre_p, post_p,
+   is_gimple_reg, fb_rvalue);
+ ret = MIN (ret, tret);
+   }
}
   else if (TREE_CODE (t) == COMPONENT_REF)
{
  /* Set the field offset into T and gimplify it.  */
- if (!TREE_OPERAND (t, 2))
+ if (TREE_OPERAND (t, 2) == NULL_TREE)
{
  tree offset = unshare_expr (component_ref_field_offset (t));
  tree field = TREE_OPERAND (t, 1);
@@ -2054,6 +2066,12 @@ gimplify_compound_lval (tree *expr_p, gi
  ret = MIN (ret, tret);
}
}
+ else
+   {
+ tret = gimplify_expr (TREE_OPERAND (t, 2), pre_p, post_p,
+   is_gimple_reg, fb_rvalue);
+ ret = MIN (ret, tret);
+   }
}
 }
 
--- gcc/testsuite/gcc.dg/gomp/pr49640.c.jj  2011-07-07 11:07:08.0 
+0200
+++ gcc/testsuite/gcc.dg/gomp/pr49640.c 2011-07-07 11:05:19.0 +0200
@@ -0,0 +1,29 @@
+/* PR middle-end/49640 */
+/* { dg-do compile } */
+/* { dg-options -O2 -std=gnu99 -fopenmp } */
+
+void
+foo (int N, int M, int K, int P, int Q, int R, int i, int j, int k,
+ unsigned char x[P][Q][R], int y[N][M][K])
+{
+  int ii, jj, kk;
+
+#pragma omp parallel for private(ii,jj,kk)
+  for (ii = 0; ii  P; ++ii)
+for (jj = 0; jj  Q; ++jj)
+  for (kk = 0; kk  R; ++kk)
+   y[i + ii][j + jj][k + kk] = x[ii][jj][kk];
+}
+
+void
+bar (int N, int M, int K, int P, int Q, int R, int i, int j, int k,
+ unsigned char x[P][Q][R], float y[N][M][K], float factor, float zero)
+{
+  int ii, jj, kk;
+
+#pragma omp parallel for private(ii,jj,kk)
+  for (ii = 0; ii  P; ++ii)
+for (jj = 0; jj  Q; ++jj)
+  for (kk = 0; kk  R; ++kk)
+   y[i + ii][j + jj][k + kk] = factor * x[ii][jj][kk] + zero;
+}

Jakub


[PATCH] Fix complex {*,/} real or real * complex handling in C FE (PR c/49644)

2011-07-07 Thread Jakub Jelinek
Hi!

For MULT_EXPR and TRUNC_DIV_EXPR, both sides of COMPLEX_EXPR contain
a copy of the non-complex operand, which means its side-effects can be
evaluated twice.  For PLUS_EXPR/MINUS_EXPR they appear just in one of
the operands and thus it works fine as is.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk/4.6?

2011-07-07  Jakub Jelinek  ja...@redhat.com

PR c/49644
* c-typeck.c (build_binary_op): For MULT_EXPR and TRUNC_DIV_EXPR with
one non-complex and one complex argument, call c_save_expr on both
operands.

* gcc.c-torture/execute/pr49644.c: New test.

--- gcc/c-typeck.c.jj   2011-05-31 08:03:10.0 +0200
+++ gcc/c-typeck.c  2011-07-07 11:47:31.0 +0200
@@ -10032,6 +10032,8 @@ build_binary_op (location_t location, en
  if (first_complex)
{
  op0 = c_save_expr (op0);
+ if (code == MULT_EXPR || code == TRUNC_DIV_EXPR)
+   op1 = c_save_expr (op1);
  real = build_unary_op (EXPR_LOCATION (orig_op0), REALPART_EXPR,
 op0, 1);
  imag = build_unary_op (EXPR_LOCATION (orig_op0), IMAGPART_EXPR,
@@ -10052,6 +10054,8 @@ build_binary_op (location_t location, en
}
  else
{
+ if (code == MULT_EXPR)
+   op0 = c_save_expr (op0);
  op1 = c_save_expr (op1);
  real = build_unary_op (EXPR_LOCATION (orig_op1), REALPART_EXPR,
 op1, 1);
--- gcc/testsuite/gcc.c-torture/execute/pr49644.c.jj2011-07-07 
11:48:34.0 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr49644.c   2011-07-07 
11:35:52.0 +0200
@@ -0,0 +1,16 @@
+/* PR c/49644 */
+
+extern void abort (void);
+
+int
+main (void)
+{
+  _Complex double a[12], *c = a, s = 3.0 + 1.0i;
+  double b[12] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }, *d = b;
+  int i;
+  for (i = 0; i  6; i++)
+*c++ = *d++ * s;
+  if (c != a + 6 || d != b + 6)
+abort ();
+  return 0;
+}

Jakub


Re: PATCH [1/n] X32: Add initial -x32 support

2011-07-07 Thread H.J. Lu
On Wed, Jul 6, 2011 at 9:22 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Jul 6, 2011 at 8:02 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Wed, Jul 6, 2011 at 4:48 PM, H.J. Lu hjl.to...@gmail.com wrote:
 Hi Paolo, DJ, Nathanael, Alexandre, Ralf,

 Is the change
 .
        * configure.ac: Support --enable-x32.
        * configure: Regenerated.

 diff --git a/gcc/configure.ac b/gcc/configure.ac
 index 5f3641b..bddabeb 100644
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -611,6 +611,11 @@ AC_ARG_ENABLE(multilib,
  [], [enable_multilib=yes])
  AC_SUBST(enable_multilib)

 +# With x32 support
 +AC_ARG_ENABLE(x32,
 +[  --enable-x32            enable x32 library support for multiple ABIs],

 Looks like a very very generic switch for a global configury ... we already
 have --with-multilib-list (SH only), why not extend that to also work
 for x86_64?

 Richard.

 +[], [enable_x32=no])
 +
  # Enable __cxa_atexit for C++.
  AC_ARG_ENABLE(__cxa_atexit,
  [AS_HELP_STRING([--enable-__cxa_atexit], [enable __cxa_atexit for C++])],

 OK?

 Thanks.


 Here is the updated patch to use --with-multilib-list=x32.

 Paolo, DJ, Nathanael, Alexandre, Ralf, Is the configure.ac change

 ---
        * configure.ac: Mention x86-64 for --with-multilib-list.
        * configure: Regenerated.

        * doc/install.texi: Document --with-multilib-list=x32.

 diff --git a/gcc/configure.ac b/gcc/configure.ac
 index 5f3641b..a73f758 100644
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -795,7 +795,7 @@ esac],
  [enable_languages=c])

  AC_ARG_WITH(multilib-list,
 -[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH only)])],
 +[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH and
 x86-64 only)])],
  :,
  with_multilib_list=default)

 diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
 index 49aac95..a5d266c 100644
 --- a/gcc/doc/install.texi
 +++ b/gcc/doc/install.texi
 @@ -1049,8 +1049,10 @@ sysv, aix.
  @item --with-multilib-list=@var{list}
  @itemx --without-multilib-list
  Specify what multilibs to build.
 -Currently only implemented for sh*-*-*.
 +Currently only implemented for sh*-*-* and x86-64-*-linux*.

 +@table @code
 +@item sh*-*-*
  @var{list} is a comma separated list of CPU names.  These must be of the
  form @code{sh*} or @code{m*} (in which case they match the compiler option
  for that processor).  The list should not contain any endian options -
 @@ -1082,6 +1084,12 @@ only little endian SH4AL:
  --with-multilib-list=sh4al,!mb/m4al
  @end smallexample

 +@item x86-64-*-linux*
 +If @var{list} is @code{x32}, x32 run-time library will be enabled.  By
 +default, x32 run-time library is disabled.
 +
 +@end table
 +
  @item --with-endian=@var{endians}
  Specify what endians to use.
  Currently only implemented for sh*-*-*.
 ---

 OK?

 Thanks.

 --
 H.J.
 ---
 2011-07-06  H.J. Lu  hongjiu...@intel.com

        * config.gcc: Support --with-multilib-list=x32 for x86 Linux
        targets.

        * configure.ac: Mention x86-64 for --with-multilib-list.
        * configure: Regenerated.

        * config/i386/gnu-user64.h (SPEC_64): Support x32.
        (SPEC_32): Likewise.
        (ASM_SPEC): Likewise.
        (LINK_SPEC): Likewise.
        (TARGET_THREAD_SSP_OFFSET): Likewise.
        (TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise.
        (SPEC_X32): New.

        * config/i386/i386.h (TARGET_X32): New.
        (TARGET_LP64): New.
        (LONG_TYPE_SIZE): Likewise.
        (POINTER_SIZE): Likewise.
        (POINTERS_EXTEND_UNSIGNED): Likewise.
        (OPT_ARCH64): Support x32.
        (OPT_ARCH32): Likewise.

        * config/i386/i386.opt (mx32): New.

        * config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New.
        (GLIBC_DYNAMIC_LINKERX32): Likewise.
        * config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise.
        (GLIBC_DYNAMIC_LINKERX32): Likewise.

        * config/i386/t-linux-x32: New.

        * config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New.
        (BIONIC_DYNAMIC_LINKERX32): Likewise.
        (GNU_USER_DYNAMIC_LINKERX32): Likewise.

        * doc/install.texi: Document --with-multilib-list=x32.

        * doc/invoke.texi: Document -mx32.


Hi Uros,

This new version only adds a comment to configure.ac.  OK to install?

Thanks.

-- 
H.J.


Re: RFA: Fix bogus mode in choose_reload_regs

2011-07-07 Thread Ulrich Weigand
Richard Sandiford wrote:

 gcc/
   * reload1.c (choose_reload_regs): Use mode sizes to check whether
   an old relaod register completely defines the required value.
 
 gcc/testsuite/
   * gcc.target/arm/neon-modes-3.c: New test.

This is OK.

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: PATCH [1/n] X32: Add initial -x32 support

2011-07-07 Thread Uros Bizjak
On Thu, Jul 7, 2011 at 2:59 PM, H.J. Lu hjl.to...@gmail.com wrote:

 Hi Paolo, DJ, Nathanael, Alexandre, Ralf,

 Is the change
 .
        * configure.ac: Support --enable-x32.
        * configure: Regenerated.

 diff --git a/gcc/configure.ac b/gcc/configure.ac
 index 5f3641b..bddabeb 100644
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -611,6 +611,11 @@ AC_ARG_ENABLE(multilib,
  [], [enable_multilib=yes])
  AC_SUBST(enable_multilib)

 +# With x32 support
 +AC_ARG_ENABLE(x32,
 +[  --enable-x32            enable x32 library support for multiple ABIs],

 Looks like a very very generic switch for a global configury ... we already
 have --with-multilib-list (SH only), why not extend that to also work
 for x86_64?

 Richard.

 +[], [enable_x32=no])
 +
  # Enable __cxa_atexit for C++.
  AC_ARG_ENABLE(__cxa_atexit,
  [AS_HELP_STRING([--enable-__cxa_atexit], [enable __cxa_atexit for C++])],

 OK?

 Thanks.


 Here is the updated patch to use --with-multilib-list=x32.

 Paolo, DJ, Nathanael, Alexandre, Ralf, Is the configure.ac change

 ---
        * configure.ac: Mention x86-64 for --with-multilib-list.
        * configure: Regenerated.

        * doc/install.texi: Document --with-multilib-list=x32.

 diff --git a/gcc/configure.ac b/gcc/configure.ac
 index 5f3641b..a73f758 100644
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -795,7 +795,7 @@ esac],
  [enable_languages=c])

  AC_ARG_WITH(multilib-list,
 -[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH only)])],
 +[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH and
 x86-64 only)])],
  :,
  with_multilib_list=default)

 diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
 index 49aac95..a5d266c 100644
 --- a/gcc/doc/install.texi
 +++ b/gcc/doc/install.texi
 @@ -1049,8 +1049,10 @@ sysv, aix.
  @item --with-multilib-list=@var{list}
  @itemx --without-multilib-list
  Specify what multilibs to build.
 -Currently only implemented for sh*-*-*.
 +Currently only implemented for sh*-*-* and x86-64-*-linux*.

 +@table @code
 +@item sh*-*-*
  @var{list} is a comma separated list of CPU names.  These must be of the
  form @code{sh*} or @code{m*} (in which case they match the compiler option
  for that processor).  The list should not contain any endian options -
 @@ -1082,6 +1084,12 @@ only little endian SH4AL:
  --with-multilib-list=sh4al,!mb/m4al
  @end smallexample

 +@item x86-64-*-linux*
 +If @var{list} is @code{x32}, x32 run-time library will be enabled.  By
 +default, x32 run-time library is disabled.
 +
 +@end table
 +
  @item --with-endian=@var{endians}
  Specify what endians to use.
  Currently only implemented for sh*-*-*.
 ---

 OK?

 Thanks.

 --
 H.J.
 ---
 2011-07-06  H.J. Lu  hongjiu...@intel.com

        * config.gcc: Support --with-multilib-list=x32 for x86 Linux
        targets.

        * configure.ac: Mention x86-64 for --with-multilib-list.
        * configure: Regenerated.

        * config/i386/gnu-user64.h (SPEC_64): Support x32.
        (SPEC_32): Likewise.
        (ASM_SPEC): Likewise.
        (LINK_SPEC): Likewise.
        (TARGET_THREAD_SSP_OFFSET): Likewise.
        (TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise.
        (SPEC_X32): New.

        * config/i386/i386.h (TARGET_X32): New.
        (TARGET_LP64): New.
        (LONG_TYPE_SIZE): Likewise.
        (POINTER_SIZE): Likewise.
        (POINTERS_EXTEND_UNSIGNED): Likewise.
        (OPT_ARCH64): Support x32.
        (OPT_ARCH32): Likewise.

        * config/i386/i386.opt (mx32): New.

        * config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New.
        (GLIBC_DYNAMIC_LINKERX32): Likewise.
        * config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise.
        (GLIBC_DYNAMIC_LINKERX32): Likewise.

        * config/i386/t-linux-x32: New.

        * config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New.
        (BIONIC_DYNAMIC_LINKERX32): Likewise.
        (GNU_USER_DYNAMIC_LINKERX32): Likewise.

        * doc/install.texi: Document --with-multilib-list=x32.

        * doc/invoke.texi: Document -mx32.


 Hi Uros,

 This new version only adds a comment to configure.ac.  OK to install?

OK.

Thanks,
Uros.


Re: CFT: Move unwinder to toplevel libgcc

2011-07-07 Thread Rainer Orth
Tristan Gingold ging...@adacore.com writes:

 Otherwise, the patch is unchanged from the original submission:
 
  [build] Move unwinder to toplevel libgcc
  http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01452.html
 
 Unfortunately, it hasn't seen much comment.  I'm now looking for testers
 especially on platforms with more change and approval of those parts:
 
 * Several IA-64 targets:
 
  ia64*-*-linux*
ia64*-*-hpux*
ia64-hp-*vms*

 For ia64-hp-vms, consider your patch approved if the parts for ia64 are.
 In case of break, I will fix them.

In that case, perhaps Steve could have a look?  I'd finally like to make
some progress on this patch.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: RFA: Fix bogus mode in choose_reload_regs

2011-07-07 Thread Jay Foad
On 7 July 2011 09:09, Richard Sandiford richard.sandif...@linaro.org wrote:
 gcc/
        * reload1.c (choose_reload_regs): Use mode sizes to check whether
        an old relaod register completely defines the required value.

s/relaod/reload/

Jay.


[PATCH] Fix folding of -(unsigned)(a * -b)

2011-07-07 Thread Richard Guenther

Folding of $subject is currently broken (noticed that when playing
with types in pointer_int_sum).  We happily ignore the fact
that the negate operates on an unsigned type and change it to
operate on a signed one - which may cause new undefined overflow.
Seen with the testcase below which aborts with current trunk.

The fix is to not strip sign-changing conversions as already
done for ABS_EXPR.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2011-07-07  Richard Guenther  rguent...@suse.de

* fold-const.c (fold_unary_loc): Do not strip sign-changes
for NEGATE_EXPR.

* gcc.dg/ftrapv-3.c: New testcase.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 175962)
+++ gcc/fold-const.c(working copy)
@@ -7561,7 +7561,7 @@ fold_unary_loc (location_t loc, enum tre
   if (arg0)
 {
   if (CONVERT_EXPR_CODE_P (code)
- || code == FLOAT_EXPR || code == ABS_EXPR)
+ || code == FLOAT_EXPR || code == ABS_EXPR || code == NEGATE_EXPR)
{
  /* Don't use STRIP_NOPS, because signedness of argument type
 matters.  */
Index: gcc/testsuite/gcc.dg/ftrapv-3.c
===
--- gcc/testsuite/gcc.dg/ftrapv-3.c (revision 0)
+++ gcc/testsuite/gcc.dg/ftrapv-3.c (revision 0)
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+/* { dg-options -ftrapv } */
+
+extern void abort (void);
+unsigned long
+foo (long i, long j)
+{
+  /* We may not fold this to (unsigned long)(i * j).  */
+  return -(unsigned long)(i * -j);
+}
+int main()
+{
+  if (foo (-__LONG_MAX__ - 1, -1) != -(unsigned long)(-__LONG_MAX__ - 1))
+abort ();
+  return 0;
+}


Re: PATCH [1/n] X32: Add initial -x32 support

2011-07-07 Thread Paolo Bonzini
Did you even _think_ of looking at the sh configury, and do something
vaguely similar for x86?

You should not duplicate t-linux64 at all.  Instead, in config.gcc set
m64/m32 as the default value for with_multilib_list on i386 biarch and
x86_64.  Pass $with_multilib_list to t-linux64 using
TM_MULTILIB_CONFIG.  Then, do something like

comma=,
MULTILIB_OPTIONS= $(subst $(comma),/,@TM_MULTILIB_CONFIG@)
MULTILIB_DIRNAMES   = $(patsubst m%, %, $(subst /, ,$(MULTILIB_OPTIONS)))
MULTILIB_OSDIRNAMES  = 64=../lib64
MULTILIB_OSDIRNAMES += 32=$(if $(wildcard $(shell echo
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)
MULTILIB_OSDIRNAMES += x32=../libx32

in config/t-linux64.  (Each on one line, apologies for any wrapping)

The option will be used as --with-multilib-list=m64,m32,mx32 (allowing
the user to omit some of the variants, too).

Paolo


[PATCH] Make VRP optimize useless conversions

2011-07-07 Thread Richard Guenther

The following patch teaches VRP to disregard the intermediate
conversion in a sequence (T1)(T2)val if that sequence is
value-preserving for val.  There are possibly some more
cases that could be handled when a sign-change is involved
but the following is a first safe step.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2011-07-07  Richard Guenther  rguent...@suse.de

* tree-vrp.c (simplify_conversion_using_ranges): New function.
(simplify_stmt_using_ranges): Call it.

* gcc.dg/tree-ssa/vrp58.c: New testcase.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c  (revision 175962)
--- gcc/tree-vrp.c  (working copy)
*** simplify_switch_using_ranges (gimple stm
*** 7342,7347 
--- 7342,7378 
return false;
  }
  
+ /* Simplify an integral conversion from an SSA name in STMT.  */
+ 
+ static bool
+ simplify_conversion_using_ranges (gimple stmt)
+ {
+   tree rhs1 = gimple_assign_rhs1 (stmt);
+   gimple def_stmt = SSA_NAME_DEF_STMT (rhs1);
+   value_range_t *final, *inner;
+ 
+   /* Obtain final and inner value-ranges for a conversion
+  sequence (final-type)(intermediate-type)inner-type.  */
+   final = get_value_range (gimple_assign_lhs (stmt));
+   if (final-type != VR_RANGE)
+ return false;
+   if (!is_gimple_assign (def_stmt)
+   || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
+ return false;
+   rhs1 = gimple_assign_rhs1 (def_stmt);
+   if (TREE_CODE (rhs1) != SSA_NAME)
+ return false;
+   inner = get_value_range (rhs1);
+   if (inner-type != VR_RANGE)
+ return false;
+   if (!tree_int_cst_equal (final-min, inner-min)
+   || !tree_int_cst_equal (final-max, inner-max))
+ return false;
+   gimple_assign_set_rhs1 (stmt, rhs1);
+   update_stmt (stmt);
+   return true;
+ }
+ 
  /* Simplify STMT using ranges if possible.  */
  
  static bool
*** simplify_stmt_using_ranges (gimple_stmt_
*** 7351,7356 
--- 7382,7388 
if (is_gimple_assign (stmt))
  {
enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
+   tree rhs1 = gimple_assign_rhs1 (stmt);
  
switch (rhs_code)
{
*** simplify_stmt_using_ranges (gimple_stmt_
*** 7364,7370 
 or identity if the RHS is zero or one, and the LHS are known
 to be boolean values.  Transform all TRUTH_*_EXPR into
   BIT_*_EXPR if both arguments are known to be boolean values.  */
! if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt
return simplify_truth_ops_using_ranges (gsi, stmt);
  break;
  
--- 7396,7402 
 or identity if the RHS is zero or one, and the LHS are known
 to be boolean values.  Transform all TRUTH_*_EXPR into
   BIT_*_EXPR if both arguments are known to be boolean values.  */
! if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
return simplify_truth_ops_using_ranges (gsi, stmt);
  break;
  
*** simplify_stmt_using_ranges (gimple_stmt_
*** 7373,7387 
 than zero and the second operand is an exact power of two.  */
case TRUNC_DIV_EXPR:
case TRUNC_MOD_EXPR:
! if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))
   integer_pow2p (gimple_assign_rhs2 (stmt)))
return simplify_div_or_mod_using_ranges (stmt);
  break;
  
/* Transform ABS (X) into X or -X as appropriate.  */
case ABS_EXPR:
! if (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
!  INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt
return simplify_abs_using_ranges (stmt);
  break;
  
--- 7405,7419 
 than zero and the second operand is an exact power of two.  */
case TRUNC_DIV_EXPR:
case TRUNC_MOD_EXPR:
! if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
   integer_pow2p (gimple_assign_rhs2 (stmt)))
return simplify_div_or_mod_using_ranges (stmt);
  break;
  
/* Transform ABS (X) into X or -X as appropriate.  */
case ABS_EXPR:
! if (TREE_CODE (rhs1) == SSA_NAME
!  INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
return simplify_abs_using_ranges (stmt);
  break;
  
*** simplify_stmt_using_ranges (gimple_stmt_
*** 7390,7399 
  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 if all the bits being cleared are already cleared or
 all the bits being set are already set.  */
! if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt
return simplify_bit_ops_using_ranges (gsi, stmt);
  break;
  
default:
  break;
}
--- 7422,7437 
  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 if all the bits being cleared are already cleared or
 all the bits being set are 

Re: Remove unused t-* fragments

2011-07-07 Thread John David Anglin

On 7/6/2011 4:14 PM, Joseph S. Myers wrote:

2011-07-06  Joseph Myersjos...@codesourcery.com

* config/i386/t-crtpic, config/i386/t-svr3dbx, config/pa/t-pa:
Remove.

Ok for pa.

Dave
--
John David Anglindave.ang...@bell.net



Re: [PATCH, graphite]: Fix UNRESOLVED: gcc.dg/graphite/pr37485.c scan-tree-dump-times graphite Loop blocked

2011-07-07 Thread Sebastian Pop
On Thu, Jul 7, 2011 at 05:36, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 We should add loop blocking flags (the same as in graphite.exp) if we
 want to check graphite tree dump.

 2011-07-07  Uros Bizjak  ubiz...@gmail.com

        * gcc.dg/graphite/pr37485.c (dg-options): Add -floop-block
        -fno-loop-strip-mine -fno-loop-interchange -ffast-math.

 Tested on x86_64-pc-linux-gnu {,-m32}.

 OK for mainline?

Yes, thanks,
Sebastian


Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns

2011-07-07 Thread Richard Guenther
On Mon, Jul 4, 2011 at 8:55 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Ok, reworked version.  The folding of X op X and !X op !X seems indeed
 not being necessary. So function simplifies much.

 Bootstrapped and regression tested for all standard languages (plus
 Ada and Obj-C++). Ok for apply?

Ok with a proper changelog entry.

Thanks,
Richard.

 Regards,
 Kai

 Index: gcc-head/gcc/tree-ssa-forwprop.c
 ===
 --- gcc-head.orig/gcc/tree-ssa-forwprop.c
 +++ gcc-head/gcc/tree-ssa-forwprop.c
 @@ -1602,6 +1602,129 @@ simplify_builtin_call (gimple_stmt_itera
   return false;
  }

 +/* Checks if expression has type of one-bit precision, or is a known
 +   truth-valued expression.  */
 +static bool
 +truth_valued_ssa_name (tree name)
 +{
 +  gimple def;
 +  tree type = TREE_TYPE (name);
 +
 +  if (!INTEGRAL_TYPE_P (type))
 +    return false;
 +  /* Don't check here for BOOLEAN_TYPE as the precision isn't
 +     necessarily one and so ~X is not equal to !X.  */
 +  if (TYPE_PRECISION (type) == 1)
 +    return true;
 +  def = SSA_NAME_DEF_STMT (name);
 +  if (is_gimple_assign (def))
 +    return truth_value_p (gimple_assign_rhs_code (def), type);
 +  return false;
 +}
 +
 +/* Helper routine for simplify_bitwise_binary_1 function.
 +   Return for the SSA name NAME the expression X if it mets condition
 +   NAME = !X. Otherwise return NULL_TREE.
 +   Detected patterns for NAME = !X are:
 +     !X and X == 0 for X with integral type.
 +     X ^ 1, X != 1,or ~X for X with integral type with precision of one.  */
 +static tree
 +lookup_logical_inverted_value (tree name)
 +{
 +  tree op1, op2;
 +  enum tree_code code;
 +  gimple def;
 +
 +  /* If name has none-intergal type, or isn't a SSA_NAME, then
 +     return.  */
 +  if (TREE_CODE (name) != SSA_NAME
 +      || !INTEGRAL_TYPE_P (TREE_TYPE (name)))
 +    return NULL_TREE;
 +  def = SSA_NAME_DEF_STMT (name);
 +  if (!is_gimple_assign (def))
 +    return NULL_TREE;
 +
 +  code = gimple_assign_rhs_code (def);
 +  op1 = gimple_assign_rhs1 (def);
 +  op2 = NULL_TREE;
 +
 +  /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand.
 +     If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR,
 +     or BIT_NOT_EXPR, then return.  */
 +  if (code == EQ_EXPR || code == NE_EXPR
 +      || code == BIT_XOR_EXPR)
 +    op2 = gimple_assign_rhs2 (def);
 +
 +  switch (code)
 +    {
 +    case TRUTH_NOT_EXPR:
 +      return op1;
 +    case BIT_NOT_EXPR:
 +      if (truth_valued_ssa_name (name))
 +       return op1;
 +      break;
 +    case EQ_EXPR:
 +      /* Check if we have X == 0 and X has an integral type.  */
 +      if (!INTEGRAL_TYPE_P (TREE_TYPE (op1)))
 +       break;
 +      if (integer_zerop (op2))
 +       return op1;
 +      break;
 +    case NE_EXPR:
 +      /* Check if we have X != 1 and X is a truth-valued.  */
 +      if (!INTEGRAL_TYPE_P (TREE_TYPE (op1)))
 +       break;
 +      if (integer_onep (op2)  truth_valued_ssa_name (op1))
 +       return op1;
 +      break;
 +    case BIT_XOR_EXPR:
 +      /* Check if we have X ^ 1 and X is truth valued.  */
 +      if (integer_onep (op2)  truth_valued_ssa_name (op1))
 +       return op1;
 +      break;
 +    default:
 +      break;
 +    }
 +
 +  return NULL_TREE;
 +}
 +
 +/* Optimize ARG1 CODE ARG2 to a constant for bitwise binary
 +   operations CODE, if one operand has the logically inverted
 +   value of the other.  */
 +static tree
 +simplify_bitwise_binary_1 (enum tree_code code, tree type,
 +                          tree arg1, tree arg2)
 +{
 +  tree anot;
 +
 +  /* If CODE isn't a bitwise binary operation, return NULL_TREE.  */
 +  if (code != BIT_AND_EXPR  code != BIT_IOR_EXPR
 +       code != BIT_XOR_EXPR)
 +    return NULL_TREE;
 +
 +  /* First check if operands ARG1 and ARG2 are equal.  If so
 +     return NULL_TREE as this optimization is handled fold_stmt.  */
 +  if (arg1 == arg2)
 +    return NULL_TREE;
 +  /* See if we have in arguments logical-not patterns.  */
 +  if (((anot = lookup_logical_inverted_value (arg1)) == NULL_TREE
 +       || anot != arg2)
 +       ((anot = lookup_logical_inverted_value (arg2)) == NULL_TREE
 +         || anot != arg1))
 +    return NULL_TREE;
 +
 +  /* X  !X - 0.  */
 +  if (code == BIT_AND_EXPR)
 +    return fold_convert (type, integer_zero_node);
 +  /* X | !X - 1 and X ^ !X - 1, if X is truth-valued.  */
 +  if (truth_valued_ssa_name (anot))
 +    return fold_convert (type, integer_one_node);
 +
 +  /* ??? Otherwise result is (X != 0 ? X : 1).  not handled.  */
 +  return NULL_TREE;
 +}
 +
  /* Simplify bitwise binary operations.
    Return true if a transformation applied, otherwise return false.  */

 @@ -1769,6 +1892,15 @@ simplify_bitwise_binary (gimple_stmt_ite
       return true;
     }

 +  /* Try simple folding for X op !X, and X op X.  */
 +  res = simplify_bitwise_binary_1 (code, TREE_TYPE (arg1), arg1, arg2);
 +  if (res != NULL_TREE)
 +    {
 +      gimple_assign_set_rhs_from_tree 

Re: [PATCH 4/6] Shrink-wrapping

2011-07-07 Thread Richard Sandiford
Bernd Schmidt ber...@codesourcery.com writes:
 This adds the actual optimization, and reworks the JUMP_LABEL handling
 for return blocks. See the introduction mail or the new comment ahead of
 thread_prologue_and_epilogue_insns for more notes.

It seems a shame to have both (return) and (simple_return).  You said
that we need the distinction in order to cope with targets like ARM,
whose (return) instruction actually performs some of the epilogue too.
It feels like the load of the saved registers should really be expressed
in rtl, in parallel with the return.  I realise that'd prevent
conditional returns though.  Maybe there's no elegant way out...

With the hidden loads, it seems like we'll have a situation in which the
values of call-saved registers will appear to be different for different
real incoming edges to the exit block.

Is JUMP_LABEL ever null after this change?  (In fully-complete rtl
sequences, I mean.)  It looked like some of the null checks in the
patch might not be necessary any more.

JUMP_LABEL also seems somewhat misnamed after this change; maybe
JUMP_TARGET would be better?  I'm the last person who should be
recommending names though.

I know it's a pain, but it'd really help if you could split the
JUMP_LABEL == a return rtx stuff out.

I think it'd also be worth splitting the RETURN_ADDR_REGNUM bit out into
a separate patch, and handling other things in a more generic way.
E.g. the default INCOMING_RETURN_ADDR_RTX could then be:

  #define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM)

and df.c:df_get_exit_block_use_set should include RETURN_ADDR_REGNUM
when epilogue_completed.

It'd be nice to handle cases in which all references to the stack pointer
are to the incoming arguments.  Maybe mention the fact that we don't as
another source of conservatism?

It'd also be nice to get rid of all these big blocks of code that are
conditional on preprocessor macros, but I realise you're just following
existing practice in the surrounding code, so again it can be left to
a future cleanup.

 @@ -1280,7 +1297,7 @@ force_nonfallthru_and_redirect (edge e, 
  basic_block
  force_nonfallthru (edge e)
  {
 -  return force_nonfallthru_and_redirect (e, e-dest);
 +  return force_nonfallthru_and_redirect (e, e-dest, NULL_RTX);
  }

Maybe assert here that e-dest isn't the exit block?  I realise it
will be caught by the:

gcc_assert (jump_label == simple_return_rtx);

check, but an assert here would make it more obvious what had gone wrong.

 -  if (GET_CODE (x) == RETURN)
 +  if (GET_CODE (x) == RETURN || GET_CODE (x) == SIMPLE_RETURN)

ANY_RETURN_P (x).  A few other cases.

 @@ -5654,6 +5658,7 @@ init_emit_regs (void)
/* Assign register numbers to the globally defined register rtx.  */
pc_rtx = gen_rtx_fmt_ (PC, VOIDmode);
ret_rtx = gen_rtx_fmt_ (RETURN, VOIDmode);
 +  simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode);
cc0_rtx = gen_rtx_fmt_ (CC0, VOIDmode);
stack_pointer_rtx = gen_raw_REG (Pmode, STACK_POINTER_REGNUM);
frame_pointer_rtx = gen_raw_REG (Pmode, FRAME_POINTER_REGNUM);

It'd be nice to s/ret_rtx/return_rtx/ for consistency, but that can
happen anytime.

 +/* Return true if INSN requires the stack frame to be set up.  */
 +static bool
 +requires_stack_frame_p (rtx insn)
 +{
 +  HARD_REG_SET hardregs;
 +  unsigned regno;
 +
 +  if (!INSN_P (insn) || DEBUG_INSN_P (insn))
 +return false;
 +  if (CALL_P (insn))
 +return !SIBLING_CALL_P (insn);
 +  if (for_each_rtx (PATTERN (insn), frame_required_for_rtx, NULL))
 +return true;
 +  CLEAR_HARD_REG_SET (hardregs);
 +  note_stores (PATTERN (insn), record_hard_reg_sets, hardregs);
 +  AND_COMPL_HARD_REG_SET (hardregs, call_used_reg_set);
 +  for (regno = 0; regno  FIRST_PSEUDO_REGISTER; regno++)
 +if (TEST_HARD_REG_BIT (hardregs, regno)
 +  df_regs_ever_live_p (regno))
 +  return true;

This can be done as a follow-up, but it looks like df should be using
a HARD_REG_SET here, and that we should be able to get at it directly.

 +   FOR_EACH_EDGE (e, ei, bb-preds)
 + if (!bitmap_bit_p (bb_antic_flags, e-src-index))
 +   {
 + VEC_quick_push (basic_block, vec, e-src);
 + bitmap_set_bit (bb_on_list, e-src-index);
 +   }

 !bitmap_bit_p (bb_on_list, e-src-index) ?

 + }
 +  while (!VEC_empty (basic_block, vec))
 + {
 +   basic_block tmp_bb = VEC_pop (basic_block, vec);
 +   edge e;
 +   edge_iterator ei;
 +   bool all_set = true;
 +
 +   bitmap_clear_bit (bb_on_list, tmp_bb-index);
 +   FOR_EACH_EDGE (e, ei, tmp_bb-succs)
 + {
 +   if (!bitmap_bit_p (bb_antic_flags, e-dest-index))
 + {
 +   all_set = false;
 +   break;
 + }
 + }
 +   if (all_set)
 + {
 +   bitmap_set_bit (bb_antic_flags, tmp_bb-index);
 +   FOR_EACH_EDGE (e, ei, tmp_bb-preds)
 + if (!bitmap_bit_p (bb_antic_flags, 

Re: [PATCH][C] Fixup pointer-int-sum

2011-07-07 Thread Joseph S. Myers
On Thu, 7 Jul 2011, Richard Guenther wrote:

 not overflow (what is actually the C semantics - is the
 multiplication allowed to overflow for unsigned intop?  If not

Overflow is not allowed.  Formally the multiplication is as-if to infinite 
precision, and then there is undefined behavior if the result of the 
addition (to infinite precision) is outside the array pointed to - 
wrapping around by some multiple of the whole address space is not 
allowed.

In practice, as previously discussed objects half or more of the address 
space do not work reliably because of the problems doing pointer 
subtraction, so always using a signed type shouldn't break anything that 
actually worked reliably (though how unreliable things were with large 
malloced objects - which unfortunately glibc's malloc can provide - if the 
source code didn't use pointer subtraction, I don't know).

In GCC's terms half or more of the address space generally means half the 
range of size_t.  (m32c has ptrdiff_t wider than size_t in some cases.  On 
such unusual architectures it ought to be possible to have objects whose 
size is up to SIZE_MAX bytes and have pointer addition and subtraction 
work reliably, which would suggest using ptrdiff_t for arithmetic in such 
cases, but the code checking sizes for arrays of constant size uses the 
signed type corresponding to size_t, so you could only get a larger object 
through malloc or VLAs.)

The patch is OK.  Unconditionally signed is also OK, though I don't see 
any advantage over this version.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix complex {*,/} real or real * complex handling in C FE (PR c/49644)

2011-07-07 Thread Joseph S. Myers
On Thu, 7 Jul 2011, Jakub Jelinek wrote:

 Hi!
 
 For MULT_EXPR and TRUNC_DIV_EXPR, both sides of COMPLEX_EXPR contain
 a copy of the non-complex operand, which means its side-effects can be
 evaluated twice.  For PLUS_EXPR/MINUS_EXPR they appear just in one of
 the operands and thus it works fine as is.
 
 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
 ok for trunk/4.6?

OK, but I think you need a similar patch for the C++ front end as well.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][C] Fixup pointer-int-sum

2011-07-07 Thread Richard Guenther
On Thu, 7 Jul 2011, Joseph S. Myers wrote:

 On Thu, 7 Jul 2011, Richard Guenther wrote:
 
  not overflow (what is actually the C semantics - is the
  multiplication allowed to overflow for unsigned intop?  If not
 
 Overflow is not allowed.  Formally the multiplication is as-if to infinite 
 precision, and then there is undefined behavior if the result of the 
 addition (to infinite precision) is outside the array pointed to - 
 wrapping around by some multiple of the whole address space is not 
 allowed.
 
 In practice, as previously discussed objects half or more of the address 
 space do not work reliably because of the problems doing pointer 
 subtraction, so always using a signed type shouldn't break anything that 
 actually worked reliably (though how unreliable things were with large 
 malloced objects - which unfortunately glibc's malloc can provide - if the 
 source code didn't use pointer subtraction, I don't know).
 
 In GCC's terms half or more of the address space generally means half the 
 range of size_t.  (m32c has ptrdiff_t wider than size_t in some cases.  On 
 such unusual architectures it ought to be possible to have objects whose 
 size is up to SIZE_MAX bytes and have pointer addition and subtraction 
 work reliably, which would suggest using ptrdiff_t for arithmetic in such 
 cases, but the code checking sizes for arrays of constant size uses the 
 signed type corresponding to size_t, so you could only get a larger object 
 through malloc or VLAs.)
 
 The patch is OK.  Unconditionally signed is also OK, though I don't see 
 any advantage over this version.

Ok, I'll defer the decision to the time I have settled on a final
solution to get rid of the (unsigned) sizetype offset operand
for POINTER_PLUS_EXPR.  The least invasive idea is to introduce a
new signed ptrofftype to replace all sizetype conversions at places
we build POINTER_PLUS_EXPRs.  That would favor unconditionally
signed.  The moderate invasive idea is to allow both a signed
and an unsigned ptrofftype (but still force a common precision),
with all the fun that arises from combining (ptr p+ off1) p+ off2
with different signs for the offset operand ...

Thanks,
Richard.


Re: [Patch,testsuite]: Filter more test cases to fit target capabilities

2011-07-07 Thread Mike Stump
On Jul 6, 2011, at 10:26 AM, Georg-Johann Lay wrote:
 Hi, I am struggling against hundreds of fails in the testsuite because
 many cases are not carefully written, e.g. stull like shifting an int
 by 19 bits if int is only 16 bits wide.

 Ok to commit?

Ok.


Re: PATCH [1/n] X32: Add initial -x32 support

2011-07-07 Thread H.J. Lu
On Thu, Jul 7, 2011 at 6:21 AM, Paolo Bonzini bonz...@gnu.org wrote:
 Did you even _think_ of looking at the sh configury, and do something
 vaguely similar for x86?

 You should not duplicate t-linux64 at all.  Instead, in config.gcc set
 m64/m32 as the default value for with_multilib_list on i386 biarch and
 x86_64.  Pass $with_multilib_list to t-linux64 using
 TM_MULTILIB_CONFIG.  Then, do something like

 comma=,
 MULTILIB_OPTIONS    = $(subst $(comma),/,@TM_MULTILIB_CONFIG@)
 MULTILIB_DIRNAMES   = $(patsubst m%, %, $(subst /, ,$(MULTILIB_OPTIONS)))
 MULTILIB_OSDIRNAMES  = 64=../lib64
 MULTILIB_OSDIRNAMES += 32=$(if $(wildcard $(shell echo
 $(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)
 MULTILIB_OSDIRNAMES += x32=../libx32

 in config/t-linux64.  (Each on one line, apologies for any wrapping)

 The option will be used as --with-multilib-list=m64,m32,mx32 (allowing
 the user to omit some of the variants, too).


This is an excellent suggestion.  Here is the updated patch. It
uses TM_MULTILIB_CONFIG and removes config/i386/t-linux-x32.

Uros, is this OK for trunk to replace the patch you approved earlier?

Thanks.

-- 
H.J.
---
2011-07-07  H.J. Lu  hongjiu...@intel.com

* config.gcc: Support --with-multilib-list for x86 Linux
targets.

* configure.ac: Mention x86-64 for --with-multilib-list.
* configure: Regenerated.

* config/i386/gnu-user64.h (SPEC_64): Support x32.
(SPEC_32): Likewise.
(ASM_SPEC): Likewise.
(LINK_SPEC): Likewise.
(TARGET_THREAD_SSP_OFFSET): Likewise.
(TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise.
(SPEC_X32): New.

* config/i386/i386.h (TARGET_X32): New.
(TARGET_LP64): New.
(LONG_TYPE_SIZE): Likewise.
(POINTER_SIZE): Likewise.
(POINTERS_EXTEND_UNSIGNED): Likewise.
(OPT_ARCH64): Support x32.
(OPT_ARCH32): Likewise.

* config/i386/i386.opt (mx32): New.

* config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New.
(GLIBC_DYNAMIC_LINKERX32): Likewise.
* config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise.
(GLIBC_DYNAMIC_LINKERX32): Likewise.

* config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New.
(BIONIC_DYNAMIC_LINKERX32): Likewise.
(GNU_USER_DYNAMIC_LINKERX32): Likewise.

* config/i386/t-linux64: Support TM_MULTILIB_CONFIG.

* doc/install.texi: Document --with-multilib-list for
Linux/x86-64.

* doc/invoke.texi: Document -mx32.
2011-07-07  H.J. Lu  hongjiu...@intel.com

* config.gcc: Support --with-multilib-list for x86 Linux
targets.

* configure.ac: Mention x86-64 for --with-multilib-list.
* configure: Regenerated.

* config/i386/gnu-user64.h (SPEC_64): Support x32.
(SPEC_32): Likewise.
(ASM_SPEC): Likewise.
(LINK_SPEC): Likewise.
(TARGET_THREAD_SSP_OFFSET): Likewise.
(TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise.
(SPEC_X32): New.

* config/i386/i386.h (TARGET_X32): New.
(TARGET_LP64): New.
(LONG_TYPE_SIZE): Likewise.
(POINTER_SIZE): Likewise.
(POINTERS_EXTEND_UNSIGNED): Likewise.
(OPT_ARCH64): Support x32.
(OPT_ARCH32): Likewise.

* config/i386/i386.opt (mx32): New.

* config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New.
(GLIBC_DYNAMIC_LINKERX32): Likewise.
* config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise.
(GLIBC_DYNAMIC_LINKERX32): Likewise.

* config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New.
(BIONIC_DYNAMIC_LINKERX32): Likewise.
(GNU_USER_DYNAMIC_LINKERX32): Likewise.

* config/i386/t-linux64: Support TM_MULTILIB_CONFIG.

* doc/install.texi: Document --with-multilib-list for
Linux/x86-64.

* doc/invoke.texi: Document -mx32.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c77f40b..449409e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1280,6 +1280,22 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | 
i[34567]86-*-knetbsd*-gnu | i
tm_file=${tm_file} i386/x86-64.h i386/gnu-user64.h 
i386/linux64.h
tm_defines=${tm_defines} TARGET_BI_ARCH=1
tmake_file=${tmake_file} i386/t-linux64
+   x86_multilibs=${with_multilib_list}
+   if test $x86_multilibs = default; then
+   x86_multilibs=m64,m32
+   fi
+   x86_multilibs=`echo $x86_multilibs | sed -e 's/,/ /g'`
+   for x86_multilib in ${x86_multilibs}; do
+   case ${x86_multilib} in
+   m32 | m64 | mx32)
+   
TM_MULTILIB_CONFIG=${TM_MULTILIB_CONFIG},${x86_multilib}
+   ;;
+ 

Re: [PATCH] Fix complex {*,/} real or real * complex handling in C FE (PR c/49644)

2011-07-07 Thread Jakub Jelinek
On Thu, Jul 07, 2011 at 02:55:45PM +, Joseph S. Myers wrote:
 On Thu, 7 Jul 2011, Jakub Jelinek wrote:
  For MULT_EXPR and TRUNC_DIV_EXPR, both sides of COMPLEX_EXPR contain
  a copy of the non-complex operand, which means its side-effects can be
  evaluated twice.  For PLUS_EXPR/MINUS_EXPR they appear just in one of
  the operands and thus it works fine as is.
  
  Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
  ok for trunk/4.6?
 
 OK, but I think you need a similar patch for the C++ front end as well.

Indeed, thanks.  Attached is the corresponding C++ patch and simplified
C patch (with c_save_expr calls right in the switch stmt for the cases
that need it instead of another condition before).

Jakub
2011-07-07  Jakub Jelinek  ja...@redhat.com

PR c/49644
* typeck.c (cp_build_binary_op): For MULT_EXPR and TRUNC_DIV_EXPR with
one non-complex and one complex argument, call save_expr on both
operands.

* g++.dg/torture/pr49644.C: New test.

--- gcc/cp/typeck.c.jj  2011-06-21 16:45:52.0 +0200
+++ gcc/cp/typeck.c 2011-07-07 17:00:17.0 +0200
@@ -4338,6 +4338,7 @@ cp_build_binary_op (location_t location,
{
case MULT_EXPR:
case TRUNC_DIV_EXPR:
+ op1 = save_expr (op1);
  imag = build2 (resultcode, real_type, imag, op1);
  /* Fall through.  */
case PLUS_EXPR:
@@ -4356,6 +4357,7 @@ cp_build_binary_op (location_t location,
  switch (code)
{
case MULT_EXPR:
+ op0 = save_expr (op0);
  imag = build2 (resultcode, real_type, op0, imag);
  /* Fall through.  */
case PLUS_EXPR:
--- gcc/testsuite/g++.dg/torture/pr49644.C.jj   2011-07-07 17:01:21.0 
+0200
+++ gcc/testsuite/g++.dg/torture/pr49644.C  2011-07-07 17:01:27.0 
+0200
@@ -0,0 +1,17 @@
+// PR c/49644
+// { dg-do run }
+
+extern C void abort ();
+
+int
+main ()
+{
+  _Complex double a[12], *c = a, s = 3.0 + 1.0i;
+  double b[12] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }, *d = b;
+  int i;
+  for (i = 0; i  6; i++)
+*c++ = *d++ * s;
+  if (c != a + 6 || d != b + 6)
+abort ();
+  return 0;
+}
2011-07-07  Jakub Jelinek  ja...@redhat.com

PR c/49644
* c-typeck.c (build_binary_op): For MULT_EXPR and TRUNC_DIV_EXPR with
one non-complex and one complex argument, call c_save_expr on both
operands.

* gcc.c-torture/execute/pr49644.c: New test.

--- gcc/c-typeck.c.jj   2011-05-31 08:03:10.0 +0200
+++ gcc/c-typeck.c  2011-07-07 11:47:31.0 +0200
@@ -10040,6 +10040,7 @@ build_binary_op (location_t location, en
{
case MULT_EXPR:
case TRUNC_DIV_EXPR:
+ op1 = c_save_expr (op1);
  imag = build2 (resultcode, real_type, imag, op1);
  /* Fall through.  */
case PLUS_EXPR:
@@ -10060,6 +10061,7 @@ build_binary_op (location_t location, en
  switch (code)
{
case MULT_EXPR:
+ op0 = c_save_expr (op0);
  imag = build2 (resultcode, real_type, op0, imag);
  /* Fall through.  */
case PLUS_EXPR:
--- gcc/testsuite/gcc.c-torture/execute/pr49644.c.jj2011-07-07 
11:48:34.0 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr49644.c   2011-07-07 
11:35:52.0 +0200
@@ -0,0 +1,16 @@
+/* PR c/49644 */
+
+extern void abort (void);
+
+int
+main ()
+{
+  _Complex double a[12], *c = a, s = 3.0 + 1.0i;
+  double b[12] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }, *d = b;
+  int i;
+  for (i = 0; i  6; i++)
+*c++ = *d++ * s;
+  if (c != a + 6 || d != b + 6)
+abort ();
+  return 0;
+}


Re: [ARM] Deprecate -mwords-little-endian

2011-07-07 Thread Richard Sandiford
Richard Earnshaw rearn...@arm.com writes:
 On 29/06/11 12:28, Richard Sandiford wrote:
 ARM has an option called -mwords-little-endian that provides big-endian
 compatibility with pre-2.8 compilers.  When I asked Richard about it,
 he seemed to think it had outlived its usefulness, so this patch
 deprecates it.  We can then remove it once 4.7 is out.
 
 Tested on arm-linux-gnueabi.  OK to install?  If so, I'll do a patch
 for the web page as well.
 

 Please also update the in-line help text in arm.opt.  OK with that change.

Thanks.  I've attached the patch I applied below.

How's this for the docs change?

--
Index: htdocs/gcc-4.7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.20
diff -u -r1.20 changes.html
--- htdocs/gcc-4.7/changes.html 6 Jul 2011 23:37:11 -   1.20
+++ htdocs/gcc-4.7/changes.html 7 Jul 2011 15:17:03 -
@@ -43,6 +43,9 @@
 only intended as a migration aid from SunOS 4 to SunOS 5.  The
 code-compat-bsd/code compiler option is not recognized any
 longer./li
+
+liThe ARM port's code-mwords-little-endian/code option has
+been deprecated.  It will be removed in a future release./li
   /ul
 
 h2General Optimizer Improvements/h2
--

I wondered about expanding it a bit (describing why the option was added
and why it's no longer needed).  It felt like overkill for such a niche
option though.

Richard


gcc/
* doc/invoke.texi (mwords-little-endian): Deprecate.
* config/arm/arm.opt (mwords-little-endian): Likewise.
* config/arm/arm.c (arm_option_override): Warn about the deprecation
of -mwords-little-endian.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi 2011-07-04 09:09:02.0 +0100
+++ gcc/doc/invoke.texi 2011-07-04 13:50:06.0 +0100
@@ -10239,7 +10239,7 @@ Generate code for a little-endian word o
 order.  That is, a byte order of the form @samp{32107654}.  Note: this
 option should only be used if you require compatibility with code for
 big-endian ARM processors generated by versions of the compiler prior to
-2.8.
+2.8.  This option is now deprecated.
 
 @item -mcpu=@var{name}
 @opindex mcpu
Index: gcc/config/arm/arm.opt
===
--- gcc/config/arm/arm.opt  2011-06-22 16:46:28.0 +0100
+++ gcc/config/arm/arm.opt  2011-07-04 13:52:38.0 +0100
@@ -235,7 +235,7 @@ Tune code for the given processor
 
 mwords-little-endian
 Target Report RejectNegative Mask(LITTLE_WORDS)
-Assume big endian bytes, little endian words
+Assume big endian bytes, little endian words.  This option is deprecated.
 
 mvectorize-with-neon-quad
 Target Report Mask(NEON_VECTORIZE_QUAD)
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c2011-07-01 05:37:51.0 +0100
+++ gcc/config/arm/arm.c2011-07-04 13:50:06.0 +0100
@@ -1483,6 +1483,10 @@ arm_option_override (void)
   if (TARGET_APCS_FLOAT)
 warning (0, passing floating point arguments in fp regs not yet 
supported);
 
+  if (TARGET_LITTLE_WORDS)
+warning (OPT_Wdeprecated, %mwords-little-endian% is deprecated and 
+will be removed in a future release);
+
   /* Initialize boolean versions of the flags, for use in the arm.md file.  */
   arm_arch3m = (insn_flags  FL_ARCH3M) != 0;
   arm_arch4 = (insn_flags  FL_ARCH4) != 0;


Re: PATCH [1/n] X32: Add initial -x32 support

2011-07-07 Thread Paolo Bonzini
On Thu, Jul 7, 2011 at 17:12, Uros Bizjak ubiz...@gmail.com wrote:
 On Thu, Jul 7, 2011 at 5:02 PM, H.J. Lu hjl.to...@gmail.com wrote:

 Did you even _think_ of looking at the sh configury, and do something
 vaguely similar for x86?

 You should not duplicate t-linux64 at all.  Instead, in config.gcc set
 m64/m32 as the default value for with_multilib_list on i386 biarch and
 x86_64.  Pass $with_multilib_list to t-linux64 using
 TM_MULTILIB_CONFIG.  Then, do something like

 comma=,
 MULTILIB_OPTIONS    = $(subst $(comma),/,@TM_MULTILIB_CONFIG@)
 MULTILIB_DIRNAMES   = $(patsubst m%, %, $(subst /, ,$(MULTILIB_OPTIONS)))
 MULTILIB_OSDIRNAMES  = 64=../lib64
 MULTILIB_OSDIRNAMES += 32=$(if $(wildcard $(shell echo
 $(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)
 MULTILIB_OSDIRNAMES += x32=../libx32

 in config/t-linux64.  (Each on one line, apologies for any wrapping)

 The option will be used as --with-multilib-list=m64,m32,mx32 (allowing
 the user to omit some of the variants, too).


 This is an excellent suggestion.  Here is the updated patch. It
 uses TM_MULTILIB_CONFIG and removes config/i386/t-linux-x32.

 Uros, is this OK for trunk to replace the patch you approved earlier?

 Er, the approval was fo x86 parts, I will leave approval for build
 parts to Paolo.

Yes, build parts are okay too.

Paolo


Re: [PATCH 4/6] Shrink-wrapping

2011-07-07 Thread Bernd Schmidt
Whee! Thanks for reviewing (reviving?) this old thing.

I should be posting an up-to-date version of this, but for the moment it
has to wait until dwarf2out is sorted out, and I'm rather busy with
other stuff. I hope to squeeze this in in the not too distant future.

I'll try to answer some of the questions now...

On 07/07/11 16:34, Richard Sandiford wrote:
 Bernd Schmidt ber...@codesourcery.com writes:
 This adds the actual optimization, and reworks the JUMP_LABEL handling
 for return blocks. See the introduction mail or the new comment ahead of
 thread_prologue_and_epilogue_insns for more notes.
 
 It seems a shame to have both (return) and (simple_return).

Yes, but the distinction exists and must be represented somehow - you
can have both in the same function.

 You said
 that we need the distinction in order to cope with targets like ARM,
 whose (return) instruction actually performs some of the epilogue too.
 It feels like the load of the saved registers should really be expressed
 in rtl, in parallel with the return.  I realise that'd prevent
 conditional returns though.  Maybe there's no elegant way out...

It certainly would make it harder to transform branches to conditional
returns. It would also require examining every port to see if it needs
changes to its return patterns. It probably only affects ARM though, but
that target is important enough that we should support the feature (i.e.
conditional returns that pop registers).

If we described conditional returns only as COND_EXEC maybe... AFAICT
only ia64, arm, frv and c6x have conditional return. I'll have to think
about it.

Note that some interface changes will be necessary in any case - passing
NULL as a new jump label simply isn't informative enough when
redirecting a jump; we must be able to distinguish between the two forms
of return at this level. So the ret_rtx/simple_return_rtx may turn out
to be the simplest solution after all.

 With the hidden loads, it seems like we'll have a situation in which the
 values of call-saved registers will appear to be different for different
 real incoming edges to the exit block.

Probably true, but I doubt we have any code that would notice. Can you
imagine anything that would care?

 Is JUMP_LABEL ever null after this change?  (In fully-complete rtl
 sequences, I mean.)  It looked like some of the null checks in the
 patch might not be necessary any more.

It shouldn't be, and it's possible that a few of these tests survived
when they shouldn't have.

 JUMP_LABEL also seems somewhat misnamed after this change; maybe
 JUMP_TARGET would be better?

Maybe. I dread the renaming patch though.

 It'd also be nice to get rid of all these big blocks of code that are
 conditional on preprocessor macros, but I realise you're just following
 existing practice in the surrounding code, so again it can be left to
 a future cleanup.

Yeah, this function is quite horrid - so many different paths through it.

However, it looks like the only target without HAVE_prologue is actually
pdp11, so we're carrying some unnecessary baggage for purely
retrocomputing purposes. Paul, can you fix that?

ret_rtx = gen_rtx_fmt_ (RETURN, VOIDmode);
 +  simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode);
 
 It'd be nice to s/ret_rtx/return_rtx/ for consistency, but that can
 happen anytime.

Unfortunately there's another macro called return_rtx.

 + df_regs_ever_live_p (regno))
 +  return true;
 
 This can be done as a follow-up, but it looks like df should be using
 a HARD_REG_SET here, and that we should be able to get at it directly.

For the df_regs_ever_live thing? Could change that, yes.

[...]
 AIUI, this prevents the optimisation for things like
 
   if (a) {
 switch (b) {
   case 1:
 ...stuff that requires a frame...
 break;
   case 2:
 ...stuff that requires a frame...
 break;
   default:
 ...stuff that doesn't require a frame...
 break;
 }
   }
 
 The switch won't be in ANTIC, but it will have two successors that are.
 Is that right?
 
 Would it work to do something like:
 
[...]

IIRC the problem here is making sure to match up prologues and epilogues
- the latter should not occur on any path that had a prologue set up and
vice versa. I think something more clever would break on e.g.

   if (c)
 goto label;
   if (a) {
 switch (b) {
   case 1:
 ...stuff that requires a frame...
 break;
   case 2:
 ...stuff that requires a frame...
 break;
   default:
 ...stuff that doesn't require a frame...
label:
 ...more stuff that doesn't require a frame...
 break;
 }
   }

If you add a prologue before the switch, two paths join at label where
one needs a prologue and the other doesn't.

 Does the JUMP_LABEL (returnjump) = ret_rtx; handle targets that
 use things like (set (pc) (reg RA)) as their return?  Probably worth
 adding a comment if so.

It simply 

Re: [ARM] Deprecate -mwords-little-endian

2011-07-07 Thread Richard Earnshaw
On 07/07/11 16:18, Richard Sandiford wrote:
 Richard Earnshaw rearn...@arm.com writes:
 On 29/06/11 12:28, Richard Sandiford wrote:
 ARM has an option called -mwords-little-endian that provides big-endian
 compatibility with pre-2.8 compilers.  When I asked Richard about it,
 he seemed to think it had outlived its usefulness, so this patch
 deprecates it.  We can then remove it once 4.7 is out.

 Tested on arm-linux-gnueabi.  OK to install?  If so, I'll do a patch
 for the web page as well.


 Please also update the in-line help text in arm.opt.  OK with that change.
 
 Thanks.  I've attached the patch I applied below.
 
 How's this for the docs change?
 
 --
 Index: htdocs/gcc-4.7/changes.html
 ===
 RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
 retrieving revision 1.20
 diff -u -r1.20 changes.html
 --- htdocs/gcc-4.7/changes.html   6 Jul 2011 23:37:11 -   1.20
 +++ htdocs/gcc-4.7/changes.html   7 Jul 2011 15:17:03 -
 @@ -43,6 +43,9 @@
  only intended as a migration aid from SunOS 4 to SunOS 5.  The
  code-compat-bsd/code compiler option is not recognized any
  longer./li
 +
 +liThe ARM port's code-mwords-little-endian/code option has
 +been deprecated.  It will be removed in a future release./li
/ul
  

Looks fine to me, but please allow 24 hours for the web maintainers to
comment if they wish.

R.



Re: Generic hwloop support library

2011-07-07 Thread Bernd Schmidt
On 07/05/11 21:25, Richard Sandiford wrote:

 (Could you bootstrap this on x86_64 to check for things like that?

That has no loop_end pattern so it wouldn't be much of a test, but a
x86_64 x bfin compiler has no warnings in this file with the intptr_t
thing fixed.

 A C bootstrap only should be fine of course, since the code isn't
 going to be run.)
 
 +  hwloop_info loops = NULL;
 
 Unnecessary initialisation (or at least, it should be).

? The value is used inside the loop to initialize next of the first loop.

Committed with these changes (except the last).


Bernd


Re: [PATCH] Fix folding of -(unsigned)(a * -b)

2011-07-07 Thread Michael Matz
Hi,

On Thu, 7 Jul 2011, Richard Guenther wrote:

 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c  (revision 175962)
 +++ gcc/fold-const.c  (working copy)
 @@ -7561,7 +7561,7 @@ fold_unary_loc (location_t loc, enum tre
if (arg0)
  {
if (CONVERT_EXPR_CODE_P (code)
 -   || code == FLOAT_EXPR || code == ABS_EXPR)
 +   || code == FLOAT_EXPR || code == ABS_EXPR || code == NEGATE_EXPR)
   {
 /* Don't use STRIP_NOPS, because signedness of argument type
matters.  */

Um, so why would stripping a signchange ever be okay?  There are many 
other unary codes that behave similar enough to FLOAT_EXPR, or 
CONVERT_EXPR that it's not obvious to me why those would allow sign 
stripping but the above not.  When the operand is float or fixed point 
types then STRIP_SIGN_NOPS and STRIP_NOPS aren't different, and when the 
operands are integer types I don't see how we can ignore sign-changing 
nops.  I'm thinking about:
VEC_UNPACK_HI_EXPR, VEC_UNPACK_LO_EXPR and PAREN_EXPR
Perhaps BIT_NOT_EXPR.  Perhaps also NON_LVALUE_EXPR.  All these can 
conceivably have integer operands, where signedness seems to matter.

I think these are harmless: CONJ_EXPR, FIXED_CONVERT_EXPR, FIX_TRUNC_EXPR, 
ADDR_SPACE_CONVERT_EXPR as their operands are either float/fixed-point 
types or pointers, but as said in those cases STRIP_NOPS and 
STRIP_SIGN_NOPS are equivalent.

So, why not simply always use STRIP_SIGN_NOPS?


Ciao,
Michael.


Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

2011-07-07 Thread Michael Meissner
On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote:
 On Thu, Jul 7, 2011 at 12:29 AM, Michael Meissner
 meiss...@linux.vnet.ibm.com wrote:
  This patch adds an option to not load the static chain (r11) for 64-bit 
  PowerPC
  calls through function pointers (or virtual function).  Most of the 
  languages
  on the PowerPC do not need the static chain being loaded when called, and
  adding this instruction can slow down code that calls very short functions.
 
  In addition, if the function does not call alloca, setjmp or deal with
  exceptions where the stack is modified, the compiler can move the store of 
  the
  TOC value for the current function to the prologue of the function, rather 
  than
  at each call site.
 
  The effect of these patches is to speed up 464.h264ref in the Spec 2006
  benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but the
  save of the TOC register is hoisted).  I believe this is due to the load of 
  the
  current function's TOC (r2) having to wait until the store queue is drained
  with the store just before the call.
 
  Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what 
  the
  cause is.
 
  I have bootstraped the compiler and saw that there were no regressions in 
  make
  check.  Is it ok to install in the trunk?
 
 Hum.  Can't the compiler figure this our itself per-call-site?  At least
 the name of the command-line switch -m[no-]r11 is meaningless to me.
 Points-to information should be able to tell you if the function pointer
 points to a nested function.

No, the compiler cannot figure it out.  Consider the case where a function is
passed a pointer to a function, such as the standard library function qsort.
The call may come from any random module, that isn't part of the compilation
suite, such as if the function being passed the pointer is in a shared library.
You don't know whether the function pointed to uses the static chain
(i.e. nested function call with trampoline, call to PL/I, or other language
that does use the static chain, which is part of the ABI).  The point of the
switch is similar to -ffast-math where you say you are willing to ignore some
corner cases in the standard in order to get better performance.

I certainly can call the switch -mno-static-chain, which is perhaps more
meaningful (at least to us compiler folk, I'm not sure static chain means much
to the normal programmer).

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899


Ping Re: Remove config.gcc support for *local* configurations

2011-07-07 Thread Joseph S. Myers
Ping.  This patch 
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02408.html is pending 
review.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix dead_debug_insert_before ICE (PR debug/49522, take 3)

2011-07-07 Thread Eric Botcazou
 So, here is a new patch which doesn't need two loops, just might go a
 little bit backwards to unchain dead_debug_use for the reset insn.

 It still needs the change of the gcc_assert (reg) into if (reg == NULL)
 return;, because the dead-used bitmap is with this sometimes a false
 positive (saying that a regno is referenced even when it isn't).
 But here it is IMHO better to occassionaly live with the false positives,
 which just means we'll sometimes once walk the chain in dead_debug_reset
 or dead_debug_insert_before before resetting it, than to recompute the
 bitmap (we'd need a second loop for that, bitmap_clear (debug-used) and
 populate it again).

Fine with me for both points, but move some bits of these explanations to the 
code itself because this isn't obvious.  For example see below.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

 2011-07-07  Jakub Jelinek  ja...@redhat.com

   PR debug/49522
   * df-problems.c (dead_debug_reset): Remove dead_debug_uses
   referencing debug insns that have been reset.
   (dead_debug_insert_before): Don't assert reg is non-NULL,
   instead return immediately if it is NULL.

   * gcc.dg/debug/pr49522.c: New test.

OK, thanks.

 --- gcc/df-problems.c.jj  2011-07-07 02:32:45.928547053 +0200
 +++ gcc/df-problems.c 2011-07-07 09:57:34.846464573 +0200
 @@ -3096,6 +3096,7 @@ static void
  dead_debug_reset (struct dead_debug *debug, unsigned int dregno)
  {
struct dead_debug_use **tailp = debug-head;
 +  struct dead_debug_use **insnp = debug-head;
struct dead_debug_use *cur;
rtx insn;

 @@ -3113,9 +3114,21 @@ dead_debug_reset (struct dead_debug *deb
   debug-to_rescan = BITMAP_ALLOC (NULL);
 bitmap_set_bit (debug-to_rescan, INSN_UID (insn));
 XDELETE (cur);
 +   if (tailp != insnp  DF_REF_INSN ((*insnp)-use) == insn)
 + tailp = insnp;

/* If the current use isn't the first one attached to INSN, go back to this
   first use.  We assume that the uses attached to an insn are adjacent.  */

 +   while ((cur = *tailp)  DF_REF_INSN (cur-use) == insn)
 + {
 +   *tailp = cur-next;
 +   XDELETE (cur);
 + }
 +   insnp = tailp;

/* Then remove all the other uses attached to INSN.  */

   }
else
 - tailp = (*tailp)-next;
 + {
 +   if (DF_REF_INSN ((*insnp)-use) != DF_REF_INSN (cur-use))
 + insnp = tailp;
 +   tailp = (*tailp)-next;
 + }
  }
  }

 @@ -3174,7 +3187,8 @@ dead_debug_insert_before (struct dead_de
   tailp = (*tailp)-next;
  }

 -  gcc_assert (reg);
 +  if (reg == NULL)
 +return;

/* We may have dangling bits in debug-used for registers that were part of
   a multi-register use, one component of which has been reset.  */


-- 
Eric Botcazou


Re: CFT: Move unwinder to toplevel libgcc

2011-07-07 Thread Steve Ellcey
On Thu, 2011-07-07 at 15:08 +0200, Rainer Orth wrote:
 Tristan Gingold ging...@adacore.com writes:
 
  Otherwise, the patch is unchanged from the original submission:
  
 [build] Move unwinder to toplevel libgcc
 http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01452.html
  
  Unfortunately, it hasn't seen much comment.  I'm now looking for testers
  especially on platforms with more change and approval of those parts:
  
  * Several IA-64 targets:
  
 ia64*-*-linux*
 ia64*-*-hpux*
 ia64-hp-*vms*
 
  For ia64-hp-vms, consider your patch approved if the parts for ia64 are.
  In case of break, I will fix them.
 
 In that case, perhaps Steve could have a look?  I'd finally like to make
 some progress on this patch.
 
 Thanks.
 Rainer

I just tried builds on ia64 linux and HP-UX and both builds failed.  I
am re-trying the HP-UX build with --with-system-libunwind to see if that
changes things but that should be the default on IA64 HP-UX.

On Linux (debian) the build stopped with:

/test/big-foot1/gcc/nightly/gcc-ia64-debian-linux-gnu-trunk/ia64-debian-linux-gnu/bin/ranlib
 libgcov.a
make[3]: *** No rule to make target
`/test/big-foot1/gcc/nightly/src/trunk/libgcc/unwind-sjlj.c', needed by
`unwind-sjlj.o'.  Stop.
make[3]: Leaving directory
`/test/big-foot1/gcc/nightly/build-ia64-debian-linux-gnu-trunk/obj_gcc/ia64-debian-linux-gnu/libgcc'
make[2]: *** [all-stage1-target-libgcc] Error 2
make[2]: Leaving directory
`/test/big-foot1/gcc/nightly/build-ia64-debian-linux-gnu-trunk/obj_gcc'
make[1]: *** [stage1-bubble] Error 2
make[1]: Leaving directory
`/test/big-foot1/gcc/nightly/build-ia64-debian-linux-gnu-trunk/obj_gcc'
make: *** [bootstrap] Error 2


The patch appeared to install correctly into my source tree and I ran 
autoreconf to regenerate the
configure files.  It looks like patch didn't handle the unwind files that 
moved.  I will try doing
that by hand and see if that fixes things.

Steve Ellcey
s...@cup.hp.com



Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

2011-07-07 Thread Richard Guenther
On Thu, Jul 7, 2011 at 5:47 PM, Michael Meissner
meiss...@linux.vnet.ibm.com wrote:
 On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote:
 On Thu, Jul 7, 2011 at 12:29 AM, Michael Meissner
 meiss...@linux.vnet.ibm.com wrote:
  This patch adds an option to not load the static chain (r11) for 64-bit 
  PowerPC
  calls through function pointers (or virtual function).  Most of the 
  languages
  on the PowerPC do not need the static chain being loaded when called, and
  adding this instruction can slow down code that calls very short functions.
 
  In addition, if the function does not call alloca, setjmp or deal with
  exceptions where the stack is modified, the compiler can move the store of 
  the
  TOC value for the current function to the prologue of the function, rather 
  than
  at each call site.
 
  The effect of these patches is to speed up 464.h264ref in the Spec 2006
  benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but 
  the
  save of the TOC register is hoisted).  I believe this is due to the load 
  of the
  current function's TOC (r2) having to wait until the store queue is drained
  with the store just before the call.
 
  Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what 
  the
  cause is.
 
  I have bootstraped the compiler and saw that there were no regressions in 
  make
  check.  Is it ok to install in the trunk?

 Hum.  Can't the compiler figure this our itself per-call-site?  At least
 the name of the command-line switch -m[no-]r11 is meaningless to me.
 Points-to information should be able to tell you if the function pointer
 points to a nested function.

 No, the compiler cannot figure it out.  Consider the case where a function is
 passed a pointer to a function, such as the standard library function qsort.
 The call may come from any random module, that isn't part of the compilation
 suite, such as if the function being passed the pointer is in a shared 
 library.
 You don't know whether the function pointed to uses the static chain
 (i.e. nested function call with trampoline, call to PL/I, or other language
 that does use the static chain, which is part of the ABI).  The point of the
 switch is similar to -ffast-math where you say you are willing to ignore some
 corner cases in the standard in order to get better performance.

Well, I guess you don't propose to build glibc with -mno-r11?  The compiler
certainly can't figure out in _all_ cases - but it should be able to handle
most of the cases (with LTO even more cases) ok, no?

I also wonder why loading a register is so expensive compared to the
actual call ...

 I certainly can call the switch -mno-static-chain, which is perhaps more
 meaningful (at least to us compiler folk, I'm not sure static chain means much
 to the normal programmer).

Well, that's up to the target maintainers to decide, maybe
-mno-nested-functions instead?

Richard.


Re: [PATCH 4/6] Shrink-wrapping

2011-07-07 Thread Richard Earnshaw
On 07/07/11 15:34, Richard Sandiford wrote:
 It seems a shame to have both (return) and (simple_return).  You said
 that we need the distinction in order to cope with targets like ARM,
 whose (return) instruction actually performs some of the epilogue too.
 It feels like the load of the saved registers should really be expressed
 in rtl, in parallel with the return.  I realise that'd prevent
 conditional returns though.  Maybe there's no elegant way out...

You'd still need to deal with distinct returns for shrink-wrapped code
when the full (return) expands to

ldm sp, {regs..., pc}

The shrink wrapped version would always be
bx  lr

There are also cases (eg on v4T) where the Thumb return sequence
sometimes has to pop into a lo register before branching to that return
address, eg

pop {r3}
bx  r3

in order to get interworking.

R.



Re: [PATCH] Make VRP optimize useless conversions

2011-07-07 Thread Michael Matz
Hi,

On Thu, 7 Jul 2011, Richard Guenther wrote:

 +   tree rhs1 = gimple_assign_rhs1 (stmt);
 +   gimple def_stmt = SSA_NAME_DEF_STMT (rhs1);
 +   value_range_t *final, *inner;
 + 
 +   /* Obtain final and inner value-ranges for a conversion
 +  sequence (final-type)(intermediate-type)inner-type.  */
 +   final = get_value_range (gimple_assign_lhs (stmt));
 +   if (final-type != VR_RANGE)
 + return false;
 +   if (!is_gimple_assign (def_stmt)
 +   || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
 + return false;
 +   rhs1 = gimple_assign_rhs1 (def_stmt);
 +   if (TREE_CODE (rhs1) != SSA_NAME)
 + return false;
 +   inner = get_value_range (rhs1);
 +   if (inner-type != VR_RANGE)
 + return false;
 +   if (!tree_int_cst_equal (final-min, inner-min)
 +   || !tree_int_cst_equal (final-max, inner-max))
 + return false;

I think that's a bit too conservative.  Granted in current VRP it might 
work, but think about an intermediate truncation plus widening:

  short s;
  short d = (short)(signed char)s;

It wouldn't be wrong for VRP to assign d the range [-16384,16383], 
suboptimal but correct.  That would trigger your function in removing the 
truncation, and _that_ would be incorrect.  The bounds of VRP aren't 
reliably tight.  You probably want to recheck if the intermediate 
conversion isn't truncating the known input range of rhs1.


Ciao,
Michael.


Re: [testsuite] ARM wmul tests: require arm_dsp_multiply

2011-07-07 Thread Richard Earnshaw
On 06/07/11 18:33, Janis Johnson wrote:
 On 06/29/2011 06:25 AM, Richard Earnshaw wrote:
 On 23/06/11 22:38, Janis Johnson wrote:
 Tests wmul-[1234].c and mla-2.c in gcc.target/arm require support that
 the arm backend identifies as TARGET_DSP_MULTIPLY.  The tests all
 specify a -march option with that support, but it is overridden by
 multilib flags.

 This patch adds a new effective target, arm_dsp_multiply, and requires
 it for those tests instead of having them specify a -march value.  This
 means that the tests will be skipped for older targets and test coverage
 relies on testing for some newer multilibs.

 The same effective target is needed for tests smlaltb-1.c, smlaltt-1.c,
 smlatb-1.c, and smlatt-1.c, but those also need to be renamed so the
 scans don't pass just because the file name is in the assembly file.

 OK for trunk, and later for 4.6?

 (btw, I'm currently testing ARM compile-only tests with 43 sets of
 multilib flags)


 I've recently approved a patch from James Greenhalgh
 (http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01852.html) that defines
 __ARM_DSP_MULTIPLY when these features are available.  That should
 simplify your target-supports change and also serve as a check that we
 aren't erroneously defining that macro.

 R.
 
 This version uses the new macro from James Greenhalgh, making the
 effective-target check trivial.  The patch removes -march options from
 the tests, and adds a tab to the scans in smla*.c so the scan won't
 match the file name; there are other arm tests that use tab in the
 search target.
 
 OK for trunk, and later for 4.6?  Putting this patch on 4.6 requires the
 new macro there as well.
 
 

I have no objections if the branch maintainers are happy with this.

R.



Re: [PATCH] New IPA-CP with real function cloning

2011-07-07 Thread Jan Hubicka
Hi,
patch is long, so let me review it in more passes.
 
 
 2011-06-22  Martin Jambor  mjam...@suse.cz
 
   * ipa-prop.h: Include alloc-pool.h.
   (ipa_lattice_type): Removed.
   (ipcp_value_source): New type.
   (ipcp_value): Likewise.
   (ipcp_values_pool): Declare.
   (ipcp_sources_pool): Likewise.
   (ipa_param_descriptor): Removed.
   (ipcp_lattice): Removed fileds type and constant. Added fields decl,
   values, values_count, contains_variable, bottom, used and virt_call.
   (ipa_node_params): New fields lattices, known_vals,
   clone_for_all_contexts and noe dead, removed fields params and
   count_scale.
   (ipa_get_param): Updated.
   (ipa_param_cannot_devirtualize_p): Removed.
   (ipa_param_types_vec_empty): Likewise.
   (ipa_edge_args): New field next_edge_clone.
   (ipa_func_list): Removed.
   (ipa_init_func_list): Removed declaration.
   (ipa_push_func_to_list_1): Likewise.
   (ipa_pop_func_from_list): Likewise.
   (ipa_push_func_to_list): Removed.
   (ipa_lattice_from_jfunc): Remove declaration.
   (ipa_get_jf_pass_through_result): Declare.
   (ipa_get_jf_ancestor_result): Likewise.
   (ipa_value_from_jfunc): Likewise.
   (ipa_get_lattice): Update.
   (ipa_lat_is_single_const): New function.
   * ipa-prop.c (ipa_push_func_to_list_1): Removed.
   (ipa_init_func_list): Likewise.
   (ipa_pop_func_from_list): Likewise.
   (ipa_get_param_decl_index): Fix coding style.
   (ipa_populate_param_decls): Update to use new lattices.
   (ipa_initialize_node_params): Likewise.
   (visit_ref_for_mod_analysis): Likewise.
   (ipa_analyze_params_uses): Likewise.
   (ipa_free_node_params_substructures): Likewise.
   (ipa_edge_duplication_hook): Add the new edge to the list of edge
   clones.
   (ipa_node_duplication_hook): Update to use new lattices.
   (ipa_free_all_structures_after_ipa_cp): Free alloc pools.
   (ipa_free_all_structures_after_iinln): Likewise.
   (ipa_write_node_info): Update to use new lattices.
   (ipa_read_node_info): Likewise.
   (ipa_get_jf_pass_through_result): New function.
   (ipa_get_jf_ancestor_result): Likewise.
   (ipa_value_from_jfunc): Likewise.
   (ipa_cst_from_jfunc): Reimplemented using ipa_value_from_jfunc.
   * ipa-cp.c: Reimplemented.
   * params.def (PARAM_DEVIRT_TYPE_LIST_SIZE): Removed.
   (PARAM_IPA_CP_VALUE_LIST_SIZE): New parameter.
   * Makefile.in (IPA_PROP_H): Added alloc-pool.h to dependencies.
 
   * doc/invoke.texi (devirt-type-list-size): Removed description.
   (ipa-cp-value-list-size): Added description.
 
   * testsuite/gcc.dg/ipa/ipa-1.c: Updated testcase dump scan.
   * testsuite/gcc.dg/ipa/ipa-2.c: Likewise.
   * testsuite/gcc.dg/ipa/ipa-3.c: Likewise and made functions static.
   * testsuite/gcc.dg/ipa/ipa-4.c: Updated testcase dump scan.
   * testsuite/gcc.dg/ipa/ipa-5.c: Likewise.
   * testsuite/gcc.dg/ipa/ipa-7.c: Xfail test.
   * testsuite/gcc.dg/ipa/ipa-8.c: Updated testcase dump scan.
   * testsuite/gcc.dg/ipa/ipacost-1.c: Likewise.
   * testsuite/gcc.dg/ipa/ipacost-2.c: Likewise.
   * testsuite/gcc.dg/ipa/ipcp-1.c: New test.
   * testsuite/gcc.dg/ipa/ipcp-2.c: Likewise.
   * testsuite/gcc.dg/tree-ssa/ipa-cp-1.c: Updated testcase.

 /* Interprocedural analyses.
Copyright (C) 2005, 2007, 2008, 2009, 2010
2011
Free Software Foundation, Inc.
 
 
 /* The following definitions and interfaces are used by
interprocedural analyses or parameters.  */
 
 /* ipa-prop.c stuff (ipa-cp, indirect inlining):  */

I was bit thinking about it and probably we could make ipa-prop
and ipa-inline-analysis to be stand alone analysis passes, instead of
something called either from inliner or ipa-cp analysis stage. But
that could be done incrementally.

 
 /* A jump function for a callsite represents the values passed as actual
arguments of the callsite. There are three main types of values :
 
Pass-through - the caller's formal parameter is passed as an actual
   argument, possibly one simple operation performed on it.
Constant - a constant (is_gimple_ip_invariant)is passed as an actual
   argument.
Unknown  - neither of the above.
 
IPA_JF_CONST_MEMBER_PTR stands for C++ member pointers, it is a special
constant in this regard.  Other constants are represented with 
 IPA_JF_CONST.

While we are at docs, I would bit expand. It seems to me that for someone not 
familiar
with the concept is not clear at all why member pointers are special.
(i.e. explain that they are non-gimple-regs etc.)

 
IPA_JF_ANCESTOR is a special pass-through jump function, which means that
the result is an address of a part of the object pointed to by the formal
parameter to which the function refers.  It is mainly intended to represent
getting 

[patch tree-optimization]: [1 of 3]: Boolify compares more

2011-07-07 Thread Kai Tietz
Hello,

This patch - first of series - adds to fold and some helper routines support
for one-bit precision bitwise folding and detection.
This patch is necessary for - next patch of series - boolification of
comparisons.

Bootstrapped and regression tested for all standard-languages (plus
Ada and Obj-C++) on host x86_64-pc-linux-gnu.

Ok for apply?

Regards,
Kai

ChangeLog

2011-07-07  Kai Tietz  kti...@redhat.com

* fold-const.c (fold_truth_not_expr): Handle
one bit precision bitwise operations.
(fold_range_test): Likewise.
(fold_truthop): Likewise.
(fold_binary_loc): Likewise.
(fold_truth_andor): Function replaces truth_andor
label.
(fold_ternary_loc): Use truth_value_type_p instead
of truth_value_p.
* gimple.c (canonicalize_cond_expr_cond): Likewise.
* gimplify.c (gimple_boolify): Likewise.
* tree-ssa-structalias.c (find_func_aliases): Likewise.
* tree-ssa-forwprop.c (truth_valued_ssa_name): Likewise.
* tree.h (truth_value_type_p): New function.
(truth_value_p): Implemented as macro via truth_value_type_p.


Index: gcc-head/gcc/fold-const.c
===
--- gcc-head.orig/gcc/fold-const.c
+++ gcc-head/gcc/fold-const.c
@@ -3074,20 +3074,35 @@ fold_truth_not_expr (location_t loc, tre
 case INTEGER_CST:
   return constant_boolean_node (integer_zerop (arg), type);

+case BIT_AND_EXPR:
+  if (integer_onep (TREE_OPERAND (arg, 1)))
+   return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0));
+  if (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 0))) != 1)
+return NULL_TREE;
+  /* fall through */
 case TRUTH_AND_EXPR:
   loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
   loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc);
-  return build2_loc (loc, TRUTH_OR_EXPR, type,
+  return build2_loc (loc, (code == BIT_AND_EXPR ? BIT_IOR_EXPR
+   : TRUTH_OR_EXPR), type,
 invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)),
 invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1)));

+case BIT_IOR_EXPR:
+  if (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 0))) != 1)
+return NULL_TREE;
+  /* fall through.  */
 case TRUTH_OR_EXPR:
   loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
   loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc);
-  return build2_loc (loc, TRUTH_AND_EXPR, type,
+  return build2_loc (loc, (code == BIT_IOR_EXPR ? BIT_AND_EXPR
+   : TRUTH_AND_EXPR), type,
 invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)),
 invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1)));
-
+case BIT_XOR_EXPR:
+  if (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 0))) != 1)
+return NULL_TREE;
+  /* fall through.  */
 case TRUTH_XOR_EXPR:
   /* Here we can invert either operand.  We invert the first operand
 unless the second operand is a TRUTH_NOT_EXPR in which case our
@@ -3095,10 +3110,14 @@ fold_truth_not_expr (location_t loc, tre
 negation of the second operand.  */

   if (TREE_CODE (TREE_OPERAND (arg, 1)) == TRUTH_NOT_EXPR)
-   return build2_loc (loc, TRUTH_XOR_EXPR, type, TREE_OPERAND (arg, 0),
+   return build2_loc (loc, code, type, TREE_OPERAND (arg, 0),
+  TREE_OPERAND (TREE_OPERAND (arg, 1), 0));
+  else if (TREE_CODE (TREE_OPERAND (arg, 1)) == BIT_NOT_EXPR
+   TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 1))) == 1)
+   return build2_loc (loc, code, type, TREE_OPERAND (arg, 0),
   TREE_OPERAND (TREE_OPERAND (arg, 1), 0));
   else
-   return build2_loc (loc, TRUTH_XOR_EXPR, type,
+   return build2_loc (loc, code, type,
   invert_truthvalue_loc (loc, TREE_OPERAND (arg, 0)),
   TREE_OPERAND (arg, 1));

@@ -3116,6 +3135,11 @@ fold_truth_not_expr (location_t loc, tre
 invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)),
 invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1)));

+
+case BIT_NOT_EXPR:
+  if (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg, 0))) != 1)
+return NULL_TREE;
+  /* fall through */
 case TRUTH_NOT_EXPR:
   return TREE_OPERAND (arg, 0);

@@ -3158,11 +3182,6 @@ fold_truth_not_expr (location_t loc, tre
   return build1_loc (loc, TREE_CODE (arg), type,
 invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)));

-case BIT_AND_EXPR:
-  if (!integer_onep (TREE_OPERAND (arg, 1)))
-   return NULL_TREE;
-  return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0));
-
 case SAVE_EXPR:
   return build1_loc (loc, 

[patch tree-optimization]: [2 of 3]: Boolify compares more

2011-07-07 Thread Kai Tietz
Hello,

This patch - second of series - adds boolification of comparisions in
gimplifier.  For this
casts from/to boolean are marked as not-useless. And in fold_unary_loc
casts to non-boolean integral types are preserved.
The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not strictly
necessary - as long as fold-const handles 1-bit precision bitwise-expression
with truth-logic - but it has shown to short-cut some expensier folding. So
I kept it within this patch.

The adjusted testcase gcc.dg/uninit-15.c indicates that due
optimization we loose
in this case variables declaration.  But this might be to be expected.

In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c
test-case.  It's caused
by always having boolean-type on conditions.  So vectorizer sees
different types, which
aren't handled by vectorizer right now.  Maybe this issue could be
special-cased for
boolean-types in tree-vect-loop, by making operand for used condition
equal to vector-type.
But this is a subject for a different patch and not addressed by this series.

There is a regressions in tree-ssa/vrp47.c, and the fix is addressed
by the 3rd patch of this
series.

Bootstrapped and regression tested for all standard-languages (plus
Ada and Obj-C++) on host x86_64-pc-linux-gnu.

Ok for apply?

Regards,
Kai


ChangeLog

2011-07-07  Kai Tietz  kti...@redhat.com

* fold-const.c (fold_unary_loc): Preserve
non-boolean-typed casts.
* gimplify.c (gimple_boolify): Handle boolification
of comparisons.
(gimplify_expr): Boolifiy non aggregate-typed
comparisons.
* tree-cfg.c (verify_gimple_comparison): Check result
type of comparison expression.
* tree-ssa.c (useless_type_conversion_p): Preserve incompatible
casts from/to boolean,
* tree-ssa-forwprop.c (combine_cond_expr_cond): Add simplification
support for one-bit-precision typed X for cases X != 0 and X == 0.
(forward_propagate_comparison): Adjust test of condition
result.


* gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted.
* gcc.dg/tree-ssa/pr21031.c: Likewise.
* gcc.dg/tree-ssa/pr30978.c: Likewise.
* gcc.dg/tree-ssa/ssa-fre-6.c: Likewise.
* gcc.dg/binop-xor1.c: Mark it as expected fail.
* gcc.dg/binop-xor3.c: Likewise.
* gcc.dg/uninit-15.c: Adjust reported message.

Index: gcc-head/gcc/fold-const.c
===
--- gcc-head.orig/gcc/fold-const.c
+++ gcc-head/gcc/fold-const.c
@@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre
 non-integral type.
 Do not fold the result as that would not simplify further, also
 folding again results in recursions.  */
- if (INTEGRAL_TYPE_P (type))
+ if (TREE_CODE (type) == BOOLEAN_TYPE)
return build2_loc (loc, TREE_CODE (op0), type,
   TREE_OPERAND (op0, 0),
   TREE_OPERAND (op0, 1));
- else
+ else if (!INTEGRAL_TYPE_P (type))
return build3_loc (loc, COND_EXPR, type, op0,
   fold_convert (type, boolean_true_node),
   fold_convert (type, boolean_false_node));
Index: gcc-head/gcc/gimplify.c
===
--- gcc-head.orig/gcc/gimplify.c
+++ gcc-head/gcc/gimplify.c
@@ -2842,18 +2842,23 @@ gimple_boolify (tree expr)

 case TRUTH_NOT_EXPR:
   TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0));
-  /* FALLTHRU */

-case EQ_EXPR: case NE_EXPR:
-case LE_EXPR: case GE_EXPR: case LT_EXPR: case GT_EXPR:
   /* These expressions always produce boolean results.  */
-  TREE_TYPE (expr) = boolean_type_node;
+  if (TREE_CODE (type) != BOOLEAN_TYPE)
+   TREE_TYPE (expr) = boolean_type_node;
   return expr;

 default:
+  if (COMPARISON_CLASS_P (expr))
+   {
+ /* There expressions always prduce boolean results.  */
+ if (TREE_CODE (type) != BOOLEAN_TYPE)
+   TREE_TYPE (expr) = boolean_type_node;
+ return expr;
+   }
   /* Other expressions that get here must have boolean values, but
 might need to be converted to the appropriate mode.  */
-  if (type == boolean_type_node)
+  if (TREE_CODE (type) == BOOLEAN_TYPE)
return expr;
   return fold_convert_loc (loc, boolean_type_node, expr);
 }
@@ -6763,7 +6768,7 @@ gimplify_expr (tree *expr_p, gimple_seq
tree org_type = TREE_TYPE (*expr_p);

*expr_p = gimple_boolify (*expr_p);
-   if (org_type != boolean_type_node)
+   if (!useless_type_conversion_p (org_type, TREE_TYPE (*expr_p)))
  {
*expr_p = fold_convert (org_type, *expr_p);
ret = GS_OK;
@@ -7208,7 +7213,7 @@ gimplify_expr (tree *expr_p, gimple_seq
   

[patch tree-optimization]: [3 of 3]: Boolify compares more

2011-07-07 Thread Kai Tietz
Hello,

This patch - third of series - fixes vrp to handle bitwise one-bit
precision typed operations.
And it introduces a second - limitted to non-switch-statement range - vrp pass.

Bootstrapped and regression tested for all standard-languages (plus
Ada and Obj-C++) on host x86_64-pc-linux-gnu.

Ok for apply?

Regards,
Kai

ChangeLog

2011-07-07  Kai Tietz  kti...@redhat.com

* tree-vrp.c (in_second_pass): New static variable.
(extract_range_from_binary_expr): Add handling for
BIT_IOR_EXPR, BIT_AND_EXPR, and BIT_NOT_EXPR.
(register_edge_assert_for_1): Add handling for 1-bit
BIT_IOR_EXPR and BIT_NOT_EXPR.
(register_edge_assert_for): Add handling for 1-bit
BIT_IOR_EXPR.
(ssa_name_get_inner_ssa_name_p): New helper function.
(ssa_name_get_cast_to_p): New helper function.
(simplify_truth_ops_using_ranges): Handle prefixed
cast instruction for result, and add support for one
bit precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR,
, and BIT_NOT_EXPR.
(simplify_stmt_using_ranges): Add handling for one bit
precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR,
and BIT_NOT_EXPR.
(vrp_finalize): Do substitute and fold pass a second
time for vrp_stmt and preserve switch-edge simplification
on second run.
(simplify_switch_using_ranges): Preserve rerun of function
in second pass.

Index: gcc-head/gcc/tree-vrp.c
===
--- gcc-head.orig/gcc/tree-vrp.c
+++ gcc-head/gcc/tree-vrp.c
@@ -74,6 +74,9 @@ struct value_range_d

 typedef struct value_range_d value_range_t;

+/* This flag indicates that we are doing a second pass of VRP.  */
+static bool in_second_pass = false;
+
 /* Set of SSA names found live during the RPO traversal of the function
for still active basic-blocks.  */
 static sbitmap *live;
@@ -2232,6 +2235,7 @@ extract_range_from_binary_expr (value_ra
  some cases.  */
   if (code != BIT_AND_EXPR
code != TRUTH_AND_EXPR
+   code != BIT_IOR_EXPR
code != TRUTH_OR_EXPR
code != TRUNC_DIV_EXPR
code != FLOOR_DIV_EXPR
@@ -2291,6 +2295,8 @@ extract_range_from_binary_expr (value_ra
  else
set_value_range_to_varying (vr);
}
+  else if (code == BIT_IOR_EXPR)
+set_value_range_to_varying (vr);
   else
gcc_unreachable ();

@@ -2300,11 +2306,13 @@ extract_range_from_binary_expr (value_ra
   /* For integer ranges, apply the operation to each end of the
  range and see what we end up with.  */
   if (code == TRUTH_AND_EXPR
-  || code == TRUTH_OR_EXPR)
+  || code == TRUTH_OR_EXPR
+  || ((code == BIT_AND_EXPR || code == BIT_IOR_EXPR)
+   TYPE_PRECISION (TREE_TYPE (op1)) == 1))
 {
   /* If one of the operands is zero, we know that the whole
 expression evaluates zero.  */
-  if (code == TRUTH_AND_EXPR
+  if ((code == TRUTH_AND_EXPR || code == BIT_AND_EXPR)
   ((vr0.type == VR_RANGE
integer_zerop (vr0.min)
integer_zerop (vr0.max))
@@ -2317,7 +2325,7 @@ extract_range_from_binary_expr (value_ra
}
   /* If one of the operands is one, we know that the whole
 expression evaluates one.  */
-  else if (code == TRUTH_OR_EXPR
+  else if ((code == TRUTH_OR_EXPR || code == BIT_IOR_EXPR)
((vr0.type == VR_RANGE
 integer_onep (vr0.min)
 integer_onep (vr0.max))
@@ -2809,7 +2817,7 @@ extract_range_from_unary_expr (value_ran
  cannot easily determine a resulting range.  */
   if (code == FIX_TRUNC_EXPR
   || code == FLOAT_EXPR
-  || code == BIT_NOT_EXPR
+  || (code == BIT_NOT_EXPR  TYPE_PRECISION (type) != 1)
   || code == CONJ_EXPR)
 {
   /* We can still do constant propagation here.  */
@@ -3976,7 +3984,9 @@ build_assert_expr_for (tree cond, tree v
   tree a = build2 (ASSERT_EXPR, TREE_TYPE (v), v, cond);
   assertion = gimple_build_assign (n, a);
 }
-  else if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
+  else if (TREE_CODE (cond) == TRUTH_NOT_EXPR
+  || (TREE_CODE (cond) == BIT_NOT_EXPR
+   TYPE_PRECISION (TREE_TYPE (cond)) == 1))
 {
   /* Given !V, build the assignment N = false.  */
   tree op0 = TREE_OPERAND (cond, 0);
@@ -4531,7 +4541,9 @@ register_edge_assert_for_1 (tree op, enu
   retval |= register_edge_assert_for_1 (gimple_assign_rhs2 (op_def),
code, e, bsi);
 }
-  else if (gimple_assign_rhs_code (op_def) == TRUTH_NOT_EXPR)
+  else if (gimple_assign_rhs_code (op_def) == TRUTH_NOT_EXPR
+  || (gimple_assign_rhs_code (op_def) == BIT_NOT_EXPR
+   TYPE_PRECISION (TREE_TYPE (op)) == 1))
 {
   /* Recurse, flipping CODE.  */
   code = invert_tree_comparison (code, false);
@@ -4617,6 +4629,9 @@ 

Re: [testsuite] arm tests: remove -march= and dg-prune-output from 3 tests

2011-07-07 Thread Richard Earnshaw
On 07/07/11 00:26, Janis Johnson wrote:
 For three tests in gcc.target/arm that don't depend on processor-specific
 behavior, don't specify the -march option.  This makes dg-prune-output
 for warnings about conflicts unnecessary, so remove it.
 
 Two of these tests are for internal compiler errors that showed up with
 particular values of -march.  I think it's fine to test them with normal
 multilibs, some of which will use those -march values, and others of
 which could trigger a closely-related ICE.
 
 If there'a a desire to use specific options in a test, I'd prefer to see
 it done in a copy of the test that is skipped for all multilibs but the
 default.
 
 OK for trunk, and for 4.6 after a few days?
 
 
 gcc-20110706-3
 
 
 2011-07-06  Janis Johnson  jani...@codesourcery.com
 
   * gcc.target/arm/pr41679.c: Remove -march options and unneeded
   dg-prune-output.
   * gcc.target/arm/pr46883.c: Likewise.
   * gcc.target/arm/xor-and.c: Likewise.
 
 Index: gcc.target/arm/pr41679.c

I think this should just be moved to gcc.c-torture/compile.  There
doesn't seem to be anything processor-specific here.

 Index: gcc.target/arm/pr46883.c

Likewise.

 Index: gcc.target/arm/xor-and.c
 ===
 --- gcc.target/arm/xor-and.c  (revision 175921)
 +++ gcc.target/arm/xor-and.c  (working copy)
 @@ -1,6 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O -march=armv6 } */
 -/* { dg-prune-output switch .* conflicts with } */
 +/* { dg-options -O } */
  
  unsigned short foo (unsigned short x)
  {

The purpose of this test seems to be to ensure that when compiling for
v6 we don't get particular instructions.  Removing the -march flag means
we won't normally test this in the way intended (ie unless the multilibs
explicitly test v6).  This is one of those cases where I think the
intention really is to force one particular instruction set.

R.




Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

2011-07-07 Thread Tristan Gingold
[...]

On Jul 7, 2011, at 5:53 PM, Richard Guenther wrote:

 On Thu, Jul 7, 2011 at 5:47 PM, Michael Meissner
 meiss...@linux.vnet.ibm.com wrote:
 I certainly can call the switch -mno-static-chain, which is perhaps more
 meaningful (at least to us compiler folk, I'm not sure static chain means 
 much
 to the normal programmer).
 
 Well, that's up to the target maintainers to decide, maybe
 -mno-nested-functions instead?

Isn't that an issue of pointer to nested functions rather than nested functions 
?
So -mno-nested-function-pointers would be more accurate

That's somewhat important from an Ada POV as nested subprograms are common, but
access/pointer to nested subprogram is not very usual.

My two cents.
Tristan.



Re: Fix PR 49014

2011-07-07 Thread Vladimir Makarov

On 07/01/2011 10:50 AM, Andrey Belevantsev wrote:

On 26.05.2011 17:32, Andrey Belevantsev wrote:

On 25.05.2011 19:31, Bernd Schmidt wrote:

On 05/25/2011 03:29 PM, Andrey Belevantsev wrote:
I think the hook is a better idea than the attribute because nobody 
will

care to mark all offending insns with an attribute.


I don't know. IIRC when I looked at sh or whatever the broken port was,
it was only two insns - there would still be some value in being 
able to

assert that all other insns have a reservation.
OK, I will take a look on x86-64 and will get back with more 
information.


Andrey
So, I have made an attempt to bootstrap on x86-64 with the extra 
assert in selective scheduling that assumes the DFA state always 
changes when issuing a recog_memoized =0 insn (patch attached).  
Indeed, there are just a few general insns that don't have proper 
reservations.  However, it was a surprise to me to see that almost any 
insn with SSE registers fails this assert and thus does not get 
properly scheduled.


Overall, the work on fixing those seems doable, it took just a day to 
get the compiler bootstrapped (of course, the testsuite may bring much 
more issues).  So, if there is an agreement on marking a few offending 
insns with the new attribute, we can proceed with the help of somebody 
from the x86 land on fixing those and researching for other targets.


The changes in sel-sched.c is ok for me.  i386.md changes look ok for me 
too but you should ask a x86 maintainer to get an approval for the change.


I think you should describe the attribute in the documentation because 
it is common for all targets.


I can not approve common.opt changes because it makes selective 
scheduler is default for the 2nd insn scheduling for all targets.  Such 
change should be justified by thorough testing and benchmarking 
(compilation speed, code size, performance improvements) on several 
platforms (at least on major ones).


Re: [patch tree-optimization]: [3 of 3]: Boolify compares more

2011-07-07 Thread Kai Tietz
2011/7/7 Paolo Bonzini bonz...@gnu.org:
 On 07/07/2011 06:07 PM, Kai Tietz wrote:

 +  /* We redo folding here one time for allowing to inspect more
 +     complex reductions.  */
 +  substitute_and_fold (op_with_constant_singleton_value_range,
 +                      vrp_fold_stmt, false);
 +  /* We need to mark this second pass to avoid re-entering of same
 +     edges for switch statments.  */
 +  in_second_pass = true;
    substitute_and_fold (op_with_constant_singleton_value_range,
                       vrp_fold_stmt, false);
 +  in_second_pass = false;

 This needs a much better explanation.

 Paolo

Well, I can work on a better comment.  The complex reduction I mean
here are cases like

int x;
int y;
_Bool D1;
_Bool D2;
_Bool D3;
int R;

D1 = x[0..1] != 0;
D2 = y[0..1] != 0;
D3 = D1  D2
R = (int) D3

(testcase is already present. See tree-ssa/vrp47.c).

As VRP in first pass produces (and replaces) to:

D1 = (_Bool) x[0..1];
D2 = (_Bool) y[0..1];
D3 = D1  D2
R = (int) D3

Just in the second pass the reduction

R = x[0..1]  y[0..1]

can happen.  In general it is sad that VRP can't insert during pass
new statements right now.  This would cause issues in range-tables,
which aren't designed for insertations.  As otherwise, we could do
also simplify things like

D1 = x[0..1] != 0;
D2 = y[0..1] == 0;
D3 = D1  D2
R = (int) D3

to
R = x[0..1]  (y[0..1] ^ 1)

Regards,
Kai


[patch] Disable static build for libjava

2011-07-07 Thread Matthias Klose
As discussed at the Google GCC gathering, disable the build of static libraries
in libjava, which should cut the build time of libjava by 50%.  The static
libjava build isn't useful out of the box, and I don't see it packaged by Linux
distributions either.

The AC_PROG_LIBTOOL check is needed to get access to the enable_shared macro.
I'm unsure about the check in the switch construct. Taken from libtool.m4, and
determining the value of enable_shared_with_static_runtimes.

Ok for the trunk?

2011-07-07  Matthias Klose  d...@ubuntu.com

* Makefile.def (target_modules/libjava): Pass
$(libjava_disable_static).
* configure.ac: Check for libtool, pass --disable-static
in libjava_disable_static.
* Makefile.in: Regenerate.
* configure: Likewise.

Index: Makefile.def
===
--- Makefile.def(revision 175963)
+++ Makefile.def(working copy)
@@ -132,7 +132,8 @@
 target_modules = { module= winsup; };
 target_modules = { module= libgloss; no_check=true; };
 target_modules = { module= libffi; };
-target_modules = { module= libjava; raw_cxx=true; };
+target_modules = { module= libjava; raw_cxx=true;
+   extra_configure_flags=$(libjava_disable_static); };
 target_modules = { module= zlib; };
 target_modules = { module= boehm-gc; };
 target_modules = { module= rda; };
Index: configure.ac
===
--- configure.ac(revision 175963)
+++ configure.ac(working copy)
@@ -443,6 +443,16 @@
   ;;
 esac

+AC_PROG_LIBTOOL
+if test x$enable_shared = xyes ; then
+  case $host_cpu in
+  cygwin* | mingw* | pw32* | cegcc*)
+;;
+  *)
+libjava_disable_static=--disable-static
+  esac
+fi
+AC_SUBST(libjava_disable_static)

 # Disable libmudflap on some systems.
 if test x$enable_libmudflap = x ; then



Re: [testsuite] arm tests: remove -march= and dg-prune-output from 3 tests

2011-07-07 Thread Janis Johnson
On 07/07/2011 09:14 AM, Richard Earnshaw wrote:
 On 07/07/11 00:26, Janis Johnson wrote:

 Index: gcc.target/arm/pr41679.c
 
 I think this should just be moved to gcc.c-torture/compile.  There
 doesn't seem to be anything processor-specific here.
 
 Index: gcc.target/arm/pr46883.c
 
 Likewise.

OK, I'll do that.

 Index: gcc.target/arm/xor-and.c
 ===
 --- gcc.target/arm/xor-and.c (revision 175921)
 +++ gcc.target/arm/xor-and.c (working copy)
 @@ -1,6 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O -march=armv6 } */
 -/* { dg-prune-output switch .* conflicts with } */
 +/* { dg-options -O } */
  
  unsigned short foo (unsigned short x)
  {
 
 The purpose of this test seems to be to ensure that when compiling for
 v6 we don't get particular instructions.  Removing the -march flag means
 we won't normally test this in the way intended (ie unless the multilibs
 explicitly test v6).  This is one of those cases where I think the
 intention really is to force one particular instruction set.
 
 R.

It passes everywhere, do you want to know when it stops passing for some
other multilib, or just care about armv6?  If you only care about armv6
then the test should be limited to run with the default multilib instead
of having to muck around checking for incompatible options.

Janis



Re: [testsuite] arm thumb tests: remove -march= and dg-prune-output from 9 tests

2011-07-07 Thread Richard Earnshaw
On 07/07/11 00:28, Janis Johnson wrote:
 This patch removes -march= from nine tests that also check for relevant
 effective targets.  If -march is removed there is no need to ignore
 compiler warnings about conflicting options with dg-prune-output, so the
 patch removes that from the tests.
 
 OK for trunk, and for 4.6 in a few days?
 
 
 gcc-20110706-4
 
 
 2011-07-06  Janis Johnson  jani...@codesourcery.com
 
   * gcc.target/arm/pr39839.c: Remove -march option and unneeded
   dg-prune-output.
   * gcc.target/arm/pr40657-2.c: Likewise.
   * gcc.target/arm/pr40956.c: Likewise.
   * gcc.target/arm/pr42235.c: Likewise.
   * gcc.target/arm/pr42495.c: Likewise.
   * gcc.target/arm/pr42505.c: Likewise.
   * gcc.target/arm/pr42574.c: Likewise.
   * gcc.target/arm/pr46934.c: Likewise.
   * gcc.target/arm/thumb-branch1.c: Likewise.
 
 Index: gcc.target/arm/pr39839.c
 ===
 --- gcc.target/arm/pr39839.c  (revision 175921)
 +++ gcc.target/arm/pr39839.c  (working copy)
 @@ -1,6 +1,5 @@
 -/* { dg-options -mthumb -Os -march=armv5te -mthumb-interwork -fpic }  */
 +/* { dg-options -mthumb -Os -mthumb-interwork -fpic }  */
  /* { dg-require-effective-target arm_thumb1_ok } */
 -/* { dg-prune-output switch .* conflicts with } */
  /* { dg-final { scan-assembler-not str\[\\t \]*r.,\[\\t \]*.sp, } } */
  

I think this test should work in both ARM and Thumb mode and for any
Thumb variant.  So I'd be inclined to remove arm_thumb1_ok and change
the dg-options to -Os -fpic.

 Index: gcc.target/arm/pr40657-2.c

OK.


 Index: gcc.target/arm/pr40956.c
 ===
 --- gcc.target/arm/pr40956.c  (revision 175921)
 +++ gcc.target/arm/pr40956.c  (working copy)
 @@ -1,7 +1,6 @@
 -/* { dg-options -mthumb -Os -fpic -march=armv5te }  */
 +/* { dg-options -mthumb -Os -fpic }  */
  /* { dg-require-effective-target arm_thumb1_ok } */
  /* { dg-require-effective-target fpic } */
 -/* { dg-prune-output switch .* conflicts with } */
  /* Make sure the constant 0 is loaded into register only once.  */
  /* { dg-final { scan-assembler-times mov\[\\t \]*r., #0 1 } } */
  

Same comment as for pr39839.c

 Index: gcc.target/arm/pr42235.c

OK.

 Index: gcc.target/arm/pr42495.c

OK.

 Index: gcc.target/arm/pr42505.c
 ===
 --- gcc.target/arm/pr42505.c  (revision 175921)
 +++ gcc.target/arm/pr42505.c  (working copy)
 @@ -1,6 +1,5 @@
 -/* { dg-options -mthumb -Os -march=armv5te }  */
 +/* { dg-options -mthumb -Os }  */
  /* { dg-require-effective-target arm_thumb1_ok } */
 -/* { dg-prune-output switch .* conflicts with } */
  /* { dg-final { scan-assembler-not str\[\\t \]*r.,\[\\t \]*.sp, } } */
  

Same comment as for pr39839.c

 Index: gcc.target/arm/pr42574.c

OK

 Index: gcc.target/arm/pr46934.c

There's nothing cpu specific here, this should be in gcc.c-torture/compile.

 Index: gcc.target/arm/thumb-branch1.c

OK.





Re: CFT: Move unwinder to toplevel libgcc

2011-07-07 Thread Steve Ellcey
On Thu, 2011-07-07 at 15:08 +0200, Rainer Orth wrote:

 In that case, perhaps Steve could have a look?  I'd finally like to make
 some progress on this patch.
 
 Thanks.
 Rainer

It looks like the GCC build is trying to compile unwind-ia64.c on IA64
HP-UX even though it should not use or need this file.  Using
--with-system-libunwind doesn't seem to help.  I am not sure where this
should be handled under the new setup.  Previously config.gcc would
either include or not include t-glibc-libunwind in the Makefile to build
or not build this file.  This might be coming from t-eh-ia64 rather
then t-glibc-libunwind.  Both of these include unwind-ia64.c.

Steve Ellcey
s...@cup.hp.com



Re: [testsuite] arm tests: remove -march= and dg-prune-output from 3 tests

2011-07-07 Thread Richard Earnshaw
On 07/07/11 17:30, Janis Johnson wrote:
 On 07/07/2011 09:14 AM, Richard Earnshaw wrote:
 On 07/07/11 00:26, Janis Johnson wrote:
 Index: gcc.target/arm/xor-and.c
 ===
 --- gcc.target/arm/xor-and.c(revision 175921)
 +++ gcc.target/arm/xor-and.c(working copy)
 @@ -1,6 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O -march=armv6 } */
 -/* { dg-prune-output switch .* conflicts with } */
 +/* { dg-options -O } */
  
  unsigned short foo (unsigned short x)
  {

 The purpose of this test seems to be to ensure that when compiling for
 v6 we don't get particular instructions.  Removing the -march flag means
 we won't normally test this in the way intended (ie unless the multilibs
 explicitly test v6).  This is one of those cases where I think the
 intention really is to force one particular instruction set.

 R.
 
 It passes everywhere, do you want to know when it stops passing for some
 other multilib, or just care about armv6?  If you only care about armv6
 then the test should be limited to run with the default multilib instead
 of having to muck around checking for incompatible options.
 

We only care about v6 here, I think.  There aren't really any multilib
issues, since it's a compile-only test.  I don't mind not testing it for
non-default multilibs, but it should be marked as 'skipped' or recorded
in some other way, so that the total number of tests is the same for
each variant.

BTW, can the testsuite ever be run with no default multilib?  If so,
then I don't think we should always skip the test.

R.



Re: [PATCH 4/6] Shrink-wrapping

2011-07-07 Thread Paul Koning

On Jul 7, 2011, at 11:38 AM, Bernd Schmidt wrote:

 ...
 
 It'd also be nice to get rid of all these big blocks of code that are
 conditional on preprocessor macros, but I realise you're just following
 existing practice in the surrounding code, so again it can be left to
 a future cleanup.
 
 Yeah, this function is quite horrid - so many different paths through it.
 
 However, it looks like the only target without HAVE_prologue is actually
 pdp11, so we're carrying some unnecessary baggage for purely
 retrocomputing purposes. Paul, can you fix that?

Sure, but...  I searched for HAVE_prologue and I can't find any place that set 
it.  There are tests for it, but I see nothing that defines it (other than 
df-scan.c which defines it as zero if it's not defined, not sure what the point 
of that is).

I must be missing something...

paul




Re: [PATCH 4/6] Shrink-wrapping

2011-07-07 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/07/11 10:58, Paul Koning wrote:
 
 On Jul 7, 2011, at 11:38 AM, Bernd Schmidt wrote:
 
 ...

 It'd also be nice to get rid of all these big blocks of code that are
 conditional on preprocessor macros, but I realise you're just following
 existing practice in the surrounding code, so again it can be left to
 a future cleanup.

 Yeah, this function is quite horrid - so many different paths through it.

 However, it looks like the only target without HAVE_prologue is actually
 pdp11, so we're carrying some unnecessary baggage for purely
 retrocomputing purposes. Paul, can you fix that?
 
 Sure, but...  I searched for HAVE_prologue and I can't find any place that 
 set it.  There are tests for it, but I see nothing that defines it (other 
 than df-scan.c which defines it as zero if it's not defined, not sure what 
 the point of that is).
 
 I must be missing something...
Isn't it defined by the insn-foo generators based on the existence of a
prologue/epilogue insn in the MD file?

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOFeYcAAoJEBRtltQi2kC7VGgIALj386a0t+LMKL8dqj81DnQ1
iMx7q+bMcKhJz6HT9iJNsH1u9rFuwlw5K+FqNlrlxazSUmDpnbqUbwcem33ciicl
jdBQQrCCyNMI0piWNS+2VwG8D3UZYOLsgHWSONK5oBDwNwDo5P8rQ3USOh4Gv6in
puKL0HsteTvycMPGoAj2ZQCs+dL6r5nogIsBMAtJ7n+Vw+hstGnbc7TdxDbWikDC
63KekXpeTyrYSBwK+mxzhs6p3lkydZxEQoh/iuKm4Pi6DFZRSZB+GTvFWSz+0Ek5
hLgqEI42LWRKx34qioO37C7cbY5ONo/O/G7wiPp3wjCm07YBFDV4awKP6XEnEfQ=
=4v2Y
-END PGP SIGNATURE-


Re: [patch] Disable static build for libjava

2011-07-07 Thread David Daney

On 07/07/2011 09:57 AM, Matthias Klose wrote:

On 07/07/2011 06:51 PM, David Daney wrote:

On 07/07/2011 09:27 AM, Matthias Klose wrote:

As discussed at the Google GCC gathering, disable the build of static libraries
in libjava, which should cut the build time of libjava by 50%.  The static
libjava build isn't useful out of the box, and I don't see it packaged by Linux
distributions either.

The AC_PROG_LIBTOOL check is needed to get access to the enable_shared macro.
I'm unsure about the check in the switch construct. Taken from libtool.m4, and
determining the value of enable_shared_with_static_runtimes.

Ok for the trunk?

2011-07-07  Matthias Klosed...@ubuntu.com

  * Makefile.def (target_modules/libjava): Pass
  $(libjava_disable_static).
  * configure.ac: Check for libtool, pass --disable-static
  in libjava_disable_static.
  * Makefile.in: Regenerate.
  * configure: Likewise.



My autoconf fu is not what it used to be.  It is fine if static libraries are
disabled by default, but it should be possible to enable them from the configure
command line.  It is unclear to me if this patch does that.


no. I assume an extra option --enable-static-libjava would be needed.


Not being a libjava maintainer, I cannot force you to add something like 
that as part of the patch, but I think it would be a good idea.





Also I would like to go on record as disagreeing with the statement that 'static
libjava build isn't useful out of the box'


I remember that there were some restrictions with the static library. but maybe
I'm wrong.



There are restrictions, but it is still useful for some embedded 
environments.


David Daney


Re: [PATCH 4/6] Shrink-wrapping

2011-07-07 Thread Paul Koning

On Jul 7, 2011, at 1:00 PM, Jeff Law wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 07/07/11 10:58, Paul Koning wrote:
 
 On Jul 7, 2011, at 11:38 AM, Bernd Schmidt wrote:
 
 ...
 
 It'd also be nice to get rid of all these big blocks of code that are
 conditional on preprocessor macros, but I realise you're just following
 existing practice in the surrounding code, so again it can be left to
 a future cleanup.
 
 Yeah, this function is quite horrid - so many different paths through it.
 
 However, it looks like the only target without HAVE_prologue is actually
 pdp11, so we're carrying some unnecessary baggage for purely
 retrocomputing purposes. Paul, can you fix that?
 
 Sure, but...  I searched for HAVE_prologue and I can't find any place that 
 set it.  There are tests for it, but I see nothing that defines it (other 
 than df-scan.c which defines it as zero if it's not defined, not sure what 
 the point of that is).
 
 I must be missing something...
 Isn't it defined by the insn-foo generators based on the existence of a
 prologue/epilogue insn in the MD file?

Thanks, that must be what I was missing.  So someone is generating HAVE_%s, and 
that's why grep didn't find HAVE_prologue?

From a note by Richard Henderson (June 30, 2011) it sounds like rs6000 is the 
other platform that still generates asm prologues.  But yes, I said I would do 
this.  It sounds like doing it soon would help Bernd a lot.  Let me try to 
accelerate it.

paul




Re: CFT: Move unwinder to toplevel libgcc

2011-07-07 Thread Steve Ellcey
On Thu, 2011-07-07 at 15:08 +0200, Rainer Orth wrote:

 In that case, perhaps Steve could have a look?  I'd finally like to make
 some progress on this patch.
 
 Thanks.
 Rainer

When doing an IA64 Linux build (where I do need to compile
unwind-ia64.c) I am dying with this failure:

In file included from 
/test/big-foot1/gcc/nightly/src/trunk/libgcc/config/ia64/unwind-ia64.c:35:0:
./md-unwind-support.h:42:7: error: unknown type name '_Unwind_FrameState'
./md-unwind-support.h:132:54: error: unknown type name '_Unwind_FrameState'
/test/big-foot1/gcc/nightly/src/trunk/libgcc/config/ia64/unwind-ia64.c: In 
function 'uw_update_reg_address':
/test/big-foot1/gcc/nightly/src/trunk/libgcc/config/ia64/unwind-ia64.c:1931:11: 
warning: cast discards '__attribute__((const))' qualifier from pointer target 
type [-Wcast-qual]
make[3]: *** [unwind-ia64.o] Error 1
make[3]: Leaving directory 
`/test/big-foot1/gcc/nightly/build-ia64-debian-linux-gnu-trunk/obj_gcc/ia64-debian-linux-gnu/libgcc'
make[2]: *** [all-stage1-target-libgcc] Error 2

Steve Ellcey
s...@cup.hp.com



Re: [PATCH 4/6] Shrink-wrapping

2011-07-07 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/07/11 11:05, Paul Koning wrote:
 
 On Jul 7, 2011, at 1:00 PM, Jeff Law wrote:
 
 -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
 
 On 07/07/11 10:58, Paul Koning wrote:
 
 On Jul 7, 2011, at 11:38 AM, Bernd Schmidt wrote:
 
 ...
 
 It'd also be nice to get rid of all these big blocks of code
 that are conditional on preprocessor macros, but I realise
 you're just following existing practice in the surrounding
 code, so again it can be left to a future cleanup.
 
 Yeah, this function is quite horrid - so many different paths
 through it.
 
 However, it looks like the only target without HAVE_prologue is
 actually pdp11, so we're carrying some unnecessary baggage for
 purely retrocomputing purposes. Paul, can you fix that?
 
 Sure, but...  I searched for HAVE_prologue and I can't find any
 place that set it.  There are tests for it, but I see nothing
 that defines it (other than df-scan.c which defines it as zero if
 it's not defined, not sure what the point of that is).
 
 I must be missing something...
 Isn't it defined by the insn-foo generators based on the existence
 of a prologue/epilogue insn in the MD file?
 
 Thanks, that must be what I was missing.  So someone is generating
 HAVE_%s, and that's why grep didn't find HAVE_prologue?
Yup.

Jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOFefZAAoJEBRtltQi2kC7Q1wH/R/vdaJUQfF732FZyuAHSMMu
TcDFJT4+uL4r5WaqBdrboyllLN0sJZYsXle/SDIMlL6wBMHDOmCykzEqWUC/Kukl
YC6u1NabYlWp0KcZqB+o2+ge4aixahPc5IJiQ/WHU9aT7/7t6VePYVSI8O9p7FjI
VXAtzrd7rrXpZnarTBHrbnmPOq/BIBzYM33kPUwThPkvy+NpYWWMPrH2moeN8EFM
1D9CATQTy3ysUGyLpxxIxNKmWqS/wJyl6+JycOE8aws9hiCclnlOdaI9yiKnU1Ht
cJut1tCv987VUidyEvKKGv/iDHm8fvTEPQ+EuwB3zD9bRqVM/cSRq2RKAdOiXoE=
=laeg
-END PGP SIGNATURE-


Re: Fix PR 49014

2011-07-07 Thread Bernd Schmidt
On 07/01/11 16:50, Andrey Belevantsev wrote:
 On 26.05.2011 17:32, Andrey Belevantsev wrote:
 On 25.05.2011 19:31, Bernd Schmidt wrote:
 On 05/25/2011 03:29 PM, Andrey Belevantsev wrote:
 I think the hook is a better idea than the attribute because nobody
 will
 care to mark all offending insns with an attribute.

 I don't know. IIRC when I looked at sh or whatever the broken port was,
 it was only two insns - there would still be some value in being able to
 assert that all other insns have a reservation.
 OK, I will take a look on x86-64 and will get back with more information.

 Andrey
 So, I have made an attempt to bootstrap on x86-64 with the extra assert
 in selective scheduling that assumes the DFA state always changes when
 issuing a recog_memoized =0 insn (patch attached).  Indeed, there are
 just a few general insns that don't have proper reservations.  However,
 it was a surprise to me to see that almost any insn with SSE registers
 fails this assert and thus does not get properly scheduled.

Probably because it's picking a scheduling description for an old CPU?
With -mcpu=pentium probably none of the newer patterns has a reservation.

That may scupper any plans to use this attribute on i386.

 Overall, the work on fixing those seems doable, it took just a day to
 get the compiler bootstrapped (of course, the testsuite may bring much
 more issues).  So, if there is an agreement on marking a few offending
 insns with the new attribute, we can proceed with the help of somebody
 from the x86 land on fixing those and researching for other targets.

+(set (attr nondfa_insn) (if_then_else (eq_attr alternative
3,4,5,6) (const_int 1) (const_int 0)))

I think this shouldn't use (const_int x); you want to be able to write
 (set_attr nondfa_insn 0,0,0,1,1,1,1)


Bernd


Re: [PATCH 4/6] Shrink-wrapping

2011-07-07 Thread Bernd Schmidt
On 07/07/11 19:05, Paul Koning wrote:
 From a note by Richard Henderson (June 30, 2011) it sounds like
 rs6000 is the other platform that still generates asm prologues.  But
 yes, I said I would do this.  It sounds like doing it soon would help
 Bernd a lot.  Let me try to accelerate it.

Maybe not a whole lot, but it would allow us to simplify some code.


Bernd


PATCH: Support -mx32 in GCC tests

2011-07-07 Thread H.J. Lu
Hi,

On Linux/x86-64, when we pass

RUNTESTFLAGS=--target_board='unix{-mx32}'

to GCC tests, we can't check lp64/ilp32 for availability of 64bit x86
instructions.  This patch adds ia32 and x32 effetive targets.  OK for
trunk?

Thanks.


H.J.
---
2011-07-07  H.J. Lu  hongjiu...@intel.com

* lib/target-supports.exp (check_effective_target_ia32): New.
(check_effective_target_x32): Likewise.
(check_effective_target_vect_cmdline_needed): Also check x32.

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 7db156f..b5b8782 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1512,6 +1512,28 @@ proc check_effective_target_ilp32 { } {
 }]
 }
 
+# Return 1 if we're generating ia32 code using default options, 0
+# otherwise.
+
+proc check_effective_target_ia32 { } {
+return [check_no_compiler_messages ia32 object {
+   int dummy[sizeof (int) == 4
+  sizeof (void *) == 4
+  sizeof (long) == 4 ? 1 : -1] = { __i386__ };
+}]
+}
+
+# Return 1 if we're generating x32 code using default options, 0
+# otherwise.
+
+proc check_effective_target_x32 { } {
+return [check_no_compiler_messages x32 object {
+   int dummy[sizeof (int) == 4
+  sizeof (void *) == 4
+  sizeof (long) == 4 ? 1 : -1] = { __x86_64__ };
+}]
+}
+
 # Return 1 if we're generating 32-bit or larger integers using default
 # options, 0 otherwise.
 
@@ -1713,7 +1735,8 @@ proc check_effective_target_vect_cmdline_needed { } {
if { [istarget alpha*-*-*]
 || [istarget ia64-*-*]
 || (([istarget x86_64-*-*] || [istarget i?86-*-*])
- [check_effective_target_lp64])
+ ([check_effective_target_x32]
+|| [check_effective_target_lp64]))
 || ([istarget powerpc*-*-*]
  ([check_effective_target_powerpc_spe]
 || [check_effective_target_powerpc_altivec]))


Re: [testsuite] arm tests: remove -march= and dg-prune-output from 3 tests

2011-07-07 Thread Janis Johnson
On 07/07/2011 09:48 AM, Richard Earnshaw wrote:
 On 07/07/11 17:30, Janis Johnson wrote:
 On 07/07/2011 09:14 AM, Richard Earnshaw wrote:
 On 07/07/11 00:26, Janis Johnson wrote:
 Index: gcc.target/arm/xor-and.c
 ===
 --- gcc.target/arm/xor-and.c   (revision 175921)
 +++ gcc.target/arm/xor-and.c   (working copy)
 @@ -1,6 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O -march=armv6 } */
 -/* { dg-prune-output switch .* conflicts with } */
 +/* { dg-options -O } */
  
  unsigned short foo (unsigned short x)
  {

 The purpose of this test seems to be to ensure that when compiling for
 v6 we don't get particular instructions.  Removing the -march flag means
 we won't normally test this in the way intended (ie unless the multilibs
 explicitly test v6).  This is one of those cases where I think the
 intention really is to force one particular instruction set.

 R.

 It passes everywhere, do you want to know when it stops passing for some
 other multilib, or just care about armv6?  If you only care about armv6
 then the test should be limited to run with the default multilib instead
 of having to muck around checking for incompatible options.

 
 We only care about v6 here, I think.  There aren't really any multilib
 issues, since it's a compile-only test.  I don't mind not testing it for
 non-default multilibs, but it should be marked as 'skipped' or recorded
 in some other way, so that the total number of tests is the same for
 each variant.

The total number of tests is not the same.  A test that compiles and does
a scan is 2 tests when it is run but is only reported as 1 UNSUPPORTED.
We don't currently have a way to count things like dg-final or dg-error
as UNSUPPORTED if the entire test is skipped.

 BTW, can the testsuite ever be run with no default multilib?  If so,
 then I don't think we should always skip the test.
 
 R.

I don't know.  I can leave it the way it is, always specifying -march
and ignoring warnings about conflicting options.  That doesn't guarantee,
though, that it will ever use the specified -march option because unless
there is a default multilib or one that doesn't use -march, the one in
the test will always be overridden by multilib options.

Janis



Re: PATCH: Support -mx32 in GCC tests

2011-07-07 Thread Mike Stump
On Jul 7, 2011, at 10:29 AM, H.J. Lu wrote:
 On Linux/x86-64, when we pass
 
 RUNTESTFLAGS=--target_board='unix{-mx32}'
 
 to GCC tests, we can't check lp64/ilp32 for availability of 64bit x86
 instructions.  This patch adds ia32 and x32 effetive targets.  OK for
 trunk?

Ok.


[PATCH 0/3] Fix PR47654 and PR49649

2011-07-07 Thread Sebastian Pop
Hi,

First there are two cleanup patches independent of the fix:

  Start counting nesting level from 0.
  Do not compute twice type, lb, and ub.

Then the patch that fixes PR47654:

  Fix PR47654: Compute LB and UB of a CLAST expression.

One of the reasons we cannot determine the IV type only from the
polyhedral representation is that as in the testcase of PR47654, we
are asked to generate an induction variable going from 0 to 127.  That
could be represented with a char.  However the upper bound
expression of the loop generated by CLOOG is min (127, 51*scat_1 + 50)
and that would overflow if we use a char type.  To evaluate a type
in which the expression 51*scat_1 + 50 does not overflow, we have to
compute an upper and lower bound for the expression.

To fix the problem exposed by Tobias:

 for (i = 0 ; i  2; i++)
  for (j = i ; j  i + 1; j++)
for (k = j ; k  j + 1; k++)
  for (m = k ; m  k + 1; m++)
for (n = m ; n  m + 1; n++)
  A[0] += A[n];
 
 I am a little bit afraid that we will increase the type size by an
 order of magnitude (or at least one bit) for each nesting level.

instead of computing the lb and ub of scat_1 in 51*scat_1 + 50 based
on the type of scat_1 (that we already code generated when building
the outer loop), we use the polyhedral representation to get an
accurate lb and ub for scat_1.

When translating the substitutions of a user statement using this
precise method, like for example S5 in vect-pr43423.c:

  for (scat_1=0;scat_1=min(T_3-1,T_4-1);scat_1++) {
S5(scat_1);

we get a type that is too precise: based on the interval [0,99] we get
the type unsigned char when the type of scat_1 is int, misleading
the vectorizer due to the insertion of spurious casts:

#  Access function 0: (int) {(unnamed-unsigned:8) graphite_IV.7_56, +, 1}_3;
#)
affine dependence test not usable: access function not affine or constant.

So we have to keep around the previous code gcc_type_for_clast_* that
computes the type of an expression as the max precision of the
components of that expression, and use that when computing the types
of substitution expressions.

The patches passed together a full bootstrap and test on amd64-linux.
Ok for trunk?

Thanks,
Sebastian


  1   2   >