RE: [PATCH] Fix stack red zone bug (PR38644)

2011-10-09 Thread Jiangning Liu


 -Original Message-
 From: Richard Henderson [mailto:r...@redhat.com]
 Sent: Saturday, October 01, 2011 3:05 AM
 To: Jiangning Liu
 Cc: 'Jakub Jelinek'; 'Richard Guenther'; Andrew Pinski; gcc-
 patc...@gcc.gnu.org
 Subject: Re: [PATCH] Fix stack red zone bug (PR38644)
 
 On 09/29/2011 06:13 PM, Jiangning Liu wrote:
 
 
  -Original Message-
  From: Jakub Jelinek [mailto:ja...@redhat.com]
  Sent: Thursday, September 29, 2011 6:14 PM
  To: Jiangning Liu
  Cc: 'Richard Guenther'; Andrew Pinski; gcc-patches@gcc.gnu.org
  Subject: Re: [PATCH] Fix stack red zone bug (PR38644)
 
  On Thu, Sep 29, 2011 at 06:08:50PM +0800, Jiangning Liu wrote:
  As far as I know different back-ends are implementing different
  prologue/epilogue in GCC. If one day this part can be refined and
  abstracted
  as well, I would say solving this stack-red-zone problem in shared
  prologue/epilogue code would be a perfect solution, and barrier can
  be
  inserted there.
 
  I'm not saying you are wrong on keeping scheduler using a pure
  barrier
  interface. From engineering point of view, I only feel my proposal
 is
  so far
  so good, because this patch at least solve the problem for all
  targets in a
  quite simple way. Maybe it can be improved in future based on this.
 
  But you don't want to listen about any other alternative, other
  backends are
  happy with being able to put the best kind of barrier at the best
 spot
  in the epilogue and don't need a generic solution which won't
 model
  very
  well the target diversity anyway.
 
  Jakub,
 
  Appreciate for your attention on this issue,
 
  1) Can you clarify who are the others back-ends? Does it cover most
 of the
  back-ends being supported by GCC right now?
 
 Your red-stack barrier issue is *exactly* the same as the frame pointer
 barrier issue, which affects many backends.
 
 That is, if the frame pointer is initialized before the local stack
 frame
 is allocated, then one has to add a barrier such that memory references
 based on the frame pointer are not scheduled before the local stack
 frame
 allocation.
 
 One example of this is in the i386 port, where the prologue looks like
 
   push%ebp
   mov %esp, %ebp
   sub $frame, %esp
 
 The rtl we emit for that subtraction looks like
 
 (define_insn pro_epilogue_adjust_stack_mode_add
   [(set (match_operand:P 0 register_operand =r,r)
 (plus:P (match_operand:P 1 register_operand 0,r)
 (match_operand:P 2 nonmemory_operand ri,li)))
(clobber (reg:CC FLAGS_REG))
(clobber (mem:BLK (scratch)))]
 
 Note the final clobber, which is a memory scheduling barrier.
 
 Other targets use similar tricks.  For instance arm stack_tie.
 
 Honestly, I've found nothing convincing throughout this thread that
 suggests to me that this problem should be handled generically.
 

Richard H.,

Thanks for your explanation by giving an example in x86. 

The key is if possible, fixing it in middle end can benefit all ports
directly and avoid bug fixing burden in back-ends, rather than fix this
problem port by port.

Actually now the debating here is whether memory barrier is properly
modeling through whole GCC rather than a single component, because my
current understanding is scheduler is not the only component using memory
barrier.

Thanks,
-Jiangning

 
 r~






RE: [PATCH] Fix stack red zone bug (PR38644)

2011-10-09 Thread Jiangning Liu


 -Original Message-
 From: Richard Guenther [mailto:richard.guent...@gmail.com]
 Sent: Friday, September 30, 2011 8:57 PM
 To: Jiangning Liu; Jakub Jelinek; Richard Guenther; Andrew Pinski; gcc-
 patc...@gcc.gnu.org; richard.sandif...@linaro.org
 Subject: Re: [PATCH] Fix stack red zone bug (PR38644)
 
 On Fri, Sep 30, 2011 at 2:46 PM, Richard Sandiford
 richard.sandif...@linaro.org wrote:
  Jiangning Liu jiangning@arm.com writes:
  You seem to feel strongly about this because it's a wrong-code bug
 that
  is very easy to introduce and often very hard to detect.  And I
  defintely
  sympathise with that.  If we were going to to do it in a target-
  independent
  way, though, I think it would be better to scan patterns like
 epilogue
  and
  automatically introduce barriers before assignments to
  stack_pointer_rtx
  (subject to the kind of hook in your patch).  But I still don't
 think
  that's better than leaving the onus on the backend.  The backend is
  still responsible for much more complicated things like determning
  the correct deallocation and register-restore sequence, and for
  determining the correct CFI sequence.
 
 
  I think middle-end in GCC is actually shared code rather than the
 part
  exactly in the middle. A pass working on RTL can be a middle end
 just
  because the code can be shared for all targets, and some passes can
 even
  work for both GIMPLE and RTL.
 
  Actually some optimizations need to work through shared part
 (middle-end)
  plus target specific part (back-end). You are thinking the
 interface
  between this shared part and target specific part should be
 using
  barrier as a properly model. To some extension I agree with this.
 However,
  it doesn't mean the fix should be in back-end rather than middle end,
  because obviously this problem is a common ABI issue for all targets.
 If we
  can abstract this issue to be a shared part, why shouldn't we do it
 in
  middle end to reduce the onus of back-end? Back-end should handle
 the target
  specific things rather than only the complicated things.
 
  And for avoidance of doubt, the automatic barrier insertion that I
  described would be one way of doing it in target-independent code.
  But...
 
  If a complicated problem can be implemented in a shared code
 manner, we
  still want to put it into middle end rather than back-end. I believe
 those
  optimizations based on SSA form are complicated enough, but they are
 all in
  middle end. This is the logic I'm seeing in GCC.
 
  The situation here is different.  The target-independent rtl code is
  being given a blob of instructions that the backend has generated for
  the epilogue.  There's no fine-tuning beyond that.  E.g. we don't
 have
  separate patterns for restore registers, deallocate stack,
 return:
  we just have one monolithic epilogue pattern.  The target-
 independent
  code has very little control.
 
  In contrast, after the tree optimisers have handed off the initial IL,
  the tree optimisers are more or less in full control.  There are very
  few cases where we generate further trees outside the middle-
 end.  The only
  case I know off-hand is the innards of va_start and va_arg, which can
 be
  generated by the backend.
 
  So let's suppose we had a similar situation there, where we wanted
  va_arg do something special in a certain situation.  If we had the
  same three choices of:
 
   1. use an on-the-side hook to represent the special something
   2. scan the code generated by the backend and automatically
      inject the special something at an appropriate place
   3. require each backend to do it properly from the start
 
  (OK, slightly prejudiced wording :-)) I think we'd still choose 3.
 
  For this particular issue, I don't think that hook interface I'm
  proposing is more complicated than the barrier. Instead, it is
 easier
  for back-end implementer to be aware of the potential issue before
  really solving stack red zone problem, because it is very clearly
  listed in target hook list.
 
  The point for model it in the IL supporters like myself is that we
  have both many backends and many rtl passes.  Putting it in a hook
 keeps
  things simple for the backends, but it means that every rtl pass must
 be
  aware of this on-the-side dependency.  Perhaps sched2 really is the
 only
  pass that needs to look at the hook at present.  But perhaps not.
  E.g. dbr_schedule (not a problem on ARM, I realise) also reorders
  instructions, so maybe it would need to be audited to see whether any
  calls to this hook are needed.  And perhaps we'd add more rtl passes
  later.
 
  The point behind using a barrier is that the rtl passes do not then
 need
  to treat the stack-deallocation dependency as a special case.  They
 can
  just use the normal analysis and get it right.
 
  In other words, we're both arguing for safety here.
 
 Indeed.  It's certainly not only scheduling that can move instructions,
 but RTL PRE, combine, ifcvt all can 

Out-of-order update of new_spill_reg_store[]

2011-10-09 Thread Richard Sandiford
This patch fixes an ordering problem in reload: the output reloads are
emitted in reverse operand order, but new_spill_reg_store[] is updated
in forward reload order.  This causes problems if the same register is
used for two reloads.

I saw this hit on mips64-linux-gnu/-mabi=64 as a failure in
execute/scal-to-vec1.c at -O3.  The reloads were:

Reloads for insn # 580
Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, 
secondary_reload_p
reload_reg_rtx: (reg:SI 5 $5)
Reload 1: reload_out (SI) = (reg:SI 32 $f0 [1655])
MD1_REG, RELOAD_FOR_OUTPUT (opnum = 0)
reload_out_reg: (reg:SI 32 $f0 [1655])
reload_reg_rtx: (reg:SI 65 lo)
secondary_out_reload = 0

Reload 2: reload_out (SI) = (reg:SI 1656)
GR_REGS, RELOAD_FOR_OUTPUT (opnum = 3)
reload_out_reg: (reg:SI 1656)
reload_reg_rtx: (reg:SI 5 $5)

So $5 is first stored in 1656 (operand 3), then $5 is used a secondary
reload in copying LO to $f0 (operand 0, reg 1655).  The next and final
use of 1655 ends up inheriting this second reload of $5, so we try to
delete the original output copy.  The problem is that we delete the
wrong one: we delete the store of $5 to 1656 rather than the copy of
$5 to 1655/$f0.

The fix I went for is to clear new_spill_reg_store[] for all reloads
as a separate pass (rather than in the main do_{input,output}_reload
loop), then only allow new_spill_store_reg[] to be set if the associated
reload register reaches the end of the reload sequence.

emit_input_reloads has:

  /* Output a special code sequence for this case, and forget about
 spill reg information.  */
  new_spill_reg_store[REGNO (reloadreg)] = NULL;
  inc_for_reload (reloadreg, oldequiv, rl-out, rl-inc);

I think this store is redundant: emit_reload_insns should already have
cleared it, beth before and after the patch.  (The code was originally:

  /* Output a special code sequence for this case.  */
  new_spill_reg_store[REGNO (reloadreg)]
= inc_for_reload (reloadreg, oldequiv, rl-out,
  rl-inc);

but was changed because we can't inherit auto-inc reloads as easily
as that.  So the nullification came from an existing new_spill_reg_store[]
assignment, rather than being added explicitly.)

Also, emit_reload_insns has two blocks to record inheritance information:
one for spill registers and one for non-spill registers.  The spill version
checks that the reload register reaches the end of the sequence, and I think
the non-spill version should too.

Tested on mips64-linux-gnu and x86_64-linux-gnu.  It fixes the testcase
(by deleting the correct instruction -- the inheritance still happens).
Bernd, Uli, does this look OK?

Richard


gcc/
* reload1.c (reload_regs_reach_end_p): Replace with...
(reload_reg_rtx_reaches_end_p): ...this function.
(new_spill_reg_store): Update commentary.
(emit_input_reload_insns): Don't clear new_spill_reg_store here.
(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
before setting new_spill_reg_store.
(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
Also use reload_reg_rtx_reaches_end_p when recording inheritance
information for non-spill reload registers.

Index: gcc/reload1.c
===
--- gcc/reload1.c   2011-10-08 16:32:26.0 +0100
+++ gcc/reload1.c   2011-10-08 16:32:26.0 +0100
@@ -5499,15 +5499,15 @@ reload_reg_reaches_end_p (unsigned int r
 }
 
 /* Like reload_reg_reaches_end_p, but check that the condition holds for
-   every register in the range [REGNO, REGNO + NREGS).  */
+   every register in REG.  */
 
 static bool
-reload_regs_reach_end_p (unsigned int regno, int nregs, int reloadnum)
+reload_reg_rtx_reaches_end_p (rtx reg, int reloadnum)
 {
-  int i;
+  unsigned int i;
 
-  for (i = 0; i  nregs; i++)
-if (!reload_reg_reaches_end_p (regno + i, reloadnum))
+  for (i = REGNO (reg); i  END_REGNO (reg); i++)
+if (!reload_reg_reaches_end_p (i, reloadnum))
   return false;
   return true;
 }
@@ -7052,7 +7052,9 @@ static rtx operand_reload_insns = 0;
 static rtx other_operand_reload_insns = 0;
 static rtx other_output_reload_insns[MAX_RECOG_OPERANDS];
 
-/* Values to be put in spill_reg_store are put here first.  */
+/* Values to be put in spill_reg_store are put here first.  Instructions
+   must only be placed here if the associated reload register reaches
+   the end of the instruction's reload sequence.  */
 static rtx new_spill_reg_store[FIRST_PSEUDO_REGISTER];
 static HARD_REG_SET reg_reloaded_died;
 
@@ -7213,9 +7215,7 @@ emit_input_reload_insns (struct insn_cha
 
   /* Prevent normal processing of this reload.  */
   special = 1;
-  /* Output a special code sequence for this case, and forget about
-spill 

Re: PATCH RFA: New configure option --with-native-system-header-dir

2011-10-09 Thread Ian Lance Taylor
Ian Lance Taylor i...@google.com writes:

 So, it seems to me that we should:

   * Remove SYSTEM_INCLUDE_DIR, which is undefined and unnecessary.

   * Move the definition of NATIVE_SYSTEM_HEADER_DIR into config.gcc
 (named native_system_header_dir).  The default is /usr/include.
 This appears to be necessary since the configure script itself needs
 to know this value.

   * Have the configure script use NATIVE_SYSTEM_HEADER_DIR when setting
 target_header_dir.

   * Arrange for Makefile to define NATIVE_SYSTEM_HEADER_DIR when
 compiling cppdefault.c (i.e., add it to PREPROCESSOR_DEFINES in
 Makefile.in).

   * Replace STANDARD_INCLUDE_DIR in cppdefault.c with
 NATIVE_SYSTEM_HEADER_DIR.

   * Remove STANDARD_INCLUDE_DIR.

   * Add the --with-native-system-header-dir option.


This patch implements this proposal.  Only lightly tested so far.  How
does this look if testing succeeds?

Ian


2011-10-08  Simon Baldwin  sim...@google.com
Ian Lance Taylor  i...@google.com

* configure.ac: Add --with-native-system-header-dir.  Set and
substitute NATIVE_SYSTEM_HEADER_DIR.  Use native_system_header
when setting target_header_dir.
* config.gcc: Always set native_system_header_dir.
(*-*-gnu*): Set native_system_header_dir.  Don't use t-gnu.
(i[34567]86-pc-msdosdjgpp*): Set native_system_header_dir.  Don't
use i386/t-djgpp.
(i[34567]86-*-mingw* | x86_64-*-mingw*): Set
native_system_header_dir.
(spu-*-elf*): Set native_system_header_dir.
* Makefile.in (NATIVE_SYSTEM_HEADER_DIR): Set to
@NATIVE_SYSTEM_HEADER_DIR@.
(PREPROCESSOR_DEFINES): Define NATIVE_SYSTEM_HEADER_DIR.
* cppdefault.c (STANDARD_INCLUDE_DIR): Don't define.
(NATIVE_SYSTEM_HEADER_COMPONENT): Rename from
STANDARD_INCLUDE_COMPONENT.
(cpp_include_defaults): Don't use SYSTEM_INCLUDE_DIR.  Rename
STANDARD_INCLUDE_DIR to NATIVE_SYSTEM_HEADER_DIR.
* system.h: Poison SYSTEM_INCLUDE_DIR, STANDARD_INCLUDE_DIR, and
STANDARD_INCLUDE_COMPONENT.
* config/i386/t-mingw32 (NATIVE_SYSTEM_HEADER_DIR): Remove.
* config/i386/t-mingw-w32: Likewise.
* config/i386/t-mingw-w64: Likewise.
* config/spu/t-spu-elf: Likewise.
* config/i386/t-djgpp: Remove.
* config/t-gnu: Remove.
* config/i386/mingw32.h (STANDARD_INCLUDE_DIR): Don't define.
(NATIVE_SYSTEM_HEADER_COMPONENT): Rename from
STANDARD_INCLUDE_COMPONENT.
* config/i386/djgpp.h (STANDARD_INCLUDE_DIR): Don't define.
* config/spu/spu-elf.h: Likewise.
* config/vms/xm-vms.h: Likewise.
* config/gnu.h: Likewise.
* config/openbsd.h (INCLUDE_DEFAULTS): Change STANDARD_INCLUDE_DIR
and STANDARD_INCLUDE_COMPONENT to NATIVE_SYSTEM_HEADER_DIR and
NATIVE_SYSTME_HEADER_COMPONENT.
* doc/install.texi (Configuration): Document
--with-native-system-header-dir.  Mention it in the documentation
for --with-sysroot and --with-build-sysroot.
* doc/tm.texi.in (Driver): Don't document SYSTEM_INCLUDE_DIR or
STANDARD_INCLUDE_DIR.  Rename STANDARD_INCLUDE_COMPONENT to
NATIVE_SYSTEM_HEADER_COMPONENT.  Rename uses of
STANDARD_INCLUDE_DIR to NATIVE_SYSTEM_HEADER_DIR.
* doc/fragments.texi (Target Fragment): Don't document
NATIVE_SYSTEM_HEADER_DIR.
* configure, doc/tm.texi: Rebuild.


Index: doc/fragments.texi
===
--- doc/fragments.texi	(revision 179696)
+++ doc/fragments.texi	(working copy)
@@ -1,5 +1,6 @@
 @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
-@c 1999, 2000, 2001, 2003, 2004, 2005, 2008 Free Software Foundation, Inc.
+@c 1999, 2000, 2001, 2003, 2004, 2005, 2008, 2011
+@c Free Software Foundation, Inc.
 @c This is part of the GCC manual.
 @c For copying conditions, see the file gcc.texi.
 
@@ -128,12 +129,6 @@ compiler.  In that case, set @code{MULTI
 of options to be used for all builds.  If you set this, you should
 probably set @code{CRTSTUFF_T_CFLAGS} to a dash followed by it.
 
-@findex NATIVE_SYSTEM_HEADER_DIR
-@item NATIVE_SYSTEM_HEADER_DIR
-If the default location for system headers is not @file{/usr/include},
-you must set this to the directory containing the headers.  This value
-should match the value of the @code{SYSTEM_INCLUDE_DIR} macro.
-
 @findex SPECS
 @item SPECS
 Unfortunately, setting @code{MULTILIB_EXTRA_OPTS} is not enough, since
Index: doc/tm.texi.in
===
--- doc/tm.texi.in	(revision 179696)
+++ doc/tm.texi.in	(working copy)
@@ -468,33 +468,15 @@ initialize the necessary environment var
 Define this macro as a C string constant if you wish to override the
 standard choice of @file{/usr/local/include} as the default prefix to
 try when searching for local header files.  

Re: [google] record compiler options to .note sections

2011-10-09 Thread Jakub Jelinek
On Sun, Oct 09, 2011 at 09:18:25AM +0800, Dehao Chen wrote:
 Unfortunately -frecord-gcc-switches cannot serve our purpose because
 the recorded switches are mergable, i.e. the linker will merge all
 options to a set of strings. However, object files may have distinct
 compile options. We want to preserve every object file's compile
 options when doing LIPO build.

And -grecord-gcc-switches?  That one, although it is mergeable, still
preserves every object files's compile options.

Jakub


RE: [PATCH] Fix stack red zone bug (PR38644)

2011-10-09 Thread Jiangning Liu


 -Original Message-
 From: Richard Sandiford richard.sandif...@linaro.org
 Date: Fri, Sep 30, 2011 at 8:46 PM
 Subject: Re: [PATCH] Fix stack red zone bug (PR38644)
 To: Jiangning Liu jiangning@arm.com
 Cc: Jakub Jelinek ja...@redhat.com, Richard Guenther
 richard.guent...@gmail.com, Andrew Pinski pins...@gmail.com,
 gcc-patches@gcc.gnu.org
 
 
 Jiangning Liu jiangning@arm.com writes:
  You seem to feel strongly about this because it's a wrong-code bug
 that
  is very easy to introduce and often very hard to detect.  And I
  defintely
  sympathise with that.  If we were going to to do it in a target-
  independent
  way, though, I think it would be better to scan patterns like
 epilogue
  and
  automatically introduce barriers before assignments to
  stack_pointer_rtx
  (subject to the kind of hook in your patch).  But I still don't
 think
  that's better than leaving the onus on the backend.  The backend is
  still responsible for much more complicated things like determning
  the correct deallocation and register-restore sequence, and for
  determining the correct CFI sequence.
 
 
  I think middle-end in GCC is actually shared code rather than the
 part
  exactly in the middle. A pass working on RTL can be a middle end just
  because the code can be shared for all targets, and some passes can
 even
  work for both GIMPLE and RTL.
 
  Actually some optimizations need to work through shared part
 (middle-end)
  plus target specific part (back-end). You are thinking the
 interface
  between this shared part and target specific part should be using
  barrier as a properly model. To some extension I agree with this.
 However,
  it doesn't mean the fix should be in back-end rather than middle end,
  because obviously this problem is a common ABI issue for all targets.
 If we
  can abstract this issue to be a shared part, why shouldn't we do it
 in
  middle end to reduce the onus of back-end? Back-end should handle the
 target
  specific things rather than only the complicated things.
 
 And for avoidance of doubt, the automatic barrier insertion that I
 described would be one way of doing it in target-independent code.
 But...
 
  If a complicated problem can be implemented in a shared code manner,
 we
  still want to put it into middle end rather than back-end. I believe
 those
  optimizations based on SSA form are complicated enough, but they are
 all in
  middle end. This is the logic I'm seeing in GCC.
 
 The situation here is different.  The target-independent rtl code is
 being given a blob of instructions that the backend has generated for
 the epilogue.  There's no fine-tuning beyond that.  E.g. we don't have
 separate patterns for restore registers, deallocate stack, return:
 we just have one monolithic epilogue pattern.  The target-independent
 code has very little control.
 
 In contrast, after the tree optimisers have handed off the initial IL,
 the tree optimisers are more or less in full control.  There are very
 few cases where we generate further trees outside the middle-end.  The
 only
 case I know off-hand is the innards of va_start and va_arg, which can
 be
 generated by the backend.
 
 So let's suppose we had a similar situation there, where we wanted
 va_arg do something special in a certain situation.  If we had the
 same three choices of:
 
  1. use an on-the-side hook to represent the special something
  2. scan the code generated by the backend and automatically
     inject the special something at an appropriate place
  3. require each backend to do it properly from the start
 
 (OK, slightly prejudiced wording :-)) I think we'd still choose 3.
 

Richard S.,

Although I've ever implemented va_arg for a commercial compiler previously
long times ago, I forgot all the details. :-) I'm not sure if using va_arg
is a good example to compare with this stack red zone case.

  For this particular issue, I don't think that hook interface I'm
  proposing is more complicated than the barrier. Instead, it is easier
  for back-end implementer to be aware of the potential issue before
  really solving stack red zone problem, because it is very clearly
  listed in target hook list.
 
 The point for model it in the IL supporters like myself is that we
 have both many backends and many rtl passes.  Putting it in a hook
 keeps
 things simple for the backends, but it means that every rtl pass must
 be
 aware of this on-the-side dependency.  Perhaps sched2 really is the
 only
 pass that needs to look at the hook at present.  But perhaps not.
 E.g. dbr_schedule (not a problem on ARM, I realise) also reorders
 instructions, so maybe it would need to be audited to see whether any
 calls to this hook are needed.  And perhaps we'd add more rtl passes
 later.

Let me rephrase your justification with my own words.

===

We can't compare adding a new pass and adding a new port, because they are
totally different things. But it implies with my proposal the burden may
still be added 

Fix for PR libobjc/49883 (clang + gcc 4.6 runtime = broken) and a small related clang fix

2011-10-09 Thread Nicola Pero
This patch fixes PR libobjc/49883.  To fix it, I installed clang and tried out 
what 
happens if you compile Objective-C code using clang and targetting the GCC 
runtime.

Unfortunately, the report was correct in that clang is producing incorrect code 
and
abusing the higher bits of the class-info field to store some other 
information.  On
the good side, the fix I proposed in the discussion of PR libobjc/49883 
actually works. 
:-)

So, I applied that fix.

I also found that clang still emits calls to the objc_lookup_class() function, 
so this
patch also adds that function back into the runtime to get code compiled with 
clang
work.

Committed to trunk.

Thanks

PS: In case anyone wonders, I do want the GNU Objective-C Runtime to be usable 
with
free, non-GCC Objective-C compilers.  It should obviously work perfectly with 
GCC, the 
GNU compiler, which is its natural partner, but some people would like to use 
it with
other free compilers and that seems a reasonable request.  Refusing that 
request just 
provides an incentive to write and support other Objective-C runtimes, which is 
a 
waste of time and resources. ;-)

Index: init.c
===
--- init.c  (revision 179711)
+++ init.c  (working copy)
@@ -643,6 +643,15 @@
   assert (CLS_ISMETA (class-class_pointer));
   DEBUG_PRINTF ( installing class '%s'\n, class-name);
 
+  /* Workaround for a bug in clang: Clang may set flags other than
+_CLS_CLASS and _CLS_META even when compiling for the
+traditional ABI (version 8), confusing our runtime.  Try to
+wipe these flags out.  */
+  if (CLS_ISCLASS (class))
+   __CLS_INFO (class) = _CLS_CLASS;
+  else
+   __CLS_INFO (class) = _CLS_META;
+
   /* Initialize the subclass list to be NULL.  In some cases it
 isn't and this crashes the program.  */
   class-subclass_list = NULL;
Index: class.c
===
--- class.c (revision 179711)
+++ class.c (working copy)
@@ -764,6 +764,15 @@
   return objc_get_class (name)-class_pointer;
 }
 
+/* This is not used by GCC, but the clang compiler seems to use it
+   when targetting the GNU runtime.  That's wrong, but we have it to
+   be compatible.  */
+Class
+objc_lookup_class (const char *name)
+{
+  return objc_getClass (name);
+}
+
 /* This is used when the implementation of a method changes.  It goes
through all classes, looking for the ones that have these methods
(either method_a or method_b; method_b can be NULL), and reloads
Index: ChangeLog
===
--- ChangeLog   (revision 179711)
+++ ChangeLog   (working copy)
@@ -1,3 +1,18 @@
+2011-10-09  Nicola Pero  nicola.p...@meta-innovation.com
+
+   PR libobjc/49883
+   * init.c (__objc_exec_class): Work around a bug in clang's code
+   generation.  Clang sets the class-info field to values different
+   from 0x1 or 0x2 (the only allowed values in the traditional GNU
+   Objective-C runtime ABI) to store some additional information, but
+   this breaks backwards compatibility.  Wipe out all the bits in the
+   fields other than the first two upon loading a class.
+
+2011-10-09  Nicola Pero  nicola.p...@meta-innovation.com
+
+   * class.c (objc_lookup_class): Added back for compatibility with
+   clang which seems to emit calls to it.
+
 2011-10-08  Richard Frith-Macdonald r...@gnu.org
 Nicola Pero  nicola.p...@meta-innovation.com




Re: [RFC] Slightly fix up vgather* patterns

2011-10-09 Thread Uros Bizjak
On Sat, Oct 8, 2011 at 5:43 PM, Jakub Jelinek ja...@redhat.com wrote:

 The AVX2 docs say that the insns will #UD if any of the mask, src and index
 registers are the same, but e.g. on
 #include x86intrin.h

 __m256 m;
 float f[1024];

 __m256
 foo (void)
 {
  __m256i mi = (__m256i) m;
  return _mm256_mask_i32gather_ps (m, f, mi, m, 4);
 }

 which is IMHO valid and should for m being zero vector just return a
 zero vector and clear mask (in this case it was already cleared) we compile
 it as
        vmovdqa m(%rip), %ymm1
        vmovaps %ymm1, %ymm0
        vgatherdps      %ymm1, (%rax, %ymm1, 4), %ymm0
 and thus IMHO it will #UD.  Also, the insns should make it clear that
 the mask register is modified too (the patch clobbers it, perhaps
 we could instead say that it zeros the register (which is true if
 it doesn't segfault), but then what if a segfault handler chooses to
 continue with the next insn and doesn't clear the mask register?).
 Still, the insn description is imprecise, saying that it loads from mem
 at the address register is wrong and perhaps some DCE might delete
 what shouldn't be deleted.  So, either it should (use (mem (scratch)))
 or something similar, or in the unspec list all the memory locations
 that are being read
 (mem:scalarssemode (plus:SI (reg:SI) (vec_select:SI (match_operand:V4SI)
 (parallel [(const_int N)]
 for N 0 through something (but it is complicated by Pmode size vs.
 the need to do nothing/truncate/sign_extend the vec_select to the right
 mode).

 What do you think?

Regarding the clear of mask operand: I agree that this should be
modelled as a clobber. Zeroing can't be guaranteed due to the fact you
described above.

About memory - can't we use (mem:BLK (match_operand:P
register_operand r)) here?

BTW: No need to use %c modifier:

/* Meaning of CODE:
   L,W,B,Q,S,T -- print the opcode suffix for specified size of operand.
   C -- print opcode suffix for set/cmov insn.
   c -- like C, but print reversed condition
   ...
*/

Uros.


Re: [patch] C6X unwinding/exception handling

2011-10-09 Thread Matthias Klose
This did break libobjc and libjava on arm-linux-gnueabi.

libobjc now has an undefined reference to _Unwind_decode_target2, which can be
avoided with

--- libobjc/exception.c.orig2011-07-21 15:33:57.0 +
+++ libobjc/exception.c 2011-10-09 10:53:12.554940776 +
@@ -182,7 +182,7 @@
   _Unwind_Ptr ptr;

   ptr = (_Unwind_Ptr) (info-TType - (i * 4));
-  ptr = _Unwind_decode_target2 (ptr);
+  ptr = _Unwind_decode_typeinfo_ptr (info-ttype_base, (_Unwind_Word) ptr);

   /* NULL ptr means catch-all.  Note that if the class is not found,
  this will abort the program.  */

libjava fails to build, the same change doesn't work for libjava/exception.cc,
because the struct lsda_header_info in exception.cc is missing the ttype_base
member. Any suggestions?

On 09/13/2011 02:48 PM, Paul Brook wrote:
 C6X uses an unwinding/exception handling echeme very similar to that
 defined by the ARM EABI.  The core of the unwinder is the same, so I've
 pulled it out into a common file.

 Other than the obvious target specific bits, the main compiler visible
 difference is that the C6X assembler generates the unwinding tables from
 DWARF .cfi directives, rather than the separate set of directives used by
 the ARM assembler.

 The libstdc++ changes probably deserve a bit of explanation. The ttype_base
 field was clearly used in an early draft of the ARM EABI, and the current
 ARM definition is a compatible subset of that used by C6X.
 _GLIBCXX_OVERRIDE_TTYPE_ENCODING is an unfortunate hack because when doing
 the ARM implementation I failed to realise ttype_encoding was the same
 thing as R_ARM_TARGET2.  We now have a lot of ARM binaries floating around
 with that field set incorrectly, so it's either this or an ABI bump.
 
 I've updated the patch to accomodate the move to libgcc/, done a quick sanity 
 recheck of arm-linux and c6x-elf and applied to svn.
 
 P.S. in case it's not clear from my description, the libstdc++ changes aren't 
 really a new hack, it's just making an old one more obvious.
 
 Paul
 
 2011-09-13  Paul Brook  p...@codesourcery.com
  
   gcc/
   * config/arm/arm.h (ASM_PREFERRED_EH_DATA_FORMAT): Define.
   (ARM_TARGET2_DWARF_FORMAT): Provide default definition.
   * config/arm/linux-eabi.h (ARM_TARGET2_DWARF_FORMAT): Define.
   * config/arm/symbian.h (ARM_TARGET2_DWARF_FORMAT): Define.
   * config/arm/uclinux-eabi.h(ARM_TARGET2_DWARF_FORMAT): Define.
   * config/arm/t-bpabi (EXTRA_HEADERS): Add unwind-arm-common.h.
   * config/arm/t-symbian (EXTRA_HEADERS): Add unwind-arm-common.h.
   * config/c6x/c6x.c (c6x_output_file_unwind): Don't rely on dwarf2 code
   enabling unwind tables.
   (c6x_debug_unwind_info): New function.
   (TARGET_ARM_EABI_UNWINDER): Define.
   (TARGET_DEBUG_UNWIND_INFO): Define.
   * config/c6x/c6x.h (DWARF_FRAME_RETURN_COLUMN): Define.
   (TARGET_EXTRA_CFI_SECTION): Remove.
   * config/c6x/t-c6x-elf (EXTRA_HEADERS): Set.
   * ginclude/unwind-arm-common.h: New file.
 
   libgcc/
   * config.host (tic6x-*-*): Add c6x/t-c6x-elf.  Set unwind_header.
   * unwind-c.c (PERSONALITY_FUNCTION): Use UNWIND_POINTER_REG.
   * unwind-arm-common.inc: New file.
   * config/arm/unwind-arm.c: Use unwind-arm-common.inc.
   * config/arm/unwind-arm.h: Use unwind-arm-common.h.
   (_GLIBCXX_OVERRIDE_TTYPE_ENCODING): Define.
   * config/c6x/libunwind.S: New file.
   * config/c6x/pr-support.c: New file.
   * config/c6x/unwind-c6x.c: New file.
   * config/c6x/unwind-c6x.h: New file.
   * config/c6x/t-c6x-elf: New file.
 
 
   libstdc++-v3/
   * libsupc++/eh_arm.cc (__cxa_end_cleanup): Add C6X implementation.
   * libsupc++/eh_call.cc (__cxa_call_unexpected): Set rtti_base.
   * libsupc++/eh_personality.cc (NO_SIZE_OF_ENCODED_VALUE): Remove
   __ARM_EABI_UNWINDER__ check.
   (parse_lsda_header): Check _GLIBCXX_OVERRIDE_TTYPE_ENCODING.
   (get_ttype_entry): Use generic implementation on ARM EABI.
   (check_exception_spec): Use _Unwind_decode_typeinfo_ptr and
   UNWIND_STACK_REG.
   (PERSONALITY_FUNCTION): Set ttype_base.



[Patch, Fortran, committed] PR 50659: [4.4/4.5/4.6/4.7 Regression] ICE with PROCEDURE statement

2011-10-09 Thread Janus Weil
Hi all,

I have just committed as obvious a patch for an ICE-on-valid problem
with PROCEDURE statements:

http://gcc.gnu.org/viewcvs?root=gccview=revrev=179723

The problem was the following: When setting up an external procedure
or procedure pointer (declared via a PROCEDURE statement), we copy the
expressions for the array bounds and string length from the interface
symbol given in the PROCEDURE declaration (cf.
'resolve_procedure_interface'). If those expressions depend on the
actual args of the interface, we have to replace those args by the
args of the new procedure symbol that we're setting up. This is what
'gfc_expr_replace_symbols' / 'replace_symbol' does. Unfortunately we
failed to check whether the symbol we try to replace is actually a
dummy!

Contrary to Andrew's initial assumption, I think the test case is
valid. I could neither find a compiler which rejects it, nor a
restriction in the standard which makes it invalid. The relevant part
of F08 is probably chapter 7.1.11 (Specification expression). This
states that a specification expression can contain variables, which
are made accessible via use association.

I'm planning to apply the patch to the 4.6, 4.5 and 4.4 branches soon.

Cheers,
Janus


[C++ Patch] PR 50660

2011-10-09 Thread Paolo Carlini

Hi,

another duplicated diagnostic message. This one happens for snippets 
like the below due to the temporary for the const ref:


int g(const int);
int m2()
{
return g(__null);
}

50660.C:4:18: warning: passing NULL to non-pointer argument 1 of ‘int 
g(const int)’
50660.C:4:18: warning: passing NULL to non-pointer argument 1 of ‘int 
g(const int)’


I'm changing conversion_null_warnings to return true when a warning is 
actually produced, which is checked by convert_like_real before calling 
again itself recursively. I think it should be safe to shut down in that 
case all kinds of further warnings, otherwise, we could even envisage 
adding an issue_conversion_null_warnings parameter to convert_like_real, 
as a last resort which certainly works.


Patch tested x86_64-linux.

Thanks,
Paolo.

/
2011-10-09  Paolo Carlini  paolo.carl...@oracle.com

PR c++/50660
* call.c (conversion_null_warnings): Return true when a warning
is actually emitted.
(convert_like_real): When conversion_null_warnings returns true
set issue_conversion_warnings to false.

Index: call.c
===
--- call.c  (revision 179720)
+++ call.c  (working copy)
@@ -5509,9 +5509,9 @@ build_temp (tree expr, tree type, int flags,
 
 /* Perform warnings about peculiar, but valid, conversions from/to NULL.
EXPR is implicitly converted to type TOTYPE.
-   FN and ARGNUM are used for diagnostics.  */
+   FN and ARGNUM are used for diagnostics.  Returns true if warned.  */
 
-static void
+static bool
 conversion_null_warnings (tree totype, tree expr, tree fn, int argnum)
 {
   tree t = non_reference (totype);
@@ -5526,6 +5526,7 @@ conversion_null_warnings (tree totype, tree expr,
   else
warning_at (input_location, OPT_Wconversion_null,
converting to non-pointer type %qT from NULL, t);
+  return true;
 }
 
   /* Issue warnings if false is converted to a NULL pointer */
@@ -5538,7 +5539,9 @@ conversion_null_warnings (tree totype, tree expr,
   else
warning_at (input_location, OPT_Wconversion_null,
converting %false% to pointer type %qT, t);
+  return true;
 }
+  return false;
 }
 
 /* Perform the conversions in CONVS on the expression EXPR.  FN and
@@ -5624,8 +5627,9 @@ convert_like_real (conversion *convs, tree expr, t
   return cp_convert (totype, expr);
 }
 
-  if (issue_conversion_warnings  (complain  tf_warning))
-conversion_null_warnings (totype, expr, fn, argnum);
+  if (issue_conversion_warnings  (complain  tf_warning)
+   conversion_null_warnings (totype, expr, fn, argnum))
+issue_conversion_warnings = false;
 
   switch (convs-kind)
 {


[CRIS] Hookize PREFERRED_RELOAD_CLASS

2011-10-09 Thread Anatoly Sokolov
 Hello.

  This patch removes obsolete PREFERRED_RELOAD_CLASS macro from CRIS back end
in the GCC and introduces equivalent TARGET_PREFERRED_RELOAD_CLASS target 
hook.

  Regression tested on cris-axis-elf.

  OK to install?

* config/cris/cris.c (cris_preferred_reload_class): New function.
(TARGET_PREFERRED_RELOAD_CLASS): Define.
* config/cris/cris.h (OUTPUT_ADDR_CONST_EXTRA): Remove.

Index: gcc/config/cris/cris.c
===
--- gcc/config/cris/cris.c  (revision 179721)
+++ gcc/config/cris/cris.c  (working copy)
@@ -123,6 +123,8 @@
 static void cris_file_start (void);
 static void cris_init_libfuncs (void);
 
+static reg_class_t cris_preferred_reload_class (rtx, reg_class_t);
+
 static int cris_register_move_cost (enum machine_mode, reg_class_t, 
reg_class_t);
 static int cris_memory_move_cost (enum machine_mode, reg_class_t, bool);
 static bool cris_rtx_costs (rtx, int, int, int, int *, bool);
@@ -198,6 +200,9 @@
 #undef TARGET_INIT_LIBFUNCS
 #define TARGET_INIT_LIBFUNCS cris_init_libfuncs
 
+#undef TARGET_PREFERRED_RELOAD_CLASS
+#define TARGET_PREFERRED_RELOAD_CLASS cris_preferred_reload_class
+
 #undef TARGET_REGISTER_MOVE_COST
 #define TARGET_REGISTER_MOVE_COST cris_register_move_cost
 #undef TARGET_MEMORY_MOVE_COST
@@ -1342,6 +1347,31 @@
   return false;
 }
 
+
+/* Worker function for TARGET_PREFERRED_RELOAD_CLASS.
+
+   It seems like gcc (2.7.2 and 2.9x of 2000-03-22) may send NO_REGS as
+   the class for a constant (testcase: __Mul in arit.c).  To avoid forcing
+   out a constant into the constant pool, we will trap this case and
+   return something a bit more sane.  FIXME: Check if this is a bug.
+   Beware that we must not override classes that can be specified as
+   constraint letters, or else asm operands using them will fail when
+   they need to be reloaded.  FIXME: Investigate whether that constitutes
+   a bug.  */
+
+static reg_class_t
+cris_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, reg_class_t rclass)
+{
+  if (rclass != ACR_REGS
+   rclass != MOF_REGS
+   rclass != SRP_REGS
+   rclass != CC0_REGS
+   rclass != SPECIAL_REGS)
+return GENERAL_REGS;
+
+  return rclass;
+}
+
 /* Worker function for TARGET_REGISTER_MOVE_COST.  */
 
 static int
Index: gcc/config/cris/cris.h
===
--- gcc/config/cris/cris.h  (revision 179721)
+++ gcc/config/cris/cris.h  (working copy)
@@ -583,22 +583,6 @@
 /* See REGNO_OK_FOR_BASE_P.  */
 #define REGNO_OK_FOR_INDEX_P(REGNO) REGNO_OK_FOR_BASE_P(REGNO)
 
-/* It seems like gcc (2.7.2 and 2.9x of 2000-03-22) may send NO_REGS as
-   the class for a constant (testcase: __Mul in arit.c).  To avoid forcing
-   out a constant into the constant pool, we will trap this case and
-   return something a bit more sane.  FIXME: Check if this is a bug.
-   Beware that we must not override classes that can be specified as
-   constraint letters, or else asm operands using them will fail when
-   they need to be reloaded.  FIXME: Investigate whether that constitutes
-   a bug.  */
-#define PREFERRED_RELOAD_CLASS(X, CLASS)   \
- ((CLASS) != ACR_REGS  \
-   (CLASS) != MOF_REGS   \
-   (CLASS) != SRP_REGS   \
-   (CLASS) != CC0_REGS   \
-   (CLASS) != SPECIAL_REGS   \
-  ? GENERAL_REGS : (CLASS))
-
 /* We can't move special registers to and from memory in smaller than
word_mode.  We also can't move between special registers.  Luckily,
-1, as returned by true_regnum for non-sub/registers, is valid as a


Anatoly.



[C++ Patch] Trailing comma in enum

2011-10-09 Thread Magnus Fromreide
Hi.

As I understand it C++11 allows trailing commas in enum definitions.
Thus I think the following little patch should be included.

On a side note I have to say that the effects of pedwarn_cxx98 are
unexpected, especially in light of the comment above the function body.

/MF
2011-10-09 Magnus Fromreide ma...@lysator.liu.se
* gcc/cp/parser.c (cp_parser_enumerator_list): Do not warn about
trailing commas in C++0x mode.
* gcc/testsuite/g++.dg/cpp0x/enum21a.C: Test that enum x { y, } do
generate a pedwarning in c++98-mode.
* gcc/testsuite/g++.dg/cpp0x/enum21b.C: Test that enum x { y, }
don't generate a pedwarning in c++0x-mode.
Index: gcc/testsuite/g++.dg/cpp0x/enum21a.C
===
--- gcc/testsuite/g++.dg/cpp0x/enum21a.C	(revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/enum21a.C	(revision 0)
@@ -0,0 +1,4 @@
+// { dg-do compile }
+// { dg-options -pedantic }
+
+enum x { y, }; // { dg-warning comma at end of enumerator list }
Index: gcc/testsuite/g++.dg/cpp0x/enum21b.C
===
--- gcc/testsuite/g++.dg/cpp0x/enum21b.C	(revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/enum21b.C	(revision 0)
@@ -0,0 +1,4 @@
+// { dg-do compile }
+// { dg-options -pedantic -std=c++0x }
+
+enum x { y, };
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c	(revision 179711)
+++ gcc/cp/parser.c	(working copy)
@@ -13444,6 +13444,7 @@ cp_parser_elaborated_type_specifier (cp_parser* pa
 
enum-specifier:
  enum-head { enumerator-list [opt] }
+ enum-head { enumerator-list , } [C++0x]
 
enum-head:
  enum-key identifier [opt] enum-base [opt]
@@ -13463,6 +13464,8 @@ cp_parser_elaborated_type_specifier (cp_parser* pa
GNU Extensions:
  enum-key attributes[opt] identifier [opt] enum-base [opt] 
{ enumerator-list [opt] }attributes[opt]
+ enum-key attributes[opt] identifier [opt] enum-base [opt]
+   { enumerator-list, }attributes[opt] [C++0x]
 
Returns an ENUM_TYPE representing the enumeration, or NULL_TREE
if the token stream isn't an enum-specifier after all.  */
@@ -13802,8 +13805,9 @@ cp_parser_enumerator_list (cp_parser* parser, tree
   /* If the next token is a `}', there is a trailing comma.  */
   if (cp_lexer_next_token_is (parser-lexer, CPP_CLOSE_BRACE))
 	{
-	  if (!in_system_header)
-	pedwarn (input_location, OPT_pedantic, comma at end of enumerator list);
+	  if (cxx_dialect  cxx0x  !in_system_header)
+	pedwarn (input_location, OPT_pedantic,
+ comma at end of enumerator list);
 	  break;
 	}
 }


[Patch] Don't ignore testsuite errors in Makefile

2011-10-09 Thread Mikael Morin
Hello,

currently, the testsuite return value is ignored by make. It is a little 
annoying if one wants to check automatically for regressions as we have to 
parse the testsuite output.
This patch reverts to the normal make behaviour, which is to not ignore 
commands' return values. Note: As a result the -k flag has to be added to the 
make command line if one wants the tests to continue after one failure.

OK for trunk?

Mikael

PS: Jakub, I CCed you as you are the author of the Makefile chunk.
2011-10-09  Mikael Morin  mikael.mo...@sfr.fr

* Makefile.in (check-parallel-%): Don't ignore testsuite errors.
Index: Makefile.in
===
--- Makefile.in	(révision 179710)
+++ Makefile.in	(copie de travail)
@@ -5116,10 +5124,10 @@ $(patsubst %,%-subtargets,$(lang_checks_paralleliz
 # Otherwise check-$lang isn't parallelized and runtest is invoked just with
 # the $(RUNTESTFLAGS) arguments.
 check-parallel-% : site.exp
-	-test -d plugin || mkdir plugin
-	-test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR)
+	test -d plugin || mkdir plugin
+	test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR)
 	test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir $(TESTSUITEDIR)/$(check_p_subdir)
-	-(rootme=`${PWD_COMMAND}`; export rootme; \
+	(rootme=`${PWD_COMMAND}`; export rootme; \
 	srcdir=`cd ${srcdir}; ${PWD_COMMAND}` ; export srcdir ; \
 	cd $(TESTSUITEDIR)/$(check_p_subdir); \
 	rm -f tmp-site.exp; \


[patch] Fix PR tree-optimization/50635

2011-10-09 Thread Ira Rosen
Hi,

In vectorizer pattern recognition when a pattern def_stmt already
exists, we need to mark it properly for the current pattern.
Another problem is that we don't really have to check that TYPE_OUT is
a vector type. It is set by the pattern detection procedures, and if
the type is invalid we fail later in the operation analysis anyway.

Bootstrapped and tested on powerpc64-suse-linux.
Committed.

Ira

ChangeLog:

PR tree-optimization/50635
* tree-vect-patterns.c (vect_handle_widen_mult_by_const): Add
DEF_STMT to the list of statements to be replaced by the
pattern statements.
(vect_handle_widen_mult_by_const): Don't check TYPE_OUT.

testsuite/ChangeLog:

PR tree-optimization/50635
* gcc.dg/vect/pr50635.c: New test.


Index: testsuite/gcc.dg/vect/pr50635.c
===
--- testsuite/gcc.dg/vect/pr50635.c (revision 0)
+++ testsuite/gcc.dg/vect/pr50635.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+
+typedef signed long int32_t;
+typedef char int8_t;
+
+void f0a(int32_t * result, int32_t * arg1, int8_t * arg2, int32_t temp_3)
+{
+  int idx;
+  for (idx=0;idx10;idx += 1)
+{
+  int32_t temp_4;
+  int32_t temp_12;
+
+  temp_4 = (-2  arg2[idx]) + temp_3;
+  temp_12 = -2 * arg2[idx] + temp_4;
+  result[idx] = temp_12;
+}
+}
+
+/* { dg-final { cleanup-tree-dump vect } } */
+
Index: tree-vect-patterns.c
===
--- tree-vect-patterns.c(revision 179718)
+++ tree-vect-patterns.c(working copy)
@@ -388,6 +388,7 @@ vect_handle_widen_mult_by_const (gimple stmt, tree
   || TREE_TYPE (gimple_assign_lhs (new_stmt)) != new_type)
 return false;

+  VEC_safe_push (gimple, heap, *stmts, def_stmt);
   *oprnd = gimple_assign_lhs (new_stmt);
 }
   else
@@ -1424,8 +1425,6 @@ vect_pattern_recog_1 (vect_recog_func_ptr vect_rec
 {
   /* No need to check target support (already checked by the pattern
  recognition function).  */
-  if (type_out)
-   gcc_assert (VECTOR_MODE_P (TYPE_MODE (type_out)));
   pattern_vectype = type_out ? type_out : type_in;
 }
   else


[committed] small change (was: Re: [Patch, Fortran] PR 35831: [F95] Shape mismatch check missing for dummy procedure argument)

2011-10-09 Thread Mikael Morin
On Tuesday 04 October 2011 20:54:21 Janus Weil wrote:
  The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk?
  
  The patch is basically OK.
  
  Otherwise I'll just start by committing the
  patch as posted ...
 
 Just did so (r179520).
 

Hello, 

I've just committed the following amendment as revision 179726.

Mikael

Index: interface.c
===
--- interface.c	(révision 179725)
+++ interface.c	(révision 179726)
@@ -1098,7 +1098,7 @@ check_dummy_characteristics (gfc_symbol *s1, gfc_s
 	  case  1:
 	  case -3:
 		snprintf (errmsg, err_len, Shape mismatch in dimension %i of 
-			  argument '%s', i, s1-name);
+			  argument '%s', i + 1, s1-name);
 		return FAILURE;
 
 	  case -2:
Index: ChangeLog
===
--- ChangeLog	(révision 179725)
+++ ChangeLog	(révision 179726)
@@ -1,3 +1,8 @@
+2011-10-09  Mikael Morin  mikael.mo...@sfr.fr
+
+	* interface.c (check_dummy_characteristics): Count dimensions starting
+	from one in diagnostic.
+
 2011-10-09  Tobias Burnus  bur...@net-b.de
 
 	* Make-lang.in (F95_PARSER_OBJS, GFORTRAN_TRANS_DEPS): Add


Re: [Patch, fortran] [00/14] PR fortran/50420 Support coarray subreferences

2011-10-09 Thread Tobias Burnus

On 07.10.2011 16:38, Mikael Morin wrote:

The full patchset has passed the fortran testsuite successfully.
OK for trunk?


OK for the whole patch set. Thanks for finding and fixing the issue!

Tobias


Patches layout

  01..04/14: Add support for non-full arrays in descriptor initialization code.

  05..09/14: Make walk_coarray initialize the scalarizer structs properly to
 accept expression with subreferences.

  10..11/14: Fix corank checking

  12/14: Accept coarray subreferences in simplify_cobound

  13/14: Fix gfc_build_array_type

  14/14: Fix gfc_build_array_ref




[committed] Fix bogus e-mail address in ChangeLogs

2011-10-09 Thread Mikael Morin
Hello, 

it seems that a bogus e-mail address (mistake of mine in the first place) has 
been promoted lately to being the main way to (miss-)communicate with me.

Committed as revision 179727.

Mikael
Index: ChangeLog
===
--- ChangeLog	(révision 179726)
+++ ChangeLog	(révision 179727)
@@ -380,7 +380,7 @@
 	* symbol.c (check_conflict): Allow threadprivate attribute with
 	FL_PROCEDURE if proc_pointer.
 
-2011-08-25  Mikael Morin  mikael.mo...@gcc.gnu.org
+2011-08-25  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/50050
 	* expr.c (gfc_free_shape): Do nothing if shape is NULL.
@@ -430,7 +430,7 @@
 	* cpp.c (gfc_cpp_init): Force BUILTINS_LOCATION for tokens
 	defined in cpp_define_builtins.
 
-2011-08-22  Mikael Morin  mikael.mo...@gcc.gnu.org
+2011-08-22  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/50050
 	* gfortran.h (gfc_clear_shape, gfc_free_shape): New prototypes.
Index: ChangeLog-2010
===
--- ChangeLog-2010	(révision 179726)
+++ ChangeLog-2010	(révision 179727)
@@ -71,7 +71,7 @@
 	substring references.
 	(gfc_check_same_strlen):  Use gfc_var_strlen.
 
-2010-12-23  Mikael Morin  mikael.mo...@gcc.gnu.org
+2010-12-23  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/46978
 	Revert part of revision 164112


Re: [Patch, Fortran, committed] PR 50585: [4.6/4.7 Regression] ICE with assumed length character array argument

2011-10-09 Thread Tobias Burnus

On 08.10.2011 11:51, Janus Weil wrote:

Thanks! What's about the .texi change for -fwhole-file?

Will do. Should I include a note about deprecation? And if yes, do you
have a suggestion for the wording?


How about the following attachment?

Tobias
diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index 41fee67..cae114a 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -164,7 +164,7 @@ and warnings}.
 @item Code Generation Options
 @xref{Code Gen Options,,Options for code generation conventions}.
 @gccoptlist{-fno-automatic  -ff2c  -fno-underscoring @gol
--fwhole-file -fsecond-underscore @gol
+-fno-whole-file -fsecond-underscore @gol
 -fbounds-check -fcheck-array-temporaries  -fmax-array-constructor =@var{n} @gol
 -fcheck=@var{all|array-temps|bounds|do|mem|pointer|recursion} @gol
 -fcoarray=@var{none|single|lib} -fmax-stack-var-size=@var{n} @gol
@@ -1225,12 +1225,13 @@ in the source, even if the names as seen by the linker are mangled to
 prevent accidental linking between procedures with incompatible
 interfaces.
 
-@item -fwhole-file
-@opindex @code{fwhole-file}
-By default, GNU Fortran parses, resolves and translates each procedure
-in a file separately.  Using this option modifies this such that the
-whole file is parsed and placed in a single front-end tree.  During
-resolution, in addition to all the usual checks and fixups, references
+@item -fno-whole-file
+@opindex @code{fno-whole-file}
+This flag causes the compiler to resolve and translate each procedure in
+a file separately. 
+
+By default, the whole file is parsed and placed in a single front-end tree.
+During resolution, in addition to all the usual checks and fixups, references
 to external procedures that are in the same file effect resolution of
 that procedure, if not already done, and a check of the interfaces. The
 dependences are resolved by changing the order in which the file is
@@ -1238,6 +1239,8 @@ translated into the backend tree.  Thus, a procedure that is referenced
 is translated before the reference and the duplication of backend tree
 declarations eliminated.
 
+The @option{-fno-whole-file} option is deprecated and may lead to wrong code.
+
 @item -fsecond-underscore
 @opindex @code{fsecond-underscore}
 @cindex underscore


[committed] More e-mail address fixes in ChangeLogs: dead e-mail address

2011-10-09 Thread Mikael Morin
That address is long dead.
Committed as revision 179728.

Mikael
Index: ChangeLog-2008
===
--- ChangeLog-2008	(révision 179727)
+++ ChangeLog-2008	(révision 179728)
@@ -45,7 +45,7 @@
 	* trans-intrinsic.c (conv_same_strlen_check): New method.
 	(gfc_conv_intrinsic_merge): Call it here to actually do the check.
 
-2008-12-15  Mikael Morin  mikael.mo...@tele2.fr
+2008-12-15  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/38487
 	* dependency.c (gfc_is_data_pointer): New function.
@@ -53,7 +53,7 @@
 	in the pointer case.
 	(gfc_check_dependency): Use gfc_is_data_pointer.
 
-2008-12-15  Mikael Morin  mikael.mo...@tele2.fr
+2008-12-15  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/38113
 	* error.c (show_locus): Start counting columns at 0.
@@ -98,13 +98,13 @@
 	* invoke.texi (idirafter): New.
 	(no-range-check): Fixed entry in option-index.
 
-2008-12-09  Mikael Morin  mikael.mo...@tele2.fr
+2008-12-09  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/37469
 	* expr.c (find_array_element): Simplify array bounds.
 	Assert that both bounds are constant expressions.
 
-2008-12-09  Mikael Morin  mikael.mo...@tele2.fr
+2008-12-09  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/35983
 	* trans-expr.c (gfc_trans_subcomponent_assign):
@@ -158,7 +158,7 @@
 	* trans-types.c (gfc_sym_type,gfc_get_function_type): Support procedure
 	pointers as function result.
 
-2008-12-01  Mikael Morin  mikael.mo...@tele2.fr
+2008-12-01  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/38252
 	* parse.c (parse_spec): Skip statement order check in case
@@ -193,7 +193,7 @@
 	* module.c (gfc_dump_module): Report error on unlink only if
 	errno != ENOENT.
 
-2008-11-25  Mikael Morin  mikael.mo...@tele2.fr
+2008-11-25  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/36463
 	* expr.c (replace_symbol): Don't replace the symtree
@@ -218,7 +218,7 @@
 	* arith.c (gfc_check_real_range): Add mpfr_check_range.
 	* simplify.c (gfc_simplify_nearest): Add mpfr_check_range.
 
-2008-11-24  Mikael Morin  mikael.mo...@tele2.fr
+2008-11-24  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/38184
 	* simplify.c (is_constant_array_expr): Return true instead of false
@@ -308,7 +308,7 @@
 	* module.c (load_equiv): Regression fix; check that equivalence
 	members come from the same module only.
 
-2008-11-16  Mikael Morin mikael.mo...@tele2.fr
+2008-11-16  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/35681
 	* dependency.c (gfc_check_argument_var_dependency): Add
@@ -333,7 +333,7 @@
 	* dependency.h (enum gfc_dep_check): New enum.
 	(gfc_check_fncall_dependency): Update prototype.
 
-2008-11-16  Mikael Morin  mikael.mo...@tele2.fr
+2008-11-16  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/37992
 	* gfortran.h (gfc_namespace): Added member old_cl_list, 
@@ -518,7 +518,7 @@
 	* fortran/check.c (gfc_check_random_seed): Check PUT size
 	at compile time.
 
-2008-10-31  Mikael Morin  mikael.mo...@tele2.fr
+2008-10-31  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/35840
 	* expr.c (gfc_reduce_init_expr): New function, containing checking code
@@ -528,7 +528,7 @@
 	checking that the expression is a constant. 
 	* match.h (gfc_reduce_init_expr): Prototype added. 
 
-2008-10-31  Mikael Morin  mikael.mo...@tele2.fr
+2008-10-31  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/35820
 	* resolve.c (gfc_count_forall_iterators): New function.
@@ -548,7 +548,7 @@
 	gfc_simplify_ifix, gfc_simplify_idint, simplify_nint): Update function
 	calls to include locus.
 
-2008-10-30  Mikael Morin  mikael.mo...@tele2.fr
+2008-10-30  Mikael Morin  mik...@gcc.gnu.org
 
 PR fortran/37903
 * trans-array.c (gfc_trans_create_temp_array): If n is less
@@ -563,7 +563,7 @@
 	possible.  Calculate the translation from loop variables to
 	array indices if an array constructor.
 
-2008-10-30  Mikael Morin  mikael.mo...@tele2.fr
+2008-10-30  Mikael Morin  mik...@gcc.gnu.org
 
 PR fortran/37749
 * trans-array.c (gfc_trans_create_temp_array): If size is NULL
Index: ChangeLog-2009
===
--- ChangeLog-2009	(révision 179727)
+++ ChangeLog-2009	(révision 179728)
@@ -3519,7 +3519,7 @@
 
 	* intrinsic.texi (MALLOC): Make example more portable.
 
-2009-02-13  Mikael Morin  mikael.mo...@tele2.fr
+2009-02-13  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/38259
 	* module.c (gfc_dump_module,gfc_use_module): Add module
@@ -3566,7 +3566,7 @@
 
 	* invoke.texi (RANGE): RANGE also takes INTEGER arguments.
 
-2009-01-19  Mikael Morin  mikael.mo...@tele2.fr
+2009-01-19  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/38859
 	* simplify.c (simplify_bound): Don't use array specification
@@ -3656,7 +3656,7 @@
 	is substituted by a function. 
 	* resolve.c (check_host_association): Return if above is set.
 
-2009-01-04  Mikael Morin  mikael.mo...@tele2.fr
+2009-01-04  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/35681
 	* ChangeLog-2008: Fix function 

[PATCH] RFC: Cache LTO streamer mappings

2011-10-09 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com

Currently the LTO file mappings use a simple one-off cache.
This doesn't match the access pattern very well.

This patch adds a real LRU of LTO mappings with a total limit.
Each file is completely mapped now instead of only specific
sections. This addresses one of the FIXME comments in LTO.

The limit is 256GB on 64bit and 256MB on 32bit. The limit
can be temporarily exceeded by a single file. The whole
file has to fit into the address space now. This may increase
the address space requirements a bit.

I originally wrote this in an attempt to minimze fragmentation
of the virtual memory map, but it didn't make too much difference
for that because it was all caused by GGC.

Also on my fairly large builds it didn't make a measurable
compile time difference, probably because it was shadowed
by other much slower passes. That is why I'm just sending it as a
RFC. It certainly complicates the code somewhat.

Maybe if people have other LTO builds they could try if it makes a
difference for them.

Is it still a good idea?

Passes a full LTO bootstrap plus test suite on x86-64-linux.

gcc/lto/:

2011-10-05   Andi Kleen a...@linux.intel.com

* lto.c (list_head, mapped_file): Add.
(page_mask): Rename to page_size.
(MB, GB, max_mapped, cur_mapped, mapped_lru, list_add, list_del): Add.
(mf_lru_enforce_limit, mf_hashtable, mf_lru_finish_cache, mf_eq): Add
(mf_hash, mf_lookup_or_create)
(lto_read_section_data): Split into two ifdef versions.
Implement version using LRU cache. Add more error checks.
(mf_lru_finish_cached): Add dummy in ifdef.
(free_section_data): Rewrite for LRU.
(read_cgraph_and_symbols): Call mf_lru_finish_cache.
---
 gcc/lto/lto.c |  292 ++---
 1 files changed, 238 insertions(+), 54 deletions(-)

diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index a77eeb4..29dc3b8 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -1141,6 +1141,30 @@ lto_file_finalize (struct lto_file_decl_data *file_data, 
lto_file *file)
   lto_free_section_data (file_data, LTO_section_decls, NULL, data, len);
 }
 
+/* A list node or head */
+struct list_head
+  {
+struct list_head *next;  /* Next node */
+struct list_head *prev;  /* Prev node */
+  };
+
+/* Cache of mapped files */
+struct mapped_file
+  {
+struct list_head lru; /* LRU list. Must be first. */
+char *map; /* Mapping of the file */
+size_t size;   /* Size of mapping (rounded up) */
+int refcnt;/* Number of users */
+const char *filename;  /* File name */
+  };
+
+struct lwstate
+{
+  lto_file *file;
+  struct lto_file_decl_data **file_data;
+  int *count;
+};
+
 /* Finalize FILE_DATA in FILE and increase COUNT. */
 
 static int 
@@ -1200,65 +1224,213 @@ lto_file_read (lto_file *file, FILE *resolution_file, 
int *count)
 #endif
 
 #if LTO_MMAP_IO
+
 /* Page size of machine is used for mmap and munmap calls.  */
-static size_t page_mask;
-#endif
+static size_t page_size;
+
+#define MB (1UL  20)
+#define GB (1UL  30)
+
+/* Limit of mapped files */
+static unsigned HOST_WIDE_INT max_mapped = sizeof(void *)  4 ? 256*GB : 
256*MB;
+
+/* Total size of currently mapped files */
+static unsigned HOST_WIDE_INT cur_mapped; 
+
+/* LRU of mapped files */
+static struct list_head mapped_lru = { mapped_lru, mapped_lru };
+
+/* Add NODE into list HEAD */
+
+static void 
+list_add(struct list_head *node, struct list_head *head)
+{
+  struct list_head *prev = head;
+  struct list_head *next = head-next;
+
+  next-prev = node;
+  node-next = next;
+  node-prev = prev;
+  prev-next = node;
+}
+
+/* Remove NODE from list. */
+
+static void 
+list_del(struct list_head *node)
+{
+  struct list_head *prev = node-prev;
+  struct list_head *next = node-next;
+   
+  if (!next  !prev)
+return;
+  next-prev = prev;
+  prev-next = next;
+  node-next = NULL;
+  node-prev = NULL;
+}
+
+/* Enforce the global LRU limit MAX when the commitment changes by INCREMENT. 
*/
+
+static void
+mf_lru_enforce_limit (unsigned HOST_WIDE_INT increment, unsigned HOST_WIDE_INT 
max)
+{
+  struct mapped_file *mf;
+  unsigned HOST_WIDE_INT new_mapped = cur_mapped + increment;
+  struct list_head *node, *prev;
+
+  for (node = mapped_lru.prev; new_mapped  max  node != mapped_lru; node = 
prev) 
+{
+  prev = node-prev;
+  mf = (struct mapped_file *) node;
+  if (mf-refcnt  0)
+continue;
+  munmap (mf-map, mf-size);
+  mf-map = NULL;
+  new_mapped -= mf-size;
+  list_del (node);
+}
+
+  cur_mapped = new_mapped;
+}
+
+/* Hash table of mapped_files */
+static htab_t mf_hashtable;
+
+/* Free all mappings in the hash table. */
+
+static void
+mf_lru_finish_cache (void)
+{
+  mf_lru_enforce_limit (0, 0);
+  gcc_assert (mapped_lru.next == mapped_lru.prev);
+  htab_delete (mf_hashtable);
+  mf_hashtable = NULL;
+}
+
+/* Compare hash table entries A and B. */
+

Re: [committed] More e-mail address fixes in ChangeLogs: dead e-mail address

2011-10-09 Thread Richard Guenther
On Sun, Oct 9, 2011 at 7:04 PM, Mikael Morin mikael.mo...@sfr.fr wrote:
 That address is long dead.
 Committed as revision 179728.

We usually don't retroactively change ChangeLogs this way.  Please refrain from
making further changes like this.

Thanks,
Richard.

 Mikael



[Patch, Fortran] Fix PR 50564

2011-10-09 Thread Thomas Koenig

Hello world,

the attached patch fixes the PR by removing common function elimination
in FORALL statements.

In the course of fixing this PR, I had originally fixed the ICE only to
find that the transformation (where f is a function)

forall (i=1:2)
  a(i) = f(i) + f(i)
end forall

to

forall (i=1:2)
  tmp = f(i)
  a(i) = tmp
end forall

did the Wrong Thing.  Oh well...

Regression-tested. OK for tunk?

Thomas

2011-10-09  Thomas Koenig  tkoe...@gcc.gnu.org

PR fortran/50564
* frontend-passes (forall_level):  New variable.
(cfe_register_funcs):  Don't register functions if we
are within a forall loop.
(optimize_namespace):  Set forall_level to 0 before entry.
(gfc_code_walker):  Increase/decrease forall_level.

2011-10-09  Thomas Koenig  tkoe...@gcc.gnu.org

PR fortran/50564
* gfortran.dg/forall_15.f90:  New test case.
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 179709)
+++ frontend-passes.c	(Arbeitskopie)
@@ -62,6 +62,10 @@ static gfc_code *inserted_block, **changed_stateme
 
 gfc_namespace *current_ns;
 
+/* If we are within any forall loop.  */
+
+static int forall_level;
+
 /* Entry point - run all passes for a namespace.  So far, only an
optimization pass is run.  */
 
@@ -165,6 +169,12 @@ cfe_register_funcs (gfc_expr **e, int *walk_subtre
 	  || (*e)-ts.u.cl-length-expr_type != EXPR_CONSTANT))
 return 0;
 
+  /* We don't do function elimination within FORALL statements, it can
+ lead to wrong-code in certain circumstances.  */
+
+  if (forall_level  0)
+return 0;
+
   /* If we don't know the shape at compile time, we create an allocatable
  temporary variable to hold the intermediate result, but only if
  allocation on assignment is active.  */
@@ -493,6 +503,7 @@ optimize_namespace (gfc_namespace *ns)
 {
 
   current_ns = ns;
+  forall_level = 0;
 
   gfc_code_walker (ns-code, convert_do_while, dummy_expr_callback, NULL);
   gfc_code_walker (ns-code, cfe_code, cfe_expr_0, NULL);
@@ -1193,6 +1204,7 @@ gfc_code_walker (gfc_code **c, walk_code_fn_t code
 		WALK_SUBEXPR (fa-end);
 		WALK_SUBEXPR (fa-stride);
 		  }
+		forall_level ++;
 		break;
 	  }
 
@@ -1335,6 +1347,10 @@ gfc_code_walker (gfc_code **c, walk_code_fn_t code
 	  WALK_SUBEXPR (b-expr2);
 	  WALK_SUBCODE (b-next);
 	}
+
+	  if (co-op == EXEC_FORALL || co-op == EXEC_DO_CONCURRENT)
+	forall_level --;
+
 	}
 }
   return 0;
! { dg-do run }
! { dg-options -ffrontend-optimize -fdump-tree-original }
! PR 50564 - this used to ICE with front end optimization.
! Original test case by Andrew Benson.
program test
  implicit none
  double precision, dimension(2) :: timeSteps, control
  integer:: iTime
  double precision   :: ratio
  double precision   :: a

  ratio = 0.7d0
  control(1) = ratio**(dble(1)-0.5d0)-ratio**(dble(1)-1.5d0)
  control(2) = ratio**(dble(2)-0.5d0)-ratio**(dble(2)-1.5d0)
  forall(iTime=1:2)
 timeSteps(iTime)=ratio**(dble(iTime)-0.5d0)-ratio**(dble(iTime)-1.5d0)
  end forall
  if (any(abs(timesteps - control)  1d-10)) call abort

  ! Make sure we still do the front-end optimization after a forall
  a = cos(ratio)*cos(ratio) + sin(ratio)*sin(ratio)
  if (abs(a-1.d0)  1d-10) call abort
end program test
! { dg-final { scan-tree-dump-times __builtin_cos 1 original } }
! { dg-final { scan-tree-dump-times __builtin_sin 1 original } }
! { dg-final { cleanup-tree-dump original } }


Avoid double mangling at WHOPR

2011-10-09 Thread Jan Hubicka
Hi,
whopr currently produce local_static.1234.43124 type symbols. This is because
everything gets mangled at WPA time and then again at ltrans time.  This simply
avoids the second mangling. This save some space  makes WHOPR/non_WHOPR symbol
tables comparable more directly.

Bootstrapped/regtested x86_64-linux, also tested with Mozilla LTO, OK?

Honza

* lto.c (lto_register_var_decl_in_symtab,
lto_register_function_decl_in_symtab): Do not mangle at ltrans time.
* lto-lang.c (lto_set_decl_assembler_name): Likewise.
Index: lto/lto.c
===
--- lto/lto.c   (revision 179664)
+++ lto/lto.c   (working copy)
@@ -604,7 +604,7 @@ lto_register_var_decl_in_symtab (struct
 
   /* Variable has file scope, not local. Need to ensure static variables
  between different files don't clash unexpectedly.  */
-  if (!TREE_PUBLIC (decl)
+  if (!TREE_PUBLIC (decl)  !flag_ltrans
!((context = decl_function_context (decl))
auto_var_in_fn_p (decl, context)))
 {
@@ -646,7 +646,7 @@ lto_register_function_decl_in_symtab (st
 {
   /* Need to ensure static entities between different files
  don't clash unexpectedly.  */
-  if (!TREE_PUBLIC (decl))
+  if (!TREE_PUBLIC (decl)  !flag_ltrans)
 {
   /* We must not use the DECL_ASSEMBLER_NAME macro here, as it
 may set the assembler name where it was previously empty.  */
Index: lto/lto-lang.c
===
--- lto/lto-lang.c  (revision 179664)
+++ lto/lto-lang.c  (working copy)
@@ -954,7 +954,7 @@ lto_set_decl_assembler_name (tree decl)
  TREE_PUBLIC, to avoid conflicts between individual files.  */
   tree id;
 
-  if (TREE_PUBLIC (decl))
+  if (TREE_PUBLIC (decl) || flag_ltrans)
 id = targetm.mangle_decl_assembler_name (decl, DECL_NAME (decl));
   else
 {


Re: [committed] More e-mail address fixes in ChangeLogs: dead e-mail address

2011-10-09 Thread Mikael Morin
On Sunday 09 October 2011 19:30:20 Richard Guenther wrote:
 We usually don't retroactively change ChangeLogs this way.
On the other hand, ChangeLogs usually don't need to be changed.

 Please refrain from making further changes like this.
OK, I will. Is there a reason for such a policy?

Mikael


[patch bfd]: Some adjustments on coff-link.c

2011-10-09 Thread Kai Tietz
Hello,

this patch improves COFF linker for undefined weak symbols
and avoids writing symbols for discarded sections - if linker tells so
-, and for IR generated sections.

ChangeLog

2011-10-09  Kai Tietz  kti...@redhat.com

* cofflink.c (coff_link_check_ar_symbols): Allow
adding of archive-file if symbol was undefined weak.
(_bfd_coff_write_global_sym): Skip write for symbol
in discared section, or if section is coming from IR, or
if input section has explicit SEC_EXCLUDED set.
(_bfd_coff_generic_relocate_section): For undefined weak
symbol and replacing it by another undefined weak, mark
section as absolute.

Regression tested for i686-w64-mingw32, x86_64-w64-mingw32, and
i686-pc-cygwin.  Ok for apply?

Regards,
Kai


Index: src/bfd/cofflink.c
===
--- src.orig/bfd/cofflink.c
+++ src/bfd/cofflink.c
@@ -242,7 +242,8 @@ coff_link_check_ar_symbols (bfd *abfd,
 COFF linkers do not bring in an object file which defines
 it.  */
  if (h != (struct bfd_link_hash_entry *) NULL
-  h-type == bfd_link_hash_undefined)
+  (h-type == bfd_link_hash_undefined
+ || h-type == bfd_link_hash_undefweak))
{
  if (!(*info-callbacks
-add_archive_element) (info, abfd, name, subsbfd))
@@ -2527,6 +2528,7 @@ _bfd_coff_write_global_sym (struct bfd_h
   bfd_size_type symesz;
   unsigned int i;
   file_ptr pos;
+  asection *input_sec;

   output_bfd = finfo-output_bfd;

@@ -2547,6 +2549,21 @@ _bfd_coff_write_global_sym (struct bfd_h
   h-root.root.string, FALSE, FALSE)
  == NULL
 return TRUE;
+  else if (h-indx != -2
+(h-root.type == bfd_link_hash_defined
+  || h-root.type == bfd_link_hash_defweak)
+   ((finfo-info-strip_discarded
+!bfd_is_abs_section (h-root.u.def.section)
+bfd_is_abs_section (h-root.u.def.section-output_section))
+  || (h-root.u.def.section-owner != NULL
+   (h-root.u.def.section-owner-flags  BFD_PLUGIN) != 0)))
+return TRUE;
+  else if (h-indx != -2
+(h-root.type == bfd_link_hash_undefined
+  || h-root.type == bfd_link_hash_undefweak)
+   h-root.u.undef.abfd != NULL
+   (h-root.u.undef.abfd-flags  BFD_PLUGIN) != 0)
+return TRUE;

   switch (h-root.type)
 {
@@ -2560,26 +2577,37 @@ _bfd_coff_write_global_sym (struct bfd_h
 case bfd_link_hash_undefweak:
   isym.n_scnum = N_UNDEF;
   isym.n_value = 0;
+  input_sec = bfd_und_section_ptr;
   break;

 case bfd_link_hash_defined:
 case bfd_link_hash_defweak:
-  {
-   asection *sec;
+  input_sec = h-root.u.def.section;
+  if (input_sec-output_section != NULL)
+   {
+ asection *sec;

-   sec = h-root.u.def.section-output_section;
-   if (bfd_is_abs_section (sec))
- isym.n_scnum = N_ABS;
-   else
- isym.n_scnum = sec-target_index;
-   isym.n_value = (h-root.u.def.value
-   + h-root.u.def.section-output_offset);
-   if (! obj_pe (finfo-output_bfd))
- isym.n_value += sec-vma;
-  }
+ sec = h-root.u.def.section-output_section;
+ if (bfd_is_abs_section (sec))
+   isym.n_scnum = N_ABS;
+ else
+   isym.n_scnum = sec-target_index;
+ isym.n_value = (h-root.u.def.value
+ + h-root.u.def.section-output_offset);
+ if (! obj_pe (finfo-output_bfd))
+   isym.n_value += sec-vma;
+   }
+  else
+{
+ BFD_ASSERT (input_sec-owner == NULL);
+ isym.n_scnum = N_UNDEF;
+ isym.n_value = 0;
+ input_sec = bfd_und_section_ptr;
+   }
   break;

 case bfd_link_hash_common:
+  input_sec = h-root.u.c.p-section;
   isym.n_scnum = N_UNDEF;
   isym.n_value = h-root.u.c.size;
   break;
@@ -2589,6 +2617,9 @@ _bfd_coff_write_global_sym (struct bfd_h
   return TRUE;
 }

+  if ((input_sec-flags  SEC_EXCLUDE) != 0)
+return TRUE;
+
   if (strlen (h-root.root.string) = SYMNMLEN)
 strncpy (isym._n._n_name, h-root.root.string, SYMNMLEN);
   else
@@ -3013,7 +3044,8 @@ _bfd_coff_generic_relocate_section (bfd
h-auxbfd-tdata.coff_obj_data-sym_hashes[
h-aux-x_sym.x_tagndx.l];

- if (!h2 || h2-root.type == bfd_link_hash_undefined)
+ if (!h2 || h2-root.type == bfd_link_hash_undefined
+ || h2-root.type == bfd_link_hash_undefweak)
{
  sec = bfd_abs_section_ptr;
  val = 0;


Re: [Patch] Don't ignore testsuite errors in Makefile

2011-10-09 Thread Jakub Jelinek
On Sun, Oct 09, 2011 at 04:32:12PM +0200, Mikael Morin wrote:
 currently, the testsuite return value is ignored by make. It is a little 
 annoying if one wants to check automatically for regressions as we have to 
 parse the testsuite output.
 This patch reverts to the normal make behaviour, which is to not ignore 
 commands' return values. Note: As a result the -k flag has to be added to the 
 make command line if one wants the tests to continue after one failure.
 
 OK for trunk?

Please no.  This is a very bad idea, most of the testsuites on many
architectures contain some FAILs and a failure from check-parallel-% would
mean the *.log/*.sum files would be never merged in that case.

If you really need to propagate the return value (I fail to see how it is
useful), then you should e.g. store the $? value from $(RUNTEST) in
check-parallel-% into some file in that directory and have the
parallelization goal after the merging collect those from the individual
files and or them all together into the final return value.

 2011-10-09  Mikael Morin  mikael.mo...@sfr.fr
 
   * Makefile.in (check-parallel-%): Don't ignore testsuite errors.

 Index: Makefile.in
 ===
 --- Makefile.in   (révision 179710)
 +++ Makefile.in   (copie de travail)
 @@ -5116,10 +5124,10 @@ $(patsubst %,%-subtargets,$(lang_checks_paralleliz
  # Otherwise check-$lang isn't parallelized and runtest is invoked just with
  # the $(RUNTESTFLAGS) arguments.
  check-parallel-% : site.exp
 - -test -d plugin || mkdir plugin
 - -test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR)
 + test -d plugin || mkdir plugin
 + test -d $(TESTSUITEDIR) || mkdir $(TESTSUITEDIR)
   test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir 
 $(TESTSUITEDIR)/$(check_p_subdir)
 - -(rootme=`${PWD_COMMAND}`; export rootme; \
 + (rootme=`${PWD_COMMAND}`; export rootme; \
   srcdir=`cd ${srcdir}; ${PWD_COMMAND}` ; export srcdir ; \
   cd $(TESTSUITEDIR)/$(check_p_subdir); \
   rm -f tmp-site.exp; \


Jakub


RE: Intrinsics for N2965: Type traits and base classes

2011-10-09 Thread Michael Spertus
Here is a new diff that works for non-class types (fixing Benjamin's failing 
test), fixes some spacing and alphabetization, and doesn't inadvertently break 
the __underlying_type trait.

Index: libstdc++-v3/include/tr2/type_traits
===
--- libstdc++-v3/include/tr2/type_traits(revision 0)
+++ libstdc++-v3/include/tr2/type_traits(revision 0)
@@ -0,0 +1,96 @@
+// TR2 type_traits -*- C++ -*-
+
+// Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
+// Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// http://www.gnu.org/licenses/.
+
+/** @file tr2/type_traits
+ *  This is a TR2 C++ Library header. 
+ */
+
+#ifndef _GLIBCXX_TR2_TYPE_TRAITS
+#define _GLIBCXX_TR2_TYPE_TRAITS 1
+
+#pragma GCC system_header
+#include type_traits
+#include bits/c++config.h
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+namespace tr2
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  /**
+   * @defgroup metaprogramming Type Traits
+   * @ingroup utilities
+   *
+   * Compile time type transformation and information.
+   * @{
+   */
+
+  templatetypename... _Elements struct typelist;
+  template
+struct typelist
+{
+  typedef std::true_type empty;
+};
+
+  templatetypename _First, typename... _Rest
+struct typelist_First, _Rest...
+{
+  struct first
+  {
+typedef _First type;
+  };
+
+  struct rest
+  {
+typedef typelist_Rest... type;
+  };
+
+  typedef std::false_type empty;
+};
+
+  // Sequence abstraction metafunctions default to looking in the type
+  templatetypename T struct first : public T::first {};
+  templatetypename T struct rest : public T::rest {};
+  templatetypename T struct empty : public T::empty {};
+
+
+  templatetypename T
+struct bases
+{
+ typedef typelist__bases(T)... type;
+};
+
+  templatetypename T
+struct direct_bases
+{
+  typedef typelist__direct_bases(T)... type;
+};
+   
+_GLIBCXX_END_NAMESPACE_VERSION
+}
+}
+
+#endif // _GLIBCXX_TR2_TYPE_TRAITS
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 178892)
+++ gcc/c-family/c-common.c (working copy)
@@ -423,6 +423,7 @@
   { __asm__, RID_ASM,0 },
   { __attribute, RID_ATTRIBUTE,  0 },
   { __attribute__,   RID_ATTRIBUTE,  0 },
+  { __bases,  RID_BASES, D_CXXONLY },
   { __builtin_choose_expr, RID_CHOOSE_EXPR, D_CONLY },
   { __builtin_complex, RID_BUILTIN_COMPLEX, D_CONLY },
   { __builtin_offsetof, RID_OFFSETOF, 0 },
@@ -433,6 +434,7 @@
   { __const, RID_CONST,  0 },
   { __const__,   RID_CONST,  0 },
   { __decltype,   RID_DECLTYPE,   D_CXXONLY },
+  { __direct_bases,   RID_DIRECT_BASES, D_CXXONLY },
   { __extension__,   RID_EXTENSION,  0 },
   { __func__,RID_C99_FUNCTION_NAME, 0 },
   { __has_nothrow_assign, RID_HAS_NOTHROW_ASSIGN, D_CXXONLY },
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h (revision 178892)
+++ gcc/c-family/c-common.h (working copy)
@@ -129,12 +129,13 @@
   RID_CONSTCAST, RID_DYNCAST, RID_REINTCAST, RID_STATCAST,
 
   /* C++ extensions */
+  RID_BASES,  RID_DIRECT_BASES,
   RID_HAS_NOTHROW_ASSIGN,  RID_HAS_NOTHROW_CONSTRUCTOR,
   RID_HAS_NOTHROW_COPY,RID_HAS_TRIVIAL_ASSIGN,
   RID_HAS_TRIVIAL_CONSTRUCTOR, RID_HAS_TRIVIAL_COPY,
   RID_HAS_TRIVIAL_DESTRUCTOR,  RID_HAS_VIRTUAL_DESTRUCTOR,
   RID_IS_ABSTRACT, RID_IS_BASE_OF,
-  RID_IS_CONVERTIBLE_TO,   RID_IS_CLASS,
+  RID_IS_CLASS,RID_IS_CONVERTIBLE_TO,  
   RID_IS_EMPTY,RID_IS_ENUM,
   RID_IS_LITERAL_TYPE, RID_IS_POD,
   RID_IS_POLYMORPHIC,  RID_IS_STD_LAYOUT,
Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c (revision 178892)
+++ gcc/cp/pt.c (working copy)
@@ -2976,6 

Re: Intrinsics for N2965: Type traits and base classes

2011-10-09 Thread Jason Merrill

On 10/09/2011 08:13 PM, Michael Spertus wrote:

+dfs_calculate_bases_pre (tree binfo, void *data_)
+{
+  (void)data_;


You can use ATTRIBUTE_UNUSED to mark an unused parameter.

I'd still like to see some testcases for the intrinsic, independent of 
the library.


Jason


Re: [Patch, Fortran] PR 50547 / cleanup in resolve_formal_arglist

2011-10-09 Thread Tobias Burnus

On 02.10.2011 01:43, Janus Weil wrote:

Hi all,

while working on PR50547, I noticed some strange things about
resolve_formal_arglist, so I decided to clean it up a little. The
attached patch does a couple of things:

Regtested on x86_64-unknown-linux-gnu. Ok for trunk?


OK. Thanks for the cleanup.

Tobias


2011-10-02  Janus Weilja...@gcc.gnu.org

PR fortran/50547
* resolve.c (resolve_formal_arglist): Remove unneeded error message.
Some reshuffling.


2011-10-02  Janus Weilja...@gcc.gnu.org

PR fortran/50547
* gfortran.dg/elemental_args_check_4.f90: New.




[PATCH, testsuite]: Remove *.gdb files from testsuite dir

2011-10-09 Thread Uros Bizjak
Hello!

Attached patch removes *.gdb temporary files from testsuite directory.

2011-10-09  Uros Bizjak  ubiz...@gmail.com

* lib/gcc-gdb-test.exp (gdb-test): Delete $cmd_file before return.

Tested on x86_64-pc-linux-gnu {,-m32}. OK for mainline and branches?

Uros.
Index: gcc-gdb-test.exp
===
--- gcc-gdb-test.exp(revision 179718)
+++ gcc-gdb-test.exp(working copy)
@@ -56,6 +56,7 @@
 set res [remote_spawn target $gdb_name -nx -nw -quiet -x $cmd_file 
./$output_file]
 if { $res  0 || $res ==  } {
unsupported $testname
+   file delete $cmd_file
return
 }
 
@@ -64,6 +65,7 @@
-re Unhandled dwarf expression|Error in sourced command file {
unsupported $testname
remote_close target
+   file delete $cmd_file
return
}
-re {[\n\r]\$1 = ([^\n\r]*)[\n\r]+\$2 = ([^\n\r]*)[\n\r]} {
@@ -76,16 +78,19 @@
fail $testname
}
remote_close target
+   file delete $cmd_file
return
}
timeout {
unsupported $testname
remote_close target
+   file delete $cmd_file
return
}
 }
 
+unsupported $testname
 remote_close target
-unsupported $testname
+file delete $cmd_file
 return
 }


Re: [PATCH, testsuite, i386] FMA3 testcases + typo fix in MD

2011-10-09 Thread Kirill Yukhin
Hi guys,
This is a Ping. Could anyboady with appropriate rights commit that?

Thanks, K

On Thu, Oct 6, 2011 at 11:46 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Thu, Oct 6, 2011 at 3:48 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:

 BTW, don't you also need -mfmpath=sse in dg-options?


 According to doc/invoke.texi
 ...
 @itemx -mfma
 ...
 These options will enable GCC to use these extended instructions in
 generated code, even without @option{-mfpmath=sse}.

 Seems it -mfpmath=sse is useless..
 Although, if this is wrong, we probably have to update doc as well.

 Well, OK then.

 Uros.



Improve ggc-page fragmentation

2011-10-09 Thread Andi Kleen
I ran into problems with virtual memory fragmentation ggc-page on 
large LTO builds. The memory was so fragmented that builds
failed because the compiler would use more than the 64k mappings
Linux allows each process by default. 

For more details see PR 50636

This patchkit includes various improvements to the fragmentation
behaviour plus some optimizations to increase the use of 2MB
pages on modern Linux kernels. This fixes the fragmentation
problem for me and increases the use of huge pages significantly.

My simple benchmarks didn't show a lot of performance improvement
though.

On non Linux kernels the fragmentation problem will be still
somewhat visible (the best fix is using the Linux specific
MADV_DONTNEED), but the new threshold should still improve things
there.

All passed bootstrap and test suite run on x86-64.

-Andi



[PATCH 3/5] On a Linux kernel ask explicitely for a huge page in ggc

2011-10-09 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com

Benchmarks show slightly faster build times on a kernel
build, near the measurement error unfortunately.

This will only work with a recent glibc that defines MADV_HUGEPAGE.

2011-10-08   Andi Kleen a...@linux.intel.com

* ggc-page.c (alloc_page): Add madvise for hugepage
---
 gcc/ggc-page.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 1f52b56..6e08cda 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -779,6 +779,11 @@ alloc_page (unsigned order)
 
   page = alloc_anon (NULL, G.pagesize * GGC_QUIRE_SIZE);
 
+#if defined(HAVE_MADVISE)  defined(MADV_HUGEPAGE)
+  /* Kernel, I would like to have hugepages, please. */
+  madvise(page, G.pagesize * GGC_QUIRE_SIZE, MADV_HUGEPAGE);
+#endif
+
   /* This loop counts down so that the chain will be in ascending
 memory order.  */
   for (i = GGC_QUIRE_SIZE - 1; i = 1; i--)
-- 
1.7.5.4



[PATCH 2/5] Increase the GGC quite size to 2MB

2011-10-09 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com

Using 2MB allows modern kernels to use 2MB huge pages on x86.

gcc/:

2011-10-08   Andi Kleen a...@linux.intel.com

* ggc-page.c (GGC_QUIRE_SIZE): Increase to 512
---
 gcc/ggc-page.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index b0b3b3f..1f52b56 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -469,7 +469,7 @@ static struct globals
can override this by defining GGC_QUIRE_SIZE explicitly.  */
 #ifndef GGC_QUIRE_SIZE
 # ifdef USING_MMAP
-#  define GGC_QUIRE_SIZE 256
+#  define GGC_QUIRE_SIZE 512   /* 2MB for 4K pages */
 # else
 #  define GGC_QUIRE_SIZE 16
 # endif
-- 
1.7.5.4



[PATCH 5/5] Add error checking to lto_section_read

2011-10-09 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com

Various callers of lto_section_read segfault on a NULL return
when the mmap fails. Add some internal_errors to give a better
message to the user.

gcc/lto/:

2011-10-09   Andi Kleen a...@linux.intel.com

* lto.c (lto_section_read): Call internal_error on IO or mmap errors.
---
 gcc/lto/lto.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index a77eeb4..dc16db4 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -1237,7 +1237,10 @@ lto_read_section_data (struct lto_file_decl_data 
*file_data,
 {
   fd = open (file_data-file_name, O_RDONLY|O_BINARY);
   if (fd == -1)
-   return NULL;
+{
+ internal_error (Cannot open %s, file_data-file_name);
+ return NULL;
+}
   fd_name = xstrdup (file_data-file_name);
 }
 
@@ -1255,7 +1258,10 @@ lto_read_section_data (struct lto_file_decl_data 
*file_data,
   result = (char *) mmap (NULL, computed_len, PROT_READ, MAP_PRIVATE,
  fd, computed_offset);
   if (result == MAP_FAILED)
-return NULL;
+{
+  internal_error (Cannot map %s, file_data-file_name);
+  return NULL;
+}
 
   return result + diff;
 #else
@@ -1264,6 +1270,7 @@ lto_read_section_data (struct lto_file_decl_data 
*file_data,
   || read (fd, result, len) != (ssize_t) len)
 {
   free (result);
+  internal_error (Cannot read %s, file_data-file_name);
   result = NULL;
 }
 #ifdef __MINGW32__
-- 
1.7.5.4



[PATCH 4/5] Add a freeing threshold for the garbage collector.

2011-10-09 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com

Add a threshold to avoid freeing pages back too early to the OS.
This avoid virtual memory map fragmentation.

Based on a idea from Honza

ggc/doc/:

2011-10-08   Andi Kleen a...@linux.intel.com

PR other/50636
* invoke.texi (ggc-free-threshold, ggc-free-min): Add.

ggc/:

2011-10-08   Andi Kleen a...@linux.intel.com

PR other/50636
* ggc-page.c (ggc_collect): Add free threshold.
* params.def (GGC_FREE_THRESHOLD, GGC_FREE_MIN): Add.
---
 gcc/doc/invoke.texi |   11 +++
 gcc/ggc-page.c  |   13 +
 gcc/params.def  |   10 ++
 3 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ef7ac68..6557f66 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8837,6 +8837,17 @@ very large effectively disables garbage collection.  
Setting this
 parameter and @option{ggc-min-expand} to zero causes a full collection
 to occur at every opportunity.
 
+@item ggc-free-threshold
+
+Only free memory back to the system when it would free more than this
+many percent of the total allocated memory. Default is 20 percent.
+This avoids memory fragmentation.
+
+@item ggc-free-min
+
+Only free memory back to the system when it would free more than this.
+Unit is kilobytes. 
+
 @item max-reload-search-insns
 The maximum number of instruction reload should look backward for equivalent
 register.  Increasing values mean more aggressive optimization, making the
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 6e08cda..cd1c41a 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -1968,14 +1968,19 @@ ggc_collect (void)
   if (GGC_DEBUG_LEVEL = 2)
 fprintf (G.debug_file, BEGIN COLLECTING\n);
 
+  /* Release the pages we freed the last time we collected, but didn't
+ reuse in the interim.  But only do this if this would free a 
+ reasonable number of pages. Otherwise hold on to them
+ to avoid virtual memory fragmentation. */
+  if (G.bytes_mapped - G.allocated = 
+   (PARAM_VALUE (GGC_FREE_THRESHOLD) / 100.0) * G.bytes_mapped 
+  G.bytes_mapped - G.allocated = (size_t)PARAM_VALUE (GGC_FREE_MIN) * 
1024)
+release_pages ();
+
   /* Zero the total allocated bytes.  This will be recalculated in the
  sweep phase.  */
   G.allocated = 0;
 
-  /* Release the pages we freed the last time we collected, but didn't
- reuse in the interim.  */
-  release_pages ();
-
   /* Indicate that we've seen collections at this context depth.  */
   G.context_depth_collections = ((unsigned long)1  (G.context_depth + 1)) - 
1;
 
diff --git a/gcc/params.def b/gcc/params.def
index 5e49c48..ca28715 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -561,6 +561,16 @@ DEFPARAM(GGC_MIN_HEAPSIZE,
 #undef GGC_MIN_EXPAND_DEFAULT
 #undef GGC_MIN_HEAPSIZE_DEFAULT
 
+DEFPARAM(GGC_FREE_THRESHOLD,
+   ggc-free-threshold,
+   Dont free memory back to system less this percent of the total memory,
+   20, 0, 100)
+
+DEFPARAM(GGC_FREE_MIN,
+ggc-free-min,
+Dont free less memory than this back to the system, in kilobytes,
+8 * 1024, 0, 0)
+
 DEFPARAM(PARAM_MAX_RELOAD_SEARCH_INSNS,
 max-reload-search-insns,
 The maximum number of instructions to search backward when looking 
for equivalent reload,
-- 
1.7.5.4



[PATCH 1/5] Use MADV_DONTNEED for freeing in garbage collector

2011-10-09 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com

Use the Linux MADV_DONTNEED call to unmap free pages in the garbage
collector.Then keep the unmapped pages in the free list. This avoid
excessive memory fragmentation on large LTO bulds, which can lead
to gcc bumping into the Linux vm_max_map limit per process.

Based on a idea from Jakub.

gcc/:

2011-10-08   Andi Kleen a...@linux.intel.com

PR other/50636
* config.in, configure: Regenerate.
* configure.ac (madvise): Add to AC_CHECK_FUNCS.
* ggc-page.c (USING_MADVISE): Add.
(page_entry): Add unmapped field.
(alloc_page): Check for unmapped pages.
(release_pages): Add USING_MADVISE branch.
---
 gcc/config.in|6 ++
 gcc/configure|2 +-
 gcc/configure.ac |2 +-
 gcc/ggc-page.c   |   48 +++-
 4 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index f2847d8..e8148b6 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1276,6 +1276,12 @@
 #endif
 
 
+/* Define to 1 if you have the `madvise' function. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_MADVISE
+#endif
+
+
 /* Define to 1 if you have the malloc.h header file. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_MALLOC_H
diff --git a/gcc/configure b/gcc/configure
index cb55dda..4a54adf 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -9001,7 +9001,7 @@ fi
 for ac_func in times clock kill getrlimit setrlimit atoll atoq \
sysconf strsignal getrusage nl_langinfo \
gettimeofday mbstowcs wcswidth mmap setlocale \
-   clearerr_unlocked feof_unlocked   ferror_unlocked fflush_unlocked 
fgetc_unlocked fgets_unlocked   fileno_unlocked fprintf_unlocked fputc_unlocked 
fputs_unlocked   fread_unlocked fwrite_unlocked getchar_unlocked getc_unlocked  
 putchar_unlocked putc_unlocked
+   clearerr_unlocked feof_unlocked   ferror_unlocked fflush_unlocked 
fgetc_unlocked fgets_unlocked   fileno_unlocked fprintf_unlocked fputc_unlocked 
fputs_unlocked   fread_unlocked fwrite_unlocked getchar_unlocked getc_unlocked  
 putchar_unlocked putc_unlocked madvise
 do :
   as_ac_var=`$as_echo ac_cv_func_$ac_func | $as_tr_sh`
 ac_fn_c_check_func $LINENO $ac_func $as_ac_var
diff --git a/gcc/configure.ac b/gcc/configure.ac
index a7b94e6..357902e 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1027,7 +1027,7 @@ define(gcc_UNLOCKED_FUNCS, clearerr_unlocked 
feof_unlocked dnl
 AC_CHECK_FUNCS(times clock kill getrlimit setrlimit atoll atoq \
sysconf strsignal getrusage nl_langinfo \
gettimeofday mbstowcs wcswidth mmap setlocale \
-   gcc_UNLOCKED_FUNCS)
+   gcc_UNLOCKED_FUNCS madvise)
 
 if test x$ac_cv_func_mbstowcs = xyes; then
   AC_CACHE_CHECK(whether mbstowcs works, gcc_cv_func_mbstowcs_works,
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 624f029..b0b3b3f 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -50,6 +50,10 @@ along with GCC; see the file COPYING3.  If not see
 #define USING_MALLOC_PAGE_GROUPS
 #endif
 
+#if defined(HAVE_MADVISE)  defined(MADV_DONTNEED)
+# define USING_MADVISE
+#endif
+
 /* Strategy:
 
This garbage-collecting allocator allocates objects on one of a set
@@ -277,6 +281,9 @@ typedef struct page_entry
   /* The lg of size of objects allocated from this page.  */
   unsigned char order;
 
+  /* Unmapped page? */
+  bool unmapped;
+
   /* A bit vector indicating whether or not objects are in use.  The
  Nth bit is one if the Nth object on this page is allocated.  This
  array is dynamically sized.  */
@@ -740,6 +747,10 @@ alloc_page (unsigned order)
 
   if (p != NULL)
 {
+  if (p-unmapped)
+G.bytes_mapped += p-bytes;
+  p-unmapped = false;
+
   /* Recycle the allocated memory from this page ...  */
   *pp = p-next;
   page = p-page;
@@ -956,7 +967,42 @@ free_page (page_entry *entry)
 static void
 release_pages (void)
 {
-#ifdef USING_MMAP
+#ifdef USING_MADVISE
+  page_entry *p, *start_p;
+  char *start;
+  size_t len;
+
+  for (p = G.free_pages; p; )
+{
+  if (p-unmapped)
+{
+  p = p-next;
+  continue;
+}
+  start = p-page;
+  len = p-bytes;
+  start_p = p;
+  p = p-next;
+  while (p  p-page == start + len)
+{
+  len += p-bytes;
+  p = p-next;
+}
+  /* Give the page back to the kernel, but don't free the mapping.
+ This avoids fragmentation in the virtual memory map of the 
+process. Next time we can reuse it by just touching it. */
+  madvise (start, len, MADV_DONTNEED);
+  /* Don't count those pages as mapped to not touch the garbage collector
+ unnecessarily. */
+  G.bytes_mapped -= len;
+  while (start_p != p)
+{
+  start_p-unmapped = true;
+  start_p = start_p-next;
+}
+}
+#endif
+#if defined(USING_MMAP)  !defined(USING_MADVISE)
   page_entry *p, *next;
   char *start;
   size_t len;
-- 
1.7.5.4



Re: [Patch] Don't ignore testsuite errors in Makefile

2011-10-09 Thread Mikael Morin
On Sunday 09 October 2011 21:12:12 Jakub Jelinek wrote:
 On Sun, Oct 09, 2011 at 04:32:12PM +0200, Mikael Morin wrote:
  currently, the testsuite return value is ignored by make. It is a little
  annoying if one wants to check automatically for regressions as we have
  to parse the testsuite output.
  This patch reverts to the normal make behaviour, which is to not ignore
  commands' return values. Note: As a result the -k flag has to be added to
  the make command line if one wants the tests to continue after one
  failure.
  
  OK for trunk?
 
 Please no.  This is a very bad idea, most of the testsuites on many
 architectures contain some FAILs and a failure from check-parallel-% would
 mean the *.log/*.sum files would be never merged in that case.
 
 If you really need to propagate the return value (I fail to see how it is
 useful), then you should e.g. store the $? value from $(RUNTEST) in
 check-parallel-% into some file in that directory and have the
 parallelization goal after the merging collect those from the individual
 files and or them all together into the final return value.
Thanks for the tips. I will just keep the patch locally for now.
I don't use parallel testing anyway.

Mikael


Re: [C++ Patch] PR 38980

2011-10-09 Thread Jason Merrill

OK.

Jason


Re: [C++ Patch] PR 50660

2011-10-09 Thread Jason Merrill
Hmm, I guess it's unlikely that a conversion is going to hit both that 
warning and another one.  OK.


Jason


Re: [C++ Patch] PR 50660

2011-10-09 Thread Jason Merrill

On 10/09/2011 11:40 PM, Jason Merrill wrote:

Hmm, I guess it's unlikely that a conversion is going to hit both that
warning and another one. OK.


Wait...how about changing conversion_null_warnings to stop looking 
through references?  Does that break anything?


Jason



Re: [C++ Patch] PR 50660

2011-10-09 Thread Paolo Carlini

On 10/10/2011 12:41 AM, Jason Merrill wrote:

On 10/09/2011 11:40 PM, Jason Merrill wrote:

Hmm, I guess it's unlikely that a conversion is going to hit both that
warning and another one. OK.
Wait...how about changing conversion_null_warnings to stop looking 
through references?  Does that break anything?

Let me check...

Paolo.


Re: [wwwdocs] Re: [2/2] tree-ssa-strlen optimization pass

2011-10-09 Thread Gerald Pfeifer
Hi Jakub,

this is a minor update on top of yours that I just applied.  Thanks
for taking the time to write this up.

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.46
diff -u -r1.46 changes.html
--- changes.html4 Oct 2011 19:07:01 -   1.46
+++ changes.html9 Oct 2011 23:05:47 -
@@ -125,13 +125,13 @@
   growth./li
 /ul/li
 
-liString length optimization pass has been added.  This pass attempts
+liA string length optimization pass has been added.  It attempts
   to track string lengths and optimize various standard C string functions
   like codestrlen/code, codestrchr/code, codestrcpy/code,
   codestrcat/code, codestpcpy/code and their
   code_FORTIFY_SOURCE/code counterparts into faster alternatives.
   This pass is enabled by default at code-O2/code or above, unless
-  optimizing for size, and can be disabled by
+  optimizing for size, and can be disabled by the
   code-fno-optimize-strlen/code option.  The pass can e.g. optimize
   pre
 char *bar (const char *a)


Re: [C++-11] User defined literals

2011-10-09 Thread Ed Smith-Rowland

On 10/08/2011 07:15 PM, Jason Merrill wrote:

On 10/08/2011 07:25 PM, Ed Smith-Rowland wrote:

Also, In spite of the documentation cp_parser_template_parameter_list
returns a TREE_VEC not a TREE_LIST. This happens inside
end_template_parm_list called inside the former. So parameter_list is a
TREE_VEC, parm_list is a TREE_LIST, parm is a PARM_DECL, etc.


Ah, I was thinking of template arguments rather than parameters.  
You're right, except that INNERMOST_TEMPLATE_PARMS should be just 
TREE_VALUE; you are already starting from the innermost parm list if 
you use what end_template_parm_list returns.


Though it occurs to me that push_template_decl_real might be a better 
place for this check.



I'm still looking for a fix for duplicate errors/warnings coming from
cp_parser_operator. I tried cp_parser_error and lost the errors. I'll
look for different code paths for the two invocations and see if I can
either move something up or see if something is set differently between
the two that would be useful for a flag.


One approach would be changing the token stream after the first error 
to something that won't produce another error, e.g. changing 
token-u.value to be an empty string after you complain about it being 
non-empty.

Interesting.  That one error is the one of the three that does *not* repeat.
One idea.  the fist error about non-empty string is followed by a 
consume_token (for the string).
Does cp_parser_identifier (parser) *not* consume the identifier token?  
Is that token left on the stream for  second pass?

I'll try it and get back.


Jason





Re: [4/4] Make SMS schedule register moves

2011-10-09 Thread Ayal Zaks
On Wed, Sep 28, 2011 at 4:49 PM, Richard Sandiford
richard.sandif...@linaro.org wrote:
 Ayal Zaks ayal.z...@gmail.com writes:
  +  /* The cyclic lifetime of move-new_reg starts and ends at move-def
  +     (the instruction that defines move-old_reg).
 
  So instruction I_REG_MOVE (new_reg=reg) must be scheduled before the
  next I_MUST_FOLLOW move/original-def (due to anti-dependence: it
  overwrites reg), but after the previous instance of I_MUST_FOLLOW (due
  to true dependence; i.e. account for latency also). Why do moves,
  except for the one closest to move-def (which is directly dependent
  upon it, i.e. for whom move-def == I_MUST_FOLLOW), have to worry
  about move-def at all? (Or have their cyclic lifetimes start and end
  there?)

 Because the uses of new_reg belong to the same move-def based cycle.


 the cycle (overloaded term; rather iteration in this context) to
 which the uses belong, is inferred from the cycle (absolute schedule
 time) in which they are scheduled, regardless of move-def.

 Just to prove your point about cycle being an overloaded term: I wasn't
 actually meaning it in the sense of (loop) iteration.  I meant a circular
 window based on move-def.


Point proven ;-)

 So (I think this is the uncontroversial bit): [M1] must be scheduled
 cyclically before [B] and cyclically after [C], with the cycle
 based at [B]:

    row 3 after [B]:  empty
    row 4:            [C]
    row 5:            [D]
    row 0:            empty
    row 1:            empty
    row 2:            [A]
    row 3 before [B]: empty

 [M1] could therefore go in row 1.  This part is OK.


 Here's how I see it:
 [M1] feeds [C] which is scheduled at cycle 10, so it must be scheduled
 before cycle 10-M_latency and after cycle 10-ii. [M1] uses the result
 of [B] which is scheduled at cycle 3, so must be scheduled after cycle
 3+B_latency and before cycle 3+ii. Taking all latencies to be 1 and
 ii=6, this yields a scheduling window of cycles [4,9]\cap[4,9]=[4,9];
 if scheduled at cycle 4 it must_follow [C], if scheduled at cycle 9 it
 must_precede [B]. This is identical to the logic behind the
 sched_window of any instruction, based on its dependencies (as you've
 updated not too long ago..), if we do not allow reg_moves (and
 arguably, one should not allow reg_moves when scheduling
 reg_moves...).

 To address the potential erroneous scenario of Loop 2, suppose [A] is
 scheduled as in the beginning in cycle 20, and that [M1] is scheduled
 in cycle 7 (\in[4,9]). Then
 [M2] feeds [D] and [A] which are scheduled at cycles 17 and 20, so it
 must be scheduled before cycle 17-1 and after cycle 20-6. [M2] uses
 the result of [M1], so must be scheduled after cycle 7+1 and before
 cycle 7+6. This yields the desired [14,16]\cap[8,13]=\emptyset.

 I agree it's natural to schedule moves for intra-iteration dependencies
 in the normal get_sched_window way.  But suppose we have a dependency:

   A --(T,N,1)-- B

 that requires two moves M1 and M2.  If we think in terms of cycles
 (in the SCHED_TIME sense), then this effectively becomes:

   A --(T,N1,1)-- M1 --(T,N2,0)-- M2 --(T,N3,0)-- B

 because it is now M1 that is fed by both the loop and the incoming edge.
 But if there is a second dependency:

   A --(T,M,0)-- C

 that also requires two moves, we end up with:

   A --(T,N1,1)-- M1 --(T,N2,0)-- M2 --(T,N3,0)-- B
                                        --(T,M3,-1)-- B

 and dependence distances of -1 feel a bit weird. :-)  Of course,
 what we really have are two parallel dependencies:

   A --(T,N1,1)-- M1 --(T,N2,0)-- M2 --(T,N3,0)-- B

   A --(T,M1,0)-- M1' --(T,M2,0)-- M2' --(T,N3,0)-- B

 where M1' and M2' occupy the same position as M1 and M2 in the schedule,
 but are one stage further along.  But we only schedule them once,
 so if we take the cycle/SCHED_TIME route, we have to introduce
 dependencies of distance -1.


Interesting; had to digest this distance 1 business, a result of
thinking in cycles instead of rows (or conversely), and mixing
dependences with scheduling; here's my understanding, based on your
explanations:

Suppose a Use is truely dependent on a Def, where both have been
scheduled at some absolute cycles; think of them as timing the first
iteration of the loop.
Assume first that Use appears originally after Def in the original
instruction sequence of the loop (dependence distance 0). In this
case, Use requires register moves if its distance D from Def according
to the schedule is more than ii cycles long -- by the time Use is
executed, the value it needs is no longer available in the def'd
register due to intervening occurrences of Def. So in this case, the
first reg-move (among D/ii) should appear after Def, recording its
value before the next occurrence of Def overwrites it, and feeding
subsequent moves as needed before each is overwritten. Thus the
scheduling window of this first reg-move is within (Def, Def+ii).

Now, suppose Use appears before Def, i.e., Use is upwards-exposed; if
it remains 

Re: [C++ Patch] PR 50660

2011-10-09 Thread Paolo Carlini

On 10/10/2011 12:41 AM, Jason Merrill wrote:

On 10/09/2011 11:40 PM, Jason Merrill wrote:

Hmm, I guess it's unlikely that a conversion is going to hit both that
warning and another one. OK.


Wait...how about changing conversion_null_warnings to stop looking 
through references?  Does that break anything?

If I just do this (I hope it's what you had in mind):

 static void
 conversion_null_warnings (tree totype, tree expr, tree fn, int argnum)
 {
-  tree t = non_reference (totype);
+  tree t = totype; /*non_reference (totype); */

I see this failure, for sure: cpp0x/variadic111.C, that is:

// PR c++/48424
// { dg-options -std=c++0x }

templatetypename... Args1
struct S
{
  templatetypename... Args2
void f(Args1... args1, Args2... args2)
{
}
};

int main()
{
  Sint, double s;
  s.f(1,2.0,false,'a');
}

triggers:

variadic111.C:16:22: warning: converting ‘false’ to pointer type for 
argument 3 of ‘void SArgs1::f(Args1 ..., Args2 ...) [with Args2 = 
{bool, char}; Args1 = {int, double}]’ [-Wconversion-null]


Also, tree-ssa/copyprop.C, for example.

Paolo.



Re: [wwwdocs] add libstdc++/1773 change to gcc-4.7/changes.html

2011-10-09 Thread Gerald Pfeifer
On Tue, 4 Oct 2011, Jonathan Wakely wrote:
 I've committed this, which documents the fix for
 http://gcc.gnu.org/PR1773 in gcc-4.7/changes.html, and also replaces
 some  characters with the gt; entity.

Interesting that the latter was not caught by the validator?  Thanks
for addressing it, Jonathan!

There also is a minor change on top of yours that I just committed;
see below.

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.47
diff -u -r1.47 changes.html
--- changes.html9 Oct 2011 23:08:14 -   1.47
+++ changes.html10 Oct 2011 01:06:49 -
@@ -267,8 +267,8 @@
 } a; // initializes a.i to 42
   /pre/blockquote/li
 
-  liG++ now sets the predefined macro tt__cplusplus/tt to the
-correct value, tt199711L/tt.
+  liG++ now sets the predefined macro code__cplusplus/code to the
+correct value, code199711L/code.
   /li
 /ul
   


Re: [google] record compiler options to .note sections

2011-10-09 Thread Dehao Chen
On Sun, Oct 9, 2011 at 5:28 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Sun, Oct 09, 2011 at 09:18:25AM +0800, Dehao Chen wrote:
 Unfortunately -frecord-gcc-switches cannot serve our purpose because
 the recorded switches are mergable, i.e. the linker will merge all
 options to a set of strings. However, object files may have distinct
 compile options. We want to preserve every object file's compile
 options when doing LIPO build.

 And -grecord-gcc-switches?  That one, although it is mergeable, still
 preserves every object files's compile options.

I tried -grecord-gcc-switches, but looks like it's not recording
options that I want.

e.g. the following two commands output the same assembly code, while
the former should record one more options.

gcc -g3 -grecord-gcc-switches a.c -Dabcdefgh -Dxy -I/usr/ -S
gcc -g3 -grecord-gcc-switches a.c -Dabcdefgh -Dxy -S

Thanks,
Dehao


        Jakub



Re: [CRIS] Hookize PREFERRED_RELOAD_CLASS

2011-10-09 Thread Hans-Peter Nilsson
 Date: Sun, 9 Oct 2011 17:47:22 +0400
 From: Anatoly Sokolov ae...@post.ru

   OK to install?
 
 * config/cris/cris.c (cris_preferred_reload_class): New function.
 (TARGET_PREFERRED_RELOAD_CLASS): Define.
 * config/cris/cris.h (OUTPUT_ADDR_CONST_EXTRA): Remove.
^^^
With the macro name in the ChangeLog entry fixed, yes, thanks.

brgds, H-P