Fix 59828 - Broken assembly on ppc* with two -mcpu= options

2014-01-17 Thread Alan Modra
This patch cures PR59828 by translating all the -mcpu options at once,
in order, to their equivalent assembler -m options by using a new spec
function.  In the process this removes some duplication.

All the rhs of -mcpu= options from the command line can be extracted
with %{mcpu=*:%*}, and then passed to a spec function.  The new
function was mostly already there in driver-rs6000.c to support
-mcpu=native.  However, the new spec function must be called for
non-native configurations, so it's necessary to split driver-rs6000.c
into two files, one for native support, the other always compiled in.

I deliberately omitted converting over aix42.h, aix51.h and aix52.h
because ASM_CPU_SPEC in those files translates -mcpu to different
assembly options than the aix -mcpu=native support.  Presumably the
assembler on older versions of aix doesn't understand the newer
options..

Bootstrapped and regression tested powerpc64-linux natively, and
x86_64-linux to powerpc64-linux cross.  OK to apply?

PR target/59828
* config/rs6000/driver-rs6000.c (asm_names): Add entries for "native".
(translate_cpu_to_asm): New function.
Move everything else to..
* config/rs6000/driver-nat-rs6000.c: ..here.  New file.
(host_detect_local_cpu): Make use of translate_cpu_to_asm.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
(translate_cpu_to_asm): Declare.
(EXTRA_SPEC_FUNCTIONS): Add translate_cpu_to_asm.
* config/rs6000/x-rs6000: Adjust for renamed file.
* config/rs6000/t-rs6000: Add driver-rs6000.o rule.
* config/rs6000/aix53.h (ASM_CPU_SPEC): Use translate_cpu_to_asm.
* config/rs6000/aix61.h (ASM_CPU_SPEC): Likewise.
* config.gcc (extra_gcc_objs): Add driver-rs6000.o.
* config.host (host_extra_gcc_objs): Remove driver-rs6000.o, add
driver-nat-rs6000.o.

Index: gcc/config/rs6000/driver-rs6000.c
===
--- gcc/config/rs6000/driver-rs6000.c   (revision 206599)
+++ gcc/config/rs6000/driver-rs6000.c   (working copy)
@@ -21,325 +21,9 @@
 #include "system.h"
 #include "coretypes.h"
 #include "tm.h"
-#include 
 
-#ifdef _AIX
-# include 
-#endif
+/* Array to map -mcpu= names to the switches passed to the assembler.  */
 
-#ifdef __linux__
-# include 
-#endif
-
-#if defined (__APPLE__) || (__FreeBSD__)
-# include 
-# include 
-#endif
-
-const char *host_detect_local_cpu (int argc, const char **argv);
-
-#if GCC_VERSION >= 0
-
-/* Returns parameters that describe L1_ASSOC associative cache of size
-   L1_SIZEKB with lines of size L1_LINE, and L2_SIZEKB.  */
-
-static char *
-describe_cache (unsigned l1_sizekb, unsigned l1_line,
-   unsigned l1_assoc ATTRIBUTE_UNUSED, unsigned l2_sizekb)
-{
-  char l1size[1000], line[1000], l2size[1000];
-
-  /* At the moment, gcc middle-end does not use the information about the
- associativity of the cache.  */
-
-  sprintf (l1size, "--param l1-cache-size=%u", l1_sizekb);
-  sprintf (line, "--param l1-cache-line-size=%u", l1_line);
-  sprintf (l2size, "--param l2-cache-size=%u", l2_sizekb);
-
-  return concat (l1size, " ", line, " ", l2size, " ", NULL);
-}
-
-#ifdef __APPLE__
-
-/* Returns the description of caches on Darwin.  */
-
-static char *
-detect_caches_darwin (void)
-{
-  unsigned l1_sizekb, l1_line, l1_assoc, l2_sizekb;
-  size_t len = 4;
-  static int l1_size_name[2] = { CTL_HW, HW_L1DCACHESIZE };
-  static int l1_line_name[2] = { CTL_HW, HW_CACHELINE };
-  static int l2_size_name[2] = { CTL_HW, HW_L2CACHESIZE };
-
-  sysctl (l1_size_name, 2, &l1_sizekb, &len, NULL, 0);
-  sysctl (l1_line_name, 2, &l1_line, &len, NULL, 0);
-  sysctl (l2_size_name, 2, &l2_sizekb, &len, NULL, 0);
-  l1_assoc = 0;
-
-  return describe_cache (l1_sizekb / 1024, l1_line, l1_assoc,
-l2_sizekb / 1024);
-}
-
-static const char *
-detect_processor_darwin (void)
-{
-  unsigned int proc;
-  size_t len = 4;
-
-  sysctlbyname ("hw.cpusubtype", &proc, &len, NULL, 0);
-
-  if (len > 0)
-switch (proc)
-  {
-  case 1:
-   return "601";
-  case 2:
-   return "602";
-  case 3:
-   return "603";
-  case 4:
-  case 5:
-   return "603e";
-  case 6:
-   return "604";
-  case 7:
-   return "604e";
-  case 8:
-   return "620";
-  case 9:
-   return "750";
-  case 10:
-   return "7400";
-  case 11:
-   return "7450";
-  case 100:
-   return "970";
-  default:
-   return "powerpc";
-  }
-
-  return "powerpc";
-}
-
-#endif /* __APPLE__ */
-
-#ifdef __FreeBSD__
-
-/* Returns the description of caches on FreeBSD PPC.  */
-
-static char *
-detect_caches_freebsd (void)
-{
-  unsigned l1_sizekb, l1_line, l1_assoc, l2_sizekb;
-  size_t len = 4;
-
-  /* Currently, as of FreeBSD-7.0, there is only the cacheline_size
- available via sysctl.  */
-  sysctlbyname ("machdep.cacheline_size", &l1_line, &len, NULL, 0);
-

[committed] PATCH: Fix a comment typo in ix86_split_lea_for_addr

2014-01-17 Thread H.J. Lu
Hi,

I checked in this patch as an obvious fix to correct a comment typo
in ix86_split_lea_for_addr.  The line below the comments is

gcc_assert (regno2 != regno0);

There is no way for r1 = r1 + C * r1.  It should be r1 = r1 + C * r2.

H.J.
---
Index: ChangeLog
===
--- ChangeLog   (revision 206744)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2014-01-17  H.J. Lu  
+
+   * config/i386/i386.c (ix86_split_lea_for_addr): Fix a comment
+   typo.
+
 2014-01-17  John David Anglin  
 
* config/pa/pa.c (pa_attr_length_indirect_call): Don't output a short
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 206744)
+++ config/i386/i386.c  (working copy)
@@ -18309,7 +18309,7 @@ ix86_split_lea_for_addr (rtx insn, rtx o
   /* Case r1 = r1 + ...  */
   if (regno1 == regno0)
{
- /* If we have a case r1 = r1 + C * r1 then we
+ /* If we have a case r1 = r1 + C * r2 then we
 should use multiplication which is very
 expensive.  Assume cost model is wrong if we
 have such case here.  */


Re: [patch] Pass -fuse-ld=gold to gccgo on targets supporting -fsplit-stack

2014-01-17 Thread Ian Lance Taylor
On Fri, Nov 29, 2013 at 5:29 AM, Matthias Klose  wrote:
> to get full advantage of the -fsplit-stack option, gccgo binaries have to be
> linked with gold, not the bfd linker.  When the system linker defaults to the
> bfd linker, then gccgo should explicitly use the gold linker, passing
> fuse-ld=gold, unless another -fuse-ld option is present.  Tested with and
> without having ld.gold on the system.

The change to libgo/configure.ac seems unrelated.

I don't think you can use "which" in a configure script.  You need to
use something like AC_PATH_PROG.

Ian


[committed] Fix long call support for indirect calls on hppa

2014-01-17 Thread John David Anglin
The attached change fixes a problem detected building gcl.  A short  
call to $$dyncall was sometimes

output when doing long call sequences.  This caused a link error.

Tested on hppa2.0-hp-hpux11.11, hppa64-hp-hpux11.11 and hppa-unknown- 
linux-gnu.  Committed

to active branches.

Dave
--
John David Anglin   dave.ang...@bell.net


2014-01-17  John David Anglin  

* config/pa/pa.c (pa_attr_length_indirect_call): Don't output a short
call to $$dyncall when TARGET_LONG_CALLS is true.

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 206593)
+++ config/pa/pa.c  (working copy)
@@ -8099,7 +8093,8 @@
 return 12;
 
   if (TARGET_FAST_INDIRECT_CALLS
-  || (!TARGET_PORTABLE_RUNTIME
+  || (!TARGET_LONG_CALLS
+ && !TARGET_PORTABLE_RUNTIME
  && ((TARGET_PA_20 && !TARGET_SOM && distance < 760)
  || distance < MAX_PCREL17F_OFFSET)))
 return 8;


Re: Allow passing arrays in registers on AArch64

2014-01-17 Thread Michael Hudson-Doyle
Ian Lance Taylor  writes:

> On Fri, Jan 17, 2014 at 11:32 AM, Michael Hudson-Doyle
>  wrote:
>>
>> On 18 Jan 2014 07:50, "Yufeng Zhang"  wrote:
>>>
>>> Also can you please try to add some new test(s)?  It may not be that
>>> straightforward to add non-C/C++ tests, but give it a try.
>>
>> Can you give some hints? Like at least where in the tree such a test would
>> go? I don't know this code at all.
>
> There is already a test in libgo, of course.
>
> I think it would be pretty hard to write a test that doesn't something
> like what libgo does.  The problem is that GCC is entirely consistent
> with and without your patch.  You could add a Go test that passes an
> array in gcc/testsuite/go.go-torture/execute/ easily enough, but it
> would be quite hard to add a test that doesn't pass whether or not
> your patch is applied.

I think it would have to be a code generation test, i.e. that compiling
something like

func second(e [2]int64) int64 {
return e[1]
}

does not access memory or something along those lines.  I'll have a look
next week.

Cheers,
mwh


Re: [patch] fix doc header in contrib/mklog

2014-01-17 Thread Diego Novillo
The patch is OK. It qualifies as obvious, too. Thanks.

On Thu, Jan 16, 2014 at 6:44 AM, Jonathan Wakely  wrote:
> The mklog script claims to write to stdout, but it actually modifies
> the input file in-place.
>
> OK to commit this change, which also updates the copyright dates?


Re: [PING][PATCH]Improving mklog [was: Re: RFC Asan instrumentation control]

2014-01-17 Thread Diego Novillo
Apologies for the delay. The patch is OK.

On Thu, Jan 16, 2014 at 12:59 AM, Tatiana Udalova  wrote:
> Ping!
>
> Thank you,
> Tatiana Udalova
>
>
> --
>
> Hello,
>
> I have reproduced the problem with mklog mentioned by Jakub:
>
>> In my experience mklog is pretty much useless, e.g. if you add a new
>> function, it will list the previous function as being modified rather
>> than the new one, etc.
>
> My focus was on functions from headers of diff-log chunks.
>
> I hacked a simple addition to mklog which skips unchanged functions in
> diff-log while adding function names to the final ChangeLog.
>
> New mklog results were verified by testsuite which compares reference
> ChangeLogs of patches from gcc trunk with logs generated by mklog.
>
> Patched mklog considerably reduced the number of unchanged functions in
> ChangeLog.
>
> Is it OK for trunk?
>
> Thank you,
> Tatiana Udalova
>
>


Re: [GOOGLE] Restrict the count_scale to be no larger than 100%

2014-01-17 Thread Xinliang David Li
Can callgraph node count be fixed up properly instead of doing
individual fixups like this?

David

On Fri, Jan 17, 2014 at 2:38 PM, Dehao Chen  wrote:
> In AutoFDO, sometime edge count might be propagated to be too large
> due to bad debug info. In this cases, we need to make sure the count
> scale is no larger than 100% otherwise it'll make real hot code cold.
>
> Bootstrapped and passed regression test. Performance test on-going.
>
> OK for google-4_8 if performance test is ok?
>
> Thanks,
> Dehao
>
> Index: gcc/tree-inline.c
> ===
> --- gcc/tree-inline.c (revision 206721)
> +++ gcc/tree-inline.c (working copy)
> @@ -2262,6 +2262,9 @@ copy_cfg_body (copy_body_data * id, gcov_type coun
>else
>  count_scale = REG_BR_PROB_BASE;
>
> +  if (flag_auto_profile && count_scale > REG_BR_PROB_BASE)
> +count_scale = REG_BR_PROB_BASE;
> +
>/* Register specific tree functions.  */
>gimple_register_cfg_hooks ();
>
> Index: gcc/cgraphclones.c
> ===
> --- gcc/cgraphclones.c (revision 206721)
> +++ gcc/cgraphclones.c (working copy)
> @@ -216,7 +216,10 @@ cgraph_clone_node (struct cgraph_node *n, tree dec
>   count, we will not update the original callee because it may
>   mistakenly mark some hot function as cold.  */
>if (flag_auto_profile && count >= n->count)
> -update_original = false;
> +{
> +  update_original = false;
> +  new_node->count = n->count;
> +}
>if (update_original)
>  {
>n->count -= count;


libgo patch committed: Align variable on 8-byte boundary

2014-01-17 Thread Ian Lance Taylor
This patch to libgo aligns the variable work in mgc0.c on an 8-byte
boundary.  The code expects that to be the case, and complains if it is
not aligned.  It would already be the case on a 64-bit system, but not
necessarily on a 32-bit system.  I haven't been able to recreate PR
59866, but I think this will fix it.  Bootstrapped and ran Go testsuite
on x86_64-unknown-linux-gnu.  Committed to 4.8 branch and mainline.

Ian

diff -r f1ef2f3189a2 libgo/runtime/mgc0.c
--- a/libgo/runtime/mgc0.c	Thu Jan 16 20:32:18 2014 -0800
+++ b/libgo/runtime/mgc0.c	Fri Jan 17 14:36:08 2014 -0800
@@ -180,7 +180,7 @@
 	Obj	*roots;
 	uint32	nroot;
 	uint32	rootcap;
-} work;
+} work __attribute__((aligned(8)));
 
 enum {
 	GC_DEFAULT_PTR = GC_NUM_INSTR,


[GOOGLE] Restrict the count_scale to be no larger than 100%

2014-01-17 Thread Dehao Chen
In AutoFDO, sometime edge count might be propagated to be too large
due to bad debug info. In this cases, we need to make sure the count
scale is no larger than 100% otherwise it'll make real hot code cold.

Bootstrapped and passed regression test. Performance test on-going.

OK for google-4_8 if performance test is ok?

Thanks,
Dehao

Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c (revision 206721)
+++ gcc/tree-inline.c (working copy)
@@ -2262,6 +2262,9 @@ copy_cfg_body (copy_body_data * id, gcov_type coun
   else
 count_scale = REG_BR_PROB_BASE;

+  if (flag_auto_profile && count_scale > REG_BR_PROB_BASE)
+count_scale = REG_BR_PROB_BASE;
+
   /* Register specific tree functions.  */
   gimple_register_cfg_hooks ();

Index: gcc/cgraphclones.c
===
--- gcc/cgraphclones.c (revision 206721)
+++ gcc/cgraphclones.c (working copy)
@@ -216,7 +216,10 @@ cgraph_clone_node (struct cgraph_node *n, tree dec
  count, we will not update the original callee because it may
  mistakenly mark some hot function as cold.  */
   if (flag_auto_profile && count >= n->count)
-update_original = false;
+{
+  update_original = false;
+  new_node->count = n->count;
+}
   if (update_original)
 {
   n->count -= count;


Re: [RFA] [PATCH][PR tree-optimization/59749] Fix recently introduced ree bug

2014-01-17 Thread Jeff Law

On 01/17/14 14:51, Jeff Law wrote:


Anyway, I clearly need to rethink that test.  Given this is something we
haven't seen in the wild, I'm going to disable it over the
weekend/monday so that enable-checking bugs pass and continue to ponder.

And the patch which disables the test.




jeff




commit 78f81be0894b38090cd6280f1e303610434d75c5
Author: Jeff Law 
Date:   Thu Jan 16 14:23:15 2014 -0700

   * ree.c (combine_set_extension): Temporarily disable test for
changing number of hard registers.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 37023c8..fabe408 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2014-01-17  Jeff Law  
+
+   * ree.c (combine_set_extension): Temporarily disable test for
+   changing number of hard registers.
+
 2014-01-17  Jan Hubicka  
 
PR middle-end/58125
diff --git a/gcc/ree.c b/gcc/ree.c
index 19d821c..421eb6c 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -297,11 +297,15 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx 
*orig_set)
   else
 new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
 
+#if 0
+  /* Rethinking test.  Temporarily disabled.  */
   /* We're going to be widening the result of DEF_INSN, ensure that doing so
  doesn't change the number of hard registers needed for the result.  */
   if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
-  != HARD_REGNO_NREGS (REGNO (orig_src), GET_MODE (SET_DEST (*orig_set
+  != HARD_REGNO_NREGS (REGNO (SET_DEST (*orig_set)),
+  GET_MODE (SET_DEST (*orig_set
return false;
+#endif
 
   /* Merge constants by directly moving the constant into the register under
  some conditions.  Recall that RTL constants are sign-extended.  */


Re: [RFA] [PATCH][PR tree-optimization/59749] Fix recently introduced ree bug

2014-01-17 Thread Jeff Law

On 01/16/14 15:07, Jakub Jelinek wrote:

On Thu, Jan 16, 2014 at 02:31:09PM -0700, Jeff Law wrote:

+2014-01-16  Jeff Law  
+
+   * ree.c (combine_set_extension): Correct test for changing number
+   of hard registers when widening a reaching definition.
+
  2014-01-16  Bernd Schmidt  

PR middle-end/56791
diff --git a/gcc/ree.c b/gcc/ree.c
index 19d821c..96cddd2 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -300,7 +300,8 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx 
*orig_set)
/* We're going to be widening the result of DEF_INSN, ensure that doing so
   doesn't change the number of hard registers needed for the result.  */
if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
-  != HARD_REGNO_NREGS (REGNO (orig_src), GET_MODE (SET_DEST (*orig_set
+  != HARD_REGNO_NREGS (REGNO (SET_DEST (*orig_set)),
+  GET_MODE (SET_DEST (*orig_set


Shouldn't that be:
 if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
!= HARD_REGNO_NREGS (REGNO (new_reg), GET_MODE (SET_DEST (*orig_set
instead?

I mean, for the !copy_needed case it is obviously the same thing (and that
is what triggers in the testcase), but don't we generally want to check if
the same hard register in a wider mode will not occupy more registers, and
in particular the hard register we are considering to use on the lhs of the
defining insn (i.e. new_reg)?
I thought about using that conditional more than once.  But talked 
myself out of it every time on the grounds that I wanted to test the 
original destination REGNO of the reaching def.


Obviously that is REGNO (new_reg) if !copy_needed.  But it's something 
completely different if copy_needed.



In the copy_needed case there's actually two destinations to consider. 
 The original destination as well as the new destination.  Both will be 
set in a mode wider than the destination of the original reaching def. 
(one will be set in the modified reaching def and the other in a copy insn).


ISTM we need the # hard reg checked on the original destination as the 
other (upper) hard regs might be live across the sequence, but not 
used/set in the sequence.   Then we need some kind of check on the upper 
part of the new destination...  But I thought I covered that elsewhere...


Anyway, I clearly need to rethink that test.  Given this is something we 
haven't seen in the wild, I'm going to disable it over the 
weekend/monday so that enable-checking bugs pass and continue to ponder.


jeff




Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-01-17 Thread Jan Hubicka
>   * combine-stack-adj.c (combine_stack_adjustments_for_block): Remove
>   ARG_SIZE note when adjustment was eliminated.

Ping...  This patch prevents me from switching the accumulate-args default
for generic and I am waiting for that witht he inliner tunning, so there is
quite a dependency chain.

Thanks,
HOnza


Re: [RFA] [PATCH][PR tree-optimization/59749] Fix recently introduced ree bug

2014-01-17 Thread Jeff Law

On 01/17/14 01:41, Eric Botcazou wrote:

Bootstrapped & regression tested on x86_64-unknown-linux.  Also
bootstrapped with --enable-checking=rtl.


Note that you can do only one bootstrap with --enable-checking=yes,rtl.


Installed on the trunk.


As far as I can see, no, it was not installed.

Pilot error, it's not installed.

jefff


Re: [google gcc-4_8] port gcov-tool to gcc-4_8

2014-01-17 Thread Xinliang David Li
Ok for google branch. It might be good to think about adding
regression tests (similar to those gcov tests under g++.dg/gcov, but
handling multiple files).

David

On Fri, Jan 17, 2014 at 12:59 PM, Rong Xu  wrote:
> last attachment is not the complete patch. Re-send again.
>
> On Fri, Jan 17, 2014 at 12:51 PM, Rong Xu  wrote:
>> Do we have to split params.def?
>>
>> I can include params.h and link in params.o (a special version as we
>> don't have some global vars).
>>
>> As for lipo_cutoff, I think we don't need a special handling -- we
>> should use the default value of 100 and let logic in dyn-ipa.c takes
>> care of the rest.
>>
>> Please find the update patch which reads the default values from params.def.
>>
>> -Rong
>>
>> On Fri, Jan 17, 2014 at 12:05 PM, Xinliang David Li  
>> wrote:
>>> For LIPO parameters, you can do this
>>> 1) isolate all LIPO specific parameters into lipo_params.def, and
>>> include it from params.def.
>>> 2) include lipo_params.def (after proper definition of DEFPARAM) in
>>> the profile tool source dir.
>>>
>>> By so doing, the compiler setting and profile tool will be in sync.
>>> (Unfortuately, for historic reason, the lipo_cutoff default val is
>>> still set in dyn-ipa.c even with the parameter, so some comments needs
>>> to be added in dyn-ipa.c to make sure the value should be doubly
>>> updated).
>>>
>>> David
>>>
>>> On Fri, Jan 17, 2014 at 11:15 AM, Rong Xu  wrote:
 Hi,

 This patch port the gcov-tool work to google/gcc-4_8 branches.

 Tested with spec2006, profiledbootstrap and google internal benchmarks.

 -Rong


PR middle-end/58125

2014-01-17 Thread Jan Hubicka
Hi,
this patch fixes no longer reproducing ICE where we try to free summary of alias
but we never allocate one.

Bootstrapped/regtested x86_64-linux, comitted.
Index: ChangeLog
===
--- ChangeLog   (revision 206732)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2014-01-17  Jan Hubicka  
+
+   PR middle-end/58125
+   * ipa-inline-analysis.c (inline_free_summary):
+   Do not free summary of aliases.
+
 2014-01-17  Jakub Jelinek  
 
PR middle-end/59706
Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 206684)
+++ ipa-inline-analysis.c   (working copy)
@@ -4146,7 +4146,8 @@
   if (!inline_edge_summary_vec.exists ())
 return;
   FOR_EACH_DEFINED_FUNCTION (node)
-reset_inline_summary (node);
+if (!node->alias)
+  reset_inline_summary (node);
   if (function_insertion_hook_holder)
 cgraph_remove_function_insertion_hook (function_insertion_hook_holder);
   function_insertion_hook_holder = NULL;


Re: [google gcc-4_8] port gcov-tool to gcc-4_8

2014-01-17 Thread Rong Xu
Do we have to split params.def?

I can include params.h and link in params.o (a special version as we
don't have some global vars).

As for lipo_cutoff, I think we don't need a special handling -- we
should use the default value of 100 and let logic in dyn-ipa.c takes
care of the rest.

Please find the update patch which reads the default values from params.def.

-Rong

On Fri, Jan 17, 2014 at 12:05 PM, Xinliang David Li  wrote:
> For LIPO parameters, you can do this
> 1) isolate all LIPO specific parameters into lipo_params.def, and
> include it from params.def.
> 2) include lipo_params.def (after proper definition of DEFPARAM) in
> the profile tool source dir.
>
> By so doing, the compiler setting and profile tool will be in sync.
> (Unfortuately, for historic reason, the lipo_cutoff default val is
> still set in dyn-ipa.c even with the parameter, so some comments needs
> to be added in dyn-ipa.c to make sure the value should be doubly
> updated).
>
> David
>
> On Fri, Jan 17, 2014 at 11:15 AM, Rong Xu  wrote:
>> Hi,
>>
>> This patch port the gcov-tool work to google/gcc-4_8 branches.
>>
>> Tested with spec2006, profiledbootstrap and google internal benchmarks.
>>
>> -Rong
Index: Makefile.in
===
--- Makefile.in (revision 206671)
+++ Makefile.in (working copy)
@@ -123,7 +123,8 @@ SUBDIRS =@subdirs@ build
 
 # Selection of languages to be made.
 CONFIG_LANGUAGES = @all_selected_languages@
-LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES)
+LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) gcov-tool$(exeext) \
+$(CONFIG_LANGUAGES)
 
 # Default values for variables overridden in Makefile fragments.
 # CFLAGS is for the user to override to, e.g., do a cross build with -O2.
@@ -194,6 +195,7 @@ GCC_WARN_CXXFLAGS = $(LOOSE_WARN) $($(@D)-warn) $(
 build/gengtype-lex.o-warn = -Wno-error
 gengtype-lex.o-warn = -Wno-error
 expmed.o-warn = -Wno-error
+libgcov-util.o-warn = -Wno-error
 
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either
@@ -755,6 +757,7 @@ GCC_TARGET_INSTALL_NAME := $(target_noncanonical)-
 CPP_INSTALL_NAME := $(shell echo cpp|sed '$(program_transform_name)')
 GCOV_INSTALL_NAME := $(shell echo gcov|sed '$(program_transform_name)')
 PROFILE_TOOL_INSTALL_NAME := $(shell echo profile_tool|sed 
'$(program_transform_name)')
+GCOV_TOOL_INSTALL_NAME := $(shell echo gcov-tool|sed 
'$(program_transform_name)')
 
 # Setup the testing framework, if you have one
 EXPECT = `if [ -f $${rootme}/../expect/expect ] ; then \
@@ -1480,7 +1483,8 @@ ALL_HOST_FRONTEND_OBJS = $(foreach v,$(CONFIG_LANG
 
 ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) $(OBJS-libcommon) \
   $(OBJS-libcommon-target) @TREEBROWSER@ main.o c-family/cppspec.o \
-  $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS)
+  $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
+  $(GCOV_TOOL_OBJS)
 
 # This lists all host object files, whether they are included in this
 # compilation or not.
@@ -1505,6 +1509,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h insn
  $(SPECS) collect2$(exeext) gcc-ar$(exeext) gcc-nm$(exeext) \
  gcc-ranlib$(exeext) \
  gcov-iov$(build_exeext) gcov$(exeext) gcov-dump$(exeext) \
+ gcov-tool$(exeect) \
  gengtype$(exeext) *.[0-9][0-9].* *.[si] *-checksum.c libbackend.a \
  libcommon-target.a libcommon.a libgcc.mk
 
@@ -4070,6 +4075,24 @@ GCOV_DUMP_OBJS = gcov-dump.o vec.o ggc-none.o
 gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \
$(LIBS) -o $@
+
+libgcov-util.o: $(srcdir)/../libgcc/libgcov-util.c gcov-io.c $(GCOV_IO_H) \
+  $(srcdir)/../libgcc/libgcov-driver.c 
$(srcdir)/../libgcc/libgcov-driver-system.c \
+  $(srcdir)/../libgcc/libgcov-merge.c $(srcdir)/../libgcc/libgcov.h \
+  $(SYSTEM_H) coretypes.h $(TM_H) $(CONFIG_H) version.h intl.h $(DIAGNOSTIC_H)
+   +$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) -o $@ 
$<
+dyn-ipa.o: $(srcdir)/../libgcc/dyn-ipa.c gcov-io.c 
$(srcdir)/../libgcc/libgcov.h \
+   $(GCOV_IO_H) $(SYSTEM_H) coretypes.h \
+   $(TM_H) $(CONFIG_H) version.h intl.h $(DIAGNOSTIC_H)
+   +$(COMPILER) -DIN_GCOV_TOOL -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
+ $(INCLUDES) -o $@ $<
+
+GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o dyn-ipa.o params.o
+gcov-tool.o: gcov-tool.c $(GCOV_IO_H) intl.h $(SYSTEM_H) coretypes.h $(TM_H) \
+   $(CONFIG_H) version.h $(DIAGNOSTIC_H)
+gcov-tool$(exeext): $(GCOV_TOOL_OBJS) $(LIBDEPS)
+   +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) \
+ $(LIBS) -o $@
 #
 # Build the include directories.  The stamp files are stmp-* rather than
 # s-* so that mostlyclean does not force the include directory to
@@ -4697,6 +4720,13 @@ install-common: native lang.install-common install
$(INSTALL_PROGRAM) $(srcdir)/../contrib/profile_tool \

Re: [wide-int] resolve bootstrap issue

2014-01-17 Thread Mike Stump
On Jan 16, 2014, at 2:55 AM, Richard Sandiford  
wrote:
>>> Why did you need the ?  It was supposed to work without.
>> 
>> The code in question needs something that is max int + max significand
>> real in size, we made the max int smaller (smaller than this quantity on
>> x86) so, this code needs a special wide int that is bigger.  The type is
>> free as vrp uses the same type.  As for why Kenny choose this method,
>> I'd defer to him.
> 
> To be clear, I was only talking about the  in
> "wi::lrshift".  Just "wi::lrshift" should be fine.
> 
> Tested on x86_64-linux-gnu.  OK to install?

Ah, yes, I was trying to get it to compile at one point and added that; I now 
see what you mean. Yes, this is fine.

> Index: gcc/real.c
> ===
> --- gcc/real.c2014-01-15 16:39:39.883276568 +
> +++ gcc/real.c2014-01-15 16:39:40.376274546 +
> @@ -1444,7 +1444,7 @@ real_to_integer (const REAL_VALUE_TYPE *
>   w = SIGSZ * HOST_BITS_PER_LONG + words * HOST_BITS_PER_WIDE_INT;
>   tmp = real_int::from_array
>   (val, (w + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT, w);
> -  tmp = wi::lrshift (tmp, (words * HOST_BITS_PER_WIDE_INT) - 
> exp);
> +  tmp = wi::lrshift (tmp, (words * HOST_BITS_PER_WIDE_INT) - exp);
>   result = wide_int::from (tmp, precision, UNSIGNED);
> 
>   if (r->sign)


Re: [PATCH] FIx up ANNOTATE_EXPR gimplification (PR middle-end/59706)

2014-01-17 Thread Richard Biener
Jakub Jelinek  wrote:
>Hi!
>
>When gimplifying ANNOTATE_EXPR, gimplify_expr used create_tmp_var_raw,
>which unfortunately (among tons of other desirable things) doesn't set
>DECL_CONTEXT on the temporary var and tree-nested.c then ICEs on it
>because of that.  The following patch fixes that.  Unfortunately,
>on the second (invalid) testcase this started to ICE during error
>recovery,
>so the patch emits the IFN_ANNOTATE internal call only if the cond
>doesn't
>have obviously bogus type.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

>2014-01-17  Jakub Jelinek  
>
>   PR middle-end/59706
>   * gimplify.c (gimplify_expr): Use create_tmp_var
>   instead of create_tmp_var_raw.  If cond doesn't have
>   integral type, don't add the IFN_ANNOTATE builtin at all.
>
>   * gfortran.dg/pr59706.f90: New test.
>   * g++.dg/ext/pr59706.C: New test.
>
>--- gcc/gimplify.c.jj  2014-01-08 10:23:24.0 +0100
>+++ gcc/gimplify.c 2014-01-17 16:51:12.324526084 +0100
>@@ -7491,7 +7491,14 @@ gimplify_expr (tree *expr_p, gimple_seq
> {
>   tree cond = TREE_OPERAND (*expr_p, 0);
>   tree id = TREE_OPERAND (*expr_p, 1);
>-  tree tmp = create_tmp_var_raw (TREE_TYPE(cond), NULL);
>+  tree type = TREE_TYPE (cond);
>+  if (!INTEGRAL_TYPE_P (type))
>+{
>+  *expr_p = cond;
>+  ret = GS_OK;
>+  break;
>+}
>+  tree tmp = create_tmp_var (type, NULL);
>   gimplify_arg (&cond, pre_p, EXPR_LOCATION (*expr_p));
>   gimple call = gimple_build_call_internal (IFN_ANNOTATE, 2,
> cond, id);
>--- gcc/testsuite/gfortran.dg/pr59706.f90.jj   2014-01-17
>17:19:23.665900803 +0100
>+++ gcc/testsuite/gfortran.dg/pr59706.f90  2014-01-17 17:17:48.0
>+0100
>@@ -0,0 +1,10 @@
>+! PR middle-end/59706
>+! { dg-do compile }
>+
>+  integer i
>+  do concurrent (i=1:2)
>+  end do
>+contains
>+  subroutine foo
>+  end 
>+end
>--- gcc/testsuite/g++.dg/ext/pr59706.C.jj  2014-01-17 17:23:46.999556115
>+0100
>+++ gcc/testsuite/g++.dg/ext/pr59706.C 2014-01-17 17:20:53.0
>+0100
>@@ -0,0 +1,21 @@
>+// PR middle-end/59706
>+// { dg-do compile }
>+
>+extern struct S s;
>+struct T { T (); ~T (); int t; } t;
>+
>+void
>+foo ()
>+{
>+  #pragma GCC ivdep
>+  while (s)   // { dg-error "could not convert" }
>+;
>+}
>+
>+void
>+bar ()
>+{
>+  #pragma GCC ivdep
>+  while (t)   // { dg-error "could not convert" }
>+;
>+}
>
>   Jakub




[gomp4] Generalize mapping functions for future OpenACC runtime library usage.

2014-01-17 Thread James Norris

Hi!

Here is a patch that changes the mapping functions to use functions defined
with the device descriptor. These changes generalize the mapping functions
so that they can be used by the soon to be added OpenACC Runtime functions.

These changes assume that the previous set of changes described in:
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00527.html, have been
applied.

Would appreciate a review of the changes.

Thanks!


Change mapping functionality to use handlers defined by device
descriptor. These changes are required for the addition of the
OpenACC Runtime Library.

* ialias.h: New file
* libgomp.h: Remove definitions and include file from where definitions
  now reside.
* splay-tree.c: Add forward reference to splay_compare, replace call to
  abort with call to gomp_fatal, and fix comment formatting.
* splay-tree.h: Remove inclusion of header file, add definition for
  splay_tree_key_s, remove splay_compare definition, and fix a typo.
* target.c: Remove inclusion of unused header files, add inclusion of
  header file target.h, remove definition of structures: target_mem_desc
  and gomp_device_desc, and fix comment formatting.
  (gomp_map_vars, gomp_unmap_tgt, gomp_update): Replace calls with
  corresponding calls defined by device descriptor (gomp_device_descr).
  (gomp_find_available_plugins): Replace call to realloc with call to
  gomp_realloc and add assignments for new device descriptor operations.
* target.h: Remove structure definiton splay_tree_key_s, add structure
  definition target_mem_desc and gomp_device_desc, add new function
  handlers to gomp_device_descr.

diff --git a/libgomp/ialias.h b/libgomp/ialias.h
new file mode 100644
index 000..167907b
--- /dev/null
+++ b/libgomp/ialias.h
@@ -0,0 +1,52 @@
+/* Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Jakub Jelinek .
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+/* Defines shared between OpenACC and OpenMP  */
+
+#ifndef IALIAS_H
+#define IALIAS_H 1
+
+#ifdef HAVE_ATTRIBUTE_VISIBILITY
+# define attribute_hidden __attribute__ ((visibility ("hidden")))
+#else
+# define attribute_hidden
+#endif
+
+#ifdef HAVE_ATTRIBUTE_ALIAS
+# define ialias_ulpialias_str1(__USER_LABEL_PREFIX__)
+# define ialias_str1(x)ialias_str2(x)
+# define ialias_str2(x)#x
+# define ialias(fn) \
+  extern __typeof (fn) goacc_ialias_##fn \
+__attribute__ ((alias (#fn))) attribute_hidden;
+# define ialias_redirect(fn) \
+  extern __typeof (fn) fn __asm__ (ialias_ulp "goacc_ialias_" #fn) 
attribute_hidden;
+# define ialias_call(fn) goacc_ialias_ ## fn
+#else
+# define ialias(fn)
+# define ialias_redirect(fn)
+# define ialias_call(fn) fn
+#endif
+
+#endif /* IALIAS_H */
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index bdc0486..933ed23 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -675,26 +675,6 @@ extern int gomp_test_nest_lock_25 (omp_nest_lock_25_t *) 
__GOMP_NOTHROW;
 # define gomp_test_nest_lock_30 omp_test_nest_lock
 #endif
 
-#ifdef HAVE_ATTRIBUTE_VISIBILITY
-# define attribute_hidden __attribute__ ((visibility ("hidden")))
-#else
-# define attribute_hidden
-#endif
-
-#ifdef HAVE_ATTRIBUTE_ALIAS
-# define ialias_ulpialias_str1(__USER_LABEL_PREFIX__)
-# define ialias_str1(x)ialias_str2(x)
-# define ialias_str2(x)#x
-# define ialias(fn) \
-  extern __typeof (fn) gomp_ialias_##fn \
-__attribute__ ((alias (#fn))) attribute_hidden;
-# define ialias_redirect(fn) \
-  extern __typeof (fn) fn __asm__ (ialias_ulp "gomp_ialias_" #fn) 
attribute_hidden;
-# define ialias_call(fn) gomp_ialias_ ## fn
-#else
-# define ialias(fn)
-# define ialias_redirect(fn)
-# define ialias_call(fn) fn
-#endif
+#include "ialias.h"
 
 #endif /* LIBGOMP_H */
diff --git a/libgomp/splay-tree.c b/libgomp/splay-tree.c
index fc929f9..fe5596d 100644
--- a/libgomp/splay-tree.c
+++ b/libgomp/splay-tree.c
@@ -43,11 +43,12 @@ typedef struct splay_tree_key_s *splay_tree_key;
The major featur

Re: [google gcc-4_8] port gcov-tool to gcc-4_8

2014-01-17 Thread Xinliang David Li
For LIPO parameters, you can do this
1) isolate all LIPO specific parameters into lipo_params.def, and
include it from params.def.
2) include lipo_params.def (after proper definition of DEFPARAM) in
the profile tool source dir.

By so doing, the compiler setting and profile tool will be in sync.
(Unfortuately, for historic reason, the lipo_cutoff default val is
still set in dyn-ipa.c even with the parameter, so some comments needs
to be added in dyn-ipa.c to make sure the value should be doubly
updated).

David

On Fri, Jan 17, 2014 at 11:15 AM, Rong Xu  wrote:
> Hi,
>
> This patch port the gcov-tool work to google/gcc-4_8 branches.
>
> Tested with spec2006, profiledbootstrap and google internal benchmarks.
>
> -Rong


[PATCH] FIx up ANNOTATE_EXPR gimplification (PR middle-end/59706)

2014-01-17 Thread Jakub Jelinek
Hi!

When gimplifying ANNOTATE_EXPR, gimplify_expr used create_tmp_var_raw,
which unfortunately (among tons of other desirable things) doesn't set
DECL_CONTEXT on the temporary var and tree-nested.c then ICEs on it
because of that.  The following patch fixes that.  Unfortunately,
on the second (invalid) testcase this started to ICE during error recovery,
so the patch emits the IFN_ANNOTATE internal call only if the cond doesn't
have obviously bogus type.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-01-17  Jakub Jelinek  

PR middle-end/59706
* gimplify.c (gimplify_expr): Use create_tmp_var
instead of create_tmp_var_raw.  If cond doesn't have
integral type, don't add the IFN_ANNOTATE builtin at all.

* gfortran.dg/pr59706.f90: New test.
* g++.dg/ext/pr59706.C: New test.

--- gcc/gimplify.c.jj   2014-01-08 10:23:24.0 +0100
+++ gcc/gimplify.c  2014-01-17 16:51:12.324526084 +0100
@@ -7491,7 +7491,14 @@ gimplify_expr (tree *expr_p, gimple_seq
  {
tree cond = TREE_OPERAND (*expr_p, 0);
tree id = TREE_OPERAND (*expr_p, 1);
-   tree tmp = create_tmp_var_raw (TREE_TYPE(cond), NULL);
+   tree type = TREE_TYPE (cond);
+   if (!INTEGRAL_TYPE_P (type))
+ {
+   *expr_p = cond;
+   ret = GS_OK;
+   break;
+ }
+   tree tmp = create_tmp_var (type, NULL);
gimplify_arg (&cond, pre_p, EXPR_LOCATION (*expr_p));
gimple call = gimple_build_call_internal (IFN_ANNOTATE, 2,
  cond, id);
--- gcc/testsuite/gfortran.dg/pr59706.f90.jj2014-01-17 17:19:23.665900803 
+0100
+++ gcc/testsuite/gfortran.dg/pr59706.f90   2014-01-17 17:17:48.0 
+0100
@@ -0,0 +1,10 @@
+! PR middle-end/59706
+! { dg-do compile }
+
+  integer i
+  do concurrent (i=1:2)
+  end do
+contains
+  subroutine foo
+  end 
+end
--- gcc/testsuite/g++.dg/ext/pr59706.C.jj   2014-01-17 17:23:46.999556115 
+0100
+++ gcc/testsuite/g++.dg/ext/pr59706.C  2014-01-17 17:20:53.0 +0100
@@ -0,0 +1,21 @@
+// PR middle-end/59706
+// { dg-do compile }
+
+extern struct S s;
+struct T { T (); ~T (); int t; } t;
+
+void
+foo ()
+{
+  #pragma GCC ivdep
+  while (s)// { dg-error "could not convert" }
+;
+}
+
+void
+bar ()
+{
+  #pragma GCC ivdep
+  while (t)// { dg-error "could not convert" }
+;
+}

Jakub


[PATCH] Avoid -Wunused-macros warning for #pragma GCC target added macros (PR target/58944)

2014-01-17 Thread Jakub Jelinek
Hi!

It makes no sense to warn about unused macros that weren't defined
by the user, but the compiler instead injected them, the user
has no control on them.
For macros predefined at the beginning of the CU we don't get warnings
because they don't match MAIN_FILE_P, but macros added for #pragma GCC target
get location_t from the location of the pragma (and without libcpp hacks
that could possibly slow down preprocessing I don't see how to get around
it), so the following hack seems to be easiest.  All newly created macros
when -Wunused-macros isn't on are initialized with macro->used = true,
and never warned about for this warning, so this patch just arranges for
all the cpp_define calls from #pragma GCC target to set that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-01-17  Jakub Jelinek  

PR target/58944
* config/i386/i386-c.c (ix86_pragma_target_parse): Temporarily
clear cpp_get_options (parse_in)->warn_unused_macros for
ix86_target_macros_internal with cpp_define.

* gcc.target/i386/pr58944.c: Drop -march=native from dg-options.
Remove dg-prune-output lines.

--- gcc/config/i386/i386-c.c.jj 2014-01-03 11:41:06.0 +0100
+++ gcc/config/i386/i386-c.c2014-01-17 14:24:15.828447673 +0100
@@ -458,6 +458,13 @@ ix86_pragma_target_parse (tree args, tre
   (enum fpmath_unit) prev_opt->x_ix86_fpmath,
   cpp_undef);
 
+  /* For the definitions, ensure all newly defined macros are considered
+ as used for -Wunused-macros.  There is no point warning about the
+ compiler predefined macros.  */
+  cpp_options *cpp_opts = cpp_get_options (parse_in);
+  unsigned char saved_warn_unused_macros = cpp_opts->warn_unused_macros;
+  cpp_opts->warn_unused_macros = 0;
+
   /* Define all of the macros for new options that were just turned on.  */
   ix86_target_macros_internal (cur_isa & diff_isa,
   cur_arch,
@@ -465,6 +472,8 @@ ix86_pragma_target_parse (tree args, tre
   (enum fpmath_unit) cur_opt->x_ix86_fpmath,
   cpp_define);
 
+  cpp_opts->warn_unused_macros = saved_warn_unused_macros;
+
   return true;
 }
 
--- gcc/testsuite/gcc.target/i386/pr58944.c.jj  2013-12-03 08:27:22.0 
+0100
+++ gcc/testsuite/gcc.target/i386/pr58944.c 2014-01-17 14:29:53.288756542 
+0100
@@ -1,11 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-Wunused-macros -march=native" } */
+/* { dg-options "-Wunused-macros" } */
 
 #pragma GCC push_options
 #pragma GCC target("xsaveopt")
 void fn1(void) {}
 #pragma GCC pop_options
-
-/* { dg-prune-output "macro \"__code_model_" } */ 
-/* { dg-prune-output "macro \"__XSAVE__\" is not used" } */ 
-/* { dg-prune-output "macro \"__XSAVEOPT__\" is not used" } */ 

Jakub


Re: Allow passing arrays in registers on AArch64

2014-01-17 Thread Ian Lance Taylor
On Fri, Jan 17, 2014 at 11:32 AM, Michael Hudson-Doyle
 wrote:
>
> On 18 Jan 2014 07:50, "Yufeng Zhang"  wrote:
>>
>> Also can you please try to add some new test(s)?  It may not be that
>> straightforward to add non-C/C++ tests, but give it a try.
>
> Can you give some hints? Like at least where in the tree such a test would
> go? I don't know this code at all.

There is already a test in libgo, of course.

I think it would be pretty hard to write a test that doesn't something
like what libgo does.  The problem is that GCC is entirely consistent
with and without your patch.  You could add a Go test that passes an
array in gcc/testsuite/go.go-torture/execute/ easily enough, but it
would be quite hard to add a test that doesn't pass whether or not
your patch is applied.

Ian


Re: PATCH: PR target/59794: [4.7/4.8/4.9 Regression] i386 backend fails to detect MMX/SSE/AVX ABI changes

2014-01-17 Thread H.J. Lu
On Tue, Jan 14, 2014 at 8:12 AM, Uros Bizjak  wrote:
> On Tue, Jan 14, 2014 at 3:18 PM, H.J. Lu  wrote:
>
>> There are several problems with i386 MMX/SSE/AVX ABI change detection:
>>
>> 1. MMX/SSE return value isn't checked for -m32 since revision 83533:
>>
>> http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=83533
>>
>> which added ix86_struct_value_rtx.  Since MMX/SSE condition is always
>> false, the MMX/SSE return value ABI change is disabled.
>> 2. For -m32, the same warning on MMX/SSE argument is issued twice, one from
>> type_natural_mode and one from function_arg_32.
>> 3. AVX return value ABI change isn't checked.
>>
>> This patch does followings:
>>
>> 1. Remove the ineffective ix86_struct_value_rtx.
>> 2. Add a bool parameter to indicate if type is used for function return
>> value.  Warn ABI change if the vector mode isn't available for function
>> return value.  Add AVX function return value ABI change warning.
>> 3. Consolidate ABI change warning into type_natural_mode.
>> 4. Update g++.dg/ext/vector23.C to prune ABI change for Linux/x86
>> added by the AVX function return value ABI change warning.
>> 5. Update gcc.target/i386/pr39162.c to avoid the AVX function return
>> value ABI change warning.
>> 6. Add testcases for warning MMX/SSE/AVX ABI changes in parameter
>> passing and function return.
>>
>> Tested on Linux/x86-64 with -m32/-m64 for "make check".  OK to install?
>>
>> Thanks.
>>
>> H.J.
>> ---
>> gcc/
>>
>> 2014-01-14  H.J. Lu  
>>
>> PR target/59794
>> * config/i386/i386.c (type_natural_mode): Add a bool parameter
>> to indicate if type is used for function return value.  Warn
>> ABI change if the vector mode isn't available for function
>> return value.
>> (ix86_function_arg_advance): Pass false to type_natural_mode.
>> (ix86_function_arg): Likewise.
>> (ix86_gimplify_va_arg): Likewise.
>> (function_arg_32): Don't warn ABI change.
>> (ix86_function_value): Pass true to type_natural_mode.
>> (ix86_return_in_memory): Likewise.
>> (ix86_struct_value_rtx): Removed.
>> (TARGET_STRUCT_VALUE_RTX): Likewise.
>>
>> gcc/testsuite/
>>
>> 2014-01-14  H.J. Lu  
>>
>> PR target/59794
>> * g++.dg/ext/vector23.C: Also prune ABI change for Linux/x86.
>> * gcc.target/i386/pr39162.c (y): New __m256i variable.
>> (bar): Change return type to void.  Set y to x.
>> * gcc.target/i386/pr59794-1.c: New testcase.
>> * gcc.target/i386/pr59794-2.c: Likewise.
>> * gcc.target/i386/pr59794-3.c: Likewise.
>> * gcc.target/i386/pr59794-4.c: Likewise.
>> * gcc.target/i386/pr59794-5.c: Likewise.
>> * gcc.target/i386/pr59794-6.c: Likewise.
>> * gcc.target/i386/pr59794-7.c: Likewise.
>
> OK for mainline and release branches after a couple of days.
>

I back ported it to 4.8 branch.  But type_natural_mode on 4.7
branch is too different from trunk.  I stopped at 4.8 branch.

-- 
H.J.


Re: Allow passing arrays in registers on AArch64

2014-01-17 Thread Yufeng Zhang

Hi Michael,

Thanks for the fix.  The patch looks OK to me in general, although I 
have some minor comments below.


On 01/17/14 08:22, Michael Hudson-Doyle wrote:

Hi, as discussed inhttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=59799
GCC currently gets a detail of the AArch64 ABI wrong: arrays are not
always passed by reference.  Fortunately the fix is rather easy...


Can you please indicate what kind of testing you have run, e.g. regtest 
on aarch64-none-abi?


Also can you please try to add some new test(s)?  It may not be that 
straightforward to add non-C/C++ tests, but give it a try.




I guess this is an ABI break but my understand there has been no release
of GCC which supports compiling a language that can pass arrays by value
on AArch64 yet.

Cheers,
mwh

   2014-01-17  Michael Hudson-Doyle

 PR target/59799

 * config/aarch64/aarch64.c (aarch64_pass_by_reference):
   The rules for passing arrays in registers are the same as
   for structs, so remove the special case for them.


aarch64-abi-fix.diff


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index fa53c71..d63da95 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -987,10 +987,7 @@ aarch64_pass_by_reference (cumulative_args_t pcum 
ATTRIBUTE_UNUSED,

if (type)
  {
-  /* Arrays always passed by reference.  */
-  if (TREE_CODE (type) == ARRAY_TYPE)
-   return true;
-  /* Other aggregates based on their size.  */
+  /* Aggregates based on their size.  */
if (AGGREGATE_TYPE_P (type))
size = int_size_in_bytes (type);
  }



You can actually merge the two iffs to have something like:

  /* Aggregates are based on their size.  */
  if (type && AGGREGATE_TYPE_P (type))
size = int_size_in_bytes (type);

Thanks,
Yufeng



Re: [Patch, cilk, C++] Fix cilk testsuite failure

2014-01-17 Thread Steve Ellcey
On Fri, 2014-01-17 at 08:23 +, Richard Sandiford wrote:

> 
> I think it'd be more direct to check the register class, since we used
> to store CCmode in GPRs too.  I.e. ST_REGNO_P (XEXP (op, 0)).
> 
> OK with that change, thanks.  Please backport to 4.8 too.
> 
> Richard

I assume you meant ST_REG_P instead of ST_REGNO_P and it looks like
ST_REG_P wants a register number as an argument so the patch I tested
and checked in is:


diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 617391c..ff28750 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -8184,7 +8184,7 @@ mips_print_operand (FILE *file, rtx op, int letter)
 case 't':
   {
int truth = (code == NE) == (letter == 'T');
-   fputc ("zfnt"[truth * 2 + (GET_MODE (op) == CCmode)], file);
+   fputc ("zfnt"[truth * 2 + ST_REG_P (REGNO (XEXP (op, 0)))], file);
   }
   break;
 

If no issues come up over the weekend with this patch I will backport it
to the 4.8 branch next week.

Steve Ellcey
sell...@mips.com




Re: PATCH: PR middle-end/59789: [4.9 Regression] ICE in in convert_move, at expr.c:333

2014-01-17 Thread H.J. Lu
On Fri, Jan 17, 2014 at 8:42 AM, Jan Hubicka  wrote:
>> For this testcase, we get CIF_TARGET_OPTION_MISMATCH.
>> Do you want to add a new flag so that inliner can use for
>> other errors?
>
> Just add flags parameter to DEFCIFCODE in cif-code.def
> and flag those that are final and should be output already
> in early inlining.
> This way we will not forget to include new codes as we introduce them.
>

Like this?  OK for trunk?

Thanks.


-- 
H.J.
From abb55ed09f4d493046b3dd6a27e7df0d1587fa72 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 13 Jan 2014 11:54:36 -0800
Subject: [PATCH] Update error handling during early_inlining

---
 gcc/ChangeLog   | 24 
 gcc/cgraph.c| 20 +-
 gcc/cgraph.h|  9 -
 gcc/cif-code.def| 66 -
 gcc/testsuite/ChangeLog |  5 +++
 gcc/testsuite/gcc.target/i386/pr59789.c | 22 +++
 gcc/tree-inline.c   |  3 +-
 7 files changed, 120 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr59789.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index df6e491..eb55a89 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,29 @@
 2014-01-17  H.J. Lu  
 
+	PR middle-end/59789
+	* cgraph.c (cgraph_inline_failed_string): Add flag to DEFCIFCODE.
+	(cgraph_inline_failed_flag): New function.
+	* cgraph.h (DEFCIFCODE): Add flag.
+	(cgraph_inline_failed_flag_t): New enum.
+	(cgraph_inline_failed_flag): New prototype.
+	* cif-code.def: Add CIF_FINAL_NORMAL to OK, FUNCTION_NOT_CONSIDERED,
+	FUNCTION_NOT_OPTIMIZED, REDEFINED_EXTERN_INLINE,
+	FUNCTION_NOT_INLINE_CANDIDATE, LARGE_FUNCTION_GROWTH_LIMIT,
+	LARGE_STACK_FRAME_GROWTH_LIMIT, MAX_INLINE_INSNS_SINGLE_LIMIT,
+	MAX_INLINE_INSNS_AUTO_LIMIT, INLINE_UNIT_GROWTH_LIMIT,
+	RECURSIVE_INLINING, UNLIKELY_CALL, NOT_DECLARED_INLINED,
+	OPTIMIZING_FOR_SIZE, ORIGINALLY_INDIRECT_CALL,
+	INDIRECT_UNKNOWN_CALL, USES_COMDAT_LOCAL. 
+	Add CIF_FINAL_ERROR to UNSPECIFIED, BODY_NOT_AVAILABLE,
+	FUNCTION_NOT_INLINABLE, OVERWRITABLE, MISMATCHED_ARGUMENTS,
+	EH_PERSONALITY, NON_CALL_EXCEPTIONS, TARGET_OPTION_MISMATCH,
+	OPTIMIZATION_MISMATCH.
+	* tree-inline.c (expand_call_inline): Emit errors during
+	early_inlining if cgraph_inline_failed_flag returns
+	CIF_FINAL_ERROR.
+
+2014-01-17  H.J. Lu  
+
 	* config/i386/i386-c.c (ix86_target_macros_internal): Handle
 	PROCESSOR_INTEL.  Treat like PROCESSOR_GENERIC.
 	* config/i386/i386.c (intel_memcpy): New.  Duplicate slm_memcpy.
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 92b31b9..156b6ee 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1877,7 +1877,7 @@ const char*
 cgraph_inline_failed_string (cgraph_inline_failed_t reason)
 {
 #undef DEFCIFCODE
-#define DEFCIFCODE(code, string)	string,
+#define DEFCIFCODE(code, flag, string)	string,
 
   static const char *cif_string_table[CIF_N_REASONS] = {
 #include "cif-code.def"
@@ -1889,6 +1889,24 @@ cgraph_inline_failed_string (cgraph_inline_failed_t reason)
   return cif_string_table[reason];
 }
 
+/* Return a flag describing the failure REASON.  */
+
+cgraph_inline_failed_flag_t
+cgraph_inline_failed_flag (cgraph_inline_failed_t reason)
+{
+#undef DEFCIFCODE
+#define DEFCIFCODE(code, flag, string)	flag,
+
+  static cgraph_inline_failed_flag_t cif_flag_table[CIF_N_REASONS] = {
+#include "cif-code.def"
+  };
+
+  /* Signedness of an enum type is implementation defined, so cast it
+ to unsigned before testing. */
+  gcc_assert ((unsigned) reason < CIF_N_REASONS);
+  return cif_flag_table[reason];
+}
+
 /* Names used to print out the availability enum.  */
 const char * const cgraph_availability_names[] =
   {"unset", "not_available", "overwritable", "available", "local"};
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 7ce5401..6b5ae8e 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -518,13 +518,19 @@ struct varpool_node_set_iterator
   unsigned index;
 };
 
-#define DEFCIFCODE(code, string)	CIF_ ## code,
+#define DEFCIFCODE(code, flag, string)	CIF_ ## code,
 /* Reasons for inlining failures.  */
 enum cgraph_inline_failed_t {
 #include "cif-code.def"
   CIF_N_REASONS
 };
 
+enum cgraph_inline_failed_flag_t
+{
+  CIF_FINAL_NORMAL = 0,
+  CIF_FINAL_ERROR
+};
+
 /* Structure containing additional information about an indirect call.  */
 
 struct GTY(()) cgraph_indirect_call_info
@@ -774,6 +780,7 @@ void cgraph_unnest_node (struct cgraph_node *);
 enum availability cgraph_function_body_availability (struct cgraph_node *);
 void cgraph_add_new_function (tree, bool);
 const char* cgraph_inline_failed_string (cgraph_inline_failed_t);
+cgraph_inline_failed_flag_t cgraph_inline_failed_flag (cgraph_inline_failed_t);
 
 void cgraph_set_nothrow_flag (struct cgraph_node *, bool);
 void cgraph_set_const_flag (struct cgraph_node *, bool, bool);
diff --git a/gcc/cif-code.def b/gcc/cif-code.def
index f1df5a0..5591f9a 100644
--- a/gcc/cif-code.def
+++ b/gcc/cif-code.

Re: [GOMP4][PATCH] SIMD-enabled functions (formerly Elemental functions) for C++

2014-01-17 Thread Jakub Jelinek
On Thu, Dec 19, 2013 at 06:12:29PM +, Iyer, Balaji V wrote:
> 2013-12-19  Balaji V. Iyer  
> 
> * parser.c (cp_parser_direct_declarator): When Cilk Plus is enabled
> see if there is an attribute after function decl.  If so, then
> parse them now.
> (cp_parser_late_return_type_opt): Handle parsing of Cilk Plus SIMD
> enabled function late parsing.
> (cp_parser_gnu_attribute_list): Parse all the tokens for the vector
> attribute for a SIMD-enabled function.
> (cp_parser_omp_all_clauses): Skip parsing to the end of pragma when
> the function is used by SIMD-enabled function (indicated by NULL
> pragma token).   Added 3 new clauses: PRAGMA_CILK_CLAUSE_MASK,
> PRAGMA_CILK_CLAUSE_NOMASK and PRAGMA_CILK_CLAUSE_VECTORLENGTH
> (cp_parser_cilk_simd_vectorlength): Modified this function to handle
> vectorlength clause in SIMD-enabled function and #pragma SIMD's
> vectorlength clause.  Added a new bool parameter to differentiate
> between the two.
> (cp_parser_cilk_simd_fn_vector_attrs): New function.
> (is_cilkplus_vector_p): Likewise.
> (cp_parser_late_parsing_elem_fn_info): Likewise.
> (cp_parser_omp_clause_name): Added a check for "mask," "nomask"

The comma should have been after " .

> +   /* In here, we handle cases where attribute is used after
> +  the function declaration.  For example:
> +  void func (int x) __attribute__((vector(..)));  */
> +   if (flag_enable_cilkplus
> +   && cp_next_tokens_can_be_attribute_p (parser))

As you are just calling cp_parser_gnu_attributes_opt here and not
..._std_..., I'd say the above should be
cp_next_tokens_can_be_gnu_attribute_p rather than
cp_next_tokens_can_be_attribute_p.  I think [[...]] attributes at this
position are ignored, so no need to handle them, not sure about
whether we allow e.g. combination of GNU and std attributes or vice versa.

> + {
> +   cp_parser_parse_tentatively (parser);
> +   tree attr = cp_parser_gnu_attributes_opt (parser);
> +   if (cp_lexer_next_token_is_not (parser->lexer,
> +   CPP_SEMICOLON)
> +   && cp_lexer_next_token_is_not (parser->lexer,
> +  CPP_OPEN_BRACE))
> + cp_parser_abort_tentative_parse (parser);
> +   else if (!cp_parser_parse_definitely (parser))
> + ;
> +   else
> + attrs = chainon (attr, attrs);
> + }
> late_return = (cp_parser_late_return_type_opt
>(parser, declarator,
> memfn ? cv_quals : -1));

> @@ -17842,6 +17868,10 @@ cp_parser_late_return_type_opt (cp_parser* parser, 
> cp_declarator *declarator,
>type = cp_parser_trailing_type_id (parser);
>  }
>  
> +  if (cilk_simd_fn_vector_p)
> +declarator->std_attributes
> +  = cp_parser_late_parsing_cilk_simd_fn_info (parser,
> +  declarator->std_attributes);

Please make sure declarator is aligned below parser.

> +  token->type = CPP_PRAGMA_EOL;
> +  parser->lexer->next_token = token;
> +  cp_lexer_consume_token (parser->lexer);
> +
> +  struct cp_token_cache *cp = 
> +cp_token_cache_new (v_token, cp_lexer_peek_token (parser->lexer));

The = should already go on the next line.

> +/* Handles the delayed parsing of the Cilk Plus SIMD-enabled function.  
> +   This function is modelled similar to the late parsing of omp declare 
> +   simd.  */
> +
> +static tree
> +cp_parser_late_parsing_cilk_simd_fn_info (cp_parser *parser, tree attrs)
> +{
> +  struct cp_token_cache *ce;
> +  cp_omp_declare_simd_data *info = parser->cilk_simd_fn_info;
> +  int ii = 0;
> +
> +  if (parser->omp_declare_simd != NULL)
> +{
> +  error ("%<#pragma omp declare simd%> cannot be used in the same 
> function"
> +  " marked as a Cilk Plus SIMD-enabled function");
> +  parser->cilk_simd_fn_info = NULL;

This will leak parser->cilk_simd_fn_info memory.  Please XDELETE it first.

> +  return attrs;
> +}
> +  if (!info->error_seen && info->fndecl_seen)
> +{
> +  error ("vector attribute not immediately followed by a single function"
> +  " declaration or definition");
> +  info->error_seen = true;
> +}
> +  if (info->error_seen)
> +return attrs;
> +
> +  /* Vector attributes are converted to #pragma omp declare simd values and
> + so we need them enabled.  */
> +  flag_openmp = 1;

The C FE doesn't do this.  I thought all the omp-low.c spots are now guarded
by flag_openmp || flag_enable_cilkplus etc. conditions.

> +  c = build_tree_list (get_identifier ("cilk simd function"), cl);

Plea

Re: [PATCH, PR 59736] Fix an IPA-CP issue with de-speculation

2014-01-17 Thread Jan Hubicka
> Hi,
> 
> in PR 59736, IPA-CP stumples on an already removed call graph edge.
> The reason is that it keeps an internal linked list of edge clones
> which can however now be corrupted by cgraph de-speculation machinery
> which can decide to remove an edge.
> 
> In order to fix this, I made the linked-list bi-directional and added
> a remove hook that fixes it up if need be.
> 
> Bootstrapped and tedted on x86_64-linux.  Unfortunately, I don't have
> a simple testcase (the smallest I have is a 8.3K multidetla reduced
> mess).  OK for trunk anyway?
> 
> Thanks,
> 
> Martin
> 
> 
> 2014-01-17  Martin Jambor  
> 
>   PR ipa/59736
>   * ipa-cp.c (prev_edge_clone): New variable.
>   (grow_next_edge_clone_vector): Renamed to grow_edge_clone_vectors.
>   Also resize prev_edge_clone vector.
>   (ipcp_edge_duplication_hook): Also update prev_edge_clone.
>   (ipcp_edge_removal_hook): New function.
>   (ipcp_driver): Register ipcp_edge_removal_hook.

OK,
thanks!
Honza


Re: [GOOGLE] don't overwrite precomputed loop bound in AutoFDO

2014-01-17 Thread Xinliang David Li
Ok.

David

On Fri, Jan 17, 2014 at 9:08 AM, Dehao Chen  wrote:
> If a loop is cunrolled/vectorized, the AutoFDO computed trip count
> will be very small. This patch disallows overwritting of precomputed
> loop bound in AutoFDO mode.
>
> Bootstrapped and passed regression test. Performance test on-going.
>
> OK for Google branches?
>
> Thanks,
> Dehao
>
> Index: tree-ssa-loop-niter.c
> ===
> --- tree-ssa-loop-niter.c (revision 206674)
> +++ tree-ssa-loop-niter.c (working copy)
> @@ -2520,7 +2520,8 @@ record_niter_bound (struct loop *loop, double_int
>  }
>if (realistic
>&& (!loop->any_estimate
> -  || i_bound.ult (loop->nb_iterations_estimate)))
> +  || (!flag_auto_profile &&
> +  i_bound.ult (loop->nb_iterations_estimate
>  {
>loop->any_estimate = true;
>loop->nb_iterations_estimate = i_bound;


Re: [C++ Patch] PR 59270

2014-01-17 Thread Paolo Carlini

On 01/17/2014 06:02 PM, Jason Merrill wrote:

OK.

Thanks.
Does returning error_mark_node from build_value_init in the case of 
erroneous type not work?
Yeah, doesn't work in the sense that we regress to two errors instead of 
one (like in 4.8.x) on that testcase I mentioned at the beginning of 
this thread (sorry for not being more explicit)


Paolo.


[PATCH, PR 59736] Fix an IPA-CP issue with de-speculation

2014-01-17 Thread Martin Jambor
Hi,

in PR 59736, IPA-CP stumples on an already removed call graph edge.
The reason is that it keeps an internal linked list of edge clones
which can however now be corrupted by cgraph de-speculation machinery
which can decide to remove an edge.

In order to fix this, I made the linked-list bi-directional and added
a remove hook that fixes it up if need be.

Bootstrapped and tedted on x86_64-linux.  Unfortunately, I don't have
a simple testcase (the smallest I have is a 8.3K multidetla reduced
mess).  OK for trunk anyway?

Thanks,

Martin


2014-01-17  Martin Jambor  

PR ipa/59736
* ipa-cp.c (prev_edge_clone): New variable.
(grow_next_edge_clone_vector): Renamed to grow_edge_clone_vectors.
Also resize prev_edge_clone vector.
(ipcp_edge_duplication_hook): Also update prev_edge_clone.
(ipcp_edge_removal_hook): New function.
(ipcp_driver): Register ipcp_edge_removal_hook.

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index a6a44e6..10fa4b6 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -2321,13 +2321,17 @@ ipcp_discover_new_direct_edges (struct cgraph_node 
*node,
edge. */
 
 static vec next_edge_clone;
+static vec prev_edge_clone;
 
 static inline void
-grow_next_edge_clone_vector (void)
+grow_edge_clone_vectors (void)
 {
   if (next_edge_clone.length ()
   <=  (unsigned) cgraph_edge_max_uid)
 next_edge_clone.safe_grow_cleared (cgraph_edge_max_uid + 1);
+  if (prev_edge_clone.length ()
+  <=  (unsigned) cgraph_edge_max_uid)
+prev_edge_clone.safe_grow_cleared (cgraph_edge_max_uid + 1);
 }
 
 /* Edge duplication hook to grow the appropriate linked list in
@@ -2335,13 +2339,34 @@ grow_next_edge_clone_vector (void)
 
 static void
 ipcp_edge_duplication_hook (struct cgraph_edge *src, struct cgraph_edge *dst,
-   __attribute__((unused)) void *data)
+   void *)
 {
-  grow_next_edge_clone_vector ();
-  next_edge_clone[dst->uid] = next_edge_clone[src->uid];
+  grow_edge_clone_vectors ();
+
+  struct cgraph_edge *old_next = next_edge_clone[src->uid];
+  if (old_next)
+prev_edge_clone[old_next->uid] = dst;
+  prev_edge_clone[dst->uid] = src;
+
+  next_edge_clone[dst->uid] = old_next;
   next_edge_clone[src->uid] = dst;
 }
 
+/* Hook that is called by cgraph.c when an edge is removed.  */
+
+static void
+ipcp_edge_removal_hook (struct cgraph_edge *cs, void *)
+{
+  grow_edge_clone_vectors ();
+
+  struct cgraph_edge *prev = prev_edge_clone[cs->uid];
+  struct cgraph_edge *next = next_edge_clone[cs->uid];
+  if (prev)
+next_edge_clone[prev->uid] = next;
+  if (next)
+prev_edge_clone[next->uid] = prev;
+}
+
 /* See if NODE is a clone with a known aggregate value at a given OFFSET of a
parameter with the given INDEX.  */
 
@@ -3568,13 +3593,17 @@ static unsigned int
 ipcp_driver (void)
 {
   struct cgraph_2edge_hook_list *edge_duplication_hook_holder;
+  struct cgraph_edge_hook_list *edge_removal_hook_holder;
   struct topo_info topo;
 
   ipa_check_create_node_params ();
   ipa_check_create_edge_args ();
-  grow_next_edge_clone_vector ();
+  grow_edge_clone_vectors ();
   edge_duplication_hook_holder =
 cgraph_add_edge_duplication_hook (&ipcp_edge_duplication_hook, NULL);
+  edge_removal_hook_holder =
+cgraph_add_edge_removal_hook (&ipcp_edge_removal_hook, NULL);
+
   ipcp_values_pool = create_alloc_pool ("IPA-CP values",
sizeof (struct ipcp_value), 32);
   ipcp_sources_pool = create_alloc_pool ("IPA-CP value sources",
@@ -3600,6 +3629,7 @@ ipcp_driver (void)
   /* Free all IPCP structures.  */
   free_toporder_info (&topo);
   next_edge_clone.release ();
+  cgraph_remove_edge_removal_hook (edge_removal_hook_holder);
   cgraph_remove_edge_duplication_hook (edge_duplication_hook_holder);
   ipa_free_all_structures_after_ipa_cp ();
   if (dump_file)


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread H.J. Lu
On Fri, Jan 17, 2014 at 7:11 AM, Uros Bizjak  wrote:
> BTW: There are some ix86_tune == XXX conditions scattered throughout
> LEA handling code. Can these be substituted with appropriate TARGET_*
> defines?
>
> Uros.

This is the patch I checked in.

Thanks.

-- 
H.J.
---
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index df6e491..4af6ef1 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,13 @@
 2014-01-17  H.J. Lu  

+ * config/i386/i386.c (ix86_lea_outperforms): Use TARGET_XXX.
+ (ix86_adjust_cost): Use !TARGET_XXX.
+ (do_reorder_for_imul): Likewise.
+ (swap_top_of_ready_list): Likewise.
+ (ix86_sched_reorder): Likewise.
+
+2014-01-17  H.J. Lu  
+
  * config/i386/i386-c.c (ix86_target_macros_internal): Handle
  PROCESSOR_INTEL.  Treat like PROCESSOR_GENERIC.
  * config/i386/i386.c (intel_memcpy): New.  Duplicate slm_memcpy.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8993331..7bfad8f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -18020,7 +18020,7 @@ ix86_lea_outperforms (rtx insn, unsigned int
regno0, unsigned int regno1,
   /* For Silvermont if using a 2-source or 3-source LEA for
  non-destructive destination purposes, or due to wanting
  ability to use SCALE, the use of LEA is justified.  */
-  if (ix86_tune == PROCESSOR_SILVERMONT || ix86_tune == PROCESSOR_INTEL)
+  if (TARGET_SILVERMONT || TARGET_INTEL)
 {
   if (has_scale)
  return true;
@@ -25567,7 +25567,7 @@ ix86_adjust_cost (rtx insn, rtx link, rtx
dep_insn, int cost)
   /* Stack engine allows to execute push&pop instructions in parall.  */
   if (((insn_type == TYPE_PUSH || insn_type == TYPE_POP)
&& (dep_insn_type == TYPE_PUSH || dep_insn_type == TYPE_POP))
-  && (ix86_tune != PROCESSOR_ATHLON && ix86_tune != PROCESSOR_K8))
+  && (!TARGET_ATHLON && !TARGET_K8))
  return 0;

   /* Show ability of reorder buffer to hide latency of load by executing
@@ -25832,7 +25832,7 @@ do_reorder_for_imul (rtx *ready, int n_ready)
   int index = -1;
   int i;

-  if (ix86_tune != PROCESSOR_BONNELL)
+  if (!TARGET_BONNELL)
 return index;

   /* Check that IMUL instruction is on the top of ready list.  */
@@ -25912,7 +25912,7 @@ swap_top_of_ready_list (rtx *ready, int n_ready)
   int clock2 = -1;
   #define INSN_TICK(INSN) (HID (INSN)->tick)

-  if (ix86_tune != PROCESSOR_SILVERMONT && ix86_tune != PROCESSOR_INTEL)
+  if (!TARGET_SILVERMONT && !TARGET_INTEL)
 return false;

   if (!NONDEBUG_INSN_P (top))
@@ -25985,9 +25985,7 @@ ix86_sched_reorder (FILE *dump, int
sched_verbose, rtx *ready, int *pn_ready,
   issue_rate = ix86_issue_rate ();

   /* Do reodering for BONNELL/SILVERMONT only.  */
-  if (ix86_tune != PROCESSOR_BONNELL
-  && ix86_tune != PROCESSOR_SILVERMONT
-  && ix86_tune != PROCESSOR_INTEL)
+  if (!TARGET_BONNELL && !TARGET_SILVERMONT && !TARGET_INTEL)
 return issue_rate;

   /* Nothing to do if ready list contains only 1 instruction.  */


[GOOGLE] don't overwrite precomputed loop bound in AutoFDO

2014-01-17 Thread Dehao Chen
If a loop is cunrolled/vectorized, the AutoFDO computed trip count
will be very small. This patch disallows overwritting of precomputed
loop bound in AutoFDO mode.

Bootstrapped and passed regression test. Performance test on-going.

OK for Google branches?

Thanks,
Dehao

Index: tree-ssa-loop-niter.c
===
--- tree-ssa-loop-niter.c (revision 206674)
+++ tree-ssa-loop-niter.c (working copy)
@@ -2520,7 +2520,8 @@ record_niter_bound (struct loop *loop, double_int
 }
   if (realistic
   && (!loop->any_estimate
-  || i_bound.ult (loop->nb_iterations_estimate)))
+  || (!flag_auto_profile &&
+  i_bound.ult (loop->nb_iterations_estimate
 {
   loop->any_estimate = true;
   loop->nb_iterations_estimate = i_bound;


Re: [C++ Patch] PR 59270

2014-01-17 Thread Jason Merrill

OK.

Does returning error_mark_node from build_value_init in the case of 
erroneous type not work?


Jason


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread H.J. Lu
On Fri, Jan 17, 2014 at 7:55 AM, H.J. Lu  wrote:
> On Fri, Jan 17, 2014 at 7:36 AM, Uros Bizjak  wrote:
>> On Fri, Jan 17, 2014 at 4:17 PM, H.J. Lu  wrote:
>>
 BTW: There are some ix86_tune == XXX conditions scattered throughout
 LEA handling code. Can these be substituted with appropriate TARGET_*
 defines?
>>>
>>> I have been looking at them closely to check their impacts on
>>> both Haswell and Silvermont.  I am planning to keep
>>> the simple LEA -> ADD transformation, but avoid
>>> the complex LEA -> ADD/MOV/SHL transformation.
>>
>> No, I didn't talk about functional change, but about equivalent
>> TARGET_* define that can be used instead of "(ix86_tune ==
>> PROCESSOR_SILVERMONT) || (ix86_tune == PROCESSOR_INTEL)".
>>
>> Uros.
>
> Something like
>
> #define TARGET_INTEL_SILVERMONT \
>   (ix86_tune == PROCESSOR_SILVERMONT || ix86_tune == PROCESSOR_INTEL)
>
>

I see what I meant.  I will submit a patch.


-- 
H.J.


Re: PATCH: PR middle-end/59789: [4.9 Regression] ICE in in convert_move, at expr.c:333

2014-01-17 Thread Jan Hubicka
> For this testcase, we get CIF_TARGET_OPTION_MISMATCH.
> Do you want to add a new flag so that inliner can use for
> other errors?

Just add flags parameter to DEFCIFCODE in cif-code.def
and flag those that are final and should be output already
in early inlining.
This way we will not forget to include new codes as we introduce them.

Honza
> 
> 
> -- 
> H.J.


Re: [PATCH, go]: Skip some go tests

2014-01-17 Thread Ian Lance Taylor
On Fri, Jan 17, 2014 at 8:32 AM, Matthias Klose  wrote:
> Am 09.01.2014 18:11, schrieb Uros Bizjak:
>> On Thu, Jan 9, 2014 at 4:01 PM, Ian Lance Taylor  wrote:
>>> On Thu, Jan 9, 2014 at 2:54 AM, Uros Bizjak  wrote:

 2014-01-09  Uros Bizjak  

 * go.test/go-test.exp (go-gc-tests): Don't run peano.go on systems
 which don't support -fsplit-stack.  Skip rotate[0123]-out.go.
>>>
>>> This is OK.  Thanks.
>>>
>>> You might want to tweak the comment just under where you added
>>> "peano.go".  Then go ahead and commit.
>>
>> Actually, we don't even have to compile/execute generator file, and
>> included rotate.go is skipped due to "// skip" in its test line.
>>
>> Attached patch was committed to mainline after re-test on 
>> x86_64-pc-linux-gnu.
>
> Ok for the 4.8 branch too?

Sure.

Ian


Re: [PATCH, go]: Skip some go tests

2014-01-17 Thread Matthias Klose
Am 09.01.2014 18:11, schrieb Uros Bizjak:
> On Thu, Jan 9, 2014 at 4:01 PM, Ian Lance Taylor  wrote:
>> On Thu, Jan 9, 2014 at 2:54 AM, Uros Bizjak  wrote:
>>>
>>> 2014-01-09  Uros Bizjak  
>>>
>>> * go.test/go-test.exp (go-gc-tests): Don't run peano.go on systems
>>> which don't support -fsplit-stack.  Skip rotate[0123]-out.go.
>>
>> This is OK.  Thanks.
>>
>> You might want to tweak the comment just under where you added
>> "peano.go".  Then go ahead and commit.
> 
> Actually, we don't even have to compile/execute generator file, and
> included rotate.go is skipped due to "// skip" in its test line.
> 
> Attached patch was committed to mainline after re-test on x86_64-pc-linux-gnu.

Ok for the 4.8 branch too?

  Matthias



Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread Uros Bizjak
On Fri, Jan 17, 2014 at 4:26 PM, Jakub Jelinek  wrote:
> On Fri, Jan 17, 2014 at 04:24:50PM +0100, Uros Bizjak wrote:
>> On Fri, Jan 17, 2014 at 3:50 PM, H.J. Lu  wrote:
>> >
>> > Wrong example.  It should be
>> >
>> > lea 0x400(%edx, %ecx, 8), %edx
>> >
>> > we get
>> >
>> > add %ecx, %edx
>> > add %ecx, %edx
>> > add %ecx, %edx
>> > add %ecx, %edx
>> > add %ecx, %edx
>> > add %ecx, %edx
>> > add %ecx, %edx
>> > add %ecx, %edx
>> > add $0x400, %edx
>>
>> Even for this example, the code can be substantially improved:
>>
>> shl $3, %ecx
>> add %ecx, %edx
>> add $0x400, %edx
>
> Only if ecx is dead after the statement.

True. Do we have this information at the point transformation is performed?

Uros.


Re: [Patch, AArch64] Relax CANNOT_CHANGE_MODE_CLASS.

2014-01-17 Thread Richard Earnshaw
On 16/01/14 18:22, Tejas Belagod wrote:
> Tejas Belagod wrote:
>> Tejas Belagod wrote:
>>> Hi,
>>>
>>> Currently, CANNOT_CHANGE_MODE_CLASS is too restrictive wrt the mode-changes 
>>> it 
>>> allows on FPREGs - it allows none at the moment. In fact, there are many 
>>> mode 
>>> changes that are safe and can be allowed. For example, in a pattern like:
>>>
>>>  (subreg:SF (reg:V4SF v0) 0)
>>>
>>> it is legal to reduce this to
>>>
>>>   (reg:SF v0)
>>>
>>> The attached patch helps parts of rtlanal.c make such decisions(eg. 
>>> simplify_subreg_regno).
>>>
>>> Tested on aarch64-none-elf and aarch64_be-none-elf. OK for trunk?
>>>
>>> Thanks,
>>> Tejas Belagod
>>> ARM.
>>>
>>> Changelog:
>>>
>>> 2013-11-28  Tejas Belagod  
>>>
>>> gcc/
>>> * config/aarch64/aarch64-protos.h (aarch64_cannot_change_mode_class):
>>> Declare.
>>> * config/aarch64/aarch64.c (aarch64_cannot_change_mode_class): New.
>>> * config/aarch64/aarch64.h (CANNOT_CHANGE_MODE_CLASS): Change to call
>>> backend function aarch64_cannot_change_mode_class.
>>>
>>
>> Hi,
>>
>> I'm testing a patch that is more general than the change presented here for 
>> CANNOT_CHANGE_MODE_CLASS. This patch is now defunct.
> 
> Ideally CANNOT_CHANGE_MODE_CLASS should be undefined, but that exposed a bug 
> in 
> LRA and subregs in general.
> 
> Until http://gcc.gnu.org/ml/gcc/2013-12/msg00086.html  and 
> http://gcc.gnu.org/ml/gcc/2014-01/msg00087.html are resolved, we have to run 
> with the attached patch. It is a slightly modified version of the initial 
> patch.
> 
> Tested on aarch64-none-elf and aarch64_be-none-elf. OK for trunk?
> 
> Thanks,
> Tejas.
> 
> 2014-01-16  Tejas Belagod  
> 
> gcc/
>   * config/aarch64/aarch64-protos.h
>   (aarch64_cannot_change_mode_class_ptr): Declare.
>   * config/aarch64/aarch64.c (aarch64_cannot_change_mode_class,
>   aarch64_cannot_change_mode_class_ptr): New.
>   * config/aarch64/aarch64.h (CANNOT_CHANGE_MODE_CLASS): Change to call
>   backend hook aarch64_cannot_change_mode_class.
> 
> 

OK.

R.




Re: [Patch AArch64] Implement Vector Permute Support

2014-01-17 Thread Richard Earnshaw
On 16/01/14 14:43, Alex Velenko wrote:
> On 14/01/14 15:51, pins...@gmail.com wrote:
>>
>>
>>> On Jan 14, 2014, at 7:19 AM, Alex Velenko  wrote:
>>>
>>> Hi,
>>>
>>> This patch turns off the vec_perm patterns for aarch64_be, this should 
>>> resolve
>>> the issue  highlighted here 
>>> http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00321.html
>>> With this patch applied, the test case provided in that link compiles 
>>> without an ICE.
>>>
>>> However, the Big-Endian port is still in development. This patch exposes
>>> another known but unrelated issue with Big-Endian Large-Int modes.
>>>
>>> The patch has been tested on aarch64-none-elf and aarch64_be-none-elf 
>>> resulting in five
>>> further regression due to the broken implementation of Big-Endian Large-Int 
>>> modes.
>>>
>>> Kind regards,
>>> Alex Velenko
>>>
>>> gcc/
>>>
>>> 2014-01-14  Alex Velenko  
>>>
>>> * config/aarch64/aarch64-simd.md (vec_perm): Add BE check.
>>> * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>>
>>> gcc/testsuite/
>>>
>>> 2014-01-14  Alex Velenko  
>>>
>>> * lib/target-supports.exp
>>> (check_effective_target_vect_perm): Exclude aarch64_be.
>>> (check_effective_target_vect_perm_byte): Likewise.
>>> (check_effective_target_vect_perm_short): Likewise.
>>
>> I think you want to use a function to check if the target is effectively 
>> big-endian instead.  Internally at Cavium, our elf compiler has big-endian 
>> multi-lib.
>>
>> Thanks,
>> Andrew
>>
>>>
>>> 
>>
> 
> Hi,
> Here is a vec-perm patch with changes proposed previously.
> Little and Big-Endian tested with no additional issues appearing.
> 
> Kind regards,
> Alex
> 
> gcc/
> 
> 2014-01-16  Alex Velenko  
> 
>   * config/aarch64/aarch64-simd.md (vec_perm): Add BE check.
>   * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
> 
> gcc/testsuite/
> 
> 2014-01-16  Alex Velenko  
> 
>   * lib/target-supports.exp
>   (check_effective_target_vect_perm): Exclude aarch64_be.
>   (check_effective_target_vect_perm_byte): Likewise.
>   (check_effective_target_vect_perm_short): Likewise.
> 

The patch is missing the hunk for aarch64.c.





Re: PATCH: PR middle-end/59789: [4.9 Regression] ICE in in convert_move, at expr.c:333

2014-01-17 Thread H.J. Lu
On Fri, Jan 17, 2014 at 7:35 AM, Jan Hubicka  wrote:
>> > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> > index 5c674bc..284bc66 100644
>> > --- a/gcc/ChangeLog
>> > +++ b/gcc/ChangeLog
>> > @@ -1,3 +1,12 @@
>> > +2014-01-13  Sriraman Tallam  
>> > +   H.J. Lu  
>> > +
>> > +   PR middle-end/59789
>> > +   * tree-inline.c (report_early_inliner_always_inline_failure): New
>> > +   function.
>> > +   (expand_call_inline): Emit errors during early_inlining if
>> > +   report_early_inliner_always_inline_failure returns true.
>> > +
>> >  2014-01-10  DJ Delorie  
>> >
>> > * config/msp430/msp430.md (call_internal): Don't allow memory
>> > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>> > index 459e365..2a7b3ca 100644
>> > --- a/gcc/testsuite/ChangeLog
>> > +++ b/gcc/testsuite/ChangeLog
>> > @@ -1,3 +1,8 @@
>> > +2014-01-13  H.J. Lu  
>> > +
>> > +   PR middle-end/59789
>> > +   * gcc.target/i386/pr59789.c: New testcase.
>> > +
>> >  2014-01-13  Jakub Jelinek  
>> >
>> > PR tree-optimization/59387
>> > diff --git a/gcc/testsuite/gcc.target/i386/pr59789.c 
>> > b/gcc/testsuite/gcc.target/i386/pr59789.c
>> > new file mode 100644
>> > index 000..b476d6c
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/i386/pr59789.c
>> > @@ -0,0 +1,22 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-require-effective-target ia32 } */
>> > +/* { dg-options "-O -march=i686" } */
>> > +
>> > +#pragma GCC push_options
>> > +#pragma GCC target("sse2")
>> > +typedef int __v4si __attribute__ ((__vector_size__ (16)));
>> > +typedef long long __m128i __attribute__ ((__vector_size__ (16), 
>> > __may_alias__));
>> > +
>> > +extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
>> > __artificial__))
>> > +_mm_set_epi32 (int __q3, int __q2, int __q1, int __q0) /* { dg-error 
>> > "target specific option mismatch" } */
>> > +{
>> > +  return __extension__ (__m128i)(__v4si){ __q0, __q1, __q2, __q3 };
>> > +}
>> > +#pragma GCC pop_options
>> > +
>> > +
>> > +__m128i
>> > +f1(void) /* { dg-message "warning: SSE vector return without SSE enabled 
>> > changes the ABI" } */
>> > +{
>> > +  return _mm_set_epi32 (0, 0, 0, 0); /* { dg-error "called from here" } */
>> > +}
>> > diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
>> > index 22521b1..ce1e3af 100644
>> > --- a/gcc/tree-inline.c
>> > +++ b/gcc/tree-inline.c
>> > @@ -4046,6 +4046,32 @@ add_local_variables (struct function *callee, 
>> > struct function *caller,
>> >}
>> >  }
>> >
>> > +/* Should an error be reported when early inliner fails to inline an
>> > +   always_inline function?  That depends on the REASON.  */
>> > +
>> > +static inline bool
>> > +report_early_inliner_always_inline_failure (cgraph_inline_failed_t reason)
>> > +{
>> > +  /* Only the following reasons need to be reported when the early inliner
>> > + fails to inline an always_inline function.  Called from
>> > + expand_call_inline.  */
>> > +  switch (reason)
>> > +{
>> > +case CIF_BODY_NOT_AVAILABLE:
>> > +case CIF_FUNCTION_NOT_INLINABLE:
>> > +case CIF_OVERWRITABLE:
>> > +case CIF_MISMATCHED_ARGUMENTS:
>> > +case CIF_EH_PERSONALITY:
>> > +case CIF_UNSPECIFIED:
>> > +case CIF_NON_CALL_EXCEPTIONS:
>> > +case CIF_TARGET_OPTION_MISMATCH:
>> > +case CIF_OPTIMIZATION_MISMATCH:
>> > +  return true;
>> > +default:
>> > +  return false;
>> > +}
>> > +}
>
> This looks resonable, but since we have .def file specifying CIF codes, I 
> would make
> this as a flag in there. Perhaps something like CIF_FINAL_ERROR that marks 
> those and
> inliner can report them.

For this testcase, we get CIF_TARGET_OPTION_MISMATCH.
Do you want to add a new flag so that inliner can use for
other errors?


-- 
H.J.


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread H.J. Lu
On Fri, Jan 17, 2014 at 7:36 AM, Uros Bizjak  wrote:
> On Fri, Jan 17, 2014 at 4:17 PM, H.J. Lu  wrote:
>
>>> BTW: There are some ix86_tune == XXX conditions scattered throughout
>>> LEA handling code. Can these be substituted with appropriate TARGET_*
>>> defines?
>>
>> I have been looking at them closely to check their impacts on
>> both Haswell and Silvermont.  I am planning to keep
>> the simple LEA -> ADD transformation, but avoid
>> the complex LEA -> ADD/MOV/SHL transformation.
>
> No, I didn't talk about functional change, but about equivalent
> TARGET_* define that can be used instead of "(ix86_tune ==
> PROCESSOR_SILVERMONT) || (ix86_tune == PROCESSOR_INTEL)".
>
> Uros.

Something like

#define TARGET_INTEL_SILVERMONT \
  (ix86_tune == PROCESSOR_SILVERMONT || ix86_tune == PROCESSOR_INTEL)


-- 
H.J.


Re: [committed] Honza's alias/weakref fix (PR c++/57945)

2014-01-17 Thread Jan Hubicka
> Hi!
> 
> I've bootstrapped/regtested on x86_64-linux and i686-linux
> following patch from Honza from the PR comment (written a month
> ago, acked by Jason a fortnight ago) and committed it to trunk.
> 
> 2014-01-17  Jan Hubicka  
> 
>   PR c++/57945
>   * passes.c (rest_of_decl_compilation): Don't call varpool_finalize_decl
>   on decls for which assemble_alias has been called.

Thank you!  I definitely need to empty my patch queue (it took me a while
to get a new setup for gcc testing where I can do LTO of bigger apps again)

Honza


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread Uros Bizjak
On Fri, Jan 17, 2014 at 4:17 PM, H.J. Lu  wrote:

>> BTW: There are some ix86_tune == XXX conditions scattered throughout
>> LEA handling code. Can these be substituted with appropriate TARGET_*
>> defines?
>
> I have been looking at them closely to check their impacts on
> both Haswell and Silvermont.  I am planning to keep
> the simple LEA -> ADD transformation, but avoid
> the complex LEA -> ADD/MOV/SHL transformation.

No, I didn't talk about functional change, but about equivalent
TARGET_* define that can be used instead of "(ix86_tune ==
PROCESSOR_SILVERMONT) || (ix86_tune == PROCESSOR_INTEL)".

Uros.


Re: PATCH: PR middle-end/59789: [4.9 Regression] ICE in in convert_move, at expr.c:333

2014-01-17 Thread Jan Hubicka
> > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > index 5c674bc..284bc66 100644
> > --- a/gcc/ChangeLog
> > +++ b/gcc/ChangeLog
> > @@ -1,3 +1,12 @@
> > +2014-01-13  Sriraman Tallam  
> > +   H.J. Lu  
> > +
> > +   PR middle-end/59789
> > +   * tree-inline.c (report_early_inliner_always_inline_failure): New
> > +   function.
> > +   (expand_call_inline): Emit errors during early_inlining if
> > +   report_early_inliner_always_inline_failure returns true.
> > +
> >  2014-01-10  DJ Delorie  
> >
> > * config/msp430/msp430.md (call_internal): Don't allow memory
> > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> > index 459e365..2a7b3ca 100644
> > --- a/gcc/testsuite/ChangeLog
> > +++ b/gcc/testsuite/ChangeLog
> > @@ -1,3 +1,8 @@
> > +2014-01-13  H.J. Lu  
> > +
> > +   PR middle-end/59789
> > +   * gcc.target/i386/pr59789.c: New testcase.
> > +
> >  2014-01-13  Jakub Jelinek  
> >
> > PR tree-optimization/59387
> > diff --git a/gcc/testsuite/gcc.target/i386/pr59789.c 
> > b/gcc/testsuite/gcc.target/i386/pr59789.c
> > new file mode 100644
> > index 000..b476d6c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr59789.c
> > @@ -0,0 +1,22 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target ia32 } */
> > +/* { dg-options "-O -march=i686" } */
> > +
> > +#pragma GCC push_options
> > +#pragma GCC target("sse2")
> > +typedef int __v4si __attribute__ ((__vector_size__ (16)));
> > +typedef long long __m128i __attribute__ ((__vector_size__ (16), 
> > __may_alias__));
> > +
> > +extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
> > __artificial__))
> > +_mm_set_epi32 (int __q3, int __q2, int __q1, int __q0) /* { dg-error 
> > "target specific option mismatch" } */
> > +{
> > +  return __extension__ (__m128i)(__v4si){ __q0, __q1, __q2, __q3 };
> > +}
> > +#pragma GCC pop_options
> > +
> > +
> > +__m128i
> > +f1(void) /* { dg-message "warning: SSE vector return without SSE enabled 
> > changes the ABI" } */
> > +{
> > +  return _mm_set_epi32 (0, 0, 0, 0); /* { dg-error "called from here" } */
> > +}
> > diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> > index 22521b1..ce1e3af 100644
> > --- a/gcc/tree-inline.c
> > +++ b/gcc/tree-inline.c
> > @@ -4046,6 +4046,32 @@ add_local_variables (struct function *callee, struct 
> > function *caller,
> >}
> >  }
> >
> > +/* Should an error be reported when early inliner fails to inline an
> > +   always_inline function?  That depends on the REASON.  */
> > +
> > +static inline bool
> > +report_early_inliner_always_inline_failure (cgraph_inline_failed_t reason)
> > +{
> > +  /* Only the following reasons need to be reported when the early inliner
> > + fails to inline an always_inline function.  Called from
> > + expand_call_inline.  */
> > +  switch (reason)
> > +{
> > +case CIF_BODY_NOT_AVAILABLE:
> > +case CIF_FUNCTION_NOT_INLINABLE:
> > +case CIF_OVERWRITABLE:
> > +case CIF_MISMATCHED_ARGUMENTS:
> > +case CIF_EH_PERSONALITY:
> > +case CIF_UNSPECIFIED:
> > +case CIF_NON_CALL_EXCEPTIONS:
> > +case CIF_TARGET_OPTION_MISMATCH:
> > +case CIF_OPTIMIZATION_MISMATCH:
> > +  return true;
> > +default:
> > +  return false;
> > +}
> > +}

This looks resonable, but since we have .def file specifying CIF codes, I would 
make
this as a flag in there. Perhaps something like CIF_FINAL_ERROR that marks 
those and
inliner can report them.

Honza


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread Jakub Jelinek
On Fri, Jan 17, 2014 at 04:24:50PM +0100, Uros Bizjak wrote:
> On Fri, Jan 17, 2014 at 3:50 PM, H.J. Lu  wrote:
> >
> > Wrong example.  It should be
> >
> > lea 0x400(%edx, %ecx, 8), %edx
> >
> > we get
> >
> > add %ecx, %edx
> > add %ecx, %edx
> > add %ecx, %edx
> > add %ecx, %edx
> > add %ecx, %edx
> > add %ecx, %edx
> > add %ecx, %edx
> > add %ecx, %edx
> > add $0x400, %edx
> 
> Even for this example, the code can be substantially improved:
> 
> shl $3, %ecx
> add %ecx, %edx
> add $0x400, %edx

Only if ecx is dead after the statement.

Jakub


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread Uros Bizjak
On Fri, Jan 17, 2014 at 3:50 PM, H.J. Lu  wrote:
>
> Wrong example.  It should be
>
> lea 0x400(%edx, %ecx, 8), %edx
>
> we get
>
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add $0x400, %edx

Even for this example, the code can be substantially improved:

shl $3, %ecx
add %ecx, %edx
add $0x400, %edx

Uros.


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread H.J. Lu
On Fri, Jan 17, 2014 at 7:11 AM, Uros Bizjak  wrote:
> On Fri, Jan 17, 2014 at 3:46 PM, H.J. Lu  wrote:
>
 ix86_split_lea_for_addr transforms a single LEA instruction into a series
 of MOV and ADD instructions.  For

 lea 0x400(%eax, %ecx, 8), %edx

 we get

 mov %eax, %edx
 add %ecx, %edx
 add %ecx, %edx
 add %ecx, %edx
 add %ecx, %edx
 add %ecx, %edx
 add %ecx, %edx
 add %ecx, %edx
 add %ecx, %edx
 add $0x400, %edx

 For -mtune=intel, we want to turn on X86_TUNE_OPT_AGU, but avoid
 ix86_split_lea_for_addr.  This patch adds X86_TUNE_AVOID_LEA_FOR_ADDR
 and PROCESSOR_INTEL.  We keep PROCESSOR_INTEL the same as
 PROCESSOR_SILVERMONT, except that X86_TUNE_AVOID_LEA_FOR_ADDR isn't
 turned on for PROCESSOR_INTEL.  OK for trunk?
>>>
>>> As said earlier, m_INTEL is not a processor, but equals a REAL
>>> processor, so the patch is not acceptable.
>>>
>>
>> -mtune=intel, similar to -mtune=generic,  isn't equal to a single processor.
>> From invoke.texi:
>>
>> ---
>> @item intel
>> Produce code optimized for the most current Intel processors, which are
>> Haswell and Silvermont for this version of GCC.
>> ---
>>
>> We don't want -mtune=intel to define __tune_silvermont__ and we
>> want to generate balanced codes for Haswell and Silvermont.
>> -mtune=intel started as -mtune=silvermont.  I am working on incremental
>> changes like this to better tune for Haswell without significantly impacting
>> Silvermont.
>
> OK, this clarifies the situation.
>
> So, -mtune=generic is too broad, and -mtune=intel is needed, as a
> generic tuning for latest Intel processors (note the plural). We want
> tuning options that cover Haswell and Silvermont for this version, but
> not something that degrades runtime too much (or unnecessarily
> increases code size too much).

Yes, that is correct.

> If this is the case, I agree with the approach.

I will check it in.

> BTW: There are some ix86_tune == XXX conditions scattered throughout
> LEA handling code. Can these be substituted with appropriate TARGET_*
> defines?

I have been looking at them closely to check their impacts on
both Haswell and Silvermont.  I am planning to keep
the simple LEA -> ADD transformation, but avoid
the complex LEA -> ADD/MOV/SHL transformation.

Thanks.

-- 
H.J.


Re: [C++ PATCH] Don't segv in cvt.c (PR c++/59838)

2014-01-17 Thread Jason Merrill

OK.

Jason


Re: [C++ Patch] PR 59269

2014-01-17 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread Uros Bizjak
On Fri, Jan 17, 2014 at 3:46 PM, H.J. Lu  wrote:

>>> ix86_split_lea_for_addr transforms a single LEA instruction into a series
>>> of MOV and ADD instructions.  For
>>>
>>> lea 0x400(%eax, %ecx, 8), %edx
>>>
>>> we get
>>>
>>> mov %eax, %edx
>>> add %ecx, %edx
>>> add %ecx, %edx
>>> add %ecx, %edx
>>> add %ecx, %edx
>>> add %ecx, %edx
>>> add %ecx, %edx
>>> add %ecx, %edx
>>> add %ecx, %edx
>>> add $0x400, %edx
>>>
>>> For -mtune=intel, we want to turn on X86_TUNE_OPT_AGU, but avoid
>>> ix86_split_lea_for_addr.  This patch adds X86_TUNE_AVOID_LEA_FOR_ADDR
>>> and PROCESSOR_INTEL.  We keep PROCESSOR_INTEL the same as
>>> PROCESSOR_SILVERMONT, except that X86_TUNE_AVOID_LEA_FOR_ADDR isn't
>>> turned on for PROCESSOR_INTEL.  OK for trunk?
>>
>> As said earlier, m_INTEL is not a processor, but equals a REAL
>> processor, so the patch is not acceptable.
>>
>
> -mtune=intel, similar to -mtune=generic,  isn't equal to a single processor.
> From invoke.texi:
>
> ---
> @item intel
> Produce code optimized for the most current Intel processors, which are
> Haswell and Silvermont for this version of GCC.
> ---
>
> We don't want -mtune=intel to define __tune_silvermont__ and we
> want to generate balanced codes for Haswell and Silvermont.
> -mtune=intel started as -mtune=silvermont.  I am working on incremental
> changes like this to better tune for Haswell without significantly impacting
> Silvermont.

OK, this clarifies the situation.

So, -mtune=generic is too broad, and -mtune=intel is needed, as a
generic tuning for latest Intel processors (note the plural). We want
tuning options that cover Haswell and Silvermont for this version, but
not something that degrades runtime too much (or unnecessarily
increases code size too much).

If this is the case, I agree with the approach.

BTW: There are some ix86_tune == XXX conditions scattered throughout
LEA handling code. Can these be substituted with appropriate TARGET_*
defines?

Uros.


[patch] fix libstdc++/56267 - local iterator requirements

2014-01-17 Thread Jonathan Wakely
The issue in PR 56267 is that unordered_xxx::local_iterator sometimes
inherits from the user-defined hash function (via _Hash_code_base,
which inherits from the hash function to use the EBO), and
local_iterator must be DefaultConstructible and Assignable, even when
the hash function isn't.

My solution is to remove the inheritance from _Hash_code_base, and
instead construct/destroy the _Hash_code_base in a block of
uninitialized memory (via __gnu_cxx::__aligned_buffer). This would
mean we can't use the EBO and increase the size of local_iterator, and
past measurements have shown that the unordered containers'
performance is sensitive to such changes, so there's a partial
specialization that doesn't have the __aligned_buffer member for the
case where the _Hash_code_base is empty and needs no storage.

François, do you have any comments on this? Can you see a better solution?

While working on this I decided I didn't like everything in
_Local_iterator_base being public, so I added some accessors to the
only members that are needed by unrelated types.

2014-01-17  Jonathan Wakely  

PR libstdc++/56267
* include/bits/hashtable_policy.h (_Local_iterator_base): Give
protected access to all existing members.
(_Local_iterator_base::_M_curr()): New public accessor.
(_Local_iterator_base::_M_get_bucket()): New public accessor.
(_Local_iterator_base<..., false>::_M_init()): New function to manage
the lifetime of the _Hash_code_base explicitly.
(_Local_iterator_base<..., false>::_M_destroy()): Likewise.
(_Local_iterator_base<..., false>): Define copy constructor and copy
assignment operator that use new functions to manage _Hash_code_base.
(operator==(const _Local_iterator_base&, const _Local_iterator_base&),
operator==(const _Local_iterator_base&, const _Local_iterator_base&)):
Use public API for _Local_iterator_base.
* include/debug/safe_local_iterator.h (_Safe_local_iterator): Likewise.
* include/debug/unordered_map (__debug::unordered_map::erase(),
__debug::unordered_multimap::erase()): Likewise.
* include/debug/unordered_set (__debug::unordered_set::erase(),
__debug::unordered_multiset::erase()): Likewise.
* testsuite/23_containers/unordered_set/56267-2.cc: New test.
commit f9c852223230da4e4fc761f72905c7acacfa4399
Author: Jonathan Wakely 
Date:   Fri Jan 17 15:10:48 2014 +

PR libstdc++/56267
* include/bits/hashtable_policy.h (_Local_iterator_base): Give
protected access to all existing members.
(_Local_iterator_base::_M_curr()): New public accessor.
(_Local_iterator_base::_M_get_bucket()): New public accessor.
(operator==(const _Local_iterator_base&, const _Local_iterator_base&),
operator==(const _Local_iterator_base&, const _Local_iterator_base&)):
Use public API for _Local_iterator_base.
* include/debug/safe_local_iterator.h (_Safe_local_iterator): Likewise.
* include/debug/unordered_map (__debug::unordered_map::erase(),
__debug::unordered_multimap::erase()): Likewise.
* include/debug/unordered_set (__debug::unordered_set::erase(),
__debug::unordered_multiset::erase()): Likewise.
* testsuite/23_containers/unordered_set/56267-2.cc: New test.

diff --git a/libstdc++-v3/include/bits/hashtable_policy.h 
b/libstdc++-v3/include/bits/hashtable_policy.h
index f64d2d3..817b190 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -1145,6 +1145,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using __ebo_h1 = _Hashtable_ebo_helper<1, _H1>;
   using __ebo_h2 = _Hashtable_ebo_helper<2, _H2>;
 
+  // Gives the local iterator implementation access to _M_bucket_index().
+  friend struct _Local_iterator_base<_Key, _Value, _ExtractKey, _H1, _H2,
+_Default_ranged_hash, false>;
+
 public:
   typedef _H1  hasher;
 
@@ -1225,7 +1229,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   private _Hashtable_ebo_helper<2, _H2>
 {
 private:
-  // Gives access to _M_h2() to the local iterator implementation.
+  // Gives the local iterator implementation access to _M_h2().
   friend struct _Local_iterator_base<_Key, _Value, _ExtractKey, _H1, _H2,
 _Default_ranged_hash, true>;
 
@@ -1331,7 +1335,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   };
 
 
-  /// Specialization.
+  /// Partial specialization used when nodes contain a cached hash code.
   template
 struct _Local_iterator_base<_Key, _Value, _ExtractKey,
@@ -1343,7 +1347,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using __hash_code_base = _Hash_code_base<_Key, _Value, _ExtractKey,
   _H1, _H2, _Hash, true>;
 
-public:
   _Local_iterator_base() = default;
   _Local_iterator_base(const __hash_code_base& __base,
   _Hash_

Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread H.J. Lu
On Fri, Jan 17, 2014 at 6:30 AM, Jakub Jelinek  wrote:
> On Fri, Jan 17, 2014 at 06:19:37AM -0800, H.J. Lu wrote:
>> ix86_split_lea_for_addr transforms a single LEA instruction into a series
>> of MOV and ADD instructions.  For
>>
>> lea 0x400(%eax, %ecx, 8), %edx
>>
>> we get
>>
>> mov %eax, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add $0x400, %edx
>
> Ugh, is that really want you want for silvermont, as opposed to (at least
> if the output operand isn't equal to the base):
> mov %ecx, %edx  ! if base is equal to index this would go away
> add %ecx, %edx
> add %edx, %edx
> add %edx, %edx
> add %eax, %edx
> add $0x400, %edx
> ?

Wrong example.  It should be

lea 0x400(%edx, %ecx, 8), %edx

we get

add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add $0x400, %edx

For

lea 0x400(%eax, %ecx, 8), %edx

we get

mov %ecx, %edx
shl $3, %edx
add %eax, %edx
add $0x400, %edx

-- 
H.J.


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread H.J. Lu
On Fri, Jan 17, 2014 at 6:23 AM, Uros Bizjak  wrote:
> On Fri, Jan 17, 2014 at 3:19 PM, H.J. Lu  wrote:
>> ix86_split_lea_for_addr transforms a single LEA instruction into a series
>> of MOV and ADD instructions.  For
>>
>> lea 0x400(%eax, %ecx, 8), %edx
>>
>> we get
>>
>> mov %eax, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add %ecx, %edx
>> add $0x400, %edx
>>
>> For -mtune=intel, we want to turn on X86_TUNE_OPT_AGU, but avoid
>> ix86_split_lea_for_addr.  This patch adds X86_TUNE_AVOID_LEA_FOR_ADDR
>> and PROCESSOR_INTEL.  We keep PROCESSOR_INTEL the same as
>> PROCESSOR_SILVERMONT, except that X86_TUNE_AVOID_LEA_FOR_ADDR isn't
>> turned on for PROCESSOR_INTEL.  OK for trunk?
>
> As said earlier, m_INTEL is not a processor, but equals a REAL
> processor, so the patch is not acceptable.
>

-mtune=intel, similar to -mtune=generic,  isn't equal to a single processor.
>From invoke.texi:

---
@item intel
Produce code optimized for the most current Intel processors, which are
Haswell and Silvermont for this version of GCC.
---

We don't want -mtune=intel to define __tune_silvermont__ and we
want to generate balanced codes for Haswell and Silvermont.
-mtune=intel started as -mtune=silvermont.  I am working on incremental
changes like this to better tune for Haswell without significantly impacting
Silvermont.

-- 
H.J.


Re: [PATCH] Fix up gen-vect-32.c testcase (PR testsuite/58776)

2014-01-17 Thread Richard Biener
On Fri, 17 Jan 2014, Jakub Jelinek wrote:

> Hi!
> 
> For -O2 vectorization by default GCC 4.9 uses the cheap cost model,
> which disallows e.g. peeling for alignment, but on strict alignment
> targets the loop needs to be peeled for alignment, otherwise vectorization
> isn't performed.
> 
> Fixed by just adding -fno-vect-cost-model to dg-options,
> bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2014-01-17  Jakub Jelinek  
> 
>   PR testsuite/58776
>   * gcc.dg/tree-ssa-gen-vect-32.c: Add -fno-vect-cost-model to
>   dg-options, use dg-additional-options for i?86/x86_64 to avoid
>   option duplication.
> 
> --- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c.jj2012-10-03 
> 09:01:35.0 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c   2014-01-17 
> 11:03:27.039224079 +0100
> @@ -1,6 +1,6 @@
>  /* { dg-do run { target vect_cmdline_needed } } */
> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse" { 
> target { i?86-*-* x86_64-*-* } } } */
> +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details 
> -fno-vect-cost-model" } */
> +/* { dg-additional-options "-mno-sse" { target { i?86-*-* x86_64-*-* } } } */
>  
>  #include 
>  
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: [Patch][AArch64] vneg floating point testcase BE fixed

2014-01-17 Thread Richard Earnshaw
On 17/01/14 14:22, Alex Velenko wrote:
> Hi,
> Here are some more improvements on changelog entry:
> 
> gcc/testsuite/
> 
> 2013-01-16  Alex Velenko  
> 
>   * gcc.target/aarch64/vneg_f.c (STORE_INST): New macro.
>   (RUN_TEST): Use new macro.
>   (INDEX64_32): Delete.
>   (INDEX64_64): Likewise.
>   (INDEX128_32): Likewise.
>   (INDEX128_64): Likewise.
>   (INDEX): Likewise.
>   (test_vneg_f32): Use fixed RUN_TEST.
>   (test_vneg_f64): Likewise.
>   (test_vnegq_f32): Likewise.
>   (test_vnegq_f64): Likewise.
> 
> 

OK.

R.




Re: [PATCH] Handle NAMELIST_DECLs in tree-nested (PR fortran/59440)

2014-01-17 Thread Richard Biener
On Fri, 17 Jan 2014, Jakub Jelinek wrote:

> Hi!
> 
> The following testcases ICE, because when tree-nested.c replaces
> various decls in the functions with local or nonlocal alternative decls
> and makes the old ones DECL_IGNORED_P, decls inside of NAMELIST_DECL
> NAMELIST_DECL_ASSOCIATED_DECL aren't replaced and thus debug info can't
> be generated for the namelist.
> 
> Fixed by adjusting them.  Bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2014-01-17  Jakub Jelinek  
> 
>   PR fortran/59440
>   * tree-nested.c (convert_nonlocal_reference_stmt,
>   convert_local_reference_stmt): For NAMELIST_DECLs in gimple_bind_vars
>   of GIMPLE_BIND stmts, adjust associated decls.
> 
>   * gfortran.dg/pr59440-1.f90: New test.
>   * gfortran.dg/pr59440-2.f90: New test.
>   * gfortran.dg/pr59440-3.f90: New test.
> 
> --- gcc/tree-nested.c.jj  2014-01-03 11:41:01.0 +0100
> +++ gcc/tree-nested.c 2014-01-17 10:17:58.287209744 +0100
> @@ -1331,6 +1331,25 @@ convert_nonlocal_reference_stmt (gimple_
>if (!optimize && gimple_bind_block (stmt))
>   note_nonlocal_block_vlas (info, gimple_bind_block (stmt));
>  
> +  for (tree var = gimple_bind_vars (stmt); var; var = DECL_CHAIN (var))
> + if (TREE_CODE (var) == NAMELIST_DECL)
> +   {
> + /* Adjust decls mentioned in NAMELIST_DECL.  */
> + tree decls = NAMELIST_DECL_ASSOCIATED_DECL (var);
> + tree decl;
> + unsigned int i;
> +
> + FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (decls), i, decl)
> +   {
> + if (TREE_CODE (decl) == VAR_DECL
> + && (TREE_STATIC (decl) || DECL_EXTERNAL (decl)))
> +   continue;
> + if (decl_function_context (decl) != info->context)
> +   CONSTRUCTOR_ELT (decls, i)->value
> + = get_nonlocal_debug_decl (info, decl);
> +   }
> +   }
> +
>*handled_ops_p = false;
>return NULL_TREE;
>  
> @@ -1787,6 +1806,36 @@ convert_local_reference_stmt (gimple_stm
>*handled_ops_p = false;
>return NULL_TREE;
>  
> +case GIMPLE_BIND:
> +  for (tree var = gimple_bind_vars (stmt); var; var = DECL_CHAIN (var))
> + if (TREE_CODE (var) == NAMELIST_DECL)
> +   {
> + /* Adjust decls mentioned in NAMELIST_DECL.  */
> + tree decls = NAMELIST_DECL_ASSOCIATED_DECL (var);
> + tree decl;
> + unsigned int i;
> +
> + FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (decls), i, decl)
> +   {
> + if (TREE_CODE (decl) == VAR_DECL
> + && (TREE_STATIC (decl) || DECL_EXTERNAL (decl)))
> +   continue;
> + if (decl_function_context (decl) == info->context
> + && !use_pointer_in_frame (decl))
> +   {
> + tree field = lookup_field_for_decl (info, decl, NO_INSERT);
> + if (field)
> +   {
> + CONSTRUCTOR_ELT (decls, i)->value
> +   = get_local_debug_decl (info, decl, field);
> +   }
> +   }
> +   }
> +   }
> +
> +  *handled_ops_p = false;
> +  return NULL_TREE;
> +
>  default:
>/* For every other statement that we are not interested in
>handling here, let the walker traverse the operands.  */
> --- gcc/testsuite/gfortran.dg/pr59440-1.f90.jj2014-01-17 
> 10:19:41.038685480 +0100
> +++ gcc/testsuite/gfortran.dg/pr59440-1.f90   2014-01-17 10:19:05.0 
> +0100
> @@ -0,0 +1,23 @@
> +! PR fortran/59440
> +! { dg-do compile }
> +! { dg-options "-O2 -g" }
> +
> +module pr59440
> +  implicit none
> +  type t
> + integer :: grid = 0
> +  end type t
> +contains
> +  subroutine read_nml (nnml, s)
> +integer, intent(in)  :: nnml
> +type(t), intent(out) :: s
> +integer  :: grid
> +namelist /N/ grid
> +call read_nml_type_2
> +s%grid = grid
> +  contains
> +subroutine read_nml_type_2
> +  read (nnml, nml=N)
> +end subroutine read_nml_type_2
> +  end subroutine read_nml
> +end module pr59440
> --- gcc/testsuite/gfortran.dg/pr59440-2.f90.jj2014-01-17 
> 10:19:44.725667756 +0100
> +++ gcc/testsuite/gfortran.dg/pr59440-2.f90   2014-01-17 10:19:20.0 
> +0100
> @@ -0,0 +1,16 @@
> +! PR fortran/59440
> +! { dg-do compile }
> +! { dg-options "-O2 -g" }
> +
> +subroutine foo (nnml, outv)
> +  integer, intent(in) :: nnml
> +  integer, intent(out) :: outv
> +  integer :: grid
> +  namelist /N/ grid
> +  read (nnml, nml=N)
> +  call bar
> +contains
> +  subroutine bar
> +outv = grid
> +  end subroutine bar
> +end subroutine foo
> --- gcc/testsuite/gfortran.dg/pr59440-3.f90.jj2014-01-17 
> 10:19:46.910654994 +0100
> +++ gcc/testsuite/gfortran.dg/pr59440-3.f90   2014-01-17 10:19:23.0 
> +0100
> @@ -0,0 +1,16 @@
> +! PR fortran/59440
> +! { dg-do compile }

Re: [PATCH] Fix up ivdep/do concurrent testcases (PR testsuite/59064)

2014-01-17 Thread Richard Biener
On Fri, 17 Jan 2014, Jakub Jelinek wrote:

> Hi!
> 
> I believe the intent of these testcases was to verify the loops
> are vectorized and don't need versioning for alias, which is the only thing
> these constructs tell the compiler about.  On some architectures, it is
> possible the loops will need versioning for alignment (or peeling for
> alignment or just using unaligned loads and/or stores in the vectorized
> loop).
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

Ok.

Thanks,
Richard.

> 2014-01-17  Jakub Jelinek  
> 
>   PR testsuite/59064
>   * gcc.dg/vect/vect-ivdep-1.c: Replace two dg-bogus lines separately
>   testing for " version" and " alias" with one testing for
>   " version\[^\n\r]* alias".
>   * gcc.dg/vect/vect-ivdep-2.c: Likewise.
>   * gfortran.dg/vect/vect-do-concurrent-1.f90: Likewise.
>   * g++.dg/vect/pr33426-ivdep.cc: Likewise.
>   * g++.dg/vect/pr33426-ivdep-2.cc: Likewise.
>   * g++.dg/vect/pr33426-ivdep-3.cc: Likewise.
>   * g++.dg/vect/pr33426-ivdep-4.cc: Adjust comments similarly.
> 
> --- gcc/testsuite/gcc.dg/vect/vect-ivdep-1.c.jj   2013-11-12 
> 11:31:19.0 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-ivdep-1.c  2014-01-17 08:38:37.919749235 
> +0100
> @@ -14,6 +14,5 @@ void foo(int n, int *a, int *b, int *c,
>  }
>  
>  /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
> -/* { dg-bogus " version" "" { target *-*-* } 0 } */
> -/* { dg-bogus " alias" "" { target *-*-* } 0 } */
> +/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
> --- gcc/testsuite/gcc.dg/vect/vect-ivdep-2.c.jj   2013-11-12 
> 11:31:19.0 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-ivdep-2.c  2014-01-17 08:38:55.854652038 
> +0100
> @@ -30,6 +30,5 @@ void bar(int n, int *a, int *b, int *c)
>  
>  
>  /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
> -/* { dg-bogus " version" "" { target *-*-* } 0 } */
> -/* { dg-bogus " alias" "" { target *-*-* } 0 } */
> +/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
> --- gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90.jj
> 2013-11-12 11:31:16.0 +0100
> +++ gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90   2014-01-17 
> 08:41:58.744714758 +0100
> @@ -12,6 +12,5 @@ subroutine test(n, a, b, c)
>  end subroutine test
>  
>  ! { dg-message "loop vectorized" "" { target *-*-* } 0 }
> -! { dg-bogus " version" "" { target *-*-* } 0 }
> -! { dg-bogus " alias" "" { target *-*-* } 0 }
> +! { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 }
>  ! { dg-final { cleanup-tree-dump "vect" } }
> --- gcc/testsuite/g++.dg/vect/pr33426-ivdep.cc.jj 2013-11-12 
> 11:31:20.0 +0100
> +++ gcc/testsuite/g++.dg/vect/pr33426-ivdep.cc2014-01-17 
> 08:40:08.534286245 +0100
> @@ -14,6 +14,5 @@ void foo(int n, int *a, int *b, int *c,
>  }
>  
>  /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
> -/* { dg-bogus " version" "" { target *-*-* } 0 } */
> -/* { dg-bogus " alias" "" { target *-*-* } 0 } */
> +/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
> --- gcc/testsuite/g++.dg/vect/pr33426-ivdep-2.cc.jj   2013-11-12 
> 11:31:20.0 +0100
> +++ gcc/testsuite/g++.dg/vect/pr33426-ivdep-2.cc  2014-01-17 
> 08:40:23.557207616 +0100
> @@ -29,8 +29,7 @@ void bar(int n, int *a, int *b, int *c)
>  }
>  
>  /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
> -/* { dg-bogus " version" "" { target *-*-* } 0 } */
> -/* { dg-bogus " alias" "" { target *-*-* } 0 } */
> +/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
>  
>  /* { dg-final { scan-tree-dump-times "ANNOTATE_EXPR " 2 "original" } } */
> --- gcc/testsuite/g++.dg/vect/pr33426-ivdep-3.cc.jj   2013-11-12 
> 11:31:20.0 +0100
> +++ gcc/testsuite/g++.dg/vect/pr33426-ivdep-3.cc  2014-01-17 
> 08:40:31.833150629 +0100
> @@ -15,8 +15,7 @@ void foo(int *a) {
>  }
>  
>  /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
> -/* { dg-bogus " version" "" { target *-*-* } 0 } */
> -/* { dg-bogus " alias" "" { target *-*-* } 0 } */
> +/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
>  /* { dg-final { cleanup-tree-dump "vect" } } */
>  
>  /* { dg-final { scan-tree-dump-times "ANNOTATE_EXPR " 1 "original" } } */
> --- gcc/testsuite/g++.dg/vect/pr33426-ivdep-4.cc.jj   2013-11-12 
> 11:31:20.0 +0100
> +++ gcc/testsuite/g++.dg/vect/pr33426-ivdep-4.cc  2014-01-17 
> 08:40:56.073057959 +0100
> @@ -20,8 +20,7 @@ void foo(std::vector *ar, int *b) {
>  }
>  
>  /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
> -/* FIXME: dg-bogus " version" "" { target *-*-* } 0  */
> -/* FIXME: dg-bogus " alias" "" { target *-*-* } 0  *

Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread Jakub Jelinek
On Fri, Jan 17, 2014 at 06:19:37AM -0800, H.J. Lu wrote:
> ix86_split_lea_for_addr transforms a single LEA instruction into a series
> of MOV and ADD instructions.  For
> 
> lea 0x400(%eax, %ecx, 8), %edx
> 
> we get
> 
> mov %eax, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add $0x400, %edx

Ugh, is that really want you want for silvermont, as opposed to (at least
if the output operand isn't equal to the base):
mov %ecx, %edx  ! if base is equal to index this would go away
add %ecx, %edx
add %edx, %edx
add %edx, %edx
add %eax, %edx
add $0x400, %edx
?

Jakub


Re: [PATCH] Don't fold zero-sized elements (PR c/58346)

2014-01-17 Thread Richard Biener
On Fri, Jan 17, 2014 at 2:37 PM, Marek Polacek  wrote:
> This is the real fix for PR58346.  I'd say the easiest solution is
> just not fold the zero-sized elements.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2014-01-17  Marek Polacek  
>
> PR c/58346
> * gimple-fold.c (fold_array_ctor_reference): Don't fold if element
> size is zero.
> testsuite/
> * gcc.dg/pr58346.c: New test.
>
> --- gcc/gimple-fold.c.mp2   2014-01-17 12:03:56.149446880 +0100
> +++ gcc/gimple-fold.c   2014-01-17 12:04:00.450462677 +0100
> @@ -2940,7 +2940,8 @@ fold_array_ctor_reference (tree type, tr
>   be larger than size of array element.  */
>if (!TYPE_SIZE_UNIT (type)
>|| TREE_CODE (TYPE_SIZE_UNIT (type)) != INTEGER_CST
> -  || elt_size.slt (tree_to_double_int (TYPE_SIZE_UNIT (type
> +  || elt_size.slt (tree_to_double_int (TYPE_SIZE_UNIT (type)))
> +  || elt_size.is_zero ())
>  return NULL_TREE;
>
>/* Compute the array index we look for.  */
> --- gcc/testsuite/gcc.dg/pr58346.c.mp2  2014-01-17 12:27:26.180127058 +0100
> +++ gcc/testsuite/gcc.dg/pr58346.c  2014-01-17 12:28:20.466332046 +0100
> @@ -0,0 +1,19 @@
> +/* PR tree-optimization/58346 */
> +/* { dg-do compile } */
> +/* { dg-options "-O" } */
> +
> +struct U {};
> +static struct U b[1] = { };
> +extern void bar (struct U);
> +
> +void
> +foo (void)
> +{
> +  bar (b[0]);
> +}
> +
> +void
> +baz (void)
> +{
> +  foo ();
> +}
>
> Marek


Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-17 Thread Uros Bizjak
On Fri, Jan 17, 2014 at 3:15 PM, Jakub Jelinek  wrote:
> On Tue, Jan 14, 2014 at 08:12:41PM +0100, Jakub Jelinek wrote:
>> For 4.9, if what you've added is what you want to do for performance
>> reasons, then I'd do something like:
>
> Ok, here it is in a form of patch, bootstrapped/regtested on x86_64-linux
> and i686-linux, ok for trunk?
>
> 2014-01-17  Jakub Jelinek  
>
> * config/i386/i386.c (ix86_data_alignment): For compatibility with
> (incorrect) GCC 4.8 and earlier alignment assumptions ensure we align
> decls to at least the GCC 4.8 used alignments.

This is OK for mainline.

Thanks,
Uros.


Re: [PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread Uros Bizjak
On Fri, Jan 17, 2014 at 3:19 PM, H.J. Lu  wrote:
> ix86_split_lea_for_addr transforms a single LEA instruction into a series
> of MOV and ADD instructions.  For
>
> lea 0x400(%eax, %ecx, 8), %edx
>
> we get
>
> mov %eax, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add %ecx, %edx
> add $0x400, %edx
>
> For -mtune=intel, we want to turn on X86_TUNE_OPT_AGU, but avoid
> ix86_split_lea_for_addr.  This patch adds X86_TUNE_AVOID_LEA_FOR_ADDR
> and PROCESSOR_INTEL.  We keep PROCESSOR_INTEL the same as
> PROCESSOR_SILVERMONT, except that X86_TUNE_AVOID_LEA_FOR_ADDR isn't
> turned on for PROCESSOR_INTEL.  OK for trunk?

As said earlier, m_INTEL is not a processor, but equals a REAL
processor, so the patch is not acceptable.

Uros.


[committed] Honza's alias/weakref fix (PR c++/57945)

2014-01-17 Thread Jakub Jelinek
Hi!

I've bootstrapped/regtested on x86_64-linux and i686-linux
following patch from Honza from the PR comment (written a month
ago, acked by Jason a fortnight ago) and committed it to trunk.

2014-01-17  Jan Hubicka  

PR c++/57945
* passes.c (rest_of_decl_compilation): Don't call varpool_finalize_decl
on decls for which assemble_alias has been called.

2014-01-17  Jakub Jelinek  

PR c++/57945
* c-c++-common/torture/pr57945.c: New test.

--- gcc/passes.c.jj 2014-01-03 11:41:01.0 +0100
+++ gcc/passes.c2014-01-17 11:51:15.767231295 +0100
@@ -187,6 +187,8 @@ rest_of_decl_compilation (tree decl,
  int top_level,
  int at_end)
 {
+  bool finalize = true;
+
   /* We deferred calling assemble_alias so that we could collect
  other attributes such as visibility.  Emit the alias now.  */
   if (!in_lto_p)
@@ -203,6 +205,7 @@ rest_of_decl_compilation (tree decl,
DECL_EXTERNAL (decl) = 0;
TREE_STATIC (decl) = 1;
assemble_alias (decl, alias);
+   finalize = false;
   }
   }
 
@@ -234,7 +237,7 @@ rest_of_decl_compilation (tree decl,
 rebuild it.  */
  if (in_lto_p && !at_end)
;
- else if (TREE_CODE (decl) != FUNCTION_DECL)
+ else if (finalize && TREE_CODE (decl) != FUNCTION_DECL)
varpool_finalize_decl (decl);
}
 
--- gcc/testsuite/c-c++-common/torture/pr57945.c.jj 2014-01-17 
11:52:02.384990286 +0100
+++ gcc/testsuite/c-c++-common/torture/pr57945.c2014-01-17 
11:50:03.0 +0100
@@ -0,0 +1,11 @@
+/* PR c++/57945 */
+/* { dg-do compile } */
+
+extern int j;
+static int i __attribute__((weakref("j")));
+
+int
+foo (void)
+{
+  return &i ? i : 0;
+}

Jakub


Re: [Patch][AArch64] vneg floating point testcase BE fixed

2014-01-17 Thread Alex Velenko

Hi,
Here are some more improvements on changelog entry:

gcc/testsuite/

2013-01-16  Alex Velenko  

* gcc.target/aarch64/vneg_f.c (STORE_INST): New macro.
(RUN_TEST): Use new macro.
(INDEX64_32): Delete.
(INDEX64_64): Likewise.
(INDEX128_32): Likewise.
(INDEX128_64): Likewise.
(INDEX): Likewise.
(test_vneg_f32): Use fixed RUN_TEST.
(test_vneg_f64): Likewise.
(test_vnegq_f32): Likewise.
(test_vnegq_f64): Likewise.


Kind regards,
Alex Velenko

On 16/01/14 16:58, Richard Earnshaw wrote:

On 16/01/14 12:23, Alex Velenko wrote:

Hi,
This patch fixes testcase vneg_f.c which  was using an inconsistent
vector model causing problems for Big-Endian compiler.

Now testcase runs on both LE and BE without regressions.

Is it okay?

Kind regards,
Alex Velenko

gcc/testsuite/

2013-01-16  Alex Velenko  

   */gcc.target/aarch64/vneg_f.c (STORE_INST): ST1 macro added.

Just say: "New macro."


   (RUN_TEST): Macro updated to use STORE_INST.

"Use it."


   (test_vneg_f32): Changed to provide definitions for RUN_TEST.

"Use RUN_TEST."


   (test_vneg_f64): Likewise.
   (test_vnegq_f32): Likewise.
   (test_vnegq_f64): Likewise.



You also need to mention the INDEX* macros that you've removed.  Just
say "Delete."



Vneg_fix.patch


diff --git a/gcc/testsuite/gcc.target/aarch64/vneg_f.c 
b/gcc/testsuite/gcc.target/aarch64/vneg_f.c
index 
1eaf21d34eb57b4e7e5388a4686fe6341197447a..01503028547f320ab3d8ea725ff09ee5d0487f18
 100644
--- a/gcc/testsuite/gcc.target/aarch64/vneg_f.c
+++ b/gcc/testsuite/gcc.target/aarch64/vneg_f.c
@@ -44,34 +44,27 @@ extern void abort (void);
  #define DATA_TYPE_64 double
  #define DATA_TYPE(data_len) DATA_TYPE_##data_len

-#define INDEX64_32 [i]
-#define INDEX64_64
-#define INDEX128_32 [i]
-#define INDEX128_64 [i]
-#define INDEX(reg_len, data_len) \
-  CONCAT1 (INDEX, reg_len##_##data_len)
-
+#define STORE_INST(reg_len, data_len) \
+  CONCAT1 (vst1, POSTFIX (reg_len, data_len))
  #define LOAD_INST(reg_len, data_len) \
CONCAT1 (vld1, POSTFIX (reg_len, data_len))
  #define NEG_INST(reg_len, data_len) \
CONCAT1 (vneg, POSTFIX (reg_len, data_len))

  #define INHIB_OPTIMIZATION asm volatile ("" : : : "memory")
-
-#define RUN_TEST(test_set, reg_len, data_len, n, a, b) \
+#define RUN_TEST(test_set, reg_len, data_len, n, a, b, c) \
{  \
  int i;   \
  (a) = LOAD_INST (reg_len, data_len) (test_set);\
  (b) = NEG_INST (reg_len, data_len) (a);  \
+STORE_INST (reg_len, data_len) (c, b);\
  for (i = 0; i < n; i++)   \
{  \
DATA_TYPE (data_len) diff; \
INHIB_OPTIMIZATION;\
-   diff   \
- = a INDEX (reg_len, data_len)\
-   + b INDEX (reg_len, data_len); \
+   diff = test_set[i] + c[i]; \
if (diff > EPSILON) \
- return 1;\
+   return 1;  \
}  \
}

@@ -84,28 +77,29 @@ extern void abort (void);
  int
  test_vneg_f32 ()
  {
-  float test_set0[2] = { TEST0, TEST1 };
-  float test_set1[2] = { TEST2, TEST3 };
-  float test_set2[2] = { VAR_MAX, VAR_MIN };
-  float test_set3[2] = { INFINITY, NAN };
-
float32x2_t a;
float32x2_t b;
+  float32_t c[2];

-  RUN_TEST (test_set0, 64, 32, 2, a, b);
-  RUN_TEST (test_set1, 64, 32, 2, a, b);
-  RUN_TEST (test_set2, 64, 32, 2, a, b);
-  RUN_TEST (test_set3, 64, 32, 0, a, b);
+  float32_t test_set0[2] = { TEST0, TEST1 };
+  float32_t test_set1[2] = { TEST2, TEST3 };
+  float32_t test_set2[2] = { VAR_MAX, VAR_MIN };
+  float32_t test_set3[2] = { INFINITY, NAN };
+
+  RUN_TEST (test_set0, 64, 32, 2, a, b, c);
+  RUN_TEST (test_set1, 64, 32, 2, a, b, c);
+  RUN_TEST (test_set2, 64, 32, 2, a, b, c);
+  RUN_TEST (test_set3, 64, 32, 0, a, b, c);

/* Since last test cannot be checked in a uniform way by adding
   negation result to original value, the number of lanes to be
   checked in RUN_TEST is 0 (last argument).  Instead, result
   will be checked manually.  */

-  if (b[0] != -INFINITY)
+  if (c[0] != -INFINITY)
  return 1;

-  if (!__builtin_isnan (b[1]))
+  if (!__builtin_isnan (c[1]))
  return 1;

return 0;
@@ -130,37 +124,38 @@ test_vneg_f64 ()
  {
float64x1_t a;
float64x1_t b;
-
-  double test_set0[1] = { TEST0 };
-  double test_set1[1] = { TEST1 };
-  double test_set2[1] = { TEST2 };
-  double test_set3[1] = { TEST3 };
-  double test_set4[1] = { VAR_MAX };
-  double test_set5[1] = { VAR_MIN };
-  double test_set6[1]

[build, libgcc] Ensure libgcc_s unwinder is always used on 64-bit Solaris 10+/x86 (PR target/59788)

2014-01-17 Thread Rainer Orth
As described in the PR, the 64-bit Solaris 10+/x86 libc contains an
implementation of those _Unwind_* functions required by the AMD64 ABI,
i.e. those contained in libgcc_s.so.1 at version GCC_3.0.

If by some circumstance (use of -Bdirect, -z lazyload, maybe others)
libc.so.1 happens to be searched by ld.so.1 before libgcc_s.so.1 and
some library (e.g. libstdc++.so.6) uses functions both from GCC_3.0
(then resolved from libc.so.1) and others (resolved from libgcc_s.so.1),
crashes result due to mixing those different implementations with
different internal data structures.

To avoid this, I suggest linking libgcc_s.so.1 with a mapfile that
enforces direct binding to the libgcc_s.so.1 implementation to avoid
that mixture.

The following patch does just that.  Initially, I tried to only use the
mapfile when -lgcc_s is used, but libtool often links shared objects
with -shared -nostdlib, adding -lgcc_s -lc -lgcc_s itself (for whatever
reason it deems appropriate to second-guess the compiler driver here).
Therefore I'm keying the mapfile use off -shared resp. -shared-libgcc
instead.

Unfortunately, the patch needs a change to the bundled ltmain.sh: by
default, libtool `optimizes' -lgcc_s -lc -lgcc_s into -lc -lgcc_s.
Combined with direct binding, this lead to exactly the failure the patch
intends to avoid.  The libtool bug has already been reported and a patch
proposed:

http://lists.gnu.org/archive/html/libtool-patches/2014-01/msg5.html

The patch has been tested on i386-pc-solaris2.{9,10,11} and
sparc-sun-solaris2.{9,10,11} with Sun as/ld and on i386-pc-solaris2.10
with Sun as/GNU ld.

I don't need approval for the Solaris specific parts, but another pair
of eyes would certainly be helpful.

One potential issue would be a version of gcc containing the patch used
with a libtool without the change.  The last libtool release was almost
two years ago, so this is quite a likely condition.  Fortunately,
problems would only ensure if some 64-bit Solaris/x86 program/library
uses the gcc extensions to the AMD64 unwinder.  According to a code
search, uses of those functions are very rare outside of gcc, and the
problem can be worked around by invoking libtool with
--preserve-dup-deps, so I consider this risk acceptable.

Rainer


2014-01-14  Rainer Orth  

gcc:
PR target/59788
* config/sol2.h (LINK_LIBGCC_MAPFILE_SPEC): Define.
(LINK_SPEC): Use it for -shared, -shared-libgcc.

libgcc:
PR target/59788
* config/t-slibgcc-sld (libgcc-unwind.map): New target.
(install-libgcc-unwind-map-forbuild): New target.
(all): Depend on install-libgcc-unwind-map-forbuild.
(install-libgcc-unwind-map): New target.
(install): Depend on install-libgcc-unwind-map.

gcc/testsuite:
PR target/59788
* g++.dg/eh/unwind-direct.C: New test.

libgo:
PR target/59788
* config/ltmain.sh (opt_duplicate_compiler_generated_deps): Enable on
*solaris2*.

toplevel:
PR target/59788
* ltmain.sh (opt_duplicate_compiler_generated_deps): Enable on
*solaris2*.

# HG changeset patch
# Parent a6e6484e3cdf3a53d0e325f3faf34e291f8469fb
Ensure libgcc_s unwinder is always used on 64-bit Solaris 10+/x86 (PR target/59788)

diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h
--- a/gcc/config/sol2.h
+++ b/gcc/config/sol2.h
@@ -174,12 +174,21 @@ along with GCC; see the file COPYING3.  
 #define RDYNAMIC_SPEC "--export-dynamic"
 #endif
 
+#ifndef USE_GLD
+/* With Sun ld, use mapfile to enforce direct binding to libgcc_s unwinder.  */
+#define LINK_LIBGCC_MAPFILE_SPEC \
+  "%{shared|shared-libgcc:-M %slibgcc-unwind.map}"
+#else
+/* GNU ld doesn't support direct binding.  */
+#define LINK_LIBGCC_MAPFILE_SPEC ""
+#endif
+
 #undef  LINK_SPEC
 #define LINK_SPEC \
   "%{h*} %{v:-V} \
%{!shared:%{!static:%{rdynamic: " RDYNAMIC_SPEC "}}} \
%{static:-dn -Bstatic} \
-   %{shared:-G -dy %{!mimpure-text:-z text}} \
+   %{shared:-G -dy %{!mimpure-text:-z text}} " LINK_LIBGCC_MAPFILE_SPEC " \
%{symbolic:-Bsymbolic -G -dy -z text} \
%(link_arch) \
%{Qy:} %{!Qn:-Qy}"
diff --git a/gcc/testsuite/g++.dg/eh/unwind-direct.C b/gcc/testsuite/g++.dg/eh/unwind-direct.C
new file mode 100644
--- /dev/null
+++ b/gcc/testsuite/g++.dg/eh/unwind-direct.C
@@ -0,0 +1,15 @@
+// PR target/59788
+// { dg-do run { target { *-*-solaris2* && { ! gld } } } }
+// { dg-options "-Wl,-Bdirect" }
+
+#include 
+
+int
+main(void)
+{
+  try
+{ throw std::runtime_error( "Catch me if you can!"); }
+  catch(...)
+{ return 0; }
+  return 1;
+}
diff --git a/libgcc/config/t-slibgcc-sld b/libgcc/config/t-slibgcc-sld
--- a/libgcc/config/t-slibgcc-sld
+++ b/libgcc/config/t-slibgcc-sld
@@ -3,3 +3,26 @@
 
 SHLIB_LDFLAGS = -Wl,-h,$(SHLIB_SONAME) -Wl,-z,text -Wl,-z,defs \
 	-Wl,-M,$(SHLIB_MAP)
+
+# Linker mapfile to enforce direct binding to libgcc_s unwinder
+# (PR target/59788).
+libgcc-unwind.map: libgcc-std.ver
+

[PATCH] Add X86_TUNE_AVOID_LEA_FOR_ADDR

2014-01-17 Thread H.J. Lu
ix86_split_lea_for_addr transforms a single LEA instruction into a series
of MOV and ADD instructions.  For

lea 0x400(%eax, %ecx, 8), %edx

we get

mov %eax, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add %ecx, %edx
add $0x400, %edx

For -mtune=intel, we want to turn on X86_TUNE_OPT_AGU, but avoid
ix86_split_lea_for_addr.  This patch adds X86_TUNE_AVOID_LEA_FOR_ADDR
and PROCESSOR_INTEL.  We keep PROCESSOR_INTEL the same as
PROCESSOR_SILVERMONT, except that X86_TUNE_AVOID_LEA_FOR_ADDR isn't
turned on for PROCESSOR_INTEL.  OK for trunk?

Thanks.


H.J.
---
 gcc/config/i386/i386-c.c |  2 +
 gcc/config/i386/i386.c   | 93 +---
 gcc/config/i386/i386.h   |  4 ++
 gcc/config/i386/x86-tune.def | 68 +++-
 5 files changed, 162 insertions(+), 32 deletions(-)
 create mode 100644 ChangeLog.intel

gcc/

2014-01-17  H.J. Lu  

* config/i386/i386-c.c (ix86_target_macros_internal): Handle
PROCESSOR_INTEL.  Treat like PROCESSOR_GENERIC.
* config/i386/i386.c (intel_memcpy): New.  Duplicate slm_memcpy.
(intel_memset): New.  Duplicate slm_memset.
(intel_cost): New.  Duplicate slm_cost.
(m_INTEL): New macro.
(processor_target_table): Add "intel".
(ix86_option_override_internal): Replace PROCESSOR_SILVERMONT
with PROCESSOR_INTEL for "intel".
(ix86_lea_outperforms): Support PROCESSOR_INTEL.  Duplicate
PROCESSOR_SILVERMONT.
(ix86_avoid_lea_for_addr): Check TARGET_AVOID_LEA_FOR_ADDR
instead of TARGET_OPT_AGU.
(ix86_issue_rate): Likewise.
(ix86_adjust_cost): Likewise.
(ia32_multipass_dfa_lookahead): Likewise.
(swap_top_of_ready_list): Likewise.
(ix86_sched_reorder): Likewise.
* config/i386/i386.h (TARGET_INTEL): New.
(TARGET_AVOID_LEA_FOR_ADDR): Likewise.
(processor_type): Add PROCESSOR_INTEL.
* config/i386/x86-tune.def: Support m_INTEL. Duplicate
m_SILVERMONT.  Add X86_TUNE_AVOID_LEA_FOR_ADDR.
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 9686382..ce9ba95 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -174,6 +174,7 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 /* use PROCESSOR_max to not set/unset the arch macro.  */
 case PROCESSOR_max:
   break;
+case PROCESSOR_INTEL:
 case PROCESSOR_GENERIC:
   gcc_unreachable ();
 }
@@ -276,6 +277,7 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
   def_or_undef (parse_in, "__tune_slm__");
   def_or_undef (parse_in, "__tune_silvermont__");
   break;
+case PROCESSOR_INTEL:
 case PROCESSOR_GENERIC:
   break;
 /* use PROCESSOR_max to not set/unset the tune macro.  */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index df408ae..82753fd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -1747,6 +1747,83 @@ struct processor_costs slm_cost = {
   1,   /* cond_not_taken_branch_cost.  */
 };
 
+static stringop_algs intel_memcpy[2] = {
+  {libcall, {{11, loop, false}, {-1, rep_prefix_4_byte, false}}},
+  {libcall, {{32, loop, false}, {64, rep_prefix_4_byte, false},
+ {8192, rep_prefix_8_byte, false}, {-1, libcall, false;
+static stringop_algs intel_memset[2] = {
+  {libcall, {{8, loop, false}, {15, unrolled_loop, false},
+ {2048, rep_prefix_4_byte, false}, {-1, libcall, false}}},
+  {libcall, {{24, loop, false}, {32, unrolled_loop, false},
+ {8192, rep_prefix_8_byte, false}, {-1, libcall, false;
+static const
+struct processor_costs intel_cost = {
+  COSTS_N_INSNS (1),   /* cost of an add instruction */
+  COSTS_N_INSNS (1) + 1,   /* cost of a lea instruction */
+  COSTS_N_INSNS (1),   /* variable shift costs */
+  COSTS_N_INSNS (1),   /* constant shift costs */
+  {COSTS_N_INSNS (3),  /* cost of starting multiply for QI */
+   COSTS_N_INSNS (3),  /*   HI */
+   COSTS_N_INSNS (3),  /*   SI */
+   COSTS_N_INSNS (4),  /*   DI */
+   COSTS_N_INSNS (2)}, /*other */
+  0,   /* cost of multiply per each bit set */
+  {COSTS_N_INSNS (18), /* cost of a divide/mod for QI */
+   COSTS_N_INSNS (26), /*  HI */
+   COSTS_N_INSNS (42), /*  SI */
+   COSTS_N_INSNS (74), /*  DI */
+   COSTS_N_INSNS (74)},/*  
other */
+  COSTS_N_INSNS (1),   /* cost of movsx */
+  COSTS_N_INSNS (1),  

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-17 Thread Jakub Jelinek
On Tue, Jan 14, 2014 at 08:12:41PM +0100, Jakub Jelinek wrote:
> For 4.9, if what you've added is what you want to do for performance
> reasons, then I'd do something like:

Ok, here it is in a form of patch, bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?

2014-01-17  Jakub Jelinek  

* config/i386/i386.c (ix86_data_alignment): For compatibility with
(incorrect) GCC 4.8 and earlier alignment assumptions ensure we align
decls to at least the GCC 4.8 used alignments.

--- gcc/config/i386/i386.c.jj   2014-01-16 20:22:50.0 +0100
+++ gcc/config/i386/i386.c  2014-01-17 11:56:51.183501322 +0100
@@ -26433,6 +26433,15 @@ ix86_constant_alignment (tree exp, int a
 int
 ix86_data_alignment (tree type, int align, bool opt)
 {
+  /* GCC 4.8 and earlier used to incorrectly assume this alignment even
+ for symbols from other compilation units or symbols that don't need
+ to bind locally.  In order to preserve some ABI compatibility with
+ those compilers, ensure we don't decrease alignment from what we
+ used to assume.  */
+
+  int max_align_compat
+= optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
+
   /* A data structure, equal or greater than the size of a cache line
  (64 bytes in the Pentium 4 and other recent Intel processors, including
  processors based on Intel Core microarchitecture) should be aligned
@@ -26447,11 +26456,17 @@ ix86_data_alignment (tree type, int alig
   if (opt
   && AGGREGATE_TYPE_P (type)
   && TYPE_SIZE (type)
-  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
-  && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align
- || TREE_INT_CST_HIGH (TYPE_SIZE (type)))
-  && align < max_align)
-align = max_align;
+  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
+{
+  if ((TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align_compat
+  || TREE_INT_CST_HIGH (TYPE_SIZE (type)))
+ && align < max_align_compat)
+   align = max_align_compat;
+  if ((TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align
+  || TREE_INT_CST_HIGH (TYPE_SIZE (type)))
+ && align < max_align)
+   align = max_align;
+}
 
   /* x86-64 ABI requires arrays greater than 16 bytes to be aligned
  to 16byte boundary.  */


Jakub


[PATCH] Fix up gen-vect-32.c testcase (PR testsuite/58776)

2014-01-17 Thread Jakub Jelinek
Hi!

For -O2 vectorization by default GCC 4.9 uses the cheap cost model,
which disallows e.g. peeling for alignment, but on strict alignment
targets the loop needs to be peeled for alignment, otherwise vectorization
isn't performed.

Fixed by just adding -fno-vect-cost-model to dg-options,
bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-01-17  Jakub Jelinek  

PR testsuite/58776
* gcc.dg/tree-ssa-gen-vect-32.c: Add -fno-vect-cost-model to
dg-options, use dg-additional-options for i?86/x86_64 to avoid
option duplication.

--- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c.jj  2012-10-03 
09:01:35.0 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c 2014-01-17 11:03:27.039224079 
+0100
@@ -1,6 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
-/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
-/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse" { 
target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details 
-fno-vect-cost-model" } */
+/* { dg-additional-options "-mno-sse" { target { i?86-*-* x86_64-*-* } } } */
 
 #include 
 

Jakub


[PATCH] Handle NAMELIST_DECLs in tree-nested (PR fortran/59440)

2014-01-17 Thread Jakub Jelinek
Hi!

The following testcases ICE, because when tree-nested.c replaces
various decls in the functions with local or nonlocal alternative decls
and makes the old ones DECL_IGNORED_P, decls inside of NAMELIST_DECL
NAMELIST_DECL_ASSOCIATED_DECL aren't replaced and thus debug info can't
be generated for the namelist.

Fixed by adjusting them.  Bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2014-01-17  Jakub Jelinek  

PR fortran/59440
* tree-nested.c (convert_nonlocal_reference_stmt,
convert_local_reference_stmt): For NAMELIST_DECLs in gimple_bind_vars
of GIMPLE_BIND stmts, adjust associated decls.

* gfortran.dg/pr59440-1.f90: New test.
* gfortran.dg/pr59440-2.f90: New test.
* gfortran.dg/pr59440-3.f90: New test.

--- gcc/tree-nested.c.jj2014-01-03 11:41:01.0 +0100
+++ gcc/tree-nested.c   2014-01-17 10:17:58.287209744 +0100
@@ -1331,6 +1331,25 @@ convert_nonlocal_reference_stmt (gimple_
   if (!optimize && gimple_bind_block (stmt))
note_nonlocal_block_vlas (info, gimple_bind_block (stmt));
 
+  for (tree var = gimple_bind_vars (stmt); var; var = DECL_CHAIN (var))
+   if (TREE_CODE (var) == NAMELIST_DECL)
+ {
+   /* Adjust decls mentioned in NAMELIST_DECL.  */
+   tree decls = NAMELIST_DECL_ASSOCIATED_DECL (var);
+   tree decl;
+   unsigned int i;
+
+   FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (decls), i, decl)
+ {
+   if (TREE_CODE (decl) == VAR_DECL
+   && (TREE_STATIC (decl) || DECL_EXTERNAL (decl)))
+ continue;
+   if (decl_function_context (decl) != info->context)
+ CONSTRUCTOR_ELT (decls, i)->value
+   = get_nonlocal_debug_decl (info, decl);
+ }
+ }
+
   *handled_ops_p = false;
   return NULL_TREE;
 
@@ -1787,6 +1806,36 @@ convert_local_reference_stmt (gimple_stm
   *handled_ops_p = false;
   return NULL_TREE;
 
+case GIMPLE_BIND:
+  for (tree var = gimple_bind_vars (stmt); var; var = DECL_CHAIN (var))
+   if (TREE_CODE (var) == NAMELIST_DECL)
+ {
+   /* Adjust decls mentioned in NAMELIST_DECL.  */
+   tree decls = NAMELIST_DECL_ASSOCIATED_DECL (var);
+   tree decl;
+   unsigned int i;
+
+   FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (decls), i, decl)
+ {
+   if (TREE_CODE (decl) == VAR_DECL
+   && (TREE_STATIC (decl) || DECL_EXTERNAL (decl)))
+ continue;
+   if (decl_function_context (decl) == info->context
+   && !use_pointer_in_frame (decl))
+ {
+   tree field = lookup_field_for_decl (info, decl, NO_INSERT);
+   if (field)
+ {
+   CONSTRUCTOR_ELT (decls, i)->value
+ = get_local_debug_decl (info, decl, field);
+ }
+ }
+ }
+ }
+
+  *handled_ops_p = false;
+  return NULL_TREE;
+
 default:
   /* For every other statement that we are not interested in
 handling here, let the walker traverse the operands.  */
--- gcc/testsuite/gfortran.dg/pr59440-1.f90.jj  2014-01-17 10:19:41.038685480 
+0100
+++ gcc/testsuite/gfortran.dg/pr59440-1.f90 2014-01-17 10:19:05.0 
+0100
@@ -0,0 +1,23 @@
+! PR fortran/59440
+! { dg-do compile }
+! { dg-options "-O2 -g" }
+
+module pr59440
+  implicit none
+  type t
+ integer :: grid = 0
+  end type t
+contains
+  subroutine read_nml (nnml, s)
+integer, intent(in)  :: nnml
+type(t), intent(out) :: s
+integer  :: grid
+namelist /N/ grid
+call read_nml_type_2
+s%grid = grid
+  contains
+subroutine read_nml_type_2
+  read (nnml, nml=N)
+end subroutine read_nml_type_2
+  end subroutine read_nml
+end module pr59440
--- gcc/testsuite/gfortran.dg/pr59440-2.f90.jj  2014-01-17 10:19:44.725667756 
+0100
+++ gcc/testsuite/gfortran.dg/pr59440-2.f90 2014-01-17 10:19:20.0 
+0100
@@ -0,0 +1,16 @@
+! PR fortran/59440
+! { dg-do compile }
+! { dg-options "-O2 -g" }
+
+subroutine foo (nnml, outv)
+  integer, intent(in) :: nnml
+  integer, intent(out) :: outv
+  integer :: grid
+  namelist /N/ grid
+  read (nnml, nml=N)
+  call bar
+contains
+  subroutine bar
+outv = grid
+  end subroutine bar
+end subroutine foo
--- gcc/testsuite/gfortran.dg/pr59440-3.f90.jj  2014-01-17 10:19:46.910654994 
+0100
+++ gcc/testsuite/gfortran.dg/pr59440-3.f90 2014-01-17 10:19:23.0 
+0100
@@ -0,0 +1,16 @@
+! PR fortran/59440
+! { dg-do compile }
+! { dg-options "-O2 -g" }
+
+subroutine foo (nnml, outv)
+  integer, intent(in) :: nnml
+  integer, intent(out) :: outv
+  integer :: grid
+  call bar
+  outv = grid
+contains
+  subroutine bar
+namelist /N/ grid
+read (nnml, nml=N)
+  end subroutine ba

[PATCH] Fix up ivdep/do concurrent testcases (PR testsuite/59064)

2014-01-17 Thread Jakub Jelinek
Hi!

I believe the intent of these testcases was to verify the loops
are vectorized and don't need versioning for alias, which is the only thing
these constructs tell the compiler about.  On some architectures, it is
possible the loops will need versioning for alignment (or peeling for
alignment or just using unaligned loads and/or stores in the vectorized
loop).

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2014-01-17  Jakub Jelinek  

PR testsuite/59064
* gcc.dg/vect/vect-ivdep-1.c: Replace two dg-bogus lines separately
testing for " version" and " alias" with one testing for
" version\[^\n\r]* alias".
* gcc.dg/vect/vect-ivdep-2.c: Likewise.
* gfortran.dg/vect/vect-do-concurrent-1.f90: Likewise.
* g++.dg/vect/pr33426-ivdep.cc: Likewise.
* g++.dg/vect/pr33426-ivdep-2.cc: Likewise.
* g++.dg/vect/pr33426-ivdep-3.cc: Likewise.
* g++.dg/vect/pr33426-ivdep-4.cc: Adjust comments similarly.

--- gcc/testsuite/gcc.dg/vect/vect-ivdep-1.c.jj 2013-11-12 11:31:19.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/vect-ivdep-1.c2014-01-17 08:38:37.919749235 
+0100
@@ -14,6 +14,5 @@ void foo(int n, int *a, int *b, int *c,
 }
 
 /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
-/* { dg-bogus " version" "" { target *-*-* } 0 } */
-/* { dg-bogus " alias" "" { target *-*-* } 0 } */
+/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
--- gcc/testsuite/gcc.dg/vect/vect-ivdep-2.c.jj 2013-11-12 11:31:19.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/vect-ivdep-2.c2014-01-17 08:38:55.854652038 
+0100
@@ -30,6 +30,5 @@ void bar(int n, int *a, int *b, int *c)
 
 
 /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
-/* { dg-bogus " version" "" { target *-*-* } 0 } */
-/* { dg-bogus " alias" "" { target *-*-* } 0 } */
+/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
--- gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90.jj  2013-11-12 
11:31:16.0 +0100
+++ gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90 2014-01-17 
08:41:58.744714758 +0100
@@ -12,6 +12,5 @@ subroutine test(n, a, b, c)
 end subroutine test
 
 ! { dg-message "loop vectorized" "" { target *-*-* } 0 }
-! { dg-bogus " version" "" { target *-*-* } 0 }
-! { dg-bogus " alias" "" { target *-*-* } 0 }
+! { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 }
 ! { dg-final { cleanup-tree-dump "vect" } }
--- gcc/testsuite/g++.dg/vect/pr33426-ivdep.cc.jj   2013-11-12 
11:31:20.0 +0100
+++ gcc/testsuite/g++.dg/vect/pr33426-ivdep.cc  2014-01-17 08:40:08.534286245 
+0100
@@ -14,6 +14,5 @@ void foo(int n, int *a, int *b, int *c,
 }
 
 /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
-/* { dg-bogus " version" "" { target *-*-* } 0 } */
-/* { dg-bogus " alias" "" { target *-*-* } 0 } */
+/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
--- gcc/testsuite/g++.dg/vect/pr33426-ivdep-2.cc.jj 2013-11-12 
11:31:20.0 +0100
+++ gcc/testsuite/g++.dg/vect/pr33426-ivdep-2.cc2014-01-17 
08:40:23.557207616 +0100
@@ -29,8 +29,7 @@ void bar(int n, int *a, int *b, int *c)
 }
 
 /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
-/* { dg-bogus " version" "" { target *-*-* } 0 } */
-/* { dg-bogus " alias" "" { target *-*-* } 0 } */
+/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
 /* { dg-final { scan-tree-dump-times "ANNOTATE_EXPR " 2 "original" } } */
--- gcc/testsuite/g++.dg/vect/pr33426-ivdep-3.cc.jj 2013-11-12 
11:31:20.0 +0100
+++ gcc/testsuite/g++.dg/vect/pr33426-ivdep-3.cc2014-01-17 
08:40:31.833150629 +0100
@@ -15,8 +15,7 @@ void foo(int *a) {
 }
 
 /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
-/* { dg-bogus " version" "" { target *-*-* } 0 } */
-/* { dg-bogus " alias" "" { target *-*-* } 0 } */
+/* { dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0 } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
 /* { dg-final { scan-tree-dump-times "ANNOTATE_EXPR " 1 "original" } } */
--- gcc/testsuite/g++.dg/vect/pr33426-ivdep-4.cc.jj 2013-11-12 
11:31:20.0 +0100
+++ gcc/testsuite/g++.dg/vect/pr33426-ivdep-4.cc2014-01-17 
08:40:56.073057959 +0100
@@ -20,8 +20,7 @@ void foo(std::vector *ar, int *b) {
 }
 
 /* { dg-message "loop vectorized" "" { target *-*-* } 0 } */
-/* FIXME: dg-bogus " version" "" { target *-*-* } 0  */
-/* FIXME: dg-bogus " alias" "" { target *-*-* } 0  */
+/* FIXME: dg-bogus " version\[^\n\r]* alias" "" { target *-*-* } 0  */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
 /* { dg-final { scan-tree-dump-times "ANNOTATE_EXPR " 1 "original" } } */

Jakub


[C++ PATCH] Don't segv in cvt.c (PR c++/59838)

2014-01-17 Thread Marek Polacek
This prevents segfault in ocp_convert.  If the ENUM_UNDERLYING_TYPE
is NULL, don't call int_fits_type_p on it.  We give meaningful error
message on the attached testcase; it's the same message as GCC
3.4/4.0.2/4.4 gave (these GCC didn't ICE on it).

Bootstrapped/regtested on x86_64-linux, ok for trunk/4.8/4.7?

2014-01-17  Marek Polacek  

PR c++/59838
cp/
* cvt.c (ocp_convert): Don't segfault on non-existing
ENUM_UNDERLYING_TYPE.
testsuite/
* g++.dg/diagnostic/pr59838.C: New test.

--- gcc/cp/cvt.c.mp32014-01-17 12:01:20.926793491 +0100
+++ gcc/cp/cvt.c2014-01-17 12:01:55.297920771 +0100
@@ -753,6 +753,7 @@ ocp_convert (tree type, tree expr, int c
 unspecified.  */
  if ((complain & tf_warning)
  && TREE_CODE (e) == INTEGER_CST
+ && ENUM_UNDERLYING_TYPE (type)
  && !int_fits_type_p (e, ENUM_UNDERLYING_TYPE (type)))
warning_at (loc, OPT_Wconversion, 
"the result of the conversion is unspecified because "
--- gcc/testsuite/g++.dg/diagnostic/pr59838.C.mp3   2014-01-17 
12:21:08.111663754 +0100
+++ gcc/testsuite/g++.dg/diagnostic/pr59838.C   2014-01-17 12:22:08.192886710 
+0100
@@ -0,0 +1,4 @@
+// PR c++/59838
+// { dg-do compile }
+
+enum E { a, b = (E) a }; // { dg-error "conversion to incomplete type" }

Marek


[PATCH] Don't fold zero-sized elements (PR c/58346)

2014-01-17 Thread Marek Polacek
This is the real fix for PR58346.  I'd say the easiest solution is
just not fold the zero-sized elements.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-01-17  Marek Polacek  

PR c/58346
* gimple-fold.c (fold_array_ctor_reference): Don't fold if element
size is zero.
testsuite/
* gcc.dg/pr58346.c: New test.

--- gcc/gimple-fold.c.mp2   2014-01-17 12:03:56.149446880 +0100
+++ gcc/gimple-fold.c   2014-01-17 12:04:00.450462677 +0100
@@ -2940,7 +2940,8 @@ fold_array_ctor_reference (tree type, tr
  be larger than size of array element.  */
   if (!TYPE_SIZE_UNIT (type)
   || TREE_CODE (TYPE_SIZE_UNIT (type)) != INTEGER_CST
-  || elt_size.slt (tree_to_double_int (TYPE_SIZE_UNIT (type
+  || elt_size.slt (tree_to_double_int (TYPE_SIZE_UNIT (type)))
+  || elt_size.is_zero ())
 return NULL_TREE;
 
   /* Compute the array index we look for.  */
--- gcc/testsuite/gcc.dg/pr58346.c.mp2  2014-01-17 12:27:26.180127058 +0100
+++ gcc/testsuite/gcc.dg/pr58346.c  2014-01-17 12:28:20.466332046 +0100
@@ -0,0 +1,19 @@
+/* PR tree-optimization/58346 */
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+struct U {};
+static struct U b[1] = { };
+extern void bar (struct U);
+
+void
+foo (void)
+{
+  bar (b[0]);
+}
+
+void
+baz (void)
+{
+  foo ();
+}

Marek


Re: PATCH: PR middle-end/59789: [4.9 Regression] ICE in in convert_move, at expr.c:333

2014-01-17 Thread H.J. Lu
On Mon, Jan 13, 2014 at 12:11 PM, H.J. Lu  wrote:
> Hi,
>
> We should report some early inlining errors.  This patch is based on
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57698#c7
>
> It adds report_early_inliner_always_inline_failure and uses it in
> expand_call_inline.  Tested on Linux/x86-64. OK to install?
>
> Thanks.
>
>
> H.J.
> 
> commit 7b18b53d308b2c25bef5664be3e6544249d86bdc
> Author: H.J. Lu 
> Date:   Mon Jan 13 11:54:36 2014 -0800
>
> Update error handling during early_inlining
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 5c674bc..284bc66 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,12 @@
> +2014-01-13  Sriraman Tallam  
> +   H.J. Lu  
> +
> +   PR middle-end/59789
> +   * tree-inline.c (report_early_inliner_always_inline_failure): New
> +   function.
> +   (expand_call_inline): Emit errors during early_inlining if
> +   report_early_inliner_always_inline_failure returns true.
> +
>  2014-01-10  DJ Delorie  
>
> * config/msp430/msp430.md (call_internal): Don't allow memory
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 459e365..2a7b3ca 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-01-13  H.J. Lu  
> +
> +   PR middle-end/59789
> +   * gcc.target/i386/pr59789.c: New testcase.
> +
>  2014-01-13  Jakub Jelinek  
>
> PR tree-optimization/59387
> diff --git a/gcc/testsuite/gcc.target/i386/pr59789.c 
> b/gcc/testsuite/gcc.target/i386/pr59789.c
> new file mode 100644
> index 000..b476d6c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr59789.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target ia32 } */
> +/* { dg-options "-O -march=i686" } */
> +
> +#pragma GCC push_options
> +#pragma GCC target("sse2")
> +typedef int __v4si __attribute__ ((__vector_size__ (16)));
> +typedef long long __m128i __attribute__ ((__vector_size__ (16), 
> __may_alias__));
> +
> +extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
> +_mm_set_epi32 (int __q3, int __q2, int __q1, int __q0) /* { dg-error "target 
> specific option mismatch" } */
> +{
> +  return __extension__ (__m128i)(__v4si){ __q0, __q1, __q2, __q3 };
> +}
> +#pragma GCC pop_options
> +
> +
> +__m128i
> +f1(void) /* { dg-message "warning: SSE vector return without SSE enabled 
> changes the ABI" } */
> +{
> +  return _mm_set_epi32 (0, 0, 0, 0); /* { dg-error "called from here" } */
> +}
> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> index 22521b1..ce1e3af 100644
> --- a/gcc/tree-inline.c
> +++ b/gcc/tree-inline.c
> @@ -4046,6 +4046,32 @@ add_local_variables (struct function *callee, struct 
> function *caller,
>}
>  }
>
> +/* Should an error be reported when early inliner fails to inline an
> +   always_inline function?  That depends on the REASON.  */
> +
> +static inline bool
> +report_early_inliner_always_inline_failure (cgraph_inline_failed_t reason)
> +{
> +  /* Only the following reasons need to be reported when the early inliner
> + fails to inline an always_inline function.  Called from
> + expand_call_inline.  */
> +  switch (reason)
> +{
> +case CIF_BODY_NOT_AVAILABLE:
> +case CIF_FUNCTION_NOT_INLINABLE:
> +case CIF_OVERWRITABLE:
> +case CIF_MISMATCHED_ARGUMENTS:
> +case CIF_EH_PERSONALITY:
> +case CIF_UNSPECIFIED:
> +case CIF_NON_CALL_EXCEPTIONS:
> +case CIF_TARGET_OPTION_MISMATCH:
> +case CIF_OPTIMIZATION_MISMATCH:
> +  return true;
> +default:
> +  return false;
> +}
> +}
> +
>  /* If STMT is a GIMPLE_CALL, replace it with its inline expansion.  */
>
>  static bool
> @@ -4116,7 +4142,8 @@ expand_call_inline (basic_block bb, gimple stmt, 
> copy_body_data *id)
>   /* During early inline pass, report only when optimization is
>  not turned on.  */
>   && (cgraph_global_info_ready
> - || !optimize)
> + || !optimize
> + || report_early_inliner_always_inline_failure (reason))
>   /* PR 20090218-1_0.c. Body can be provided by another module. */
>   && (reason != CIF_BODY_NOT_AVAILABLE || !flag_generate_lto))
> {

Hi Honza,

Can you take a look at this patch?

Thanks.

-- 
H.J.


[PATCH] Reduce compile-time for -Og

2014-01-17 Thread Richard Biener

This reduces -Og compile-time for the testcase in PR46590 from 116s
to 40s (still way short of -O0 compile-time which is 17s).
It does so by disabling the rest of loop2 (RTL invariant motion
and doloop) as I originally intended and by disabling PTA which
can get quite expensive on larger testcases (especially for
fortran which makes heavy use of fnspecs).

Bootstrap with BOOT_CFLAGS="-Og -g" on x86_64-unknown-linux-gnu
in progress.

Richard.

2014-01-17  Richard Biener  

PR tree-optimization/46590
* opts.c (default_options_table): Add entries for
OPT_fbranch_count_reg, OPT_fmove_loop_invariants and OPT_ftree_pta,
all enabled at -O1 but not for -Og.
* common.opt (fbranch-count-reg): Remove Init(1).
(fmove-loop-invariants): Likewise.
(ftree-pta): Likewise.

Index: gcc/opts.c
===
*** gcc/opts.c  (revision 206702)
--- gcc/opts.c  (working copy)
*** static const struct default_options defa
*** 454,459 
--- 454,462 
  { OPT_LEVELS_1_PLUS, OPT_fcombine_stack_adjustments, NULL, 1 },
  { OPT_LEVELS_1_PLUS, OPT_fcompare_elim, NULL, 1 },
  { OPT_LEVELS_1_PLUS, OPT_ftree_slsr, NULL, 1 },
+ { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fbranch_count_reg, NULL, 1 },
+ { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fmove_loop_invariants, NULL, 1 },
+ { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_pta, NULL, 1 },
  
  /* -O2 optimizations.  */
  { OPT_LEVELS_2_PLUS, OPT_finline_small_functions, NULL, 1 },
Index: gcc/common.opt
===
*** gcc/common.opt  (revision 206702)
--- gcc/common.opt  (working copy)
*** Common Report Var(flag_bounds_check)
*** 875,881 
  Generate code to check bounds before indexing arrays
  
  fbranch-count-reg
! Common Report Var(flag_branch_on_count_reg) Init(1) Optimization
  Replace add, compare, branch with branch on count register
  
  fbranch-probabilities
--- 875,881 
  Generate code to check bounds before indexing arrays
  
  fbranch-count-reg
! Common Report Var(flag_branch_on_count_reg) Optimization
  Replace add, compare, branch with branch on count register
  
  fbranch-probabilities
*** Common Report Var(flag_modulo_sched_allo
*** 1564,1570 
  Perform SMS based modulo scheduling with register moves allowed
  
  fmove-loop-invariants
! Common Report Var(flag_move_loop_invariants) Init(1) Optimization
  Move loop invariant computations out of loops
  
  fdce
--- 1564,1570 
  Perform SMS based modulo scheduling with register moves allowed
  
  fmove-loop-invariants
! Common Report Var(flag_move_loop_invariants) Optimization
  Move loop invariant computations out of loops
  
  fdce
*** Common Report Var(flag_tree_partial_pre)
*** 2170,2176 
  In SSA-PRE optimization on trees, enable partial-partial redundancy 
elimination
  
  ftree-pta
! Common Report Var(flag_tree_pta) Init(1) Optimization
  Perform function-local points-to analysis on trees.
  
  ftree-reassoc
--- 2170,2176 
  In SSA-PRE optimization on trees, enable partial-partial redundancy 
elimination
  
  ftree-pta
! Common Report Var(flag_tree_pta) Optimization
  Perform function-local points-to analysis on trees.
  
  ftree-reassoc


Re: [PATCH] Fix PR46590

2014-01-17 Thread Richard Biener
On Fri, 17 Jan 2014, Jakub Jelinek wrote:

> On Fri, Jan 17, 2014 at 12:32:34PM +0100, Richard Biener wrote:
> > ! /* Search the contents of the sorted vector with a binary search.
> > !CMP is the comparison function to pass to bsearch.  */
> 
> Can you please sed -i -e s/__//g in the whole method?
> I mean, this isn't in libstdc++ or glibc header, so there is no point
> in obfuscating the names, and the __ prefixed names are even reserved
> for implementation, which we are not at least in stage1 built compiler.

Will do.

Richard.

> > ! template
> > ! inline T *
> > ! vec::bsearch (const void *__key,
> > ! int (*__compar) (const void *, const void *))
> > ! {
> > !   const void *__base = this->address ();
> > !   size_t __nmemb = this->length ();
> > !   size_t __size = sizeof (T);
> > !   /* The following is a copy of glibc stdlib-bsearch.h.  */
> > !   size_t __l, __u, __idx;
> > !   const void *__p;
> > !   int __comparison;
> > ! 
> > !   __l = 0;
> > !   __u = __nmemb;
> > !   while (__l < __u)
> > ! {
> > !   __idx = (__l + __u) / 2;
> > !   __p = (const void *) (((const char *) __base) + (__idx * __size));
> > !   __comparison = (*__compar) (__key, __p);
> > !   if (__comparison < 0)
> > !   __u = __idx;
> > !   else if (__comparison > 0)
> > !   __l = __idx + 1;
> > !   else
> > !   return (T *)const_cast(__p);
> > ! }
> > ! 
> > !   return NULL;
> >   }
k


Re: [Patch][AArch64] vneg floating point testcase BE fixed

2014-01-17 Thread Alex Velenko

Hi,
I agree the correct changelog entry should be:

gcc/testsuite/

2013-01-16  Alex Velenko  
*/gcc.target/aarch64/vneg_f.c (STORE_INST): New macro.
(RUN_TEST): Use new macro.
(INDEX): marcro removed
(test_vneg_f32): Use fixed RUN_TEST.
(test_vneg_f64): Likewise.
(test_vnegq_f32): Likewise.
(test_vnegq_f64): Likewise.

Kind regards,
Alex Velenko

On 16/01/14 16:58, Richard Earnshaw wrote:

On 16/01/14 12:23, Alex Velenko wrote:

Hi,
This patch fixes testcase vneg_f.c which  was using an inconsistent
vector model causing problems for Big-Endian compiler.

Now testcase runs on both LE and BE without regressions.

Is it okay?

Kind regards,
Alex Velenko

gcc/testsuite/

2013-01-16  Alex Velenko  

   */gcc.target/aarch64/vneg_f.c (STORE_INST): ST1 macro added.

Just say: "New macro."


   (RUN_TEST): Macro updated to use STORE_INST.

"Use it."


   (test_vneg_f32): Changed to provide definitions for RUN_TEST.

"Use RUN_TEST."


   (test_vneg_f64): Likewise.
   (test_vnegq_f32): Likewise.
   (test_vnegq_f64): Likewise.



You also need to mention the INDEX* macros that you've removed.  Just
say "Delete."



Vneg_fix.patch


diff --git a/gcc/testsuite/gcc.target/aarch64/vneg_f.c 
b/gcc/testsuite/gcc.target/aarch64/vneg_f.c
index 
1eaf21d34eb57b4e7e5388a4686fe6341197447a..01503028547f320ab3d8ea725ff09ee5d0487f18
 100644
--- a/gcc/testsuite/gcc.target/aarch64/vneg_f.c
+++ b/gcc/testsuite/gcc.target/aarch64/vneg_f.c
@@ -44,34 +44,27 @@ extern void abort (void);
  #define DATA_TYPE_64 double
  #define DATA_TYPE(data_len) DATA_TYPE_##data_len

-#define INDEX64_32 [i]
-#define INDEX64_64
-#define INDEX128_32 [i]
-#define INDEX128_64 [i]
-#define INDEX(reg_len, data_len) \
-  CONCAT1 (INDEX, reg_len##_##data_len)
-
+#define STORE_INST(reg_len, data_len) \
+  CONCAT1 (vst1, POSTFIX (reg_len, data_len))
  #define LOAD_INST(reg_len, data_len) \
CONCAT1 (vld1, POSTFIX (reg_len, data_len))
  #define NEG_INST(reg_len, data_len) \
CONCAT1 (vneg, POSTFIX (reg_len, data_len))

  #define INHIB_OPTIMIZATION asm volatile ("" : : : "memory")
-
-#define RUN_TEST(test_set, reg_len, data_len, n, a, b) \
+#define RUN_TEST(test_set, reg_len, data_len, n, a, b, c) \
{  \
  int i;   \
  (a) = LOAD_INST (reg_len, data_len) (test_set);\
  (b) = NEG_INST (reg_len, data_len) (a);  \
+STORE_INST (reg_len, data_len) (c, b);\
  for (i = 0; i < n; i++)   \
{  \
DATA_TYPE (data_len) diff; \
INHIB_OPTIMIZATION;\
-   diff   \
- = a INDEX (reg_len, data_len)\
-   + b INDEX (reg_len, data_len); \
+   diff = test_set[i] + c[i]; \
if (diff > EPSILON) \
- return 1;\
+   return 1;  \
}  \
}

@@ -84,28 +77,29 @@ extern void abort (void);
  int
  test_vneg_f32 ()
  {
-  float test_set0[2] = { TEST0, TEST1 };
-  float test_set1[2] = { TEST2, TEST3 };
-  float test_set2[2] = { VAR_MAX, VAR_MIN };
-  float test_set3[2] = { INFINITY, NAN };
-
float32x2_t a;
float32x2_t b;
+  float32_t c[2];

-  RUN_TEST (test_set0, 64, 32, 2, a, b);
-  RUN_TEST (test_set1, 64, 32, 2, a, b);
-  RUN_TEST (test_set2, 64, 32, 2, a, b);
-  RUN_TEST (test_set3, 64, 32, 0, a, b);
+  float32_t test_set0[2] = { TEST0, TEST1 };
+  float32_t test_set1[2] = { TEST2, TEST3 };
+  float32_t test_set2[2] = { VAR_MAX, VAR_MIN };
+  float32_t test_set3[2] = { INFINITY, NAN };
+
+  RUN_TEST (test_set0, 64, 32, 2, a, b, c);
+  RUN_TEST (test_set1, 64, 32, 2, a, b, c);
+  RUN_TEST (test_set2, 64, 32, 2, a, b, c);
+  RUN_TEST (test_set3, 64, 32, 0, a, b, c);

/* Since last test cannot be checked in a uniform way by adding
   negation result to original value, the number of lanes to be
   checked in RUN_TEST is 0 (last argument).  Instead, result
   will be checked manually.  */

-  if (b[0] != -INFINITY)
+  if (c[0] != -INFINITY)
  return 1;

-  if (!__builtin_isnan (b[1]))
+  if (!__builtin_isnan (c[1]))
  return 1;

return 0;
@@ -130,37 +124,38 @@ test_vneg_f64 ()
  {
float64x1_t a;
float64x1_t b;
-
-  double test_set0[1] = { TEST0 };
-  double test_set1[1] = { TEST1 };
-  double test_set2[1] = { TEST2 };
-  double test_set3[1] = { TEST3 };
-  double test_set4[1] = { VAR_MAX };
-  double test_set5[1] = { VAR_MIN };
-  double test_set6[1] = { INFINITY };
-  double test_set7[1] = { NAN };
-
-  RUN_TEST (test_set0, 64, 64, 1, a, b);
-  RUN_TEST (test_set1, 64, 64, 1, a, b);
-  RUN_TE

Re: [AArch64] Make -mcpu, -march and -mtune case-insensitive.

2014-01-17 Thread Richard Earnshaw
On 17/01/14 11:12, Alan Lawrence wrote:
> Small patch to make the -mcpu, -march and -mtune command-line options
> case-insensitive, allowing e.g. -mcpu=CortexA57 -march=ARMv8-A.
> 
> Tested on aarch64-none-elf with no regressions; options passed onto e.g.
> ld are always lowercase (as before).
> 
> OK for trunk?
> 
> --Alan
> 
> ChangeLog:
> 2014-01-17  Alan Lawrence  
>   * config/aarch64/aarch64.opt (mcpu, march, mtune): Make
> case-insensitive.
> 

OK.

R.




Re: [PATCH, ARM, v2] Fix PR target/59142: internal compiler error while compiling OpenCV 2.4.7

2014-01-17 Thread Christophe Lyon
Committed on Charlies' behalf as:
r206706 for the 4.8 branch
r206707 for the 4.7 branch

Christophe.

On 17 January 2014 10:28, Richard Earnshaw  wrote:
> On 16/01/14 18:40, Charles Baylis wrote:
>> On 20 December 2013 13:26, Richard Earnshaw  wrote:
>>> On 19/12/13 17:40, Charles Baylis wrote:
 Is it ok for 4.8, and should it be considered for 4.7?

>>>
>>> Yes, provided it passes testing on those releases.
>>
>> Results of testing 4.8:
>> All 3 patches:
>> 0001-PR-target-59142-vfp_hard_register_operand.patch
>> 0002-PR-target-59142-arm_hard_general_register_operand.patch
>> 0003-PR-target-59142-low_register_operand.patch
>> apply correctly, and I have verified that ldmstm.md is correctly
>> patched and does not need to be regenerated and have tested that the
>> compiler bootstraps and passes make check in a arm-linux-gnueabihf
>> configuration on a chromebook.
>>
>>
>> Results of testing 4.7:
>> Only the following 2 patches should be applied as patch 0001 modifies
>> a pattern which does not exist on the 4.7 branch.
>> 0002-PR-target-59142-arm_hard_general_register_operand.patch
>> 0003-PR-target-59142-low_register_operand.patch
>> I have verified that ldmstm.md is correctly patched and does not need
>> to be regenerated and have tested that the compiler bootstraps in a
>> arm-linux-gnueabi configuration on a chromebook.
>>
>> I think this is OK to be committed to both branches?
>>
>
> OK.
>
> R.
>


Re: [PATCH] Fix PR46590

2014-01-17 Thread Jakub Jelinek
On Fri, Jan 17, 2014 at 12:32:34PM +0100, Richard Biener wrote:
> ! /* Search the contents of the sorted vector with a binary search.
> !CMP is the comparison function to pass to bsearch.  */

Can you please sed -i -e s/__//g in the whole method?
I mean, this isn't in libstdc++ or glibc header, so there is no point
in obfuscating the names, and the __ prefixed names are even reserved
for implementation, which we are not at least in stage1 built compiler.

> ! template
> ! inline T *
> ! vec::bsearch (const void *__key,
> !   int (*__compar) (const void *, const void *))
> ! {
> !   const void *__base = this->address ();
> !   size_t __nmemb = this->length ();
> !   size_t __size = sizeof (T);
> !   /* The following is a copy of glibc stdlib-bsearch.h.  */
> !   size_t __l, __u, __idx;
> !   const void *__p;
> !   int __comparison;
> ! 
> !   __l = 0;
> !   __u = __nmemb;
> !   while (__l < __u)
> ! {
> !   __idx = (__l + __u) / 2;
> !   __p = (const void *) (((const char *) __base) + (__idx * __size));
> !   __comparison = (*__compar) (__key, __p);
> !   if (__comparison < 0)
> ! __u = __idx;
> !   else if (__comparison > 0)
> ! __l = __idx + 1;
> !   else
> ! return (T *)const_cast(__p);
> ! }
> ! 
> !   return NULL;
>   }

Jakub


[C++ Patch] PR 59269

2014-01-17 Thread Paolo Carlini

Hi,

I think we can handle this 4.9 Regression ICE in build_value_init_noctor 
the same way as cx_check_missing_mem_inits: only enforce 
!TYPE_HAS_COMPLEX_DFLT when errorcount == 0 (I also double checked that 
in the case at issue type_has_constexpr_default_constructor is true).


Tested x86_64-linux.

Thanks,
Paolo.


/cp
2014-01-17  Paolo Carlini  

PR c++/59269
* init.c (build_value_init_noctor): Assert !TYPE_HAS_COMPLEX_DFLT
only when errorcount == 0.

/testsuite
2014-01-17  Paolo Carlini  

PR c++/59269
* g++.dg/cpp0x/nsdmi-union4.C: New.
Index: cp/init.c
===
--- cp/init.c   (revision 206700)
+++ cp/init.c   (working copy)
@@ -382,7 +382,8 @@ build_value_init_noctor (tree type, tsubst_flags_t
  SFINAE-enabled.  */
   if (CLASS_TYPE_P (type))
 {
-  gcc_assert (!TYPE_HAS_COMPLEX_DFLT (type));
+  gcc_assert (!TYPE_HAS_COMPLEX_DFLT (type)
+ || errorcount != 0);

   if (TREE_CODE (type) != UNION_TYPE)
{
Index: testsuite/g++.dg/cpp0x/nsdmi-union4.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-union4.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-union4.C   (working copy)
@@ -0,0 +1,12 @@
+// PR c++/59269
+// { dg-require-effective-target c++11 }
+
+union U
+{
+  int& i = 0;  // { dg-error "reference" }
+};
+
+void foo()
+{
+  U();
+}


Commit: MSP430: Add -mcpu= option

2014-01-17 Thread Nick Clifton
Hi Guys,

  I am committing the attached patch to add a -mcpu= command line option
  to the MSP430 backend.  This allows the selection of the ISA to be
  used, and it operates independently of the -mmcu= command line option.

gcc/ChangeLog
2014-01-17  Nick Clifton  

* config/msp430/msp430.opt: (mcpu): New option.
* config/msp430/msp430.c (msp430_mcu_name): Use target_mcu.
(msp430_option_override): Parse target_cpu.  If the MCU name
matches a generic string, clear target_mcu.
(msp430_attr): Allow numeric interrupt values up to 63.
(msp430_expand_epilogue): No longer invert operand 1 of gen_popm.
* config/msp430/msp430.h (ASM_SPEC): Convert -mcpu into a -mmcu
option.
* config/msp430/t-msp430: (MULTILIB_MATCHES): Remove mcu matches.
Add mcpu matches.
* config/msp430/msp430.md (popm): Use %J rather than %I.
(addsi3): Use msp430_nonimmediate_operand for operand 2.
(addhi_cy_i): Use immediate_operand for operand 2.
* doc/invoke.texi: Document -mcpu option.



msp430.mcpu.patch.xz
Description: application/xz


[PATCH] Fix PR46590

2014-01-17 Thread Richard Biener

This fixes PR46590 - I've worked on this at the beginning of last
year already but appearantly when deciding to not push the last
bits of the LIM reorg failed to check the full testcase again.

So this fixes memory usage of that large testcase (many loops)
from requiring a peak memory usage of >3.5GB (kills my machine)
to a peak memory usage of ~1GB (peak happens during IRA,
the testcase completes compile in 122s at -O1 and in 160s at -O3).

The main issue with LIM was the accesses_in_loop array which
we keep for each distinct memory reference.  This was an
array [number-of-loops][number-of-accesses-in-that-loop]
to be able to easily walk over all accesses of a memory reference
in a loop and its children (see for_all_locs_in_loop).
Obviously allocating a vec of size number-of-loops of vecs
(even if those remain un-allocated) makes memory requirements
O(number-of-accesses * number-of-loops) - very bad for this
testcase.

The fix is to flatten this to a [number-of-accesses] array
and support easy walking over accesses in a loop and its children
by sorting that array after the locations loop postorder number.
So all accesses are clustered and you can bsearch for a member
of that cluster.

Note that restoring this efficient walking isn't necessary
for the testcase in PR46590 but I've created an artificial
testcase where that improves compile-time from 13s to 0s (LIM time).

The patch below includes some more TLC and optimizations I applied
to LIM until I noticed this issue.

The patch also adds bsearch support to vec<>, alongside its support
for qsort, copying the glibc inline function implementation
(Jakub said we shouldn't rely on bsearch availability nor on
it being implemented with an inline function to make it fast).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

I'll commit this later today (if testing works out ok).

Thanks,
Richard.

2014-01-17  Richard Biener  

PR tree-optimization/46590
* vec.h (vec<>::bseach): New member function implementing
binary search according to C89 bsearch.
(vec<>::qsort): Avoid calling ::qsort for vectors with sizes 0 or 1.
* tree-ssa-loop-im.c (struct mem_ref): Make stored member a
bitmap pointer again.  Make accesses_in_loop a flat array.
(mem_ref_obstack): New global.
(outermost_indep_loop): Adjust for mem_ref->stored changes.
(mark_ref_stored): Likewise.
(ref_indep_loop_p_2): Likewise.
(set_ref_stored_in_loop): New helper function.
(mem_ref_alloc): Allocate mem_refs on the mem_ref_obstack obstack.
(memref_free): Adjust.
(record_mem_ref_loc): Simplify.
(gather_mem_refs_stmt): Adjust.
(sort_locs_in_loop_postorder_cmp): New function.
(analyze_memory_references): Sort accesses_in_loop after
loop postorder number.
(find_ref_loc_in_loop_cmp): New function.
(for_all_locs_in_loop): Find relevant cluster of locs in
accesses_in_loop and iterate without recursion.
(execute_sm): Avoid uninit warning.
(struct ref_always_accessed): Simplify.
(ref_always_accessed::operator ()): Likewise.
(ref_always_accessed_p): Likewise.
(tree_ssa_lim_initialize): Initialize mem_ref_obstack, compute
loop postorder numbers here.
(tree_ssa_lim_finalize): Free mem_ref_obstack and loop postorder
numbers.

Index: gcc/vec.h
===
*** gcc/vec.h.orig  2014-01-07 10:19:52.979454286 +0100
--- gcc/vec.h   2014-01-17 12:18:11.352085931 +0100
*** public:
*** 476,481 
--- 476,482 
void unordered_remove (unsigned);
void block_remove (unsigned, unsigned);
void qsort (int (*) (const void *, const void *));
+   T *bsearch (const void *key, int (*compar)(const void *, const void *));
unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
static size_t embedded_size (unsigned);
void embedded_init (unsigned, unsigned = 0);
*** template
*** 938,944 
  inline void
  vec::qsort (int (*cmp) (const void *, const void *))
  {
!   ::qsort (address (), length (), sizeof (T), cmp);
  }
  
  
--- 939,981 
  inline void
  vec::qsort (int (*cmp) (const void *, const void *))
  {
!   if (length () > 1)
! ::qsort (address (), length (), sizeof (T), cmp);
! }
! 
! 
! /* Search the contents of the sorted vector with a binary search.
!CMP is the comparison function to pass to bsearch.  */
! 
! template
! inline T *
! vec::bsearch (const void *__key,
! int (*__compar) (const void *, const void *))
! {
!   const void *__base = this->address ();
!   size_t __nmemb = this->length ();
!   size_t __size = sizeof (T);
!   /* The following is a copy of glibc stdlib-bsearch.h.  */
!   size_t __l, __u, __idx;
!   const void *__p;
!   int __comparison;
! 
!   __l = 0;
!   __u = __nmemb;
!   while (__l < __u)
! {
!   __idx

[AArch64] Make -mcpu, -march and -mtune case-insensitive.

2014-01-17 Thread Alan Lawrence
Small patch to make the -mcpu, -march and -mtune command-line options
case-insensitive, allowing e.g. -mcpu=CortexA57 -march=ARMv8-A.

Tested on aarch64-none-elf with no regressions; options passed onto e.g.
ld are always lowercase (as before).

OK for trunk?

--Alan

ChangeLog:
2014-01-17  Alan Lawrence  
* config/aarch64/aarch64.opt (mcpu, march, mtune): Make
case-insensitive.diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 163f34b..f5a15b7 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -88,15 +88,15 @@ Target RejectNegative Joined Enum(tls_type) Var(aarch64_tls_dialect) Init(TLS_DE
 Specify TLS dialect
 
 march=
-Target RejectNegative Joined Var(aarch64_arch_string)
+Target RejectNegative ToLower Joined Var(aarch64_arch_string)
 -march=ARCH	Use features of architecture ARCH
 
 mcpu=
-Target RejectNegative Joined Var(aarch64_cpu_string)
+Target RejectNegative ToLower Joined Var(aarch64_cpu_string)
 -mcpu=CPU	Use features of and optimize for CPU
 
 mtune=
-Target RejectNegative Joined Var(aarch64_tune_string)
+Target RejectNegative ToLower Joined Var(aarch64_tune_string)
 -mtune=CPU	Optimize for CPU
 
 mabi=

Re: [Patch] Regex bracket matcher cache optimization

2014-01-17 Thread Jonathan Wakely
On 17 January 2014 10:33, Tim Shen wrote:
> I agree, especially after I finding that an exported symbol could only
> be removed in a main version changing.

Maybe never!

> Anyway, we'll win some run-time
> cases, on which, we beat Boost ;)

That's excellent, well done :-)


Re: [RFA][PATCH][PR middle-end/57904][P1 regression] Improve cleanups after copyprop

2014-01-17 Thread Richard Biener
On Thu, Jan 16, 2014 at 7:30 PM, Jeff Law  wrote:
> On 01/16/14 04:49, Richard Biener wrote:
>>
>>
>> Well - the issue here is that inlining / IPA-CP propagates constant
>> arguments to direct uses which of course exposes constant propagation
>> opportunities.  Now, copyprop doesn't to "real" constant propagation,
>> it just also propagates constants as if they were registers.
>>
>> So it exactly works as designed, but you could argue that pass
>> ordering
>>
>>NEXT_PASS (pass_copy_prop);
>>NEXT_PASS (pass_complete_unrolli);
>>NEXT_PASS (pass_ccp);
>>
>> is wrong.  Of course complete unrolling exposes constant propagation
>> opportunities (though nowadays it has a cheap CCP machinery built-in).
>>
>> IIRC that copyprop pass was added to avoid spurious warnings just
>> as in the PR.  You could argue that with complete unrolling having
>> a cheap CCP built in (see propagate_constants_for_unrolling) we
>> should move CCP before unrolli (and copyprop!) as well.
>
> It's certainly possible that copyprop was added for that reason, I simply
> have no memory of it.
>
> I tend to be leery of juggling passes simply because it's often just pushing
> the bubble down in one spot and making another appear elsewhere.  However, I
> don't feel that strongly about it in this case.
>
>
>>
>> So - please try making pass order
>>
>>NEXT_PASS (pass_ccp);
>>NEXT_PASS (pass_copy_prop);
>>NEXT_PASS (pass_complete_unrolli);
>>
>> instead.
>
> That fixes things as well.  Bootstrapped and regression tested.  OK for the
> trunk?

Ok.

Thanks,
Richard.

>
>
> commit 4e47e40685c4480945783e77ebc9a123d15cfd24
> Author: Jeff Law 
> Date:   Thu Jan 16 11:20:42 2014 -0700
>
> PR middle-end/57904
> * passes.def: Reorder pass_copy_prop, pass_unrolli, pass_ccp
> sequence
> so that pass_ccp runs first.
>
> PR middle-end/57904
> * gfortran.dg/pr57904.f90: New test.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index d4f83f4..6669f26 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2014-01-16  Jeff Law  
> +
> +   PR middle-end/57904
> +   * passes.def: Reorder pass_copy_prop, pass_unrolli, pass_ccp
> sequence
> +   so that pass_ccp runs first.
> +
>  2014-01-16  Alan Lawrence  
>
> * config/arm/arm.opt: Make -mcpu, -march, -mtune case-insensitive.
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 95ea8ce..c98b048 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -132,11 +132,11 @@ along with GCC; see the file COPYING3.  If not see
>  They ensure memory accesses are not indirect wherever possible.  */
>NEXT_PASS (pass_strip_predict_hints);
>NEXT_PASS (pass_rename_ssa_copies);
> -  NEXT_PASS (pass_copy_prop);
> -  NEXT_PASS (pass_complete_unrolli);
>NEXT_PASS (pass_ccp);
>/* After CCP we rewrite no longer addressed locals into SSA
>  form if possible.  */
> +  NEXT_PASS (pass_copy_prop);
> +  NEXT_PASS (pass_complete_unrolli);
>NEXT_PASS (pass_phiprop);
>NEXT_PASS (pass_forwprop);
>NEXT_PASS (pass_object_sizes);
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 868593b..65a37b5 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-01-16  Jeff Law  
> +
> +PR middle-end/57904
> +   * gfortran.dg/pr57904.f90: New test.
> +
>  2014-01-16  Nick Clifton  
>
> PR middle-end/28865
> diff --git a/gcc/testsuite/gfortran.dg/pr57904.f90
> b/gcc/testsuite/gfortran.dg/pr57904.f90
> new file mode 100644
> index 000..69fa7ed
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/pr57904.f90
> @@ -0,0 +1,22 @@
> +! { dg-do compile }
> +! { dg-options "-O2" }
> +
> +program test
> +  call test2 ()
> +contains
> +  subroutine test2 ()
> +type t
> +  integer, allocatable :: x
> +end type t
> +
> +type t2
> +  class(t), allocatable :: a
> +end type t2
> +
> +type(t2) :: one, two
> +
> +allocate (two%a)
> +one = two
> +  end subroutine test2
> +end program test
> +
>


Re: [Patch] Regex bracket matcher cache optimization

2014-01-17 Thread Tim Shen
On Fri, Jan 17, 2014 at 4:39 AM, Jonathan Wakely  wrote:
> On 8 January 2014 22:47, Tim Shen wrote:
> I think we want to be cautious with exporting instantiations (and as
> Paolo noted we definitely don't want to do it for 4.9.0 now). Slow
> compile times are a problem, but only a minor annoyance. Exporting
> symbols that might change is a bigger problem, as we have to keep
> exporting them once they're in the library.  The current
> implementation is new for 4.9, so I think we can live with it being
> slow to compile for its first release. After 4.9.0 we will have more
> user feedback and more experience with it, and for the next major
> release will know what's stable enough to export "forever" from the
> library.

I agree, especially after I finding that an exported symbol could only
be removed in a main version changing. Anyway, we'll win some run-time
cases, on which, we beat Boost ;)


-- 
Regards,
Tim Shen


Re: Extend -fstack-protector-strong to cover calls with return slot

2014-01-17 Thread Florian Weimer

On 01/08/2014 03:57 PM, Florian Weimer wrote:


What about the attached version?  It still does not exactly match your
original suggestion because gimple_call_lhs (stmt) can be NULL_TREE if
the result is ignored and this case needs instrumentation, as you
explained, so I use the function return type in the aggregate_value_p
check.

Testing is still under way, but looks good so far.  I'm bootstrapping
with BOOT_CFLAGS="-O2 -g -fstack-protector-strong" with Ada enabled, for
additional coverage.


Testing passed without new regressions.  Is this okay for trunk?

--
Florian Weimer / Red Hat Product Security Team


Re: [C++ Patch] PR 59270

2014-01-17 Thread Paolo Carlini

.. patchlet fixes c++/58811 too.

Paolo.


Re: [Patch] Regex bracket matcher cache optimization

2014-01-17 Thread Jonathan Wakely
On 8 January 2014 22:47, Tim Shen wrote:
>
> So my plan is to instantiate _Compiler and _Executor instead of user
> interfaces like basic_regex or regex_match, because the implementation
> may change (say add a new executor) later. Is that Ok?

I think we want to be cautious with exporting instantiations (and as
Paolo noted we definitely don't want to do it for 4.9.0 now). Slow
compile times are a problem, but only a minor annoyance. Exporting
symbols that might change is a bigger problem, as we have to keep
exporting them once they're in the library.  The current
implementation is new for 4.9, so I think we can live with it being
slow to compile for its first release. After 4.9.0 we will have more
user feedback and more experience with it, and for the next major
release will know what's stable enough to export "forever" from the
library.


Re: [PATCH, ARM, v2] Fix PR target/59142: internal compiler error while compiling OpenCV 2.4.7

2014-01-17 Thread Richard Earnshaw
On 16/01/14 18:40, Charles Baylis wrote:
> On 20 December 2013 13:26, Richard Earnshaw  wrote:
>> On 19/12/13 17:40, Charles Baylis wrote:
>>> Is it ok for 4.8, and should it be considered for 4.7?
>>>
>>
>> Yes, provided it passes testing on those releases.
> 
> Results of testing 4.8:
> All 3 patches:
> 0001-PR-target-59142-vfp_hard_register_operand.patch
> 0002-PR-target-59142-arm_hard_general_register_operand.patch
> 0003-PR-target-59142-low_register_operand.patch
> apply correctly, and I have verified that ldmstm.md is correctly
> patched and does not need to be regenerated and have tested that the
> compiler bootstraps and passes make check in a arm-linux-gnueabihf
> configuration on a chromebook.
> 
> 
> Results of testing 4.7:
> Only the following 2 patches should be applied as patch 0001 modifies
> a pattern which does not exist on the 4.7 branch.
> 0002-PR-target-59142-arm_hard_general_register_operand.patch
> 0003-PR-target-59142-low_register_operand.patch
> I have verified that ldmstm.md is correctly patched and does not need
> to be regenerated and have tested that the compiler bootstraps in a
> arm-linux-gnueabi configuration on a chromebook.
> 
> I think this is OK to be committed to both branches?
> 

OK.

R.



Re: [PATCH] _Cilk_for for C and C++

2014-01-17 Thread Marek Polacek
On Thu, Jan 16, 2014 at 01:18:59PM -0800, Aldy Hernandez wrote:
> I'm not a C++ expert, but my understanding was that in C++ you don't
> need a typedef to use the following structure by name
> (cilk_for_information).  So you can just declare "struct
> cilk_for_information {...}" and instantiate it with just
> "cilk_for_information some_instance".  If that's the case, get rid
> of typedef.

Yes.  That's what create_implicit_typedef does.

Marek


Re: [RFA] [PATCH][PR tree-optimization/59749] Fix recently introduced ree bug

2014-01-17 Thread Eric Botcazou
> Bootstrapped & regression tested on x86_64-unknown-linux.  Also
> bootstrapped with --enable-checking=rtl.

Note that you can do only one bootstrap with --enable-checking=yes,rtl.

> Installed on the trunk.

As far as I can see, no, it was not installed.

-- 
Eric Botcazou


[PATCH] Fix crossing jumps in functions with forced_labels (PR rtl-optimization/57763)

2014-01-17 Thread Jakub Jelinek
Hi!

As mentioned in the PR, on alpha (but I don't see a reason why it can't
occur on most other targets except for i?86/x86_64, cr16, m32c, moxie and
msp430) if a function has non-NULL forced_labels (contains computed goto),
we can get ICEs due to incorrect EDGE_CROSSING flags on some edges.

The problem is that if not HAS_LONG_UNCOND_BRANCH,
fix_crossing_unconditional_branches changes normal unconditional jumps into
indirect jumps that jump to a single label, because normal jump might not
be able to jump to very far other section.  But, turning the jump into
indirect one means computed_jump_p is true on those, and such jumps are
supposed to have edges to all forced_labels, but
fix_crossing_unconditional_branches doesn't add those and just if say
post-reload splitter attempts to change something, we can end up calling
make_edges and that might not give the right flags to the edges.
As the indirect jump jumps to a single label only, I don't see why we should
require edges to all forced_labels, we know the single label to which it can
jump.  So the following patch fixes that by setting JUMP_LABEL on the
indirect jump, which results e.g. into computed_jump_p no longer returning
true for it.

Bootstrapped/regtested on x86_64-linux and i686-linux (where it isn't
called, sure) and Uros has bootstrapped/regtested this on Alpha.  Ok for
trunk?

2014-01-17  Jakub Jelinek  

PR rtl-optimization/57763
* bb-reorder.c (fix_crossing_unconditional_branches): Set JUMP_LABEL
on the new indirect jump_insn.

--- gcc/bb-reorder.c.jj 2014-01-03 11:41:01.0 +0100
+++ gcc/bb-reorder.c2014-01-16 12:49:27.69537 +0100
@@ -2183,6 +2183,8 @@ fix_crossing_unconditional_branches (voi
  emit_insn_before (indirect_jump_sequence, last_insn);
  delete_insn (last_insn);
 
+ JUMP_LABEL (jump_insn) = label;
+
  /* Make BB_END for cur_bb be the jump instruction (NOT the
 barrier instruction at the end of the sequence...).  */
 


Jakub


Re: [Patch, cilk, C++] Fix cilk testsuite failure

2014-01-17 Thread Richard Sandiford
Steve Ellcey  writes:
> 2014-01-15  Andrew Pinski 
>   Steve Ellcey  
>
>   PR target/59462
>   * config/mips/mips.c (mips_print_operand): Check operand mode instead
>   of operator mode.
>
>
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index 617391c..60cb8ee 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -8184,7 +8184,7 @@ mips_print_operand (FILE *file, rtx op, int letter)
>  case 't':
>{
>   int truth = (code == NE) == (letter == 'T');
> - fputc ("zfnt"[truth * 2 + (GET_MODE (op) == CCmode)], file);
> + fputc ("zfnt"[truth * 2 + (GET_MODE (XEXP (op, 0)) == CCmode)], file);
>}
>break;
>  

I think it'd be more direct to check the register class, since we used
to store CCmode in GPRs too.  I.e. ST_REGNO_P (XEXP (op, 0)).

OK with that change, thanks.  Please backport to 4.8 too.

Richard


Allow passing arrays in registers on AArch64

2014-01-17 Thread Michael Hudson-Doyle
Hi, as discussed in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59799
GCC currently gets a detail of the AArch64 ABI wrong: arrays are not
always passed by reference.  Fortunately the fix is rather easy...

I guess this is an ABI break but my understand there has been no release
of GCC which supports compiling a language that can pass arrays by value
on AArch64 yet.

Cheers,
mwh

  2014-01-17  Michael Hudson-Doyle  

PR target/59799

* config/aarch64/aarch64.c (aarch64_pass_by_reference):
  The rules for passing arrays in registers are the same as
  for structs, so remove the special case for them.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index fa53c71..d63da95 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -987,10 +987,7 @@ aarch64_pass_by_reference (cumulative_args_t pcum ATTRIBUTE_UNUSED,
 
   if (type)
 {
-  /* Arrays always passed by reference.  */
-  if (TREE_CODE (type) == ARRAY_TYPE)
-	return true;
-  /* Other aggregates based on their size.  */
+  /* Aggregates based on their size.  */
   if (AGGREGATE_TYPE_P (type))
 	size = int_size_in_bytes (type);
 }


Re: [ARM] Make -mcpu, -march and -mtune case-insensitive

2014-01-17 Thread James Greenhalgh
On Thu, Jan 16, 2014 at 06:13:33PM +, James Greenhalgh wrote:
> On Thu, Jan 16, 2014 at 05:14:24PM +, Richard Earnshaw wrote:
> > On 16/01/14 16:15, Alan Lawrence wrote:
> > > This is a small patch that makes the -mcpu, -march and -mtune
> > > command-line options case-insensitive, allowing e.g. -mcpu=Cortex-A15
> > > -march=ARMv7.
> > > 
> > > Regression tested on arm-none-eabi with no issues; options passed onto
> > > e.g. ld are always lowercase (as previously).
> > > 
> > > OK for trunk?
> > > 
> > > --Alan
> > > 
> > > ChangeLog:
> > > * config/arm/arm.opt: Make -mcpu, -march, -mtune case-insensitive.
> > 
> > * config/arm/arm.opt (mcpu, march, mtune): Make case-insensitive.
> > 
> > Don't forget the leading tab; and since someone else will have to commit
> > the patch for you, you should also include the date/author part as well.
> >  Generally
> > 
> >   Alan Lawrence  
> > 
> > would be acceptable, since the commit date may not be the same as the
> > posting date.
> > 
> > Otherwise, this is OK.
> > 
> 
> I've committed this to trunk on Alan's behalf as revision 206673,
> with the following Changelog:
> 
> 2014-01-16  Alan Lawrence  
> 
>   * config/arm/arm.opt: Make -mcpu, -march, -mtune case-insensitive.
> 

A more careful reading of your review of Alan's patch shows that this
should have been:

2014-01-16  Alan Lawrence  

* config/arm/arm.opt (mcpu, march, mtune): Make case-insensitive.

I've fixed this up in revision 206700.

Sorry for the noise.
James