Re: [PATCH] Detect loops in find_comparison_args

2012-07-27 Thread Paolo Bonzini
Il 26/07/2012 22:22, Sandra Loosemore ha scritto:
 Aha, I honestly couldn't figure out that was what you were trying to
 catch with the version you posted previously.
 
 How about this one?  Tested as before.

Yeah, that's cleaner.

Paolo


Re: [PATCH] New fdo summary-based icache sensitive unrolling (issue6351086)

2012-07-27 Thread Steven Bosscher
On Fri, Jul 27, 2012 at 6:47 AM, Teresa Johnson tejohn...@google.com wrote:
 * gcc/gcov-io.h (GCOV_TAG_SUMMARY_LENGTH): Update for new summary 
 info.
 (struct gcov_ctr_summary): Add new summary info: num_hot_counters and
 hot_cutoff_value.

You should also update the description for the data file at the head
of gcov-io.h:

   The data file contains the following records.
data: {unit summary:object summary:program* function-data*}*
unit: header int32:checksum
function-data:  announce_function present counts
announce_function: header int32:ident
int32:lineno_checksum int32:cfg_checksum
present: header int32:present
counts: header int64:count*
summary: int32:checksum {count-summary}GCOV_COUNTERS_SUMMABLE
count-summary:  int32:num int32:runs int64:sum
int64:max int64:sum_max

You've added two fields in count-summary IIUC.

Ciao!
Steven


Re: [Patch, PPC] extend TARGET_NO_LWSYNC to cover 440 and 603 processors.

2012-07-27 Thread Iain Sandoe

On 21 Jul 2012, at 18:04, Iain Sandoe wrote:
 On 21 Jul 2012, at 17:43, Andrew Pinski wrote:
 On Sat, Jul 21, 2012 at 9:12 AM, Iain Sandoe i...@codesourcery.com wrote:

 The following patch was been in use internally, for some time, to handle 
 two further cases where the processor does not have lwsync.  Verified on a 
 cross from i686-linux-gnu to powerpc-linux-gnu.

 This was only done for e500 and not other proessors as the e500 was
 not fully compatible even with the older spec.  In that lwsync would
 cause an illegal instruction exception.  This is NOT the case for 603
 and the 440 so I don't think this patch should applied.
 
 The original report, for which our change was made, was of an instruction 
 exception caused by lwsync on 440.
 I'm not personally familiar with that processor, I'll try and get access to a 
 board/confirm in due course.

To follow up on this thread and ref[1].
This is not repeatable on a 440EP.  It was reported against a device using a 
440H6 core.
We have not been able to locate an erratum, so we're going to drop the patch 
unless/until there is a specific reproducible report of failure.
thanks
Iain

[1] http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01817.html


Re: Diagnostics from GCC_DRIVER_HOST_INITIALIZATION

2012-07-27 Thread Dodji Seketeli
Hello,

Ryan Mansfield rmansfi...@qnx.com a écrit:

 On 12-07-19 06:06 PM, Gabriel Dos Reis wrote:

[...]

 Would moving the GCC_DRIVER_HOST_INITIALIZATION after diagnostic_initialize
 be OK?

 yes, I think so.

 OK, then here's a changelog entry for the diff.

 2012-07-20  Ryan Mansfield  rmansfi...@qnx.com

 * gcc.c (main): Move GCC_DRIVER_HOST_INITIALIZATION after
 diagnostic_initialize.

 Could someone please apply the change?

The change seems small and obvious enough to not require copyright
assignment on file, but, just to be sure, Ryan, do you have copyright
assignment to the FSF on file (sorry if my question is stupid)?

Gaby, can I go ahead and apply this?

Thanks.

[...]

 Index: gcc.c
 ===
 --- gcc.c (revision 189716)
 +++ gcc.c (working copy)
 @@ -6189,17 +6189,18 @@
  CL_DRIVER,
  decoded_options, decoded_options_count);
  
 -#ifdef GCC_DRIVER_HOST_INITIALIZATION
 -  /* Perform host dependent initialization when needed.  */
 -  GCC_DRIVER_HOST_INITIALIZATION;
 -#endif
 -
/* Unlock the stdio streams.  */
unlock_std_streams ();
  
gcc_init_libintl ();
  
diagnostic_initialize (global_dc, 0);
 +
 +#ifdef GCC_DRIVER_HOST_INITIALIZATION
 +  /* Perform host dependent initialization when needed.  */
 +  GCC_DRIVER_HOST_INITIALIZATION;
 +#endif
 +
if (atexit (delete_temp_files) != 0)
  fatal_error (atexit failed);
  

-- 
Dodji


Re: Add hot/cold attributes for labels

2012-07-27 Thread Richard Guenther
On Thu, Jul 26, 2012 at 11:58 PM, Richard Henderson r...@redhat.com wrote:
 On 07/26/2012 02:41 PM, Richard Henderson wrote:
 This is a patch...

 ... that I should have attached.  Bah.

Do we need to mark the labels so we preserve them?  Consider

 goto foo;

foo:
bar __attribute__((cold)):
  ...

so bar will be unused?  What about BB merging if we end up with

  BB 3:
  ..
  fallthru
  bar __attribute__((cold)):
 ...

should BB 3 inherit the coldness?  I think we no longer disable
BB merging if the destination has user labels.

Richard.


Re: Add hot/cold attributes for labels

2012-07-27 Thread Steven Bosscher
On Fri, Jul 27, 2012 at 10:57 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Thu, Jul 26, 2012 at 11:58 PM, Richard Henderson r...@redhat.com wrote:
 On 07/26/2012 02:41 PM, Richard Henderson wrote:
 This is a patch...

 ... that I should have attached.  Bah.

 Do we need to mark the labels so we preserve them?  Consider

  goto foo;

 foo:
 bar __attribute__((cold)):
   ...

 so bar will be unused?  What about BB merging if we end up with

   BB 3:
   ..
   fallthru
   bar __attribute__((cold)):
  ...

 should BB 3 inherit the coldness?  I think we no longer disable
 BB merging if the destination has user labels.

Right. I don't like the use of this attribute on labels at all, for
the reasons you list here. I think it would be much cleaner to add a
branch hint on the label in the asm goto, to contain this extension
and to also to make it clear that it's not the label that is cold but
the jump that is unlikely to be executed (i.e. cause and effect: the
jump is unlikely and therefore the basic block is cold).

Something like this:

   asm-goto-operands:
 asm-got-branch-hint identifier
 asm-goto-operands , asm-got-branch-hint identifier

   asm-got-branch-hint: empty | + | -
  where + means branch-likely and - means branch-unlikely

-

  asm goto (+l1);
  asm goto (+l1);
  asm goto (-l1);

Ciao!
Steven


Re: [rs6000 1/3] Remove RIOS, RSC and RIOS2 processor types

2012-07-27 Thread David Edelsohn
On Thu, Jul 26, 2012 at 3:38 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 Move those parts of rios.md that apply to 601 to a new file
 601.md, renaming everything from rios1* to ppc601*.

 2012-07-26  Segher Boessenkool  seg...@kernel.crashing.org

 gcc/
 * config/rs6000/601.md: New file.
 * config/rs6000/aix43.h (ASM_CPU_SPEC): Delete support for
 RIOS CPUs.
 * config/rs6000/aix51.h (ASM_CPU_SPEC): Likewise.
 * config/rs6000/driver-rs6000.c (detect_processor_aix,
 struct asm_names): Likewise.
 * config/rs6000/rios1.md: Delete file.
 * config/rs6000/rios2.md: Delete file.
 * config/rs6000/rs6000-cpus.def: Delete definitions for RIOS
 CPUs.
 * config/rs6000/rs6000-opts.h (enum processor_type): Delete
 PROCESSOR_RIOS1 and PROCESSOR_RIOS2.
 * config/rs6000/rs6000-tables.opt: Regenerated.
 * config/rs6000/rs6000.c (struct rios1_cost, struct rios2_cost):
 Delete.
 (rs6000_option_override_internal): Delete support for RIOS CPUs.
 (rs6000_conditional_register_usage): Adjust comment.
 (rs6000_issue_rate):Delete support for RIOS CPUs.
 * config/rs6000/rs6000.h (ASM_CPU_SPEC): Delete support for
 RIOS CPUs.
 (PROCESSOR_POWER): Change to PROCESSOR_PPC601.
 (PROCESSOR_DEFAULT): Change to PROCESSOR_PPC603.
 * config/rs6000/rs6000.md (define_attr cpu): Delete rios1
 and rios2.
 (include rios1.md, include rios2.md): Delete.
 (include 601.md): New.
 * config/rs6000/rs6000.opt (enum rs6000_cpu): Default to
 PROCESSOR_PPC603.
 * config/rs6000/t-aix43 (MULTILIB_MATCHES): Delete support
 for RIOS CPUs.
 * config/rs6000/t-rs6000 (MD_INCLUDES): Delete rios1.md and
 rios2.md .  Add 601.md .

This patch is okay.

Thanks a lot for helping with this cleanup!

Thanks David


Re: Add hot/cold attributes for labels

2012-07-27 Thread Richard Guenther
On Fri, Jul 27, 2012 at 11:08 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Fri, Jul 27, 2012 at 10:57 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Thu, Jul 26, 2012 at 11:58 PM, Richard Henderson r...@redhat.com wrote:
 On 07/26/2012 02:41 PM, Richard Henderson wrote:
 This is a patch...

 ... that I should have attached.  Bah.

 Do we need to mark the labels so we preserve them?  Consider

  goto foo;

 foo:
 bar __attribute__((cold)):
   ...

 so bar will be unused?  What about BB merging if we end up with

   BB 3:
   ..
   fallthru
   bar __attribute__((cold)):
  ...

 should BB 3 inherit the coldness?  I think we no longer disable
 BB merging if the destination has user labels.

 Right. I don't like the use of this attribute on labels at all, for
 the reasons you list here. I think it would be much cleaner to add a
 branch hint on the label in the asm goto, to contain this extension
 and to also to make it clear that it's not the label that is cold but
 the jump that is unlikely to be executed (i.e. cause and effect: the
 jump is unlikely and therefore the basic block is cold).

As in the case where you have both an unlikely and likely jump to a
basic-block.  But what I understand is that rth adds a way to mark
a basic-block as hot or cold, not a way to mark an edge as hot or cold
(that would be what the asm goto annotation would do).  Both cases
are of course useful.

Richard.

 Something like this:

asm-goto-operands:
  asm-got-branch-hint identifier
  asm-goto-operands , asm-got-branch-hint identifier

asm-got-branch-hint: empty | + | -
   where + means branch-likely and - means branch-unlikely

 -

   asm goto (+l1);
   asm goto (+l1);
   asm goto (-l1);

 Ciao!
 Steven


Re: [rs6000 2/3] Remove support for old POWER

2012-07-27 Thread David Edelsohn
On Thu, Jul 26, 2012 at 3:38 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 That is, -mpower and friends, TARGET_POWER and friends.
 These are always disabled now.

 2012-07-26  Segher Boessenkool  seg...@kernel.crashing.org

 gcc/
 * common/config/rs6000/rs6000-common.c (rs6000_handle_option):
 Delete code for -mno-power, -mpower, and -mpower2.
 * config/rs6000/aix43.h (NON_POWERPC_MASKS): Delete.
 (SUBTARGET_OVERRIDE_OPTIONS): Delete check for POWER together
 with -maix64.
 (ASM_CPU_SPEC): Delete support for POWER and POWER2.
 * config/rs6000/aix51.h (NON_POWERPC_MASKS): Delete.
 (SUBTARGET_OVERRIDE_OPTIONS): Delete check for POWER together
 with -maix64.
 (ASM_CPU_SPEC): Delete support for POWER and POWER2.
 * config/rs6000/aix52.h (NON_POWERPC_MASKS): Delete.
 (SUBTARGET_OVERRIDE_OPTIONS): Delete check for POWER together
 with -maix64.
 (TARGET_POWER): Delete.
 * config/rs6000/aix53.h (NON_POWERPC_MASKS): Delete.
 (SUBTARGET_OVERRIDE_OPTIONS): Delete check for POWER together
 with -maix64.
 (TARGET_POWER): Delete.
 * config/rs6000/aix61.h (NON_POWERPC_MASKS): Delete.
 (SUBTARGET_OVERRIDE_OPTIONS): Delete check for POWER together
 with -maix64.
 (TARGET_POWER): Delete.
 * config/rs6000/darwin.h (TARGET_POWER): Delete.
 * config/rs6000/driver-rs6000.c (struct asm_names): Delete
 support for -mpower, -mpower2, and -mno-power.
 * config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
 Likewise.
 (rs6000_cpu_cpp_builtins): Likewise.
 * config/rs6000/rs6000-cpus.def: Likewise.
 * config/rs6000/rs6000-tables.opt: Regenerate.X FIXME
 * config/rs6000/rs6000.c (POWER_MASKS): Delete.
 (rs6000_option_override_internal): Adjust.
 (rs6000_conditional_register_usage): Adjust.
 (rs6000_emit_move): Adjust.
 (rs6000_common_init_builtins): Adjust.
 (rs6000_init_libfuncs): Adjust.
 (rs6000_output_function_prologue): Adjust.
 (rs6000_adjust_cost): Adjust.
 (struct rs6000_opt_masks): Delete MASK_POWER and MASK_POWER2.
 * config/rs6000/rs6000.h (ASM_CPU_SPEC): Delete support for
 POWER and POWER2.
 (TARGET_DEFAULT): Adjust.
 (PROCESSOR_POWER): Delete.
 (SHIFT_COUNT_TRUNCATED): Adjust.
 * config/rs6000/rs6000.md (extendqisi2): Delete POWER support.
 (extendqisi2_power): Delete.
 (extendqisi2_no_power): Adjust.
 (extendqihi2, extendqihi2_power, extendqihi2_no_power):
 Likewise.
 (sminsi3, smaxsi3, uminsi3, umaxsi3): Adjust.
 (anonymous doz insn patterns): Delete.
 (abssi2): Adjust.
 (abssi2_power): Delete.
 (abssi2_nopower): Adjust.
 (nabs_power, nabs_nopower): Likewise.
 (mulsi3, mulsi3_mq, mulsi3_no_mq, mulsi3_mq_internal1):
 Likewise.  Delete anonymous post-reload splitter.
 (mulsi3_no_mq_internal1): rename to...
 (mulsi3_internal1): New define_insn.
 (mulsi3_mq_internal2, mulsi3_no_mq_internal2, mulsi3_internal2):
 Likewise.
 (divmodsi4, divmodsi4_internal, udivmode3, udivsi3_mq,
 udivsi3_no_mq, udivsi3, divmode3, divsi3_mq, divmode3_no_mq,
 udivmodsi4_normal, udivmodsi4_tests, udivmodsi4): Likewise.
 (mulh_call, mull_call, divss_call, divus_call, quoss_call,
 quous_call): Likewise.
 (maskir_internal1, maskir_internal2, maskir_internal3,
 maskir_internal4, maskir_internal5, maskir_internal6,
 maskir_internal7, maskir_internal8): Delete.
 (ashlsi3, ashlsi3_power, ashlsi3_no_power): Adjust.
 (anonymous sl insn patterns): Delete.
 (lshrsi3, lshrsi3_power, lshrsi3_no_power): Adjust.
 (lshrsi3_64): Adjust.
 (anonymous sr insn patterns): Delete.
 (anonymous rrib insn patterns): Delete.
 (ashrsi3, ashrsi3_power, ashrsi3_no_power): Adjust.
 (anonymous sra insn patterns): Delete.
 (sqrtsf2, sqrtdf2, sqrtdf2_fpr): Adjust.
 (fix_truncmodesi2, fix_truncmodesi2_internal,
 fctiwz_mode): Adjust.
 (mulsidi3, mulsidi3_mq, mulsidi3_no_mq, umulsidi3, umulsidi3_mq,
 umulsidi3_no_mq, smulsi3_highpart, smulsi3_highpart_mq,
 smulsi3_highpart_no_mq, umulsi3_highpart, umulsi3_highpart_mq,
 umulsi3_highpart_no_mq): Adjust.
 (ashldi3_power, lshrdi3_power, ashrdi3_power): Delete.
 (ashrdi3_no_power, ashldi3, ashldi3_internal1,
 lshrdi3_internal1): Adjust.
 (fix_trunctfsi2, fix_trunctfsi2_fprs): Adjust.
 (movti_power): Delete.
 (movti_string): Adjust.
 (stmsi8, stmsi7, stmsi6, stmsi5, stmsi4, stmsi3): Adjust.
 (stmsi8_power, stmsi7_power, stmsi6_power, stmsi5_power,
 stmsi4_power, stmsi3_power): 

Re: [rs6000 2/3] Remove support for old POWER

2012-07-27 Thread Segher Boessenkool

his is okay, but why does the ChangeLog line


 * config/rs6000/rs6000-tables.opt: Regenerate.X  
FIXME


have X FIXME?


Just to check if you are paying attention :-)  It was a
reminder to myself to force a regenerate of the file, because
the timestamps get messed up during patch series rebasing.
I then remembered to regenerate the file, but not to remove my
reminder.  Can't have everything :-)


Segher



[PATCH, i386]: Handle zero extended addresses in ix86_avoid_lea_for_addr

2012-07-27 Thread Uros Bizjak
Hello!

Attached patch enables ix86_avoid_lea_for_addr to process
zero-extended addresses. This patch should help atom performance,
especially in x32 mode.

Please note the complication with insn re-recognition in
ix86_avoid_lea_for_addr, to solve the problem as described in the
comment:

  /* ix86_avoid_lea_for_addr re-recognizes insn and changes operands[]
 array behind our backs.  To make things worse, zero-extended oeprands
 (zero_extend:DI (addr:SI)) are re-recognized as (addr:DI), since they
 also satisfy operand constraints of one of many *leamode insn patterns.

 However, at this point we are looking only if the original insn
 is performing inherent zero extension, and will emit
 split insn sequence in SImode for this case.  */

2012-07-27  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (ix86_avoid_lea_for_addr): Handle
zero-extended addresses.
(ix86_split_lea_for_addr): Unconditionally convert target and
all address operands to requested mode.
* config/i386/i386.md (*leamode): Determine mode of split insn
sequence from the original insn pattern.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32}, also when configured with --with-arch=core2 --with-cpu=atom

I will wait a day or two for possible comments, before the patch is
committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 189904)
+++ config/i386/i386.md (working copy)
@@ -3474,13 +3474,28 @@
 (match_operand:SI 1 x86_64_zext_general_operand
rmWz,0,r   ,m  ,r   ,m)))]
   TARGET_64BIT
-  @
-   mov{l}\t{%1, %k0|%k0, %1}
-   #
-   movd\t{%1, %0|%0, %1}
-   movd\t{%1, %0|%0, %1}
-   %vmovd\t{%1, %0|%0, %1}
-   %vmovd\t{%1, %0|%0, %1}
+{
+  switch (get_attr_type (insn))
+{
+case TYPE_IMOVX:
+  if (ix86_use_lea_for_mov (insn, operands))
+   return lea{l}\t{%E1, %k0|%k0, %E1};
+  else
+   return mov{l}\t{%1, %k0|%k0, %1};
+
+case TYPE_MULTI:
+  return #;
+
+case TYPE_MMXMOV:
+  return movd\t{%1, %0|%0, %1};
+
+case TYPE_SSEMOV:
+  return %vmovd\t{%1, %0|%0, %1};
+
+default:
+  gcc_unreachable ();
+}
+}
   [(set_attr type imovx,multi,mmxmov,mmxmov,ssemov,ssemov)
(set_attr prefix orig,*,orig,orig,maybe_vex,maybe_vex)
(set_attr prefix_0f 0,*,*,*,*,*)
@@ -5479,7 +5494,26 @@
   reload_completed  ix86_avoid_lea_for_addr (insn, operands)
   [(const_int 0)]
 {
-  ix86_split_lea_for_addr (operands, MODEmode);
+  enum machine_mode mode = MODEmode;
+  rtx addr;
+
+  /* ix86_avoid_lea_for_addr re-recognizes insn and changes operands[]
+ array behind our backs.  To make things worse, zero-extended oeprands
+ (zero_extend:DI (addr:SI)) are re-recognized as (addr:DI), since they
+ also satisfy operand constraints of one of many *leamode insn patterns.
+
+ However, at this point we are looking only if the original insn
+ is performing inherent zero extension, and will emit
+ split insn sequence in SImode for this case.  */
+  addr = SET_SRC (PATTERN (curr_insn));
+
+  /* Emit all operations in SImode for zero-extended addresses.  Recall
+ that x86_64 inheretly zero-extends SImode operations to DImode.  */
+  if (GET_CODE (addr) == ZERO_EXTEND
+  || GET_CODE (addr) == AND)
+mode = SImode;
+
+  ix86_split_lea_for_addr (operands, mode);
   DONE;
 }
   [(set_attr type lea)
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 189904)
+++ config/i386/i386.c  (working copy)
@@ -17036,11 +17036,6 @@ ix86_avoid_lea_for_addr (rtx insn, rtx operands[])
   struct ix86_address parts;
   int ok;
 
-  /* FIXME: Handle zero-extended addresses.  */
-  if (GET_CODE (operands[1]) == ZERO_EXTEND
-  || GET_CODE (operands[1]) == AND)
-return false;
-
   /* Check we need to optimize.  */
   if (!TARGET_OPT_AGU || optimize_function_for_size_p (cfun))
 return false;
@@ -17124,7 +17119,7 @@ ix86_emit_binop (enum rtx_code code, enum machine_
It is assumed that it is allowed to clobber flags register
at lea position.  */
 
-extern void
+void
 ix86_split_lea_for_addr (rtx operands[], enum machine_mode mode)
 {
   unsigned int regno0, regno1, regno2;
@@ -17135,7 +17130,7 @@ ix86_split_lea_for_addr (rtx operands[], enum mach
   ok = ix86_decompose_address (operands[1], parts);
   gcc_assert (ok);
 
-  target = operands[0];
+  target = gen_lowpart (mode, operands[0]);
 
   regno0 = true_regnum (target);
   regno1 = INVALID_REGNUM;
@@ -17143,18 +17138,19 @@ ix86_split_lea_for_addr (rtx operands[], enum mach
 
   if (parts.base)
 {
-  if (GET_MODE (parts.base) != mode)
-   parts.base = gen_lowpart (mode, parts.base);
+  parts.base = gen_lowpart (mode, parts.base);
   regno1 = true_regnum (parts.base);
 }
 
   if (parts.index)
 {
-  

Re: [rs6000 3/3] Remove MQ

2012-07-27 Thread David Edelsohn
On Thu, Jul 26, 2012 at 3:38 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 gcc/
 * config/rs6000/constraints.md: Delete q constraint.
 * config/rs6000/dfp.md (movsd_hardfloat, movsd_softfloat):
 Delete the q alternative.
 * config/rs6000/predicates.md (gpc_reg_operand): Replace
 MQ_REGNO with the literal 64.
 * config/rs6000/rs6000.c (rs6000_debug_reg_global,
 rs6000_init_hard_regno_mode_ok, rs6000_dbx_register_number):
 Adjust to MQ_REGNO removal.
 * config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTER): Adjust
 comment.
 (REG_ALLOC_ORDER): Adjust comment.  Remove MQ from alloc order.
 (enum reg_class): Adjust comment.  Delete MQ_REGS.
 (REG_CLASS_CONTENTS): Adjust.
 (REGISTER_NAMES, ADDITIONAL_REGISTER_NAMES): Adjust comment.
 * config/rs6000/rs6000.md: Delete MQ_REGNO.
 (movsi_internal1, movsi_internal1_single, movhi_internal,
 movqi_internal, movcc_internal1, movsf_hardfloat,
 movsf_softfloat): Delete the q alternative.
 (ctrmode_internal1, ctrmode_internal2, ctrmode_internal5,
 ctrmode_internal6): Delete q constraint.

This is okay, but gpc_reg_operand in predicates.md needs something
better than a magic 64 constant in the test.  Should this be a
comment or a new symbolic value related the max/last FPR or number of
general registers?

Thanks, David


Re: Add hot/cold attributes for labels

2012-07-27 Thread Steven Bosscher
On Fri, Jul 27, 2012 at 11:15 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 Right. I don't like the use of this attribute on labels at all, for
 the reasons you list here. I think it would be much cleaner to add a
 branch hint on the label in the asm goto, to contain this extension
 and to also to make it clear that it's not the label that is cold but
 the jump that is unlikely to be executed (i.e. cause and effect: the
 jump is unlikely and therefore the basic block is cold).

 As in the case where you have both an unlikely and likely jump to a
 basic-block.  But what I understand is that rth adds a way to mark
 a basic-block as hot or cold, not a way to mark an edge as hot or cold
 (that would be what the asm goto annotation would do).  Both cases
 are of course useful.

I don't see why it is useful to be able to mark a basic block as hot
or cold. This is something that the compiler can figure out for itself
if you provide the branch hints (__builtin_expect is also a kind of
branch hint). Marking basic blocks as likely or unlikely seems just
redundant and confusing to me. A basic block being hot or cold is an
effect of its incoming edges being unlikely-taken, not an inherent
property of the basic block itself.

Ciao!
Steven


Re: [rs6000 3/3] Remove MQ

2012-07-27 Thread Segher Boessenkool

This is okay, but gpc_reg_operand in predicates.md needs something
better than a magic 64 constant in the test.  Should this be a
comment or a new symbolic value related the max/last FPR or number of
general registers?


The test assumes a lot about the relative ordering of the
various regnos (like most of the predicates).  In my opinion
a literal here is better than testing with some in fact
unrelated macro.  All these predicates could use a cleanup,
bring them into the 21st century.  I didn't really feel like
doing that though.

I'll add a comment.


Segher



Re: [Test] Fix for PRPR53981

2012-07-27 Thread Kirill Yukhin

 OK if you remove the declarations for abort, exit, rand, and srand;
 they're no longer needed.  Presumably an alternate fix would be to
 add extern before the declarations of rand and srand.

 Janis

Comitted to trunk and 4.7 branch

http://gcc.gnu.org/ml/gcc-cvs/2012-07/msg00811.html
http://gcc.gnu.org/ml/gcc-cvs/2012-07/msg00810.html

Thanks, K


Re: Add hot/cold attributes for labels

2012-07-27 Thread Richard Guenther
On Fri, Jul 27, 2012 at 11:40 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Fri, Jul 27, 2012 at 11:15 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 Right. I don't like the use of this attribute on labels at all, for
 the reasons you list here. I think it would be much cleaner to add a
 branch hint on the label in the asm goto, to contain this extension
 and to also to make it clear that it's not the label that is cold but
 the jump that is unlikely to be executed (i.e. cause and effect: the
 jump is unlikely and therefore the basic block is cold).

 As in the case where you have both an unlikely and likely jump to a
 basic-block.  But what I understand is that rth adds a way to mark
 a basic-block as hot or cold, not a way to mark an edge as hot or cold
 (that would be what the asm goto annotation would do).  Both cases
 are of course useful.

 I don't see why it is useful to be able to mark a basic block as hot
 or cold. This is something that the compiler can figure out for itself
 if you provide the branch hints (__builtin_expect is also a kind of
 branch hint). Marking basic blocks as likely or unlikely seems just
 redundant and confusing to me. A basic block being hot or cold is an
 effect of its incoming edges being unlikely-taken, not an inherent
 property of the basic block itself.

Well, I see it as a more finegrained way to force the compiler to optimize
that block for size (which then leads to my question on basic-block merging
and propagating that info).

Richard.

 Ciao!
 Steven


[PATCH][1/n] into-SSA TLC

2012-07-27 Thread Richard Guenther

This removes one FOR_EACH_REFERENCED vars loop from into-ssa.
insert_phi_nodes, which is only called when we rewrite the whole
function into SSA, not from SSA updating, has an awkward way
of inserting PHI nodes for all relevant vars which uses
three hashtable lookups for all vars we insert PHIs for (and
some more).  The following patch gets that down to zero
and trades it for a qsort on a vector we trade for a temporary
bitmap.  That sounds overall faster (not that into-ssa time
matters - only update-ssa time would), and brings me one step
closer to eventually not require the UID - DECL mapping
(referenced-vars) in into-SSA (which is the major blocker for
getting rid of that mapping alltogether).

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2012-07-27  Richard Guenther  rguent...@suse.de

* tree-into-ssa.c (def_blocks_p): New typedef.
(insert_phi_nodes_compare_def_blocks): New function.
(insert_phi_nodes): Do not walk over referenced vars, instead
walk over recorded def_blocks, record relevant ones and sort
them to avoid repeated hashtable lookups.

Index: gcc/tree-into-ssa.c
===
*** gcc/tree-into-ssa.c (revision 189904)
--- gcc/tree-into-ssa.c (working copy)
*** struct def_blocks_d
*** 67,72 
--- 67,77 
bitmap livein_blocks;
  };
  
+ typedef struct def_blocks_d *def_blocks_p;
+ 
+ DEF_VEC_P(def_blocks_p);
+ DEF_VEC_ALLOC_P(def_blocks_p,heap);
+ 
  
  /* Each entry in DEF_BLOCKS contains an element of type STRUCT
 DEF_BLOCKS_D, mapping a variable VAR to a bitmap describing all the
*** insert_phi_nodes_for (tree var, bitmap p
*** 1142,1147 
--- 1147,1164 
  }
  }
  
+ /* Sort def_blocks after DECL_UID of their var.  */
+ 
+ static int
+ insert_phi_nodes_compare_def_blocks (const void *a, const void *b)
+ {
+   const struct def_blocks_d *defa = *(struct def_blocks_d * const *)a;
+   const struct def_blocks_d *defb = *(struct def_blocks_d * const *)b;
+   if (DECL_UID (defa-var)  DECL_UID (defb-var))
+ return -1;
+   else
+ return 1;
+ }
  
  /* Insert PHI nodes at the dominance frontier of blocks with variable
 definitions.  DFS contains the dominance frontier information for
*** insert_phi_nodes_for (tree var, bitmap p
*** 1150,1192 
  static void
  insert_phi_nodes (bitmap_head *dfs)
  {
!   referenced_var_iterator rvi;
!   bitmap_iterator bi;
!   tree var;
!   bitmap vars;
!   unsigned uid;
  
timevar_push (TV_TREE_INSERT_PHI_NODES);
  
/* Do two stages to avoid code generation differences for UID
   differences but no UID ordering differences.  */
  
!   vars = BITMAP_ALLOC (NULL);
!   FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
  {
!   struct def_blocks_d *def_map;
! 
!   def_map = find_def_blocks_for (var);
!   if (def_map == NULL)
!   continue;
! 
!   if (get_phi_state (var) != NEED_PHI_STATE_NO)
!   bitmap_set_bit (vars, DECL_UID (var));
! }
! 
!   EXECUTE_IF_SET_IN_BITMAP (vars, 0, uid, bi)
! {
!   tree var = referenced_var (uid);
!   struct def_blocks_d *def_map;
!   bitmap idf;
! 
!   def_map = find_def_blocks_for (var);
!   idf = compute_idf (def_map-def_blocks, dfs);
!   insert_phi_nodes_for (var, idf, false);
BITMAP_FREE (idf);
  }
  
!   BITMAP_FREE (vars);
  
timevar_pop (TV_TREE_INSERT_PHI_NODES);
  }
--- 1167,1196 
  static void
  insert_phi_nodes (bitmap_head *dfs)
  {
!   htab_iterator hi;
!   unsigned i;
!   struct def_blocks_d *def_map;
!   VEC(def_blocks_p,heap) *vars;
  
timevar_push (TV_TREE_INSERT_PHI_NODES);
  
+   vars = VEC_alloc (def_blocks_p, heap, htab_elements (def_blocks));
+   FOR_EACH_HTAB_ELEMENT (def_blocks, def_map, struct def_blocks_d *, hi)
+ if (get_phi_state (def_map-var) != NEED_PHI_STATE_NO)
+   VEC_quick_push (def_blocks_p, vars, def_map);
+ 
/* Do two stages to avoid code generation differences for UID
   differences but no UID ordering differences.  */
+   VEC_qsort (def_blocks_p, vars, insert_phi_nodes_compare_def_blocks);
  
!   FOR_EACH_VEC_ELT (def_blocks_p, vars, i, def_map)
  {
!   bitmap idf = compute_idf (def_map-def_blocks, dfs);
!   insert_phi_nodes_for (def_map-var, idf, false);
BITMAP_FREE (idf);
  }
  
!   VEC_free(def_blocks_p, heap, vars);
  
timevar_pop (TV_TREE_INSERT_PHI_NODES);
  }


[PATCH] convert target_expmed macro accessors into inline functions

2012-07-27 Thread Nathan Froyd
As suggested by rth here:

http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01281.html

this patch converts all the #define accessors in expmed.h to use inline
functions instead.

By itself, doing that conversion is not very exciting.  Followup patches
might:

* Move setters into expmed.c;
* Reduce space of fields by not using NUM_MACHINE_MODES, similar to the
  convert_cost case;
* Possibly moving the getters into expmed.c, assuming that LTO will take
  care of the performance hit.  Doing so enables target_expmed to not be
  exposed everywhere.
* Lazily initialize the costs.

Tested on x86_64-unknown-linux-gnu.  OK to commit?

-Nathan

* expmed.h (alg_hash, alg_hash_used_p, sdiv_pow2_cheap,
smod_pow2_cheap, zero_cost, add_cost, neg_cost, shift_cost)
shiftadd_cost, shiftsub0_cost, shiftsub1_cost, mul_cost,
sdiv_cost, udiv_cost, mul_widen_cost, mul_highpart_cost): Delete
macro definitions and re-purpose as inline functions.
(alg_hash_entry_ptr, set_alg_hash_used_p, sdiv_pow2_cheap_ptr,
set_sdiv_pow2_cheap, smod_pow2_cheap_ptr, set_smod_pow2_cheap,
zero_cost_ptr, set_zero_cost, add_cost_ptr, set_add_cost,
neg_cost_ptr, set_neg_cost, shift_cost_ptr, set_shift_cost,
shiftadd_cost_ptr, set_shiftadd_cost, shiftsub0_cost_ptr,
set_shiftsub0_cost, shiftsub1_cost_ptr, set_shiftsub1_cost,
mul_cost_ptr, set_mul_cost, sdiv_cost_ptr, set_sdiv_cost,
udiv_cost_ptr, set_udiv_cost, mul_widen_cost_ptr,
set_mul_widen_cost, mul_highpart_cost_ptr, set_mul_highpart_cost):
New functions.
(convert_cost_ptr): New function, split out from...
(set_convert_cost, convert_cost): ...here.
* expmed.c, tree-ssa-loop-ivopts.c: Update for new functions.
* gimple-ssa-strength-reduction.c: Likewise.

---
 gcc/expmed.c|  230 ++-
 gcc/expmed.h|  441 +++
 gcc/gimple-ssa-strength-reduction.c |6 +-
 gcc/tree-ssa-loop-ivopts.c  |   24 +-
 4 files changed, 533 insertions(+), 168 deletions(-)

diff --git a/gcc/expmed.c b/gcc/expmed.c
index e660a3f..9743fc0 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -143,20 +143,24 @@ init_expmed_one_mode (struct init_expmed_rtl *all,
   PUT_MODE (all-shift_sub1, mode);
   PUT_MODE (all-convert, mode);
 
-  add_cost[speed][mode] = set_src_cost (all-plus, speed);
-  neg_cost[speed][mode] = set_src_cost (all-neg, speed);
-  mul_cost[speed][mode] = set_src_cost (all-mult, speed);
-  sdiv_cost[speed][mode] = set_src_cost (all-sdiv, speed);
-  udiv_cost[speed][mode] = set_src_cost (all-udiv, speed);
-
-  sdiv_pow2_cheap[speed][mode] = (set_src_cost (all-sdiv_32, speed)
- = 2 * add_cost[speed][mode]);
-  smod_pow2_cheap[speed][mode] = (set_src_cost (all-smod_32, speed)
- = 4 * add_cost[speed][mode]);
-
-  shift_cost[speed][mode][0] = 0;
-  shiftadd_cost[speed][mode][0] = shiftsub0_cost[speed][mode][0]
-= shiftsub1_cost[speed][mode][0] = add_cost[speed][mode];
+  set_add_cost (speed, mode, set_src_cost (all-plus, speed));
+  set_neg_cost (speed, mode, set_src_cost (all-neg, speed));
+  set_mul_cost (speed, mode, set_src_cost (all-mult, speed));
+  set_sdiv_cost (speed, mode, set_src_cost (all-sdiv, speed));
+  set_udiv_cost (speed, mode, set_src_cost (all-udiv, speed));
+
+  set_sdiv_pow2_cheap (speed, mode, (set_src_cost (all-sdiv_32, speed)
+= 2 * add_cost (speed, mode)));
+  set_smod_pow2_cheap (speed, mode, (set_src_cost (all-smod_32, speed)
+= 4 * add_cost (speed, mode)));
+
+  set_shift_cost (speed, mode, 0, 0);
+  {
+int cost = add_cost (speed, mode);
+set_shiftadd_cost (speed, mode, 0, cost);
+set_shiftsub0_cost (speed, mode, 0, cost);
+set_shiftsub1_cost (speed, mode, 0, cost);
+  }
 
   n = MIN (MAX_BITS_PER_WORD, mode_bitsize);
   for (m = 1; m  n; m++)
@@ -164,10 +168,10 @@ init_expmed_one_mode (struct init_expmed_rtl *all,
   XEXP (all-shift, 1) = all-cint[m];
   XEXP (all-shift_mult, 1) = all-pow2[m];
 
-  shift_cost[speed][mode][m] = set_src_cost (all-shift, speed);
-  shiftadd_cost[speed][mode][m] = set_src_cost (all-shift_add, speed);
-  shiftsub0_cost[speed][mode][m] = set_src_cost (all-shift_sub0, speed);
-  shiftsub1_cost[speed][mode][m] = set_src_cost (all-shift_sub1, speed);
+  set_shift_cost (speed, mode, m, set_src_cost (all-shift, speed));
+  set_shiftadd_cost (speed, mode, m, set_src_cost (all-shift_add, 
speed));
+  set_shiftsub0_cost (speed, mode, m, set_src_cost (all-shift_sub0, 
speed));
+  set_shiftsub1_cost (speed, mode, m, set_src_cost (all-shift_sub1, 
speed));
 }
 
   if (SCALAR_INT_MODE_P (mode))
@@ -181,10 +185,8 @@ init_expmed_one_mode (struct init_expmed_rtl *all,
  PUT_MODE (all-wide_lshr, wider_mode);
  XEXP 

Re: [Test] Fix for PRPR53981

2012-07-27 Thread Anna Tikhonova
Kirill has already checked it in for me, thanks!
Btw, I have FSF copyright assignment.

Is it possible to commit this patch to 4.6 branch? Since current ndk
is based on 4.6

2012/7/25 Janis Johnson janis_john...@mentor.com:
 On 07/25/2012 03:58 AM, Anna Tikhonova wrote:
 Thanks!

 I've removed declarations. New patch attached.

 You're not listed as write after approval in the MAINTAINERS
 file; would you like me to check this in for you?

 I didn't check to see if you have an FSF copyright assignment
 because this is a very small patch.

 Janis


[PATCH, testsuite]: Fix gfortran.dg/bind_c_array_params_2.f90 scan failure on alpha

2012-07-27 Thread Uros Bizjak
Hello!

Without -mno-explicit-relocs, alpha generates:

ldq $27,myBindC($29)!literal!6
jsr $26,($27),myBindC   !lituse_jsr!6

which confuses scan-assembler-times ... 1.

Also, added appropriate cleanup-tree-dump while there.

Tested on alphaev68-pc-linux-gnu, committed to mainline SVN.

Uros.
Index: bind_c_array_params_2.f90
===
--- bind_c_array_params_2.f90   (revision 189904)
+++ bind_c_array_params_2.f90   (working copy)
@@ -1,5 +1,6 @@
 ! { dg-do compile }
 ! { dg-options -std=f2008ts -fdump-tree-original }
+! { dg-additional-options -mno-explicit-relocs { target alpha*-*-* } }
 !
 ! Check that assumed-shape variables are correctly passed to BIND(C)
 ! as defined in TS 29913
@@ -14,6 +15,6 @@
 call test(aa)
 end
 
+! { dg-final { scan-assembler-times myBindC 1 } }
 ! { dg-final { scan-tree-dump-times test \\\(parm\\. 1 original } }
-! { dg-final { scan-assembler-times myBindC 1 } }
-
+! { dg-final { cleanup-tree-dump original } }


[PATCH][2/n] into-SSA TLC

2012-07-27 Thread Richard Guenther

When Diego committed rev 119760 he basically disabled the codepath
that switched virtual symbols to full rewrite.  Basically because
you could still trigger it by --param virtual-mappings-ratio=0.
As the code has not been excercised in the last 6(!) years I
opted to remove it as the underlying issue, very many virtual operands
vanished with the merge of alias-improvements.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2012-07-27  Richard Guenther  rguent...@suse.de

* doc/invoke.texi (min-virtual-mappings, virtual-mappings-ratio):
Remove param documentation.
* params.def (PARAM_MIN_VIRTUAL_MAPPINGS,
PARAM_VIRTUAL_MAPPINGS_TO_SYMS_RATIO): Remove.
* tree-flow.h (mark_set_for_renaming): Remove.
* tree-into-ssa.c (struct update_ssa_stats_d): Remove.
(add_new_name_mapping): Remove bookkeeping code.
(dump_update_ssa): Remove stats dumping code.
(init_update_ssa): Remove stats allocation code.
(delete_update_ssa): Remove stats freeing code.
(mark_set_for_renaming): Remove.
(switch_virtuals_to_full_rewrite_p): Likewise.
(switch_virtuals_to_full_rewrite): Likewise.
(update_ssa): Do not call switch_virtuals_to_full_rewrite.

Index: trunk/gcc/doc/invoke.texi
===
*** trunk.orig/gcc/doc/invoke.texi  2012-07-26 10:46:37.0 +0200
--- trunk/gcc/doc/invoke.texi   2012-07-27 12:52:05.774891497 +0200
*** Small integer constants can use a shared
*** 9156,9173 
  compiler's memory usage and increasing its speed.  This sets the maximum
  value of a shared integer constant.  The default value is 256.
  
- @item min-virtual-mappings
- Specifies the minimum number of virtual mappings in the incremental
- SSA updater that should be registered to trigger the virtual mappings
- heuristic defined by virtual-mappings-ratio.  The default value is
- 100.
- 
- @item virtual-mappings-ratio
- If the number of virtual mappings is virtual-mappings-ratio bigger
- than the number of virtual symbols to be updated, then the incremental
- SSA updater switches to a full update for those symbols.  The default
- ratio is 3.
- 
  @item ssp-buffer-size
  The minimum size of buffers (i.e.@: arrays) that receive stack smashing
  protection when @option{-fstack-protection} is used.
--- 9156,9161 
Index: trunk/gcc/params.def
===
*** trunk.orig/gcc/params.def   2012-06-06 11:40:57.0 +0200
--- trunk/gcc/params.def2012-07-27 12:51:33.096892630 +0200
*** DEFPARAM (PARAM_INTEGER_SHARE_LIMIT,
*** 644,673 
  The upper bound for sharing integer constants,
  256, 2, 2)
  
- /* Incremental SSA updates for virtual operands may be very slow if
-there is a large number of mappings to process.  In those cases, it
-is faster to rewrite the virtual symbols from scratch as if they
-had been recently introduced.  This heuristic cannot be applied to
-SSA mappings for real SSA names, only symbols kept in FUD chains.
- 
-PARAM_MIN_VIRTUAL_MAPPINGS specifies the minimum number of virtual
-mappings that should be registered to trigger the heuristic.
- 
-PARAM_VIRTUAL_MAPPINGS_TO_SYMS_RATIO specifies the ratio between
-mappings and symbols.  If the number of virtual mappings is
-PARAM_VIRTUAL_MAPPINGS_TO_SYMS_RATIO bigger than the number of
-virtual symbols to be updated, then the updater switches to a full
-update for those symbols.  */
- DEFPARAM (PARAM_MIN_VIRTUAL_MAPPINGS,
- min-virtual-mappings,
- Minimum number of virtual mappings to consider switching to full 
virtual renames,
- 100, 0, 0)
- 
- DEFPARAM (PARAM_VIRTUAL_MAPPINGS_TO_SYMS_RATIO,
- virtual-mappings-ratio,
- Ratio between virtual mappings and virtual symbols to do full 
virtual renames,
- 3, 0, 0)
- 
  DEFPARAM (PARAM_SSP_BUFFER_SIZE,
  ssp-buffer-size,
  The lower bound for a buffer to be considered for stack smashing 
protection,
--- 644,649 
Index: trunk/gcc/tree-flow.h
===
*** trunk.orig/gcc/tree-flow.h  2012-07-26 15:47:35.0 +0200
--- trunk/gcc/tree-flow.h   2012-07-27 12:46:26.098903253 +0200
*** bool name_registered_for_update_p (tree)
*** 575,581 
  void release_ssa_name_after_update_ssa (tree);
  void compute_global_livein (bitmap, bitmap);
  void mark_sym_for_renaming (tree);
- void mark_set_for_renaming (bitmap);
  bool symbol_marked_for_renaming (tree);
  tree get_current_def (tree);
  void set_current_def (tree, tree);
--- 575,580 
Index: trunk/gcc/tree-into-ssa.c
===
*** trunk.orig/gcc/tree-into-ssa.c  2012-07-27 12:44:02.0 +0200
--- trunk/gcc/tree-into-ssa.c   2012-07-27 

Re: [PATCH] Fix comment in cgraphunit.c

2012-07-27 Thread Richard Guenther
On Fri, Jul 27, 2012 at 1:44 PM, Marek Polacek pola...@redhat.com wrote:
 On Wed, Jul 25, 2012 at 01:43:59PM +0200, Richard Guenther wrote:
 On Tue, Jul 24, 2012 at 9:27 PM, Marek Polacek pola...@redhat.com wrote:
  Ping.

 Ok.

 Sorry, can't commit myself.  Thanks,

Committed.

Richard.

 Marek


[PATCH][3/n] into-SSA TLC

2012-07-27 Thread Richard Guenther

This tries to more clearly separate per-SSA name held information
from per-DECL held information during update-ssa.  We already have
a global array of SSA name informations so it is pointless to
have a hashtable mapping SSA names to yet another piece of information
(a bitmap).  This patch simply puts the bitmap into that SSA name
auxiliar vector.  Lifetime is managed by using a separate obstack
and the aux vector age.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2012-07-27  Richard Guenther  rguent...@suse.de

* tree-cfg.c (gimple_can_merge_blocks_p): Do more fine-grained
check whether SSA form is not up-to-date.
* tree-flow.h (name_mappings_registered_p): Remove.
* tree-into-ssa.c (struct repl_map_d): Remove.
(repl_tbl): Likewise.
(struct ssa_name_info): Add repl_set member.
(update_ssa_obstack): New static global.
(get_ssa_name_ann): Initialize repl_set.
(clear_ssa_name_info): Assert age did not wrap.
(repl_map_hash, repl_map_eq, repl_map_free): Remove.
(names_replaced_by): Adjust.
(add_to_repl_tbl): Likewise.
(dump_tree_ssa_stats): Likewise.
(init_update_ssa): Initialize update_ssa_obstack.
(delete_update_ssa): Free update_ssa_obstack.
(name_mappings_registered_p): Remove.
(update_ssa): Adjust.

Index: trunk/gcc/tree-cfg.c
===
*** trunk.orig/gcc/tree-cfg.c   2012-07-26 10:46:43.0 +0200
--- trunk/gcc/tree-cfg.c2012-07-27 13:40:30.603790938 +0200
*** gimple_can_merge_blocks_p (basic_block a
*** 1449,1455 
  {
gimple stmt;
gimple_stmt_iterator gsi;
-   gimple_seq phis;
  
if (!single_succ_p (a))
  return false;
--- 1449,1454 
*** gimple_can_merge_blocks_p (basic_block a
*** 1499,1508 
/* It must be possible to eliminate all phi nodes in B.  If ssa form
   is not up-to-date and a name-mapping is registered, we cannot eliminate
   any phis.  Symbols marked for renaming are never a problem though.  */
!   phis = phi_nodes (b);
!   if (!gimple_seq_empty_p (phis)
!name_mappings_registered_p ())
! return false;
  
/* When not optimizing, don't merge if we'd lose goto_locus.  */
if (!optimize
--- 1498,1510 
/* It must be possible to eliminate all phi nodes in B.  If ssa form
   is not up-to-date and a name-mapping is registered, we cannot eliminate
   any phis.  Symbols marked for renaming are never a problem though.  */
!   for (gsi = gsi_start_phis (b); !gsi_end_p (gsi); gsi_next (gsi))
! {
!   gimple phi = gsi_stmt (gsi);
!   /* Technically only new names matter.  */
!   if (name_registered_for_update_p (PHI_RESULT (phi)))
!   return false;
! }
  
/* When not optimizing, don't merge if we'd lose goto_locus.  */
if (!optimize
Index: trunk/gcc/tree-flow.h
===
*** trunk.orig/gcc/tree-flow.h  2012-07-27 12:46:26.0 +0200
--- trunk/gcc/tree-flow.h   2012-07-27 13:40:41.441790559 +0200
*** void delete_update_ssa (void);
*** 570,576 
  void register_new_name_mapping (tree, tree);
  tree create_new_def_for (tree, gimple, def_operand_p);
  bool need_ssa_update_p (struct function *);
- bool name_mappings_registered_p (void);
  bool name_registered_for_update_p (tree);
  void release_ssa_name_after_update_ssa (tree);
  void compute_global_livein (bitmap, bitmap);
--- 570,575 
Index: trunk/gcc/tree-into-ssa.c
===
*** trunk.orig/gcc/tree-into-ssa.c  2012-07-27 12:52:09.0 +0200
--- trunk/gcc/tree-into-ssa.c   2012-07-27 14:00:30.533749404 +0200
*** static bitmap blocks_with_phis_to_rewrit
*** 128,145 
 strategy.  */
  #define NAME_SETS_GROWTH_FACTOR   (MAX (3, num_ssa_names / 3))
  
- /* Tuple used to represent replacement mappings.  */
- struct repl_map_d
- {
-   tree name;
-   bitmap set;
- };
- 
- 
- /* NEW - OLD_SET replacement table.  If we are replacing several
-existing SSA names O_1, O_2, ..., O_j with a new name N_i,
-then REPL_TBL[N_i] = { O_1, O_2, ..., O_j }.  */
- static htab_t repl_tbl;
  
  /* The function the SSA updating data structures have been initialized for.
 NULL if they need to be initialized by register_new_name_mapping.  */
--- 128,133 
*** struct mark_def_sites_global_data
*** 157,174 
  /* Information stored for SSA names.  */
  struct ssa_name_info
  {
!   /* The current reaching definition replacing this SSA name.  */
!   tree current_def;
  
/* This field indicates whether or not the variable may need PHI nodes.
   See the enum's definition for more detailed information about the
   states.  */
ENUM_BITFIELD (need_phi_state) need_phi_state : 2;
  
!   /* Age of this record (so that 

Re: Add hot/cold attributes for labels

2012-07-27 Thread Paolo Bonzini
Il 27/07/2012 11:40, Steven Bosscher ha scritto:
 
  As in the case where you have both an unlikely and likely jump to a
  basic-block.  But what I understand is that rth adds a way to mark
  a basic-block as hot or cold, not a way to mark an edge as hot or cold
  (that would be what the asm goto annotation would do).  Both cases
  are of course useful.
 I don't see why it is useful to be able to mark a basic block as hot
 or cold. This is something that the compiler can figure out for itself
 if you provide the branch hints (__builtin_expect is also a kind of
 branch hint). Marking basic blocks as likely or unlikely seems just
 redundant and confusing to me. A basic block being hot or cold is an
 effect of its incoming edges being unlikely-taken, not an inherent
 property of the basic block itself.

You could say the same of functions.  Sometimes it's easier to mark the
edges and sometimes it's easier to mark the targets.

Paolo



Re: [rs6000 3/3] Remove MQ

2012-07-27 Thread Segher Boessenkool

This is okay, but gpc_reg_operand in predicates.md needs something
better than a magic 64 constant in the test.  Should this be a
comment or a new symbolic value related the max/last FPR or number of
general registers?


The test assumes a lot about the relative ordering of the
various regnos (like most of the predicates).  In my opinion
a literal here is better than testing with some in fact
unrelated macro.  All these predicates could use a cleanup,
bring them into the 21st century.  I didn't really feel like
doing that though.

I'll add a comment.


I changed the test to INT_REGNO_P || FP_REGNO_P; this isn't
totally obvious since INT_REGNO_P includes ap and fp, but those
are covered by the other arms of the conditional already: in
fact, it probably would be better to rewrite the whole thing
to simply disallow LR, CTR, CA (and MQ ;-) ); this would
express the purpose much better.  But that's not for this
patch.  Regtested again, all three patches committed.

So what's next?  Remove common mode, remove old-mnemonics?


Segher



[PATCH]: Use GTY atomic option for arrays, new atomic vector type

2012-07-27 Thread Laurynas Biveinis
This is a slightly expanded version of the patch in [1]. The main difference is 
that I expanded gengtype diagnostics to error on GTY length option applied to 
strings too, in addition to other arrays of atomic types. This in turn 
uncovered another place where a correct GTY option should be used .

Java and libcpp parts are already approved.

Tested on x86_64 linux, no regressions. OK for trunk?

[1] http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01204.html

gcc:
2012-07-27  Laurynas Biveinis  laurynas.bivei...@gmail.com
Steven Bosscher  ste...@gcc.gnu.org

* gengtype.c (adjust_field_type): Diagnose duplicate length
option applications and option being applied to arrays of atomic
types.
(walk_type): Allow atomic option on strings too.
* dwarf2out.h (struct dw_vec_struct): Use the atomic GTY option
for the array field.
* vec.h: Describe the atomic object A type of the macros in
the header comment.
(VEC_T_GTY_ATOMIC, DEF_VEC_A, DEF_VEC_ALLOC_A): Define.
* emit-rtl.c (locations_locators_vals): use the atomic object
vector.
* doc/gty.texi: Clarify that GTY option length is only for
arrays of non-atomic objects.  Fix typo in the description of the
atomic option.

gcc/java:
2012-07-24  Laurynas Biveinis  laurynas.bivei...@gmail.com

* jcf.h (CPool): Use the atomic GTY option for the tags field.
(bootstrap_method): Likewise for the bootstrap_arguments field.

libcpp:
2012-07-24  Laurynas Biveinis  laurynas.bivei...@gmail.com

* include/line-map.h (line_map_macro): Use the atomic GTY option
for the macro_locations field.


-- 
Laurynas
gcc:
2012-07-27  Laurynas Biveinis  laurynas.bivei...@gmail.com
	Steven Bosscher  ste...@gcc.gnu.org

	* gengtype.c (adjust_field_type): Diagnose duplicate length
	option applications and option being applied to arrays of atomic
	types.
	(walk_type): Allow atomic option on strings too.
	* dwarf2out.h (struct dw_vec_struct): Use the atomic GTY option
	for the array field.
	* vec.h: Describe the atomic object A type of the macros in
	the header comment.
	(VEC_T_GTY_ATOMIC, DEF_VEC_A, DEF_VEC_ALLOC_A): Define.
	* emit-rtl.c (locations_locators_vals): use the atomic object
	vector.
	* doc/gty.texi: Clarify that GTY option length is only for
	arrays of non-atomic objects.  Fix typo in the description of the
	atomic option.

gcc/java:
2012-07-24  Laurynas Biveinis  laurynas.bivei...@gmail.com

	* jcf.h (CPool): Use the atomic GTY option for the tags field.
	(bootstrap_method): Likewise for the bootstrap_arguments field.

libcpp:
2012-07-24  Laurynas Biveinis  laurynas.bivei...@gmail.com

	* include/line-map.h (line_map_macro): Use the atomic GTY option
	for the macro_locations field.

Index: gcc/gengtype.c
===
--- gcc/gengtype.c	(revision 189893)
+++ gcc/gengtype.c	(working copy)
@@ -1256,7 +1256,17 @@
 
   for (; opt; opt = opt-next)
 if (strcmp (opt-name, length) == 0)
-  length_p = 1;
+  {
+	if (length_p)
+	  error_at_line (lexer_line, duplicate `%s' option, opt-name);
+	if (t-u.p-kind == TYPE_SCALAR || t-u.p-kind == TYPE_STRING)
+	  {
+	error_at_line (lexer_line,
+			   option `%s' may not be applied to 
+			   arrays of atomic types, opt-name);
+	  }
+	length_p = 1;
+  }
 else if ((strcmp (opt-name, param_is) == 0
 	  || (strncmp (opt-name, param, 5) == 0
 		   ISDIGIT (opt-name[5])
@@ -2495,7 +2505,7 @@
   return;
 }
 
-  if (atomic_p  (t-kind != TYPE_POINTER))
+  if (atomic_p  (t-kind != TYPE_POINTER)  (t-kind != TYPE_STRING))
 {
   error_at_line (d-line, field `%s' has invalid option `atomic'\n, d-val);
   return;
Index: gcc/dwarf2out.h
===
--- gcc/dwarf2out.h	(revision 189893)
+++ gcc/dwarf2out.h	(working copy)
@@ -160,7 +160,7 @@
 /* Describe a floating point constant value, or a vector constant value.  */
 
 typedef struct GTY(()) dw_vec_struct {
-  unsigned char * GTY((length (%h.length))) array;
+  unsigned char * GTY((atomic)) array;
   unsigned length;
   unsigned elt_size;
 }
Index: gcc/vec.h
===
--- gcc/vec.h	(revision 189893)
+++ gcc/vec.h	(working copy)
@@ -95,24 +95,25 @@
the 'space' predicate will tell you whether there is spare capacity
in the vector.  You will not normally need to use these two functions.
 
-   Vector types are defined using a DEF_VEC_{O,P,I}(TYPEDEF) macro, to
+   Vector types are defined using a DEF_VEC_{O,A,P,I}(TYPEDEF) macro, to
get the non-memory allocation version, and then a
-   DEF_VEC_ALLOC_{O,P,I}(TYPEDEF,ALLOC) macro to get memory managed
+   DEF_VEC_ALLOC_{O,A,P,I}(TYPEDEF,ALLOC) macro to get memory managed
vectors.  Variables of vector type are declared using a
VEC(TYPEDEF,ALLOC) macro.  The ALLOC argument specifies the

[PATCH][4/n] into-SSA TLC

2012-07-27 Thread Richard Guenther

This avoids triggering update-ssa right after into-ssa just because
we didn't rename virtual operands yet.  Simply do that on-the-fly,
update_stmt will have added bare symbols as operands already.
Surprisingly simple ... no idea why I chose the simple route
when merging alias-improvements (originally the first 'alias' pass
enabled virtual operands).

Btw, we still have no virtual operands at -O0, it would now become
a tiny bit cheaper to add them (just to remove some !optimize checks).

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2012-07-27  Richard Guenther  rguent...@suse.de

* tree-into-ssa.c (mark_def_sites): Also process virtual operands.
(rewrite_stmt): Likewise.
(rewrite_enter_block): Likewise.
(pass_build_ssa): Do not update virtual SSA form during TODO.
(mark_symbol_for_renaming): Do nothing if we are not in SSA form.

Index: trunk/gcc/tree-into-ssa.c
===
*** trunk.orig/gcc/tree-into-ssa.c  2012-07-27 14:49:48.0 +0200
--- trunk/gcc/tree-into-ssa.c   2012-07-27 15:12:29.091599852 +0200
*** mark_def_sites (basic_block bb, gimple s
*** 675,681 
  
/* If a variable is used before being set, then the variable is live
   across a block boundary, so mark it live-on-entry to BB.  */
!   FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
  {
tree sym = USE_FROM_PTR (use_p);
gcc_assert (DECL_P (sym));
--- 675,681 
  
/* If a variable is used before being set, then the variable is live
   across a block boundary, so mark it live-on-entry to BB.  */
!   FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_ALL_USES)
  {
tree sym = USE_FROM_PTR (use_p);
gcc_assert (DECL_P (sym));
*** mark_def_sites (basic_block bb, gimple s
*** 686,692 
  
/* Now process the defs.  Mark BB as the definition block and add
   each def to the set of killed symbols.  */
!   FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
  {
gcc_assert (DECL_P (def));
set_def_block (def, bb, false);
--- 686,692 
  
/* Now process the defs.  Mark BB as the definition block and add
   each def to the set of killed symbols.  */
!   FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_ALL_DEFS)
  {
gcc_assert (DECL_P (def));
set_def_block (def, bb, false);
*** rewrite_stmt (gimple_stmt_iterator si)
*** 1336,1342 
if (is_gimple_debug (stmt))
rewrite_debug_stmt_uses (stmt);
else
!   FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
  {
tree var = USE_FROM_PTR (use_p);
gcc_assert (DECL_P (var));
--- 1336,1342 
if (is_gimple_debug (stmt))
rewrite_debug_stmt_uses (stmt);
else
!   FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_ALL_USES)
  {
tree var = USE_FROM_PTR (use_p);
gcc_assert (DECL_P (var));
*** rewrite_stmt (gimple_stmt_iterator si)
*** 1346,1352 
  
/* Step 2.  Register the statement's DEF operands.  */
if (register_defs_p (stmt))
! FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
{
tree var = DEF_FROM_PTR (def_p);
tree name = make_ssa_name (var, stmt);
--- 1346,1352 
  
/* Step 2.  Register the statement's DEF operands.  */
if (register_defs_p (stmt))
! FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_ALL_DEFS)
{
tree var = DEF_FROM_PTR (def_p);
tree name = make_ssa_name (var, stmt);
*** static void
*** 1404,1410 
  rewrite_enter_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED,
 basic_block bb)
  {
-   gimple phi;
gimple_stmt_iterator gsi;
  
if (dump_file  (dump_flags  TDF_DETAILS))
--- 1404,1409 
*** rewrite_enter_block (struct dom_walk_dat
*** 1418,1428 
   node introduces a new version for the associated variable.  */
for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (gsi))
  {
!   tree result;
! 
!   phi = gsi_stmt (gsi);
!   result = gimple_phi_result (phi);
!   gcc_assert (is_gimple_reg (result));
register_new_def (result, SSA_NAME_VAR (result));
  }
  
--- 1417,1423 
   node introduces a new version for the associated variable.  */
for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (gsi))
  {
!   tree result = gimple_phi_result (gsi_stmt (gsi));
register_new_def (result, SSA_NAME_VAR (result));
  }
  
*** struct gimple_opt_pass pass_build_ssa =
*** 2437,2444 
PROP_ssa,   /* properties_provided */
0,  /* properties_destroyed */
0,  /* todo_flags_start */
!   TODO_update_ssa_only_virtuals
! | TODO_verify_ssa
  

Re: [PATCH]: Use GTY atomic option for arrays, new atomic vector type

2012-07-27 Thread Richard Guenther
On Fri, Jul 27, 2012 at 2:59 PM, Laurynas Biveinis
laurynas.bivei...@gmail.com wrote:
 This is a slightly expanded version of the patch in [1]. The main difference 
 is that I expanded gengtype diagnostics to error on GTY length option applied 
 to strings too, in addition to other arrays of atomic types. This in turn 
 uncovered another place where a correct GTY option should be used .

 Java and libcpp parts are already approved.

 Tested on x86_64 linux, no regressions. OK for trunk?

Ok.

Thanks,
Richard.

 [1] http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01204.html

 gcc:
 2012-07-27  Laurynas Biveinis  laurynas.bivei...@gmail.com
 Steven Bosscher  ste...@gcc.gnu.org

 * gengtype.c (adjust_field_type): Diagnose duplicate length
 option applications and option being applied to arrays of atomic
 types.
 (walk_type): Allow atomic option on strings too.
 * dwarf2out.h (struct dw_vec_struct): Use the atomic GTY option
 for the array field.
 * vec.h: Describe the atomic object A type of the macros in
 the header comment.
 (VEC_T_GTY_ATOMIC, DEF_VEC_A, DEF_VEC_ALLOC_A): Define.
 * emit-rtl.c (locations_locators_vals): use the atomic object
 vector.
 * doc/gty.texi: Clarify that GTY option length is only for
 arrays of non-atomic objects.  Fix typo in the description of the
 atomic option.

 gcc/java:
 2012-07-24  Laurynas Biveinis  laurynas.bivei...@gmail.com

 * jcf.h (CPool): Use the atomic GTY option for the tags field.
 (bootstrap_method): Likewise for the bootstrap_arguments field.

 libcpp:
 2012-07-24  Laurynas Biveinis  laurynas.bivei...@gmail.com

 * include/line-map.h (line_map_macro): Use the atomic GTY option
 for the macro_locations field.


 --
 Laurynas


[RS6000] Fix PR54093, ICE due to 07-24 changes

2012-07-27 Thread Alan Modra
This fixes a thinko and typo in
http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01168.html that shows up
as an ICE on e500.  Two issues really.  One is that the secondary
reload insn emitted by rs6000_secondary_reload_gpr had better be
valid.  The other is that we only need these reloads when the
insn predicate says that the address is good.  Fixing the second
problem avoids the first.

Bootstrapped and regression tested powerpc-linux, and fixes the
testcases in the PR on e500.  OK to apply?

* config/rs6000/rs6000.c (rs6000_secondary_reload): Limit 32-bit
multi-gpr reload to cases where predicate passes.  Do the same for
64-bit multi-gpr reload.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 189801)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -13643,8 +13643,11 @@ rs6000_secondary_reload (bool in_p,
GET_MODE_SIZE (GET_MODE (x)) = UNITS_PER_WORD)
 {
   rtx off = address_offset (XEXP (x, 0));
+  unsigned int extra = GET_MODE_SIZE (GET_MODE (x)) - UNITS_PER_WORD;
 
-  if (off != NULL_RTX  (INTVAL (off)  3) != 0)
+  if (off != NULL_RTX
+  (INTVAL (off)  3) != 0
+  (unsigned HOST_WIDE_INT) INTVAL (off) + 0x8000  0x1 - extra)
{
  if (in_p)
sri-icode = CODE_FOR_reload_di_load;
@@ -13662,10 +13665,17 @@ rs6000_secondary_reload (bool in_p,
GET_MODE_SIZE (GET_MODE (x))  UNITS_PER_WORD)
 {
   rtx off = address_offset (XEXP (x, 0));
+  unsigned int extra = GET_MODE_SIZE (GET_MODE (x)) - UNITS_PER_WORD;
 
+  /* We need a secondary reload only when our legitimate_address_p
+says the address is good (as otherwise the entire address
+will be reloaded).  So for mode sizes of 8 and 16 this will
+be when the offset is in the ranges [0x7ffc,0x7fff] and
+[0x7ff4,0x7ff7] respectively.  Note that the address we see
+here may have been manipulated by legitimize_reload_address.  */
   if (off != NULL_RTX
-  ((unsigned HOST_WIDE_INT) INTVAL (off) + 0x8000
- = 0x1000u - (GET_MODE_SIZE (GET_MODE (x)) - UNITS_PER_WORD)))
+  ((unsigned HOST_WIDE_INT) INTVAL (off) - (0x8000 - extra)
+  UNITS_PER_WORD))
{
  if (in_p)
sri-icode = CODE_FOR_reload_si_load;

-- 
Alan Modra
Australia Development Lab, IBM


Re: Diagnostics from GCC_DRIVER_HOST_INITIALIZATION

2012-07-27 Thread Ryan Mansfield

On 12-07-27 03:41 AM, Dodji Seketeli wrote:

2012-07-20  Ryan Mansfield  rmansfi...@qnx.com

 * gcc.c (main): Move GCC_DRIVER_HOST_INITIALIZATION after
 diagnostic_initialize.

Could someone please apply the change?


The change seems small and obvious enough to not require copyright
assignment on file, but, just to be sure, Ryan, do you have copyright
assignment to the FSF on file (sorry if my question is stupid)?


Yes, I do.

Regards,

Ryan Mansfield



Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset

2012-07-27 Thread Richard Guenther
On Thu, Jul 26, 2012 at 11:57 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Thu, Jul 26, 2012 at 11:23 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 Ok!  Thanks for adding this exhaustive documentation.

 There's more to come! I want to add some explanations to ebitmap,
 pointer-set, fibheap, and splay-tree as sets, and add a chapter in the
 gccint manual too.

 Now if only you'd document those loop changes... ;-)

Eh ...


 Btw, ebitmap is unused since it was added - maybe we should simply remove
 it ...?

 I wouldn't remove it just yet. I'm going to make sure that bitmap.[ch]
 and ebitmap.[ch] provide the same interface and see if there are
 places where ebitmap is a better choice than bitmap or sbitmap (cprop
 and gcse.c come to mind).

Btw, just looking over sparseset.h what needs to be documented is that
iterating over the set is faster than for an sbitmap but element ordering
is random!  Also it looks less efficient than sbitmap in the case when
your main operation is adding to the set and querying the set randomly.
It's space overhead is really huge - for smaller universes a smaller
SPARSESET_ELT_TYPE would be nice, templates to the rescue!  I
wonder in which cases a unsigned HOST_WIDEST_FAST_INT sized
universe is even useful (but a short instead of an int is probably too
small ...)

Richard.

 Ciao!
 Steven


Re: Add hot/cold attributes for labels

2012-07-27 Thread Richard Henderson
On 07/27/2012 01:57 AM, Richard Guenther wrote:
 On Thu, Jul 26, 2012 at 11:58 PM, Richard Henderson r...@redhat.com wrote:
 On 07/26/2012 02:41 PM, Richard Henderson wrote:
 This is a patch...

 ... that I should have attached.  Bah.
 
 Do we need to mark the labels so we preserve them?  Consider
 
  goto foo;
 
 foo:
 bar __attribute__((cold)):
   ...
 
 so bar will be unused?

We don't purge unused labels until rtl (at which point it becomes a deleted 
label note), and we only really need the label to survive until after the 
profile_estimate pass.  After which all the significance of the label has been 
transferred into the edge frequency.

The test case

void g(void);
void h(void);
void f(int x, int y)
{
  if (x) goto A;
  return;

 A:
 B: __attribute__((cold))
  g();
  return;
}

does in fact DTRT with the estimates.

 What about BB merging if we end up with
 
   BB 3:
   ..
   fallthru
   bar __attribute__((cold)):
  ...
 
 should BB 3 inherit the coldness?  I think we no longer disable
 BB merging if the destination has user labels.

The edge might be marked cold, but that should have no other effect.  No more 
than

  if (__builtin_expect (test, 0))

when it turns out that we can prove that test is false.


r~


Re: Add hot/cold attributes for labels

2012-07-27 Thread Richard Henderson
On 07/27/2012 02:08 AM, Steven Bosscher wrote:
 Right. I don't like the use of this attribute on labels at all, for
 the reasons you list here. I think it would be much cleaner to add a
 branch hint on the label in the asm goto, to contain this extension
 and to also to make it clear that it's not the label that is cold but
 the jump that is unlikely to be executed (i.e. cause and effect: the
 jump is unlikely and therefore the basic block is cold).

The label attribute is also usable for computed goto.  Are you going
to change the syntax for that one as well?


r~


Re: [PATCH] New fdo summary-based icache sensitive unrolling (issue6351086)

2012-07-27 Thread Teresa Johnson
On Fri, Jul 27, 2012 at 12:31 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Fri, Jul 27, 2012 at 6:47 AM, Teresa Johnson tejohn...@google.com wrote:
 * gcc/gcov-io.h (GCOV_TAG_SUMMARY_LENGTH): Update for new summary 
 info.
 (struct gcov_ctr_summary): Add new summary info: num_hot_counters and
 hot_cutoff_value.

 You should also update the description for the data file at the head
 of gcov-io.h:

The data file contains the following records.
 data: {unit summary:object summary:program* function-data*}*
 unit: header int32:checksum
 function-data:  announce_function present counts
 announce_function: header int32:ident
 int32:lineno_checksum int32:cfg_checksum
 present: header int32:present
 counts: header int64:count*
 summary: int32:checksum {count-summary}GCOV_COUNTERS_SUMMABLE
 count-summary:  int32:num int32:runs int64:sum
 int64:max int64:sum_max

 You've added two fields in count-summary IIUC.

Thanks, I will update this in my next version of the patch.

Teresa


 Ciao!
 Steven



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [RS6000] Fix PR54093, ICE due to 07-24 changes

2012-07-27 Thread David Edelsohn
On Fri, Jul 27, 2012 at 9:26 AM, Alan Modra amo...@gmail.com wrote:
 This fixes a thinko and typo in
 http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01168.html that shows up
 as an ICE on e500.  Two issues really.  One is that the secondary
 reload insn emitted by rs6000_secondary_reload_gpr had better be
 valid.  The other is that we only need these reloads when the
 insn predicate says that the address is good.  Fixing the second
 problem avoids the first.

 Bootstrapped and regression tested powerpc-linux, and fixes the
 testcases in the PR on e500.  OK to apply?

 * config/rs6000/rs6000.c (rs6000_secondary_reload): Limit 32-bit
 multi-gpr reload to cases where predicate passes.  Do the same for
 64-bit multi-gpr reload.

Okay.

Thanks, David


Re: [PATCH, testsuite]: Fix gfortran.dg/bind_c_array_params_2.f90 scan failure on alpha

2012-07-27 Thread Paul Richard Thomas
Dear Uros,

It looks good an clear to me!

Thanks for the patch and, in particular, for adding the cleanup line.

Paul

On 27 July 2012 13:53, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Without -mno-explicit-relocs, alpha generates:

 ldq $27,myBindC($29)!literal!6
 jsr $26,($27),myBindC   !lituse_jsr!6

 which confuses scan-assembler-times ... 1.

 Also, added appropriate cleanup-tree-dump while there.

 Tested on alphaev68-pc-linux-gnu, committed to mainline SVN.

 Uros.



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy


Re: [rs6000 3/3] Remove MQ

2012-07-27 Thread David Edelsohn
On Fri, Jul 27, 2012 at 8:58 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:

 I changed the test to INT_REGNO_P || FP_REGNO_P; this isn't
 totally obvious since INT_REGNO_P includes ap and fp, but those
 are covered by the other arms of the conditional already: in
 fact, it probably would be better to rewrite the whole thing
 to simply disallow LR, CTR, CA (and MQ ;-) ); this would
 express the purpose much better.  But that's not for this
 patch.  Regtested again, all three patches committed.

Please undo that change.  You made the test for that heavily used
function even more expensive.

Thanks, David


Re: Fwd: [PATCH] New fdo summary-based icache sensitive unrolling (issue6351086)

2012-07-27 Thread Teresa Johnson
On Fri, Jul 27, 2012 at 3:14 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 Index: libgcc/libgcov.c
 ===
 --- libgcc/libgcov.c(revision 189893)
 +++ libgcc/libgcov.c(working copy)
 @@ -276,6 +276,120 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned
return 1;
  }

 +/* Used by qsort to sort gcov values in descending order.  */
 +
 +static int
 +sort_by_reverse_gcov_value (const void *pa, const void *pb)
 +{
 +  const gcov_type a = *(gcov_type const *)pa;
 +  const gcov_type b = *(gcov_type const *)pb;
 +
 +  if (b  a)
 +return 1;
 +  else if (b == a)
 +return 0;
 +  else
 +return -1;
 +}
 +
 +/* Determines the number of counters required to cover a given percentage
 +   of the total sum of execution counts in the summary, which is then also
 +   recorded in SUM.  */
 +
 +static void
 +gcov_compute_cutoff_values (struct gcov_summary *sum)

 This looks like good idea to me to drive the hot/cold partitioning even if it
 is not quite accurate (you have no idea how many instructions given counter is
 guarding).

Thanks - right, I think it will be a good approximation.


 To reduce overhead on embedded sysems, what about just doing histogram with 
 say
 128 steps instead if dragging in qsort? This also avoids the need to
 produce the copy of all counters.

I like that suggestion. I'll need 1024 buckets though to get the 99.9% cutoff
I am using right now (or use 128 buckets plus 1 to track what would be
bucket 1024 which is roughly 99.9%). I can use something like a binary
search to locate the right bucket without a linear search or divide. And
I can keep track of the min value for each bucket in order to identify the
correct value for hot_cutoff_value.

 +
 +  /* Determine the cumulative counter value at the specified cutoff
 + percentage and record the percentage for use by gcov consumers.
 + Check for overflow when sum_all is multiplied by the cutoff_perc,
 + and if so, do the divide first.  */
 +  if ((cs_ptr-sum_all * cutoff_perc) / cutoff_perc != cs_ptr-sum_all)
 +/* Overflow, do the divide first.  */
 +cum_cutoff = cs_ptr-sum_all / 1000 * cutoff_perc;
 +  else
 +/* Otherwise multiply first to get the correct value for small
 +   values of sum_all.  */
 +cum_cutoff = (cs_ptr-sum_all * cutoff_perc) / 1000;

 To further keep embedded systems (at least a bit) happier, I guess one could 
 do
 this without generic 64bit divide operations.  I guess 1000 can be bumped up 
 to
 1024, small error is hamless here.

 Actually it may be easier to simply embedd the histogram into gcov summary
 so one can control the cutoff with --param in compiler at --profile-use time.
 It seems resonable to me to trade 128 values per file for the extra 
 flexibility.

Both you and David have requested more values, so I will go ahead
and implement this suggestion in this patch. I can use the 128 + 1
bucket approach I described above to get the data for roughly every
1% plus the 99.9%. That should be enough granularity for the
optimizations (and the smallest bucket doesn't really need to be
fed back as it can be largely extrapolated from the others). This
will require feeding back 128 arrays of 2 values (num_hot_counters
and hot_cutoff_value).


 +  for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr-next)
 +{
 +  if (!gi_ptr-merge[t_ix])
 +continue;
 +
 +  /* Find the appropriate index into the gcov_ctr_info array
 + for the counter we are currently working on based on the
 + existence of the merge function pointer for this object.  */
 +  for (i = 0, ctr_info_ix = 0; i  t_ix; i++)
 +{
 +  if (gi_ptr-merge[i])
 +ctr_info_ix++;
 +}
 +  for (f_ix = 0; f_ix != gi_ptr-n_functions; f_ix++)
 +{
 +  gfi_ptr = gi_ptr-functions[f_ix];
 +
 +  if (!gfi_ptr || gfi_ptr-key != gi_ptr)
 +continue;
 +
 +  ci_ptr = gfi_ptr-ctrs[ctr_info_ix];
 +  /* Sanity check that there are enough entries in value_arry
 +for this function's counters. Gracefully handle the case when
 +there are not, in case something in the profile info is
 +corrupted.  */
 +  c_num = ci_ptr-num;
 +  if (index + c_num  cs_ptr-num)
 +c_num = cs_ptr-num - index;
 +  /* Copy over this function's counter values.  */
 +  memcpy (value_array[index], ci_ptr-values,
 +  sizeof (gcov_type) * c_num);
 +  index += c_num;

 I wonder if the loop walking all counters can't be fused into one of the other
 loops we already have.

Not with the histogram approach as the preceding walk (in the caller, gcov_exit)
will be needed to find the min and max counter values first.

 +}
 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 189893)
 +++ gcc/doc/invoke.texi (working 

[PATCH v2] Target-specific limits on vector alignment

2012-07-27 Thread Ulrich Weigand
Richard Guenther wrote:
 On Mon, Jun 11, 2012 at 5:25 PM, Richard Earnshaw rearn...@arm.com wrote:
  On 11/06/12 15:53, Richard Guenther wrote:
  The type argument or the size argument looks redundant.
 
  Technically, yes, we could get rid of tree_low_cst (TYPE_SIZE (type)
  and calculate it inside the alignment function if it was needed.
  However, it seemed likely that most targets would need that number one
  way or another, such that passing it would be helpful.
 
 Well, you don't need it in stor-layout and targets might think the value
 may be completely unrelated to the type ...
 
  Note that we still can have such vector properly aligned, thus the
  vectorizer would need to use build_aligned_type also if it knows the
  type is aligned, not only when thinks it is misaligned.  You basically
  change the alignment of the default vector type.
 
  I'm not sure I follow...
 
 I say that a large vector may be still aligned, so the vectorizer when
 creating vector memory references has to use a non-default aligned vector
 type when the vector is aligned.  It won't do that at the moment.

Richard (Earnshaw) has asked me to take over working on this patch now.

I've now made the change requested above and removed the size argument.
The target is now simply asked to return the required alignment for the
given vector type.  I've also added a check for the case where the
target provides both an alignment and a mode for a vector type, but
the mode actually requires bigger alignment than the type.  This is
simply rejected (the target can fix this by reporting a different
type alignment or changing the mode alignment).

I've not made any attempts to have the vectorizer register larger
alignments than the one returned by the target hook.  It's not
clear to me when this would be useful (at least on ARM) ...

I've also run the testsuite, and this actually uncovered to bugs in
the vectorizer where it made an implicit assumption that vector types
must always be naturally aligned:

- In vect_update_misalignment_for_peel, the code used the vector size
  instead of the required alignment in order to bound misalignment
  values -- leading to a misalignment value bigger than the underlying
  alignment requirement of the vector type, causing an ICE later on

- In vect_do_peeling_for_loop_bound, the code divided the vector type
  alignment by the number of elements in order to arrive at the element
  size ... this returns a wrong value if the alignment is less than the
  vector size, causing incorrect code to be generated

  (This routine also had some confusion between size and alignment in
  comments and variable names, which I've fixed as well.)

Finally, two test cases still failed spuriously:

- gcc.dg/align-2.c actually checked that vector types are naturally
  aligned

- gcc.dg/vect/slp-25.c checked that we needed to perform peeling for
  alignment, which we actually don't need any more if vector types
  have a lesser alignment requirement in the first place

I've added a new effective target flag to check whether the target
requires natural alignment for vector types, and disabled those two
tests if it doesn't.

With those changes, I've completed testing with no regressions on
arm-linux-gnueabi.

OK for mainline?

Bye,
Ulrich


ChangeLog:

* target.def (vector_alignment): New target hook.
* doc/tm.texi.in (TARGET_VECTOR_ALIGNMENT): Document new hook.
* doc/tm.texi: Regenerate.
* targhooks.c (default_vector_alignment): New function.
* targhooks.h (default_vector_alignment): Add prototype.
* stor-layout.c (layout_type): Use targetm.vector_alignment.
* config/arm/arm.c (arm_vector_alignment): New function.
(TARGET_VECTOR_ALIGNMENT): Define.

* tree-vect-data-refs.c (vect_update_misalignment_for_peel): Use
vector type alignment instead of size.
* tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Use
element type size directly instead of computing it from alignment.
Fix variable naming and comment.

testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_vect_natural_alignment): New function.
* gcc.dg/align-2.c: Only run on targets with natural alignment
of vector types.
* gcc.dg/vect/slp-25.c: Adjust tests for targets without natural
alignment of vector types.


Index: gcc/target.def
===
*** gcc/target.def  (revision 189809)
--- gcc/target.def  (working copy)
*** DEFHOOK
*** 1659,1664 
--- 1659,1672 
   bool, (enum machine_mode mode),
   hook_bool_mode_false)
  
+ DEFHOOK
+ (vector_alignment,
+  This hook can be used to define the alignment for a vector of type\n\
+ @var{type}, in order to comply with a platform ABI.  The default is to\n\
+ require natural alignment for vector types.,
+  HOST_WIDE_INT, (const_tree type),
+  default_vector_alignment)
+ 
  /* 

Re: [rs6000 3/3] Remove MQ

2012-07-27 Thread Segher Boessenkool

Please undo that change.  You made the test for that heavily used
function even more expensive.


It generates machine code identical to the original.


Segher



Re: [PATCH] PR 53528 c++/ C++11 Generalized Attribute support

2012-07-27 Thread Jason Merrill

On 07/26/2012 11:19 AM, Dodji Seketeli wrote:

+  struct scoped_attributes *ns = set_attributes_namespace (attrs,
+  attrs_len,
+  name_space);
+  if (ns == NULL)
+return NULL;
+
+  for (i = 0; i  attrs_len; ++i)
+register_scoped_attribute (attrs[i], ns);


This looks kind of funny; setting their namespace and then registering 
them sound like the same thing.  Let's rename set_attributes_namespace 
to register_scoped_attributes and call register_scoped_attribute from there.



+  if (TREE_STATIC (node))
+   {
+ /* For file scope variables and static members, the target
+supports alignments that are at most
+MAX_OFILE_ALIGNMENT.  */


I think this should check TREE_STATIC || DECL_EXTERNAL.

More later.

Jason



[Patch, fortran] Remove gfc_array_ref::offset field

2012-07-27 Thread Mikael Morin
The offset field is never set; this patch removes it.

Regression tested on x86_64-unknown-linux-gnu. OK for trunk?

Mikael

2012-07-27  Mikael Morin  mik...@gcc.gnu.org

	* array.c (gfc_copy_array_ref): Don't copy the offset field.
	* expr.c (find_array_section): Ignore the offset field.
	* trans-expr.c (gfc_find_interface_mapping_to_ref): Don't apply
	interface mapping to the offset field.
	* gfortran.h (struct gfc_array_ref): Remove the offset field.

diff --git a/array.c b/array.c
index 76bd5c3..f23d0bc 100644
--- a/array.c
+++ b/array.c
@@ -50,8 +50,6 @@ gfc_copy_array_ref (gfc_array_ref *src)
   dest-stride[i] = gfc_copy_expr (src-stride[i]);
 }
 
-  dest-offset = gfc_copy_expr (src-offset);
-
   return dest;
 }
 
diff --git a/expr.c b/expr.c
index cb5e1c6..74f204b 100644
--- a/expr.c
+++ b/expr.c
@@ -1493,10 +1493,7 @@ find_array_section (gfc_expr *expr, gfc_ref *ref)
  constructor.  */  
   for (idx = 0; idx  (int) mpz_get_si (nelts); idx++)
 {
-  if (ref-u.ar.offset)
-	mpz_set (ptr, ref-u.ar.offset-value.integer);
-  else
-	mpz_init_set_ui (ptr, 0);
+  mpz_init_set_ui (ptr, 0);
 
   incr_ctr = true;
   for (d = 0; d  rank; d++)
diff --git a/gfortran.h b/gfortran.h
index e1f2e3c..0f96772 100644
--- a/gfortran.h
+++ b/gfortran.h
@@ -1515,8 +1515,6 @@ typedef struct gfc_array_ref
 *stride[GFC_MAX_DIMENSIONS];
 
   enum gfc_array_ref_dimen_type dimen_type[GFC_MAX_DIMENSIONS];
-
-  struct gfc_expr *offset;
 }
 gfc_array_ref;
 
diff --git a/trans-expr.c b/trans-expr.c
index 263605a..2603995 100644
--- a/trans-expr.c
+++ b/trans-expr.c
@@ -2806,7 +2806,6 @@ gfc_apply_interface_mapping_to_ref (gfc_interface_mapping * mapping,
 	gfc_apply_interface_mapping_to_expr (mapping, ref-u.ar.end[n]);
 	gfc_apply_interface_mapping_to_expr (mapping, ref-u.ar.stride[n]);
 	  }
-	gfc_apply_interface_mapping_to_expr (mapping, ref-u.ar.offset);
 	break;
 
   case REF_COMPONENT:



Re: [patch[ Add explanations to sbitmap, bitmap, and sparseset

2012-07-27 Thread William J. Schmidt
On Fri, 2012-07-27 at 15:40 +0200, Richard Guenther wrote:
 On Thu, Jul 26, 2012 at 11:57 AM, Steven Bosscher stevenb@gmail.com 
 wrote:
  On Thu, Jul 26, 2012 at 11:23 AM, Richard Guenther
  richard.guent...@gmail.com wrote:
  Ok!  Thanks for adding this exhaustive documentation.
 
  There's more to come! I want to add some explanations to ebitmap,
  pointer-set, fibheap, and splay-tree as sets, and add a chapter in the
  gccint manual too.
 
  Now if only you'd document those loop changes... ;-)
 
 Eh ...
 
 
  Btw, ebitmap is unused since it was added - maybe we should simply remove
  it ...?
 
  I wouldn't remove it just yet. I'm going to make sure that bitmap.[ch]
  and ebitmap.[ch] provide the same interface and see if there are
  places where ebitmap is a better choice than bitmap or sbitmap (cprop
  and gcse.c come to mind).
 
 Btw, just looking over sparseset.h what needs to be documented is that
 iterating over the set is faster than for an sbitmap but element ordering
 is random!  Also it looks less efficient than sbitmap in the case when
 your main operation is adding to the set and querying the set randomly.
 It's space overhead is really huge - for smaller universes a smaller
 SPARSESET_ELT_TYPE would be nice, templates to the rescue!  I
 wonder in which cases a unsigned HOST_WIDEST_FAST_INT sized
 universe is even useful (but a short instead of an int is probably too
 small ...)

Another option for sparse sets would be a templatized version of Pugh's
skip lists.  Iteration is the same as a linked list and random access is
logarithmic in the size of the set (not the universe).  Space overhead
is also logarithmic.  The potential downside is that it involves
pointers.

Bill

 
 Richard.
 
  Ciao!
  Steven



[PATCH v2, i386]: Handle zero extended addresses in ix86_avoid_lea_for_addr

2012-07-27 Thread Uros Bizjak
On Fri, Jul 27, 2012 at 11:29 AM, Uros Bizjak ubiz...@gmail.com wrote:

 Attached patch enables ix86_avoid_lea_for_addr to process
 zero-extended addresses. This patch should help atom performance,
 especially in x32 mode.

 Please note the complication with insn re-recognition in
 ix86_avoid_lea_for_addr, to solve the problem as described in the
 comment:

   /* ix86_avoid_lea_for_addr re-recognizes insn and changes operands[]
  array behind our backs.  To make things worse, zero-extended oeprands
  (zero_extend:DI (addr:SI)) are re-recognized as (addr:DI), since they
  also satisfy operand constraints of one of many *leamode insn patterns.

Actually, the instruction gets re-recognized as
*zero_extendsidi2_rex64, this is the reason why we got DImode
(addr:DI) operand. This fact further uncovers existing problem with
ix86_avoid_lea_for_addr. This function should not mark addresses
having less than two operands for splitting. These patterns are
re-recognized as MOV (and now as zero-extending MOVL) due to the
approach, described in the comment above, and due to the fact that we
define *mov{si,di} and *zero_extendsidi2_rex64 patterns before
*leamode in the i386.md.

However, here is no point messing with these patterns in splitters,
they are conditionally converted to LEAs at the insn emission phase
(see i.e. *zero_extendsidi2_rex64 change in attached patch). The
attached patch prevents splitting by a simple criteria function.

As a bonus, the patch also includes conditional splitter for
non-destructive zero-extended adds.

2012-07-27  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (ix86_avoid_lea_for_addr): Handle
zero-extended addresses.  Return false if the address has less
than two components.
(ix86_split_lea_for_addr): Unconditionally convert target and
all address operands to requested mode.
* config/i386/i386.md (*leamode): Pass SImode to
ix86_split_lea_for_addr when splitting zero-extended address.
(zero-extended add splitter): New splitter to conditionally split
non-destructive adds.
(*zero_extendsidi2_rex64): Conditionally emit leal instead of movl.

I am currently re-testing v2 patch.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 189915)
+++ config/i386/i386.c  (working copy)
@@ -17036,11 +17036,6 @@ ix86_avoid_lea_for_addr (rtx insn, rtx operands[])
   struct ix86_address parts;
   int ok;
 
-  /* FIXME: Handle zero-extended addresses.  */
-  if (GET_CODE (operands[1]) == ZERO_EXTEND
-  || GET_CODE (operands[1]) == AND)
-return false;
-
   /* Check we need to optimize.  */
   if (!TARGET_OPT_AGU || optimize_function_for_size_p (cfun))
 return false;
@@ -17052,6 +17047,11 @@ ix86_avoid_lea_for_addr (rtx insn, rtx operands[])
   ok = ix86_decompose_address (operands[1], parts);
   gcc_assert (ok);
 
+  /* There should be at least two components in the address.  */
+  if ((parts.base != NULL_RTX) + (parts.index != NULL_RTX)
+  + (parts.disp != NULL_RTX) + (parts.scale  1)  2)
+return false;
+
   /* We should not split into add if non legitimate pic
  operand is used as displacement. */
   if (parts.disp  flag_pic  !LEGITIMATE_PIC_OPERAND_P (parts.disp))
@@ -17124,7 +17124,7 @@ ix86_emit_binop (enum rtx_code code, enum machine_
It is assumed that it is allowed to clobber flags register
at lea position.  */
 
-extern void
+void
 ix86_split_lea_for_addr (rtx operands[], enum machine_mode mode)
 {
   unsigned int regno0, regno1, regno2;
@@ -17135,7 +17135,7 @@ ix86_split_lea_for_addr (rtx operands[], enum mach
   ok = ix86_decompose_address (operands[1], parts);
   gcc_assert (ok);
 
-  target = operands[0];
+  target = gen_lowpart (mode, operands[0]);
 
   regno0 = true_regnum (target);
   regno1 = INVALID_REGNUM;
@@ -17143,18 +17143,19 @@ ix86_split_lea_for_addr (rtx operands[], enum mach
 
   if (parts.base)
 {
-  if (GET_MODE (parts.base) != mode)
-   parts.base = gen_lowpart (mode, parts.base);
+  parts.base = gen_lowpart (mode, parts.base);
   regno1 = true_regnum (parts.base);
 }
 
   if (parts.index)
 {
-  if (GET_MODE (parts.index) != mode)
-   parts.index = gen_lowpart (mode, parts.index);
+  parts.index = gen_lowpart (mode, parts.index);
   regno2 = true_regnum (parts.index);
 }
 
+  if (parts.disp)
+parts.disp = gen_lowpart (mode, parts.disp);
+
   if (parts.scale  1)
 {
   /* Case r1 = r1 + ...  */
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 189915)
+++ config/i386/i386.md (working copy)
@@ -3474,13 +3474,28 @@
 (match_operand:SI 1 x86_64_zext_general_operand
rmWz,0,r   ,m  ,r   ,m)))]
   TARGET_64BIT
-  @
-   mov{l}\t{%1, %k0|%k0, %1}
-   #
-   movd\t{%1, %0|%0, %1}
-   movd\t{%1, %0|%0, %1}
-   

Re: [Patch, Fortran] assumed-rank some bound intrinsics support, fix failures and improve diagnostcs

2012-07-27 Thread Mikael Morin
On 26/07/2012 17:32, Tobias Burnus wrote:
 On 07/26/2012 05:12 PM, Mikael Morin wrote:
 On 26/07/2012 16:53, Mikael Morin wrote:
 Here is a draft for those. Lightly tested with print *, ... 
 
 Looks rather nice. The output for test1 is also  good:
 
   integer :: a(1:3,-2:5)
 gives
   lbound(arg) == [1, 1]
   ubound(arg) == [3, 8]
   shape(arg) == [3, 8]
 
 However, if the dummy is allocatable or a pointer, the result should be:
 
   lbound(arg) == [1, -2]
   ubound(arg) == [3, 5]
   shape(arg) == [3, 8]
 
 which your second test case doesn't give.

Hello,

do you have a test case exhibiting the problem?
It seems fine to me.

$ ./test1
   1   1
   3   8
   3   8
   1   1
   3   8
   3   8
   1  -2
   3   5
   3   8
   1  -2
   3   5
   3   8

./test2
  11 101
  13 108
   3   8
  11  97
  12 106
   2  10
  13  99
  15 110
   3  12





program test

  integer :: a(1:3,-2:5)
  integer, allocatable :: b(:,:)

  call foo(a)

  allocate(b(1:3,-2:5))
  call foo(b)
  call bar(b)
  call baz(b)

contains
  subroutine foo(arg)
integer :: arg(..)

print *, lbound(arg)
print *, ubound(arg)
print *, shape(arg)
  end subroutine foo
  subroutine bar(arg)
integer, allocatable :: arg(:,:)

print *, lbound(arg)
print *, ubound(arg)
print *, shape(arg)
  end subroutine bar
  subroutine baz(arg)
integer, allocatable :: arg(..)

print *, lbound(arg)
print *, ubound(arg)
print *, shape(arg)
  end subroutine baz
end program test


program test

  integer  :: a(1:3,-2:5)
  integer, allocatable :: b(:,:)
  integer, allocatable :: c(:,:)
  integer, pointer :: d(:,:)

  b = foo(a)
  print *,b(:,1)
  print *,b(:,2)
  print *,b(:,3)

  allocate(c(1:2,-3:6))
  b = bar(c)
  print *,b(:,1)
  print *,b(:,2)
  print *,b(:,3)

  allocate(d(3:5,-1:10))
  b = baz(d)
  print *,b(:,1)
  print *,b(:,2)
  print *,b(:,3)

contains
  function foo(arg) result(res)
integer :: arg(..)
integer, allocatable :: res(:,:)

allocate(res(rank(arg), 3))

res(:,1) = lbound(arg) + (/ 10, 100 /)
res(:,2) = (/ 10, 100 /) + ubound(arg) 
res(:,3) = shape(arg)

  end function foo
  function bar(arg) result(res)
integer, allocatable :: arg(..)
integer, allocatable :: res(:,:)

allocate(res(rank(arg), 3))

res(:,1) = lbound(arg) + (/ 10, 100 /)
res(:,2) = (/ 10, 100 /) + ubound(arg) 
res(:,3) = shape(arg)

  end function bar
  function baz(arg) result(res)
integer, pointer :: arg(..)
integer, allocatable :: res(:,:)

allocate(res(rank(arg), 3))

res(:,1) = lbound(arg) + (/ 10, 100 /)
res(:,2) = (/ 10, 100 /) + ubound(arg) 
res(:,3) = shape(arg)

  end function baz
end program test


Re: [PATCH v2, i386]: Handle zero extended addresses in ix86_avoid_lea_for_addr

2012-07-27 Thread Uros Bizjak
On Fri, Jul 27, 2012 at 7:16 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Jul 27, 2012 at 11:29 AM, Uros Bizjak ubiz...@gmail.com wrote:

 Attached patch enables ix86_avoid_lea_for_addr to process
 zero-extended addresses. This patch should help atom performance,
 especially in x32 mode.

 Please note the complication with insn re-recognition in
 ix86_avoid_lea_for_addr, to solve the problem as described in the
 comment:

   /* ix86_avoid_lea_for_addr re-recognizes insn and changes operands[]
  array behind our backs.  To make things worse, zero-extended oeprands
  (zero_extend:DI (addr:SI)) are re-recognized as (addr:DI), since they
  also satisfy operand constraints of one of many *leamode insn 
 patterns.

 Actually, the instruction gets re-recognized as
 *zero_extendsidi2_rex64, this is the reason why we got DImode
 (addr:DI) operand. This fact further uncovers existing problem with
 ix86_avoid_lea_for_addr. This function should not mark addresses
 having less than two operands for splitting. These patterns are
 re-recognized as MOV (and now as zero-extending MOVL) due to the
 approach, described in the comment above, and due to the fact that we
 define *mov{si,di} and *zero_extendsidi2_rex64 patterns before
 *leamode in the i386.md.

 However, here is no point messing with these patterns in splitters,
 they are conditionally converted to LEAs at the insn emission phase
 (see i.e. *zero_extendsidi2_rex64 change in attached patch). The
 attached patch prevents splitting by a simple criteria function.

 As a bonus, the patch also includes conditional splitter for
 non-destructive zero-extended adds.

 2012-07-27  Uros Bizjak  ubiz...@gmail.com

 * config/i386/i386.c (ix86_avoid_lea_for_addr): Handle
 zero-extended addresses.  Return false if the address has less
 than two components.
 (ix86_split_lea_for_addr): Unconditionally convert target and
 all address operands to requested mode.
 * config/i386/i386.md (*leamode): Pass SImode to
 ix86_split_lea_for_addr when splitting zero-extended address.
 (zero-extended add splitter): New splitter to conditionally split
 non-destructive adds.
 (*zero_extendsidi2_rex64): Conditionally emit leal instead of movl.

 I am currently re-testing v2 patch.

... now with correct v2 patch attached.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 189915)
+++ config/i386/i386.md (working copy)
@@ -3474,13 +3474,28 @@
 (match_operand:SI 1 x86_64_zext_general_operand
rmWz,0,r   ,m  ,r   ,m)))]
   TARGET_64BIT
-  @
-   mov{l}\t{%1, %k0|%k0, %1}
-   #
-   movd\t{%1, %0|%0, %1}
-   movd\t{%1, %0|%0, %1}
-   %vmovd\t{%1, %0|%0, %1}
-   %vmovd\t{%1, %0|%0, %1}
+{
+  switch (get_attr_type (insn))
+{
+case TYPE_IMOVX:
+  if (ix86_use_lea_for_mov (insn, operands))
+   return lea{l}\t{%E1, %k0|%k0, %E1};
+  else
+   return mov{l}\t{%1, %k0|%k0, %1};
+
+case TYPE_MULTI:
+  return #;
+
+case TYPE_MMXMOV:
+  return movd\t{%1, %0|%0, %1};
+
+case TYPE_SSEMOV:
+  return %vmovd\t{%1, %0|%0, %1};
+
+default:
+  gcc_unreachable ();
+}
+}
   [(set_attr type imovx,multi,mmxmov,mmxmov,ssemov,ssemov)
(set_attr prefix orig,*,orig,orig,maybe_vex,maybe_vex)
(set_attr prefix_0f 0,*,*,*,*,*)
@@ -5479,7 +5494,26 @@
   reload_completed  ix86_avoid_lea_for_addr (insn, operands)
   [(const_int 0)]
 {
-  ix86_split_lea_for_addr (operands, MODEmode);
+  enum machine_mode mode = MODEmode;
+  rtx addr;
+
+  /* ix86_avoid_lea_for_addr re-recognizes insn and changes operands[]
+ array behind our backs.  To make things worse, zero-extended oeprands
+ (zero_extend:DI (addr:SI)) are re-recognized as (addr:DI), since they
+ also satisfy operand constraints of other insn patterns.
+
+ However, at this point we are looking only if the original insn
+ is performing inherent zero extension, and will emit
+ split insn sequence in SImode for this case.  */
+  addr = SET_SRC (PATTERN (curr_insn));
+
+  /* Emit all operations in SImode for zero-extended addresses.  Recall
+ that x86_64 inheretly zero-extends SImode operations to DImode.  */
+  if (GET_CODE (addr) == ZERO_EXTEND
+  || GET_CODE (addr) == AND)
+mode = SImode;
+
+  ix86_split_lea_for_addr (operands, mode);
   DONE;
 }
   [(set_attr type lea)
@@ -5807,11 +5841,11 @@
 (define_split
   [(set (match_operand:SWI48 0 register_operand)
(plus:SWI48 (match_operand:SWI48 1 register_operand)
-  (match_operand:SWI48 2 nonmemory_operand)))
+   (match_operand:SWI48 2 x86_64_nonmemory_operand)))
(clobber (reg:CC FLAGS_REG))]
   reload_completed  ix86_avoid_lea_for_add (insn, operands)
   [(set (match_dup 0) (match_dup 1))
-   (parallel [(set (match_dup 0) (plus:MODE (match_dup 0) (match_dup 2)))
+   (parallel [(set 

[PATCH, RFC] Re-work find_reloads_subreg_address (Re: [PATCH][RFC, Reload]. Reload bug?)

2012-07-27 Thread Ulrich Weigand
Tejas Belagod wrote:
 Tejas Belagod wrote:
  This is because offsettable_address_addr_space_p () gets as far as calling
  strict_memory_address_addr_space_p () with a QImode and (mode_sz - 1) which
  returns true. The only way I see offsettable_address_addr_space_p () 
  returning
  false would be mode_dependent_address_p () to return true for addr 
  expression
  (PLUS (reg) (16)) which partly makes sense to me because PLUS is a
  mode-dependent address in that it cannot be allowed for NEON addressing 
  modes,
  but it seems very generic for mode_dependent_address_p () to return true for
  PLUS in general instead of making a special case for vector modes. This
  distinction cannot be made in the target's mode_dependent_address_p() as 
  'mode'
  is not supplied to it.
 
 I dug a little deeper into offsettable_address_addr_space_p () and realized 
 that
 it only gets reg_equiv_mem () which is MEM:OI (reg sp) to work with which does
 not include the SUBREG_BYTE, therefore mode_dependent_address_p () does not 
 have
 PLUS to check for address tree-mode dependency.

Sorry for the late reply, it took me a while to understand what's
really going on here.

I now agree that this is definitely a bug in reload; it's clear that
offsettable_memref_p does not and cannot catch this case.  In fact,
it does not even have enough information to answer the question in
any sensible way (except for just about always returning no, which
wouldn't really be useful).

I also agree with the general approach in your patch.  The basic idea
is that it makes no sense to ask a generic question like would it
be valid to add some (unknown) offset and change to some (unknown)
mode?, when instead we can ask quite specifically for the *known*
offset and mode we want to change to.  However, I'd prefer this to
go even further than what your patch does: I think we should not
be querying offsettable_memref_p *at all* here.

With your patch, you still call offsettable_memref_p on the address
that has already been offset -- asking for more offsets seems pointless.
Also, there is another call to offsettable_memref_p *within*
find_reloads_subreg_address --which gets used when FORCE_REPLACE is
false-- which suffers from the same problem as the call in
find_reloads_toplev, and likewise ought to be eliminated.


Taking a step back, let's look at the ways a (subreg (reg)) where
reg is a pseudo equivalent to a memory location can be handled:

- The default way would be to reload the inner reg in its proper
  mode from its equivalent memory location into a reload register,
  and use a subreg of that register as the operand.  This always
  works correcly, but sometime causes unnecessary reloads, if the
  insn could actually handle a memory operand directly.

- To avoid that unnecessary reload, we can instead attempt to replace
  the whole subreg with a modified (usually narrowed or widended)
  memory reference.  This can then be either used directly in the insn,
  or itself be reloaded.

In the second case (outer reload), there can be a number of issues:

- We may not be allowed at all to change the memory access if:

  * we have a paradoxical subreg that is implictly handled as a
LOAD_EXTEND or ZERO_EXTEND due to LOAD_EXTEND_OP

  * we have a normal subreg that is implicitly acting on the full
register due to WORD_REGISTER_OPERATIONS  (the check for this
seems to be incomplete in current code!)

  * the equivalent memory address is mode-dependent

  * we have a paradoxical subreg, and we cannot prove the widened
memory is properly aligned (so we may cause either a misaligned
access, or access to unmapped memory)

- Even if we are in principle allowed to change the memory access,
  the modified address might not be valid (either because the
  original equivalent address is already invalid, or because it
  becomes invalid when adding an offset and/or changing its mode).
  We can still do the outer access in that case, but we will have
  to push address reloads (based on the modified address).

  Current code tries to be clever as to when to perform the
  substitution of the modified memory address: if it thinks no
  address reloads will be required in either case, it leaves the
  address as (subreg (reg)), allowing find_reloads to choose
  between (inner or outer) reloads or doing an (outer) access
  as memory operand directly.  In either case, the actual change
  to a mem happens in cleanup_subreg_operands at the end.

  On the other hand, if address reloads *are* required, it is
  find_reloads_toplev/find_reloads_subreg_address that will
  replace either the subreg or the reg with an explicit (outer
  or inner) memory access, and push the corresponding address
  reloads.  [ Note that find_reloads now no longer has the
  choice of switching between inner and outer access.  In the
  case of an outer access, there still is the choice between
  a direct memory access in the insn and a reload.  ]

- Even if we are allowed to change the 

Re: Diagnostics from GCC_DRIVER_HOST_INITIALIZATION

2012-07-27 Thread Dodji Seketeli
Gabriel Dos Reis g...@integrable-solutions.net a écrit:

  2012-07-20  Ryan Mansfield  rmansfi...@qnx.com javascript:;
 
  * gcc.c (main): Move GCC_DRIVER_HOST_INITIALIZATION after
  diagnostic_initialize.
 
  Could someone please apply the change?

 The change seems small and obvious enough to not require copyright
 assignment on file, but, just to be sure, Ryan, do you have copyright
 assignment to the FSF on file (sorry if my question is stupid)?

 Gaby, can I go ahead and apply this?


 yes, thank you!

Committed at revision r189918.

Thanks.

-- 
Dodji


libiberty/md5: fix strict alias warnings

2012-07-27 Thread Mike Frysinger
Current libiberty md5 code triggers these warnings with gcc-4.7.1 for me:

libiberty/md5.c: In function ‘md5_finish_ctx’:
libiberty/md5.c:117:3: warning: dereferencing type-punned pointer will break 
strict-aliasing rules [-Wstrict-aliasing]
libiberty/md5.c:118:3: warning: dereferencing type-punned pointer will break 
strict-aliasing rules [-Wstrict-aliasing]

The change below fixes things for me.  The optimized output (-O2) is the same
before/after my change on x86_64-linux.  I imagine it'll be the same for most
targets.  It seems simpler than using a union on the md5_ctx buffer since these
are the only two locations in the code where this occurs.

2012-07-27  Mike Frysinger  vap...@gentoo.org

* md5.c (md5_finish_ctx): Declare swap_bytes.  Assign SWAP() output
to swap_bytes, and then call memcpy to move it to ctx-buffer.

--- a/libiberty/md5.c
+++ b/libiberty/md5.c
@@ -102,7 +102,7 @@
 md5_finish_ctx (struct md5_ctx *ctx, void *resbuf)
 {
   /* Take yet unprocessed bytes into account.  */
-  md5_uint32 bytes = ctx-buflen;
+  md5_uint32 swap_bytes, bytes = ctx-buflen;
   size_t pad;
 
   /* Now count remaining bytes.  */
@@ -113,10 +113,13 @@
   pad = bytes = 56 ? 64 + 56 - bytes : 56 - bytes;
   memcpy (ctx-buffer[bytes], fillbuf, pad);
 
-  /* Put the 64-bit file length in *bits* at the end of the buffer.  */
-  *(md5_uint32 *) ctx-buffer[bytes + pad] = SWAP (ctx-total[0]  3);
-  *(md5_uint32 *) ctx-buffer[bytes + pad + 4] = SWAP ((ctx-total[1]  3) |
-   (ctx-total[0]  29));
+  /* Put the 64-bit file length in *bits* at the end of the buffer.
+ Use memcpy to avoid aliasing problems.  On most systems, this
+ will be optimized away to the same code.  */
+  swap_bytes = SWAP (ctx-total[0]  3);
+  memcpy (ctx-buffer[bytes + pad], swap_bytes, sizeof (swap_bytes));
+  swap_bytes = SWAP ((ctx-total[1]  3) | (ctx-total[0]  29));
+  memcpy (ctx-buffer[bytes + pad + 4], swap_bytes, sizeof (swap_bytes));
 
   /* Process last bytes.  */
   md5_process_block (ctx-buffer, bytes + pad + 8, ctx);


signature.asc
Description: This is a digitally signed message part.


Re: libiberty/md5: fix strict alias warnings

2012-07-27 Thread Ian Lance Taylor
On Fri, Jul 27, 2012 at 11:36 AM, Mike Frysinger vap...@gentoo.org wrote:

 2012-07-27  Mike Frysinger  vap...@gentoo.org

 * md5.c (md5_finish_ctx): Declare swap_bytes.  Assign SWAP() output
 to swap_bytes, and then call memcpy to move it to ctx-buffer.

/* Take yet unprocessed bytes into account.  */
 -  md5_uint32 bytes = ctx-buflen;
 +  md5_uint32 swap_bytes, bytes = ctx-buflen;

Please add a new line, rather than declaring one uninitialized and one
initialized variable on the same line.

This patch is OK with that change.

Thanks.

Ian


Re: [PATCH] convert target_expmed macro accessors into inline functions

2012-07-27 Thread Richard Henderson
On 07/27/2012 04:41 AM, Nathan Froyd wrote:
 Tested on x86_64-unknown-linux-gnu.  OK to commit?
 
 -Nathan
 
   * expmed.h (alg_hash, alg_hash_used_p, sdiv_pow2_cheap,
   smod_pow2_cheap, zero_cost, add_cost, neg_cost, shift_cost)
   shiftadd_cost, shiftsub0_cost, shiftsub1_cost, mul_cost,
   sdiv_cost, udiv_cost, mul_widen_cost, mul_highpart_cost): Delete
   macro definitions and re-purpose as inline functions.
   (alg_hash_entry_ptr, set_alg_hash_used_p, sdiv_pow2_cheap_ptr,
   set_sdiv_pow2_cheap, smod_pow2_cheap_ptr, set_smod_pow2_cheap,
   zero_cost_ptr, set_zero_cost, add_cost_ptr, set_add_cost,
   neg_cost_ptr, set_neg_cost, shift_cost_ptr, set_shift_cost,
   shiftadd_cost_ptr, set_shiftadd_cost, shiftsub0_cost_ptr,
   set_shiftsub0_cost, shiftsub1_cost_ptr, set_shiftsub1_cost,
   mul_cost_ptr, set_mul_cost, sdiv_cost_ptr, set_sdiv_cost,
   udiv_cost_ptr, set_udiv_cost, mul_widen_cost_ptr,
   set_mul_widen_cost, mul_highpart_cost_ptr, set_mul_highpart_cost):
   New functions.
   (convert_cost_ptr): New function, split out from...
   (set_convert_cost, convert_cost): ...here.
   * expmed.c, tree-ssa-loop-ivopts.c: Update for new functions.
   * gimple-ssa-strength-reduction.c: Likewise.

Ok.  And thanks!


r~


Re: [rs6000 3/3] Remove MQ

2012-07-27 Thread Segher Boessenkool

[Please don't top post.]


 Please undo that change.  You made the test for that heavily used
 function even more expensive.

It generates machine code identical to the original.



I suspect David is refering to execution time of the compiler itself


And I am talking about the machine code of the compiler itself,
generated from the GCC source code before and after my patches:
the gpc_reg_operand function machine code stays bitwise the same.
GCC is a smart enough compiler :-)

Segher



ping ancient MIPS: missed optimization patch

2012-07-27 Thread Sandra Loosemore

Richard,

This ancient patch to tweak mips_legitimize_address

http://gcc.gnu.org/ml/gcc/2008-11/msg00294.html

seems to never have been applied.  Do you have any idea whether this is 
still a useful change?  The test case given in the first message in that 
thread no longer reproduces with a recent mainline build.


-Sandra


RFA: implement C11 _Generic

2012-07-27 Thread Tom Tromey
This patch attempts to implement the C11 _Generic feature.
Based on the last comment in

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46073

I am not at all sure I've done it correctly.

There are a couple of other things that aren't clear to me.

First, should c_parser_generic_selection call mark_exp_read on each
expression in the generic association list?  I chose not to, on the
basis that those expressions are parsed but not evaluated or otherwise
used; but I am not sure that this is correct.  (I did choose to call
it for the controlling expression, by analogy with typeof).

Second, it isn't clear to me whether setting
c_inhibit_evaluation_warnings is sufficient here.  I welcome your
advice.

Finally, I'd appreciate advice on the content of the various error
messages.

I wrote some new tests.  I tried to test every constraint in the spec,
but I have never really been all that good at language lawyering.

Comments?

2012-07-27  Tom Tromey  tro...@redhat.com

* c-common.h (enum rid) RID_GENERIC: New constant.
* c-common.c (c_common_reswords): Add _Generic.

2012-07-27  Tom Tromey  tro...@redhat.com

* c-parser.c (struct c_generic_association): New.
(c_generic_association_d): New typedef.
(c_parser_generic_selection): New function.
(c_parser_postfix_expression): Handle RID_GENERIC.

2012-07-27  Tom Tromey  tro...@redhat.com

* gcc.dg/c11-generic-2.c: New file.
* gcc.dg/c11-generic-1.c: New file.

---
 gcc/c-family/ChangeLog   |5 +
 gcc/c-family/c-common.c  |1 +
 gcc/c-family/c-common.h  |4 +-
 gcc/c/ChangeLog  |7 ++
 gcc/c/c-parser.c |  193 ++
 gcc/testsuite/ChangeLog  |5 +
 gcc/testsuite/gcc.dg/c11-generic-1.c |   28 +
 gcc/testsuite/gcc.dg/c11-generic-2.c |   24 
 8 files changed, 265 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c11-generic-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-generic-2.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index b72506b..cc880de 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -418,6 +418,7 @@ const struct c_common_resword c_common_reswords[] =
   { _Sat, RID_SAT,   D_CONLY | D_EXT },
   { _Static_assert,   RID_STATIC_ASSERT, D_CONLY },
   { _Noreturn,RID_NORETURN,  D_CONLY },
+  { _Generic, RID_GENERIC,   D_CONLY },
   { __FUNCTION__,RID_FUNCTION_NAME, 0 },
   { __PRETTY_FUNCTION__, RID_PRETTY_FUNCTION_NAME, 0 },
   { __alignof,   RID_ALIGNOF,0 },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 050112e..415b6e0 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1,6 +1,6 @@
 /* Definitions for c-common.c.
Copyright (C) 1987, 1993, 1994, 1995, 1997, 1998,
-   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011
+   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011, 2012
Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -108,7 +108,7 @@ enum rid
   RID_FRACT, RID_ACCUM,
 
   /* C11 */
-  RID_ALIGNAS,
+  RID_ALIGNAS, RID_GENERIC,
 
   /* This means to warn that this is a C++ keyword, and then treat it
  as a normal identifier.  */
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 2237749..360cc58 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -6158,6 +6158,195 @@ c_parser_get_builtin_args (c_parser *parser, const char 
*bname,
   return true;
 }
 
+/* This represents a single generic-association.  */
+
+struct c_generic_association
+{
+  /* The location of the starting token of the type.  */
+  location_t type_location;
+  /* The association's type, or NULL_TREE for 'default'..  */
+  tree type;
+  /* The association's expression.  */
+  struct c_expr expression;
+};
+
+typedef struct c_generic_association c_generic_association_d;
+
+DEF_VEC_O (c_generic_association_d);
+DEF_VEC_ALLOC_O (c_generic_association_d, heap);
+
+/* Parse a generic-selection.  (C11 6.5.1.1).
+   
+   generic-selection:
+ _Generic ( assignment-expression , generic-assoc-list )
+ 
+   generic-assoc-list:
+ generic-association
+ generic-assoc-list , generic-association
+   
+   generic-association:
+ type-name : assignment-expression
+ default : assignment-expression
+*/
+
+static struct c_expr
+c_parser_generic_selection (c_parser *parser)
+{
+  VEC (c_generic_association_d, heap) *associations = NULL;
+  struct c_expr selector, error_expr;
+  tree selector_type;
+  struct c_generic_association matched_assoc;
+  int match_found = 0;
+  location_t generic_loc, selector_loc;
+
+  error_expr.original_code = ERROR_MARK;
+  error_expr.original_type = NULL;
+  error_expr.value = error_mark_node;
+
+  gcc_assert (c_parser_next_token_is_keyword (parser, RID_GENERIC));
+  generic_loc = c_parser_peek_token (parser)-location;
+  

Re: RFA: implement C11 _Generic

2012-07-27 Thread Joseph S. Myers
Could you explain the choices you have made for the issues raised on 
comp.std.c last month (regarding the handling of qualifiers on controlling 
expressions) and the rationale for those choices (and make sure there are 
appropriate testcases)?

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch, fortran] Fix PR 54033, problems with -I

2012-07-27 Thread Thomas Koenig

Hi Janis,


On 07/26/2012 10:16 AM, Thomas Koenig wrote:


No test case because I couldn't figure out how to test for a
warning with no line number.


Try using line number 0.


That didn't work for me. Using

! { dg-do compile }
! { dg-options -I include_6.f90 -I missing_dir }
! { dg-warning not a directory missing directory target *-*-* 0 }
! { dg-warning does not exist nonexisting directory target *-*-* 0 }
end

got me

Warning: Include directory include_6.f90 does not exist^M
Warning: Include directory missing_dir does not exist^M
output is:
Warning: Include directory include_6.f90 does not exist^M
Warning: Include directory missing_dir does not exist^M

FAIL: gfortran.dg/include_6.f90  -O  (test for excess errors)
Excess errors:
:0:0: Warning: Include directory include_6.f90 does not exist
:0:0: Warning: Include directory missing_dir does not exist

and

! { dg-do compile }
! { dg-options -I include_6.f90 -I missing_dir }
! { dg-warning not a directory missing directory target *-*-* 0 }
! { dg-warning does not exist nonexisting directory target *-*-* 0 }
! { dg-excess-errors Include directory }
end

resulted in an XFAIL:

Warning: Include directory include_6.f90 does not exist^M
Warning: Include directory missing_dir does not exist^M
output is:
Warning: Include directory include_6.f90 does not exist^M
Warning: Include directory missing_dir does not exist^M

XFAIL: gfortran.dg/include_6.f90  -O  (test for excess errors)
Excess errors:
:0:0: Warning: Include directory include_6.f90 does not exist
:0:0: Warning: Include directory missing_dir does not exist

so dg-excess-errors seems to imply XFAIL.

The problem may be related to the fact that, when we process the
options, we do not yet have a file name, so dejagnu may have trouble
parsing the warning.

Any other ideas?

Thomas


Re: RFA: implement C11 _Generic

2012-07-27 Thread Tom Tromey
 Joseph == Joseph S Myers jos...@codesourcery.com writes:

Joseph Could you explain the choices you have made for the issues
Joseph raised on comp.std.c last month (regarding the handling of
Joseph qualifiers on controlling expressions) and the rationale for
Joseph those choices (and make sure there are appropriate testcases)?

I found this:

https://groups.google.com/forum/?fromgroups#!topic/comp.std.c/InNlRotSWTc

I wasn't really aware of 6.3.2.1, but after reading it and re-reading
6.5.1.1, I think I agree with his model 0 interpretation: no promotion
or conversion.

I don't have a standards-based reason for this, though; just my belief
that _Generic was omitted from 6.3.2.1 by mistake.

That's an amateur opinion though.  I think it would be better for you to
say what you think.


I thought I was implementing model 0, but the test program he posts
shows differently.

I'm happy to try to fix that up, but I'd like your guidance as to the
proper direction.

Tom


Re: ping ancient MIPS: missed optimization patch

2012-07-27 Thread Andrew Pinski
On Fri, Jul 27, 2012 at 12:15 PM, Sandra Loosemore
san...@codesourcery.com wrote:
 Richard,

 This ancient patch to tweak mips_legitimize_address

 http://gcc.gnu.org/ml/gcc/2008-11/msg00294.html

 seems to never have been applied.  Do you have any idea whether this is
 still a useful change?  The test case given in the first message in that
 thread no longer reproduces with a recent mainline build

I think http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699#c9 was the
change which fixes the original testcase.

See that full bug report for more testcases and more information on
what is going on.

Thanks,
Andrew Pinski


Re: RFA: implement C11 _Generic

2012-07-27 Thread Joseph S. Myers
On Fri, 27 Jul 2012, Tom Tromey wrote:

 I found this:
 
 https://groups.google.com/forum/?fromgroups#!topic/comp.std.c/InNlRotSWTc
 
 I wasn't really aware of 6.3.2.1, but after reading it and re-reading
 6.5.1.1, I think I agree with his model 0 interpretation: no promotion
 or conversion.
 
 I don't have a standards-based reason for this, though; just my belief
 that _Generic was omitted from 6.3.2.1 by mistake.
 
 That's an amateur opinion though.  I think it would be better for you to
 say what you think.

I think there are three standards issues and two GCC issues:

* Integer promotions.  I think it's clear that those do not occur; a 
short value remains as short rather than int.

* Conversion of qualified to unqualified types (which normally would occur 
as part of lvalue-to-rvalue conversion, but might also be an issue with 
casts to qualified types).  The given example of a cbrt type-generic macro 
would only work as users expect (given that an argument might be a 
variable of type const long double, or a cast to const long double) if 
the expression (whether lvalue or rvalue) is converted from qualified to 
unqualified type.  That suggests qualifiers should be discarded on the 
controlling expression (and if one of the type names specifies a qualified 
type, that case can never match).

* Conversion of arrays and function designators to pointers.  Given the 
removal of qualifiers it would seem natural for this to apply as well (to 
the controlling expression - not of course to the selected expression, 
given that this may explicitly be a function designator).  But unlike the 
case of qualifiers, there's no real pointer in the standard to what's 
intended.

And the GCC issues:

* GCC doesn't have very well-defined semantics for whether an rvalue has a 
qualified type or not.  This only appears as an issue with typeof at 
present, but does require more care about eliminating qualifiers for 
_Generic.

* build_unary_op has code

  /* If the lvalue is const or volatile, merge that into the type
 to which the address will point.  This is only needed
 for function types.  */

that will give pointer-to-qualified type to expressions such as abort - 
the address of a noreturn function.  Again, at language level this is only 
visible with typeof at present.  But in C11 terms, _Noreturn is not part 
of a function's type.  This means that if the controlling expression is 
something such as abort, with pointer-to-qualified-function type, the 
qualifiers on the pointer target type need to be removed for the 
comparisons.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch, fortran] Fix PR 54033, problems with -I

2012-07-27 Thread Janis Johnson
On 07/27/2012 01:06 PM, Thomas Koenig wrote:
 Hi Janis,
 
 On 07/26/2012 10:16 AM, Thomas Koenig wrote:

 No test case because I couldn't figure out how to test for a
 warning with no line number.

 Try using line number 0.
 
 That didn't work for me. Using
 
 ! { dg-do compile }
 ! { dg-options -I include_6.f90 -I missing_dir }
 ! { dg-warning not a directory missing directory target *-*-* 0 }
 ! { dg-warning does not exist nonexisting directory target *-*-* 0 }
 end
 
 got me
 
 Warning: Include directory include_6.f90 does not exist^M
 Warning: Include directory missing_dir does not exist^M
 output is:
 Warning: Include directory include_6.f90 does not exist^M
 Warning: Include directory missing_dir does not exist^M
 
 FAIL: gfortran.dg/include_6.f90  -O  (test for excess errors)
 Excess errors:
 :0:0: Warning: Include directory include_6.f90 does not exist
 :0:0: Warning: Include directory missing_dir does not exist

Use { target *-*-* } instead of target *-*-*.

Notice that the two warnings have the same text, so the directive
looking for not a directory will fail.

Janis


Re: [PATCH] PR 53528 c++/ C++11 Generalized Attribute support

2012-07-27 Thread Jason Merrill

On 07/26/2012 11:19 AM, Dodji Seketeli wrote:

+// Example taken from dcl.attr.grammar:
+
+int p[10];
+void f()
+{
+int x = 42, y[5];
+/* Here, the '[[gnu::' should have introduced an attribute, ont a
+   lambda invocation an array subscripting expression.  */
+int(p[[gnu::x] { return x; }()]); // { dg-error expected|consecutive }
+/* Likewise, the '[[gnu::' is invalid here.  */
+y[[gnu::] { return 2; }()] = 2; // { dg-error expected|consecutive }


The example in the standard doesn't have gnu:: in it.  Search and 
replace error?



+  if (cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE) == NULL
+ || cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE) == 
NULL)


Let's use ! rather than == NULL so that these lines fit in 80 chars.


+  /* If E is a constant, fold it and return it right away.
+Otherwise, build an ALIGNOF_EXPR that will be substituted
+into, later at template instantiation time.  */
+  tree cst = TYPE_P (e) ? NULL_TREE : cxx_constant_value (e);


I think you want fold_non_dependent_expr_sfinae rather than 
cxx_constant_value.



+  /*
+[dcl.align]/3


Extra newline.


+int  [[gnu::format(printf, 1, 2)]] foo(const char *, ...);


This seems wrong; you can't apply the format attribute to a pointer type.

About your earlier question on IRC, the solution to the problem you're 
having with



struct A {int i;} [[gnu::aligned(16)]] a;


is that C++11 attributes after the closing brace should not be passed to 
finish_struct.  Instead, we should apply them to the type (without 
TYPE_IN_PLACE) after the type is complete.



+typedef union { int i; } U [[gnu::transparent_union]];


For the same reason, this should also be rejected; the testcase should 
put the attribute before the opening brace.  We accept this with 
GNU-style attributes for backward compatibility, but there's no reason 
to propagate that lossage into the new syntax.



+templatetypename T struct A
+{
+  void foo() const;
+} [[gnu::aligned(4)]];


Likewise.


+  typedef void ([[gnu::__stdcall__]] T::*F) (L*);


I don't think a C++11 attribute can appear in this position.  I think it 
should be


 typedef void (T::*F)(L*) [[gnu::__stdcall__]]


+virtual void [[gnu::__stdcall__]] A(L *listener) = 0;


Similarly, here the stdcall attribute appertains to void, which makes 
no sense.



+S [[gnu::__stdcall__]] getS();
+extern int * ([[gnu::stdcall]] *fooPtr)( void);
+int * [[gnu::stdcall]] myFn01( void) { return 0; }


Likewise.


+  typedef [[gnu::aligned (16)]] struct {


This also seems ill-formed, as there are no type-specifiers before the 
attribute, so there's nothing for it to appertain to.



+int
+[[gnu::noreturn]]
+[[gnu::unused]]
+one(void); // OK


noreturn doesn't make sense for int.  Nor does unused, really.  C++11 
attributes that appertain to the function must either come before any 
specifiers or after the declarator.



+template
+[[gnu::packed]]
+struct Aint; // { dg-warning attribute }


Here the patch is giving the wrong warning; it complains about applying 
the attribute to Aint, but actually we should be warning about an 
attribute that would appertain to all the declarators in a declaration 
with no declarators for it to appertain to.



+[[gnu::deprecated]] enum E { E0 }; // { dg-warning attribute ignored in 
declaration of }
+// { dg-message must follow the  { target *-*-* } 3 }


Same here.


+ if (strcmp (attr_name, IDENTIFIER_POINTER (get_attribute_name 
(list))) == 0)
+  gcc_checking_assert (TREE_CODE (get_attribute_name (list)) == 
IDENTIFIER_NODE);
+ for (a = lookup_ident_attribute (get_attribute_name (a2), 
attributes);
+  a = lookup_ident_attribute (get_attribute_name (a2), 
TREE_CHAIN (a)))


Line too long.


+  /* A given attribute has been parsed as a C++-11 generalized
+ attribute.  */


Let's drop the word generalized throughout the patch.  I don't see how 
these attributes are any more generalized than GNU attributes; we should 
just describe them as C++11 attributes.



+cxx_11_attribute_p (const_tree attr)


Let's change cxx_11 to cxx11.

Jason



[PATCH] delete last traces of GO_IF_MODE_DEPENDENT_ADDRESS

2012-07-27 Thread Nathan Froyd
Subject says it all, really.  Two targets with redundant definitions, and
two targets with trivial definitions.  Time to remove this.

Tested on x86_64-unknown-linux-gnu.  Crosses to {alpha,vax}-linux-gnu built
as well.  OK to commit?

-Nathan

* defaults.h (GO_IF_MODE_DEPENDENT_ADDRESS): Delete.
* targhooks.c (default_mode_dependent_address_p): Delete code
for GO_IF_MODE_DEPENDENT_ADDRESS.
* system.h (GO_IF_MODE_DEPENDENT_ADDRESS): Poison.
* doc/tm.texi.in (GO_IF_MODE_DEPENDENT_ADDRESS): Delete documention.
* doc/tm.texi: Regenerate.
* config/alpha.h (GO_IF_MODE_DEPENDENT_ADDRESS): Move code to...
* config/alpha.c (alpha_mode_dependent_address_p): ...here.  New
function.
(TARGET_MODE_DEPENDENT_ADDRESS_P): Define.
* config/cr16/cr16.h (GO_IF_MODE_DEPENDENT_ADDRESS): Delete.
* config/mep/mep.h (GO_IF_MODE_DEPENDENT_ADDRESS): Delete.
* config/vax/vax-protos.h (vax_mode_dependent_address_p): Delete.
* config/vax/vax.h (GO_IF_MODE_DEPENDENT_ADDRESS): Delete.
* config/vax/vax.c (vax_mode_dependent_address_p): Make static.
Take a const_rtx.
(TARGET_MODE_DEPENDENT_ADDRESS_P): Define.

---
 gcc/config/alpha/alpha.c|   12 
 gcc/config/alpha/alpha.h|7 ---
 gcc/config/cr16/cr16.h  |4 
 gcc/config/mep/mep.h|2 --
 gcc/config/vax/vax-protos.h |1 -
 gcc/config/vax/vax.c|7 +--
 gcc/config/vax/vax.h|5 -
 gcc/defaults.h  |7 ---
 gcc/doc/tm.texi |   18 --
 gcc/doc/tm.texi.in  |   18 --
 gcc/system.h|3 ++-
 gcc/targhooks.c |   12 
 12 files changed, 19 insertions(+), 77 deletions(-)

diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 5617ea3..6d455ef 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -1038,6 +1038,16 @@ alpha_legitimize_address (rtx x, rtx oldx 
ATTRIBUTE_UNUSED,
   return new_x ? new_x : x;
 }
 
+/* Return true if ADDR has an effect that depends on the machine mode it
+   is used for.  On the Alpha this is true only for the unaligned modes.
+   We can simplify the test since we know that the address must be valid.  */
+
+static bool
+alpha_mode_dependent_address_p (const_rtx addr)
+{
+  return GET_CODE (addr) == AND;
+}
+
 /* Primarily this is required for TLS symbols, but given that our move
patterns *ought* to be able to handle any symbol at any time, we
should never be spilling symbolic operands to the constant pool, ever.  */
@@ -9709,6 +9719,8 @@ alpha_conditional_register_usage (void)
 
 #undef TARGET_LEGITIMIZE_ADDRESS
 #define TARGET_LEGITIMIZE_ADDRESS alpha_legitimize_address
+#undef TARGET_MODE_DEPENDENT_ADDRESS_P
+#define TARGET_MODE_DEPENDENT_ADDRESS_P alpha_mode_dependent_address_p
 
 #undef TARGET_ASM_FILE_START
 #define TARGET_ASM_FILE_START alpha_file_start
diff --git a/gcc/config/alpha/alpha.h b/gcc/config/alpha/alpha.h
index 8520ea8..cdb7c49 100644
--- a/gcc/config/alpha/alpha.h
+++ b/gcc/config/alpha/alpha.h
@@ -851,13 +851,6 @@ do {   
 \
 }   \
 } while (0)
 
-/* Go to LABEL if ADDR (a legitimate address expression)
-   has an effect that depends on the machine mode it is used for.
-   On the Alpha this is true only for the unaligned modes.   We can
-   simplify this test since we know that the address must be valid.  */
-
-#define GO_IF_MODE_DEPENDENT_ADDRESS(ADDR,LABEL)  \
-{ if (GET_CODE (ADDR) == AND) goto LABEL; }
 
 /* Specify the machine mode that this machine uses
for the index in the tablejump instruction.  */
diff --git a/gcc/config/cr16/cr16.h b/gcc/config/cr16/cr16.h
index 54794e1..cf5bdf1 100644
--- a/gcc/config/cr16/cr16.h
+++ b/gcc/config/cr16/cr16.h
@@ -460,10 +460,6 @@ struct cumulative_args
 #define REG_OK_FOR_INDEX_P(X)   1
 #endif /* not REG_OK_STRICT.  */
 
-/* Go to LABEL if ADDR (a legitimate address expression) has 
-   an effect that depends on the machine mode it is used for.  */
-#define GO_IF_MODE_DEPENDENT_ADDRESS(ADDR, LABEL)
-
 /* Assume best case (branch predicted).  */
 #define BRANCH_COST(speed_p, predictable_p)   2
 
diff --git a/gcc/config/mep/mep.h b/gcc/config/mep/mep.h
index ad5b36d..920120c 100644
--- a/gcc/config/mep/mep.h
+++ b/gcc/config/mep/mep.h
@@ -561,8 +561,6 @@ typedef struct
   if (mep_legitimize_reload_address ((X), (MODE), (OPNUM), (TYPE), 
(IND_LEVELS))) \
 goto WIN
 
-#define GO_IF_MODE_DEPENDENT_ADDRESS(ADDR, LABEL)
-
 #define SELECT_CC_MODE(OP, X, Y)  CCmode
 
 
diff --git a/gcc/config/vax/vax-protos.h b/gcc/config/vax/vax-protos.h
index 3f24794..5363877 100644
--- a/gcc/config/vax/vax-protos.h
+++ b/gcc/config/vax/vax-protos.h
@@ -19,7 +19,6 @@ along with GCC; see the file COPYING3.  If not see

Re: [PATCH] delete last traces of GO_IF_MODE_DEPENDENT_ADDRESS

2012-07-27 Thread Jan-Benedict Glaw
On Fri, 2012-07-27 19:21:40 -0400, Nathan Froyd froy...@mozilla.com wrote:
   * config/vax/vax-protos.h (vax_mode_dependent_address_p): Delete.
   * config/vax/vax.h (GO_IF_MODE_DEPENDENT_ADDRESS): Delete.
   * config/vax/vax.c (vax_mode_dependent_address_p): Make static.
   Take a const_rtx.
   (TARGET_MODE_DEPENDENT_ADDRESS_P): Define.

The VAX parts look good to me, but Matt has to decide on that.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: really soon now:  an unspecified period of time, 
likly to
the second  : be greater than any reasonable 
definition
  of soon.


signature.asc
Description: Digital signature