Re: Also test -O0 for OpenACC C, C++ offloading test cases

2016-03-22 Thread Thomas Schwinge
Hi!

On Tue, 22 Mar 2016 23:52:11 +0100, Bernd Schmidt  wrote:
> On 03/22/2016 11:23 AM, Thomas Schwinge wrote:
> > --- libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c
> > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c
> > @@ -1,5 +1,6 @@
> > -/* { dg-do run } */
> > -/* { dg-additional-options "-O2" } */
> > +/* Dead code elimination for blocks guarded by acc_on_device () only works 
> > with
> > +   optimizations enabled.
> > +   { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
> 
> What exactly is going on with these? Do these tests fail with -O0, and 
> is that likely to be a problem in practice?

Want me to re-word that?  :-| I thought it would be obvious from looking
at the test case code; will not be a problem in practice.  It's because
of constructs used in the test cases, like the following, for example:

#pragma acc loop worker
  for (unsigned ix = 0; ix < N; ix++)
{
  if (__builtin_acc_on_device (5))
{
  int g = 0, w = 0, v = 0;

  __asm__ volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (g));
  __asm__ volatile ("mov.u32 %0,%%tid.y;" : "=r" (w));
  __asm__ volatile ("mov.u32 %0,%%tid.x;" : "=r" (v));
  ary[ix] = (g << 16) | (w << 8) | v;
}
  else
ary[ix] = ix;
}

Without optimizations, the target (x86_64) assembler will bail out seeing
the device (nvptx) inline assembly code, even if it's dead code always
because of the acc_on_device () conditional.

Long ago, my suggestion has been to have GCC provide builtin functions
for users to retrieve the number of gangs, workers, vectors, and the
current thread's IDs of these; not sure why Nathan didn't implement that?
(Should be easy to do -- want me to have a look at that, as a separate
patch?)


> Also, why remove the dg-do run?

Because that's the default anyway.


Grüße
 Thomas


Re: [DOC Patch] Add sample for @cc constraint

2016-03-22 Thread David Wohlferd
Ping?  (link to original post: 
https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00743.html )


This patch adds a sample for a new-to-v6 feature.  Is this not the right 
time for doc improvements?


I considered adding some assembler output.  Something like:


Before this feature, you had to write code like this:

   asm("bt $0, %1 ; setc %0" : "=q" (a) : "r" (value) : "cc");
   if (a)

This would generate code like this:

bt $0, %ebx
setc %al <- Convert flags to byte
testb   %al, %al <-- Convert byte back to flags
jne .L5

Using @cc, this code

   asm("bt $0, %1" : "=@ccc" (a) : "r" (value) );
   if (a)

produces this output:

bt $0, %ebx
jc  .L5 <- Use the flags directly


While this helps show the benefit of the feature, it just seemed like 
too much detail.  Showing people the c code and reminding them to enable 
optimizations (what the current patch does) seems like it should be 
sufficient.


dw

On 3/12/2016 8:00 PM, David Wohlferd wrote:

The docs for the new(-ish) @cc constraint need an example. Attached.

ChangeLog:

2016-03-12  David Wohlferd  

* doc/extend.texi: Add sample for @cc constraint

Note that while I have a release on file with FSF, I don't have write 
access to SVN.


dw




Re: [PATCH 7/7] ira.c validate_equiv_mem

2016-03-22 Thread Bernd Schmidt

On 03/21/2016 02:43 AM, Alan Modra wrote:


+enum valid_equiv { valid_none, valid_combine, valid_reload };
+


Might be worth documenting that each step represents a superset of the 
previous one.



+ ret = valid_combine;
+ if (! MEM_READONLY_P (memref)
+ && ! RTL_CONST_OR_PURE_CALL_P (insn))
+   return valid_none;
+   }


The gcc style is actually not to have a space after unary "!". None of 
the code in this file follows that, but I think you may want to change 
that as you modify things in your patches, and have new code follow the 
recommended style.



@@ -3536,7 +3557,8 @@ update_equiv_regs (void)
{
  /* Note that the statement below does not affect the priority
 in local-alloc!  */
- REG_LIVE_LENGTH (regno) *= 2;
+ if (note)
+   REG_LIVE_LENGTH (regno) *= 2;


That's a very suspicious comment. It would be worth testing whether 
REG_LIVE_LENGTH has any effect on our current register allocation at 
all, and remove this code if not.


Otherwise looks good for stage 1.


Bernd


Re: [PATCH] Fix PR c++/70332 (ICE due to aggregate initialization of NSDMI)

2016-03-22 Thread Patrick Palka
On Tue, Mar 22, 2016 at 6:12 PM, Patrick Palka  wrote:
> On Tue, Mar 22, 2016 at 6:00 PM, Jason Merrill  wrote:
>> On 03/22/2016 05:35 PM, Patrick Palka wrote:
>>>
>>> + if (cp_unevaluated_operand == 0
>>
>>
>> Why check this here?
>
> Just so that the change doesn't affect the behavior of tsubst_decl()
> when cp_unevaluated_operand != 0.  Presumably the existing code (10
> lines below) handles that case just fine.

Turns out that without the check we can trigger the cxx_dialect >=
cxx14 assert because in c++11 mode we can reach the assert through
get_defaulted_eh_spec() which increments cp_unevaluated_operand and
then calls get_nsdmi (..., /*in_ctor=*/false) causing
current_class_ref to get set to a PLACEHOLDER_EXPR.

So for example g++.dg/cpp0x/nsdmi-template2.C regresses with an ICE.
So it seems the cp_unevaluated_operand != 0 check is necessary as long
as the assert stays.

There are no regressions if both the cp_unevaluated_operand check and
the assert are removed however.


Re: Also test -O0 for OpenACC C, C++ offloading test cases

2016-03-22 Thread Bernd Schmidt

On 03/22/2016 11:23 AM, Thomas Schwinge wrote:

diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c
index 01d1dc8..5806cb3 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c
@@ -1,5 +1,6 @@
-/* { dg-do run } */
-/* { dg-additional-options "-O2" } */
+/* Dead code elimination for blocks guarded by acc_on_device () only works with
+   optimizations enabled.
+   { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */


What exactly is going on with these? Do these tests fail with -O0, and 
is that likely to be a problem in practice?


Also, why remove the dg-do run?


Bernd



Re: [PING**2] [PATCH, libstdc++] Add missing free-standing headers to install rule

2016-03-22 Thread Jonathan Wakely

On 22/03/16 20:38 +, Bernd Edlinger wrote:

On 22.03.2016 20:10, Jonathan Wakely wrote:

On 22/03/16 18:29 +, Bernd Edlinger wrote:

Yes. Maybe changing concept_check.h would be better, because
I see 3 different instances of bits/c++config.h:

$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/fpu/bits/c++config.h
$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/bits/c++config.h
$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/thumb/bits/c++config.h


But they're all generated from the same include/bits/c++config in the
source tree, so that shouldn't matter.


while I only see one use of _GLIBCXX_CONCEPT_CHECKS:
$prefix/arm-eabi/include/c++/6.0.0/bits/concept_check.h


I'm fine with changing it there. We should also document that the
macro doesn't do anything for freestanding implementations.



Done.  Attached is a new version of my patch with a small
documentation update.  I just used your wording if you don't mind.


Please say "has no effect" rather than "doesn't do anything".


Is it Ok for trunk when boot-strap and regression-testing completed?


OK, thanks.




Thanks
Bernd.



2016-03-22  Bernd Edlinger  

* include/Makefile.am (install-freestanding-headers): Add
concept_check.h and move.h to the installed headers.
* include/Makefile.in: Regenerated.
* include/bits/concept_check.h: Ignore _GLIBCXX_CONCEPT_CHECKS for
freestanding implementations.
* doc/html/manual/using_macros.html (_GLIBCXX_CONCEPT_CHECKS): Mention
that this macro doesn't do anything for freestanding implementaions.


The HTML files are generated, so typically the changelog would say
it's regenerated. I assume you edited by hand, but it's still not
necessary to repeat the same thing for both the xml original and
generated html, one of them should be "Likewise".


* doc/xml/manual/using.xml (_GLIBCXX_CONCEPT_CHECKS): Mention
that this macro doesn't do anything for freestanding implementaions.


Re: [PATCH 6/7] ira.c use DF infrastructure for combine_and_move_insns

2016-03-22 Thread Alan Modra
On Tue, Mar 22, 2016 at 05:29:08PM +0100, Bernd Schmidt wrote:
> On 03/21/2016 02:42 AM, Alan Modra wrote:
> > * ira.c (combine_and_move_insns): Rather than scanning insns,
> > use DF infrastucture to find use and def insns.
> >
> >-  remove_death (regno, insn);
> 
> This call appears to have gone missing. Is that intentional?

No, well spotted.  Reinstated.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix PR c++/70332 (ICE due to aggregate initialization of NSDMI)

2016-03-22 Thread Patrick Palka
On Tue, Mar 22, 2016 at 6:00 PM, Jason Merrill  wrote:
> On 03/22/2016 05:35 PM, Patrick Palka wrote:
>>
>> + if (cp_unevaluated_operand == 0
>
>
> Why check this here?

Just so that the change doesn't affect the behavior of tsubst_decl()
when cp_unevaluated_operand != 0.  Presumably the existing code (10
lines below) handles that case just fine.


Re: [PATCH] Fix PR c++/70332 (ICE due to aggregate initialization of NSDMI)

2016-03-22 Thread Jason Merrill

On 03/22/2016 05:35 PM, Patrick Palka wrote:

+ if (cp_unevaluated_operand == 0


Why check this here?

Jason



[PATCH] Fix *vector_shift_pattern (PR tree-optimization/70354)

2016-03-22 Thread Jakub Jelinek
Hi!

As the testcase shows, the C/C++ FEs narrow the shift counters from whatever
type they had originally to unsigned int (previously signed int).
Then the vect-patterns code, to be able to use vector by vector shifts
attempts to narrow or widen them again to the right type.  If there is
already a cast from the right precision, it just uses the rhs1 of that case,
otherwise it adds a cast to the pattern, which performs the needed widening
or narrowing.

Unfortunately, we have information loss during optimizations, we don't know
anymore if it was say:
long a[64], b[64];
void foo (void)
{
  for (int i = 0; i < 64; i++)
a[i] <<= b[i]; // Here we don't need any masking, it would be UB
   // if b isn't in range
}
void bar (void)
{
  for (int i = 0; i < 64; i++)
a[i] <<= (unsigned) b[i]; // But here we can't just use b[i] as the
  // shift count, because the upper bits are to be 
masked off.
}
void baz (void)
{
  for (int i = 0; i < 64; i++)
a[i] <<= b[i] - 0x72ULL; // And here the optimizers will likely
 // optimize away the subtraction, because 
there
 // is implicit cast to (unsigned int).  We 
need
 // to mask instead of using b[i] directly.
}
But, not casting say long long shift counters to unsigned int would penalize
other code, computing unneeded operations.  So I'm afraid we want the
following fix, which I've bootstrapped/regtested on x86_64-linux and
i686-linux.  For short/char shifts we don't need this of course.

Ok for trunk?

2016-03-22  Jakub Jelinek  

PR tree-optimization/70354
* tree-vect-patterns.c (vect_recog_vector_vector_shift_pattern): If
oprnd0 is wider than oprnd1 and there is a cast from the wider
type to oprnd1, mask it with the mask of the narrower type.

* gcc.dg/vect/pr70354-1.c: New test.
* gcc.dg/vect/pr70354-2.c: New test.
* gcc.target/i386/avx2-pr70354-1.c: New test.
* gcc.target/i386/avx2-pr70354-2.c: New test.

--- gcc/tree-vect-patterns.c.jj 2016-03-04 15:42:12.0 +0100
+++ gcc/tree-vect-patterns.c2016-03-22 15:28:24.403579426 +0100
@@ -2097,7 +2097,20 @@ vect_recog_vector_vector_shift_pattern (
   if (TYPE_MODE (TREE_TYPE (rhs1)) == TYPE_MODE (TREE_TYPE (oprnd0))
  && TYPE_PRECISION (TREE_TYPE (rhs1))
 == TYPE_PRECISION (TREE_TYPE (oprnd0)))
-   def = rhs1;
+   {
+ if (TYPE_PRECISION (TREE_TYPE (oprnd1))
+ >= TYPE_PRECISION (TREE_TYPE (rhs1)))
+   def = rhs1;
+ else
+   {
+ tree mask
+   = build_low_bits_mask (TREE_TYPE (rhs1),
+  TYPE_PRECISION (TREE_TYPE (oprnd1)));
+ def = vect_recog_temp_ssa_var (TREE_TYPE (rhs1), NULL);
+ def_stmt = gimple_build_assign (def, BIT_AND_EXPR, rhs1, mask);
+ new_pattern_def_seq (stmt_vinfo, def_stmt);
+   }
+   }
 }
 
   if (def == NULL_TREE)
--- gcc/testsuite/gcc.dg/vect/pr70354-1.c.jj2016-03-22 15:36:42.210847707 
+0100
+++ gcc/testsuite/gcc.dg/vect/pr70354-1.c   2016-03-22 15:47:45.448878909 
+0100
@@ -0,0 +1,50 @@
+/* PR tree-optimization/70354 */
+/* { dg-do run } */
+
+#ifndef main
+#include "tree-vect.h"
+#endif
+
+long long int b[64], c[64], g[64];
+unsigned long long int a[64], d[64], e[64], f[64], h[64];
+
+__attribute__ ((noinline, noclone)) void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 64; i++)
+{
+  d[i] = h[i] << (unsigned long long int) b[i] * e[i])
+   << (-a[i] - 3752448776177690134ULL))
+  - 8214565720323784703ULL) - 1ULL);
+  e[i] = (_Bool) (f[i] + (unsigned long long int) g[i]);
+  g[i] = c[i];
+}
+}
+
+int
+main ()
+{
+  int i;
+#ifndef main
+  check_vect ();
+#endif
+  if (__CHAR_BIT__ != 8 || sizeof (long long int) != 8)
+return 0;
+  for (i = 0; i < 64; ++i)
+{
+  a[i] = 14694295297531861425ULL;
+  b[i] = -1725558902283030715LL;
+  c[i] = 4402992416302558097LL;
+  e[i] = 6297173129107286501ULL;
+  f[i] = 13865724171235650855ULL;
+  g[i] = 982871027473857427LL;
+  h[i] = 8193845517487445944ULL;
+}
+  foo ();
+  for (i = 0; i < 64; i++)
+if (d[i] != 8193845517487445944ULL || e[i] != 1
+   || g[i] != 4402992416302558097ULL)
+  abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.dg/vect/pr70354-2.c.jj2016-03-22 15:36:45.527802852 
+0100
+++ gcc/testsuite/gcc.dg/vect/pr70354-2.c   2016-03-22 16:07:09.397164461 
+0100
@@ -0,0 +1,37 @@
+/* PR tree-optimization/70354 */
+/* { dg-do run } */
+
+#ifndef main
+#include "tree-vect.h"
+#endif
+
+unsigned long long a[64], b[64];
+
+__attribute__((noinline, noclone)) void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 64; i++)
+a[i] <<= (b[i] - 0x12ULL);
+}
+
+int
+main ()
+{
+  int i;
+#ifndef main
+  check_vect ();
+#

[PATCH][PR target/70232] Correctly distinguish between FSM jump threads and old style jump threads

2016-03-22 Thread Jeff Law



Just a dumb oversight here.  An FSM jump thread requires threading 
through the loop latch and eliminating a multi-way branch at the end of 
the jump threading path.  We incorrectly tested for just the former and 
as a result applied the wrong clamp for the number of statements to copy.


As a result we would end up duplicating ~70 statements to eliminate a 
single, simple, branch.


Fixed by correctly distinguishing between FSM jump threads and old style 
jump threads.


Bootstrapped and regression tested on x86_64-linux-gnu and verified (by 
hand) the test passed on arm-linux-gnueabi.


Installed on the trunk.

Jeff
commit 6a4ed260a8e40fa6a05e373db7791f913a8e0da3
Author: law 
Date:   Tue Mar 22 21:32:34 2016 +

PR target/70232
tree-ssa-threadbackward.c
(fsm_find_control_statement_thread_paths): Correctly distinguish
between old style jump threads vs FSM jump threads.

PR target/70232
* gcc.dg/tree-ssa/pr70232.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234409 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6848496..a7f7933 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2016-03-22  Jeff Law  
+
+   PR target/70232
+   tree-ssa-threadbackward.c
+   (fsm_find_control_statement_thread_paths): Correctly distinguish
+   between old style jump threads vs FSM jump threads.
+
 2016-03-22  Ilya Enkovich  
 
PR target/70302
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 43217b8..eda8a9a 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2016-03-22  Jeff Law  
+
+   PR target/70232
+   * gcc.dg/tree-ssa/pr70232.c: New test.
+
 2016-03-22  Ilya Enkovich  
 
PR target/70302
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr70232.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr70232.c
new file mode 100644
index 000..6cc987a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr70232.c
@@ -0,0 +1,129 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -w -fdump-tree-vrp1-details -fdump-tree-vrp2-details 
-fdump-tree-dom2-details -fdump-tree-dom3-details" } */
+
+/* All the threads found by the FSM threader should have too
+   many statements to be profitable.  */
+/* { dg-final { scan-tree-dump-not "Registering FSM " "dom2"} } */
+/* { dg-final { scan-tree-dump-not "Registering FSM " "dom3"} } */
+/* { dg-final { scan-tree-dump-not "Registering FSM " "vrp1"} } */
+/* { dg-final { scan-tree-dump-not "Registering FSM " "vrp2"} } */
+
+typedef _Bool bool;
+typedef unsigned char uint8_t;
+typedef unsigned long uint32_t;
+typedef unsigned long long uint64_t;
+typedef unsigned int size_t;
+
+enum {
+ false = 0,
+ true = 1
+};
+
+struct list_head {
+ struct list_head *next, *prev;
+};
+
+
+extern void * memcpy(void *, const void *, size_t);
+extern int memcmp(const void *,const void *,size_t);
+extern void * memset(void *, int, size_t);
+extern void __memzero(void *ptr, size_t n);
+
+static inline uint64_t wwn_to_uint64_t(uint8_t *wwn)
+{
+ return (uint64_t)wwn[0] << 56 | (uint64_t)wwn[1] << 48 |
+ (uint64_t)wwn[2] << 40 | (uint64_t)wwn[3] << 32 |
+ (uint64_t)wwn[4] << 24 | (uint64_t)wwn[5] << 16 |
+ (uint64_t)wwn[6] << 8 | (uint64_t)wwn[7];
+}
+
+struct lpfc_name {
+ union {
+  uint8_t wwn[8];
+ } u;
+};
+
+struct lpfc_hba {
+ uint32_t cfg_fof;
+ uint32_t cfg_oas_flags;
+ struct list_head luns;
+};
+
+struct lpfc_device_id {
+ struct lpfc_name vport_wwpn;
+ struct lpfc_name target_wwpn;
+ uint64_t lun;
+};
+
+struct lpfc_device_data {
+ struct list_head listentry;
+ struct lpfc_device_id device_id;
+ bool oas_enabled;
+ bool available;
+};
+
+bool
+lpfc_find_next_oas_lun(struct lpfc_hba *phba, struct lpfc_name *vport_wwpn,
+ struct lpfc_name *target_wwpn, uint64_t *starting_lun,
+ struct lpfc_name *found_vport_wwpn,
+ struct lpfc_name *found_target_wwpn,
+ uint64_t *found_lun,
+ uint32_t *found_lun_status)
+{
+
+ struct lpfc_device_data *lun_info;
+ struct lpfc_device_id *device_id;
+ uint64_t lun;
+ bool found = false;
+
+ if (__builtin_expect(!!(!phba), 0) || !vport_wwpn || !target_wwpn ||
+ !starting_lun || !found_vport_wwpn ||
+ !found_target_wwpn || !found_lun || !found_lun_status ||
+ (*starting_lun == -1u) ||
+ !phba->cfg_fof)
+  return false;
+
+ lun = *starting_lun;
+ *found_lun = -1;
+ *starting_lun = -1;
+
+
+
+ for (lun_info = ({ const typeof( ((typeof(*lun_info) *)0)->listentry ) 
*__mptr = ((&phba->luns)->next); (typeof(*lun_info) *)( (char *)__mptr - 
__builtin_offsetof(typeof(*lun_info), listentry) );}); &lun_info->listentry != 
(&phba->luns); lun_info = ({ const typeof( ((typeof(*(lun_info)) 
*)0)->listentry ) *__mptr = ((lun_info)->listentry.next); (typeof(*(lun_info)) 
*)( (char *)__mptr - __builtin_offsetof(typeof(*(lun_info)), listentry) );})) {
+  if (((wwn_to_uint64_t(vport_wwpn->u.wwn) == 0) ||
+   (memcmp(&lun_in

[PATCH] Slightly improve TARGET_STV splitters (PR target/70321)

2016-03-22 Thread Jakub Jelinek
Hi!

As the PR mentions, DImode AND/IOR/XOR patterns often result in too ugly
code, regression from when the patterns weren't there (before STV has been
added).  This patch attempts to improve it a little bit by improving the
splitter for these, rather than always generating two SImode AND/IOR/XOR
instructions, if the last operand's subword is either 0 or -1, optimize
the corresponding instruction in the pair to nothing, or to clearing, or
negation.  More improvement can be IMHO only achieved by moving the STV
pass before combiner and split patterns we don't adjust into vector patterns
into corresponding SImode patterns, so that the combiner can handle them,
but that sounds like stage1 material.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-03-22  Jakub Jelinek  

PR target/70321
* config/i386/i386.md (*anddi3_doubleword, *di3_doubleword):
Optimize TARGET_STV splitters, if high or low word of last argument
is 0 or -1.

--- gcc/config/i386/i386.md.jj  2016-03-22 09:13:54.0 +0100
+++ gcc/config/i386/i386.md 2016-03-22 18:45:16.392316554 +0100
@@ -8141,16 +8141,31 @@
 (match_operand:DI 1 "nonimmediate_operand" "%0,0,0")
 (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm")))
(clobber (reg:CC FLAGS_REG))]
-  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 && ix86_binary_operator_ok (AND, 
DImode, operands)"
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
+   && ix86_binary_operator_ok (AND, DImode, operands)"
   "#"
   "&& reload_completed"
-  [(parallel [(set (match_dup 0)
-  (and:SI (match_dup 1) (match_dup 2)))
- (clobber (reg:CC FLAGS_REG))])
-   (parallel [(set (match_dup 3)
-  (and:SI (match_dup 4) (match_dup 5)))
- (clobber (reg:CC FLAGS_REG))])]
-  "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);")
+  [(const_int 0)]
+{
+  split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);
+  if (operands[2] == const0_rtx)
+{
+  operands[1] = const0_rtx;
+  ix86_expand_move (SImode, &operands[0]);
+}
+  else if (operands[2] != constm1_rtx)
+emit_insn (gen_andsi3 (operands[0], operands[1], operands[2]));
+  else if (operands[5] == constm1_rtx)
+emit_note (NOTE_INSN_DELETED);
+  if (operands[5] == const0_rtx)
+{
+  operands[4] = const0_rtx;
+  ix86_expand_move (SImode, &operands[3]);
+}
+  else if (operands[5] != constm1_rtx)
+emit_insn (gen_andsi3 (operands[3], operands[4], operands[5]));
+  DONE;
+})
 
 (define_insn "*andsi_1"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r,Ya,!k")
@@ -8665,16 +8680,41 @@
 (match_operand:DI 1 "nonimmediate_operand" "%0,0,0")
 (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm")))
(clobber (reg:CC FLAGS_REG))]
-  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 && ix86_binary_operator_ok 
(, DImode, operands)"
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
+   && ix86_binary_operator_ok (, DImode, operands)"
   "#"
   "&& reload_completed"
-  [(parallel [(set (match_dup 0)
-  (any_or:SI (match_dup 1) (match_dup 2)))
- (clobber (reg:CC FLAGS_REG))])
-   (parallel [(set (match_dup 3)
-  (any_or:SI (match_dup 4) (match_dup 5)))
- (clobber (reg:CC FLAGS_REG))])]
-  "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);")
+  [(const_int 0)]
+{
+  split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);
+  if (operands[2] == constm1_rtx)
+{
+  if ( == IOR)
+   {
+ operands[1] = constm1_rtx;
+ ix86_expand_move (SImode, &operands[0]);
+   }
+  else
+   ix86_expand_unary_operator (NOT, SImode, &operands[0]);
+}
+  else if (operands[2] != const0_rtx)
+ix86_expand_binary_operator (, SImode, &operands[0]);
+  else if (operands[5] == const0_rtx)
+emit_note (NOTE_INSN_DELETED);
+  if (operands[5] == constm1_rtx)
+{
+  if ( == IOR)
+   {
+ operands[4] = constm1_rtx;
+ ix86_expand_move (SImode, &operands[3]);
+   }
+  else
+   ix86_expand_unary_operator (NOT, SImode, &operands[3]);
+}
+  else if (operands[5] != const0_rtx)
+ix86_expand_binary_operator (, SImode, &operands[3]);
+  DONE;
+})
 
 (define_insn_and_split "*andndi3_doubleword"
   [(set (match_operand:DI 0 "register_operand" "=r,r")

Jakub


[PATCH] Fix PR c++/70332 (ICE due to aggregate initialization of NSDMI)

2016-03-22 Thread Patrick Palka
With c++14 an NSDMI no longer makes a class type non-aggregate so it's
possible to perform aggregate initialization on a class that has an
NSDMI, but tsubst_copy() currently ICEs on a use of 'this' in such
a situation.

This patch makes tsubst_copy() handle a use of 'this' in an NSDMI as
part of an aggregate initialization.  In that case current_class_ref
will be a PLACEHOLDER_EXPR (as set by get_nsdmi()) and this
PLACEHOLDER_EXPR will later get resolved to the true object by
replace_placeholders().

Does this patch look OK to commit after testing?

gcc/cp/ChangeLog:

PR c++/70332
* pt.c (tsubst_copy) [PARM_DECL]: Handle the use of 'this' in an
NSDMI that's part of an aggregrate initialization.

gcc/testsuite/ChangeLog:

PR c++/70332
* g++.dg/cpp1y/nsdmi-aggr5.C: New test.
---
 gcc/cp/pt.c  | 20 
 gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr5.C | 24 
 2 files changed, 40 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index ebfc45b..49ef9d3 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13878,10 +13878,22 @@ tsubst_copy (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
   if (r == NULL_TREE)
{
  /* We get here for a use of 'this' in an NSDMI.  */
- if (DECL_NAME (t) == this_identifier
- && current_function_decl
- && DECL_CONSTRUCTOR_P (current_function_decl))
-   return current_class_ptr;
+ if (DECL_NAME (t) == this_identifier)
+   {
+ /* We're processing an NSDMI as part of a constructor call.  */
+ if (current_function_decl
+ && DECL_CONSTRUCTOR_P (current_function_decl))
+   return current_class_ptr;
+
+ /* Or as part of an aggregate initialization.  */
+ if (cp_unevaluated_operand == 0
+ && current_class_ref
+ && TREE_CODE (current_class_ref) == PLACEHOLDER_EXPR)
+   {
+ gcc_assert (cxx_dialect >= cxx14);
+ return current_class_ptr;
+   }
+   }
 
  /* This can happen for a parameter name used later in a function
 declaration (such as in a late-specified return type).  Just
diff --git a/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr5.C 
b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr5.C
new file mode 100644
index 000..fe377c3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr5.C
@@ -0,0 +1,24 @@
+// PR c++/70332
+// { dg-do run { target c++14 } }
+
+template 
+struct C
+{
+ T m;
+ T *n = &m;
+};
+
+C c { };
+
+int
+main ()
+{
+  *c.n = 5;
+  if (c.m != 5)
+__builtin_abort ();
+
+  C d { 10 };
+  *d.n = *d.n + 1;
+  if (d.m != 11)
+__builtin_abort ();
+}
-- 
2.8.0.rc3.27.gade0865



[C++ PATCH] Diagnose constexpr overflow (PR c++/70323)

2016-03-22 Thread Jakub Jelinek
Hi!

On the following testcase, the first function is cp_folded into
return i == 0 ? 2147483648(OVF): 2147483647;
The problem is that we don't diagnose then the overflow at all.
We already have code that sets *overflow_p under right conditions,
just there wasn't any permerror call.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-03-22  Jakub Jelinek  

PR c++/70323
* constexpr.c (cxx_eval_constant_expression): Diagnose overflow
on TREE_OVERFLOW constants.

* g++.dg/cpp0x/constexpr-70323.C: New test.

--- gcc/cp/constexpr.c.jj   2016-03-22 09:05:32.0 +0100
+++ gcc/cp/constexpr.c  2016-03-22 10:38:58.598077573 +0100
@@ -3306,8 +3306,13 @@ cxx_eval_constant_expression (const cons
 }
   if (CONSTANT_CLASS_P (t))
 {
-  if (TREE_OVERFLOW (t) && (!flag_permissive || ctx->quiet))
-   *overflow_p = true;
+  if (TREE_OVERFLOW (t))
+   {
+ if (!ctx->quiet)
+   permerror (input_location, "overflow in constant expression");
+ if (!flag_permissive || ctx->quiet)
+   *overflow_p = true;
+   }
   return t;
 }
 
--- gcc/testsuite/g++.dg/cpp0x/constexpr-70323.C.jj 2016-03-22 
10:42:54.093884158 +0100
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-70323.C2016-03-22 
10:42:29.0 +0100
@@ -0,0 +1,10 @@
+// PR c++/70323
+// { dg-do compile { target c++11 } }
+
+constexpr int overflow_if_0 (int i) { return __INT_MAX__ + !i; }
+constexpr int overflow_if_1 (int i) { return __INT_MAX__ + i; }
+
+constexpr bool i0_0 = overflow_if_0 (0);   // { dg-error "overflow in constant 
expression" }
+constexpr bool i0_1 = overflow_if_0 (1);
+constexpr bool i1_0 = overflow_if_1 (0);
+constexpr bool i1_1 = overflow_if_1 (1);   // { dg-error "overflow in constant 
expression" }

Jakub


Re: [PR69315] enable finish_function to recurse for constexpr functions

2016-03-22 Thread Jason Merrill

On 01/26/2016 12:11 PM, Alexandre Oliva wrote:

We don't want finish_function to be called recursively from mark_used.
However, it's desirable and necessary to call itself recursively when
performing delayed folding, because that may have to instantiate and
evaluate constexpr template functions.


Hmm.  If recursion is problematic, we shouldn't do it.  If it isn't 
problematic, we can do away with defer_mark_used_calls.  It doesn't make 
sense to me to allow recursion some of the time.


I tried disabling defer_mark_used_calls, and the only new testsuite 
failure turned out to be fixing a bug on variadic122.C: we were 
deferring a mark_used call in unevaluated context and then calling it in 
evaluated context.


Jakub, you added defer_mark_used_calls for BZ 37189, do you think it's 
still needed?  The testcase passes without it now.


Jason



Re: [PATCH] PR libgcc/70363, fix __float128 problem with non ISA-3.0 assembler

2016-03-22 Thread David Edelsohn
On Tue, Mar 22, 2016 at 4:33 PM, Michael Meissner
 wrote:
> This patch fixes PR libgcc/70363, which is a configuration issue if you build
> GCC 6.x with an assembler that does not support the ISA 3.0 instructions.  I
> missed one emulation function that needed to be a different name if the IFUNC
> functions added for ISA 3.0 support are not being built.
>
> I built a trunk compiler with a stock assembler, and did a program with a
> convert from __float128 to long double/__ibm128.  If I did not include the
> patch, the linker reported:
>
> -genoa-> ~/fsf-install-ppc64le/trunk-at9x/bin/gcc -O2 
> test-float128-6.c -DDEBUG && a.out
> /tmp/ccbCLWdO.o: In function `print_hex.constprop.0':
> test-float128-6.c:(.text+0x84): undefined reference to `__extendkftf2'
> collect2: error: ld returned 1 exit status
>
> If built a compiler with the patch, it succeeds.  Is this patch ok to install
> in the trunk?
>
> 2016-03-22  Michael Meissner  
>
> PR libgcc/70363
> * config/rs6000/extendkftf2-sw.c (__extendkftf2_sw): If libgcc was
> built with an assembler that does not support ISA 3.0
> instructions, rename __extendkftf2_sw to __extendkftf2.

Okay.

Thanks, David


Re: [PING**2] [PATCH, libstdc++] Add missing free-standing headers to install rule

2016-03-22 Thread Bernd Edlinger
On 22.03.2016 20:10, Jonathan Wakely wrote:
> On 22/03/16 18:29 +, Bernd Edlinger wrote:
>> Yes. Maybe changing concept_check.h would be better, because
>> I see 3 different instances of bits/c++config.h:
>>
>> $prefix/arm-eabi/include/c++/6.0.0/arm-eabi/fpu/bits/c++config.h
>> $prefix/arm-eabi/include/c++/6.0.0/arm-eabi/bits/c++config.h
>> $prefix/arm-eabi/include/c++/6.0.0/arm-eabi/thumb/bits/c++config.h
>
> But they're all generated from the same include/bits/c++config in the
> source tree, so that shouldn't matter.
>
>> while I only see one use of _GLIBCXX_CONCEPT_CHECKS:
>> $prefix/arm-eabi/include/c++/6.0.0/bits/concept_check.h
>
> I'm fine with changing it there. We should also document that the
> macro doesn't do anything for freestanding implementations.
>

Done.  Attached is a new version of my patch with a small
documentation update.  I just used your wording if you don't mind.

Is it Ok for trunk when boot-strap and regression-testing completed?


Thanks
Bernd.
2016-03-22  Bernd Edlinger  

	* include/Makefile.am (install-freestanding-headers): Add
	concept_check.h and move.h to the installed headers.
	* include/Makefile.in: Regenerated.
	* include/bits/concept_check.h: Ignore _GLIBCXX_CONCEPT_CHECKS for
	freestanding implementations.
	* doc/html/manual/using_macros.html (_GLIBCXX_CONCEPT_CHECKS): Mention
	that this macro doesn't do anything for freestanding implementaions.
	* doc/xml/manual/using.xml (_GLIBCXX_CONCEPT_CHECKS): Mention
	that this macro doesn't do anything for freestanding implementaions.

Index: libstdc++-v3/doc/html/manual/using_macros.html
===
--- libstdc++-v3/doc/html/manual/using_macros.html	(revision 234407)
+++ libstdc++-v3/doc/html/manual/using_macros.html	(working copy)
@@ -66,7 +66,8 @@
 	--enable-concept-checks.  When defined, performs
 	compile-time checking on certain template instantiations to
 	detect violations of the requirements of the standard.  This
-	is described in more detail in
+	macro doesn't do anything for freestanding implementaions.
+	This is described in more detail in
 	Compile Time Checks.
   _GLIBCXX_ASSERTIONS
 	Undefined by default. When defined, enables extra error checking in
@@ -91,4 +92,4 @@
 	mode.
   __STDCPP_WANT_MATH_SPEC_FUNCS__Undefined by default. When defined to a non-zero integer constant,
 	enables support for ISO/IEC 29124 Special Math Functions.
-  Prev Up NextHeaders Home Dual ABI
\ No newline at end of file
+  Prev Up NextHeaders Home Dual ABI
Index: libstdc++-v3/doc/xml/manual/using.xml
===
--- libstdc++-v3/doc/xml/manual/using.xml	(revision 234407)
+++ libstdc++-v3/doc/xml/manual/using.xml	(working copy)
@@ -908,7 +908,8 @@
 	--enable-concept-checks.  When defined, performs
 	compile-time checking on certain template instantiations to
 	detect violations of the requirements of the standard.  This
-	is described in more detail in
+	macro doesn't do anything for freestanding implementaions.
+	This is described in more detail in
 	Compile Time Checks.
   
 
Index: libstdc++-v3/include/Makefile.am
===
--- libstdc++-v3/include/Makefile.am	(revision 234407)
+++ libstdc++-v3/include/Makefile.am	(working copy)
@@ -1331,7 +1331,7 @@
 # libsupc++, so only the others and the sub-includes are copied here.
 install-freestanding-headers:
 	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}/bits
-	for file in c++0x_warning.h atomic_base.h; do \
+	for file in c++0x_warning.h atomic_base.h concept_check.h move.h; do \
 	  $(INSTALL_DATA) ${glibcxx_srcdir}/include/bits/$${file} $(DESTDIR)${gxx_include_dir}/bits; done
 	$(mkinstalldirs) $(DESTDIR)${host_installdir}
 	for file in ${host_srcdir}/os_defines.h ${host_builddir}/c++config.h \
Index: libstdc++-v3/include/Makefile.in
===
--- libstdc++-v3/include/Makefile.in	(revision 234407)
+++ libstdc++-v3/include/Makefile.in	(working copy)
@@ -1753,7 +1753,7 @@
 # libsupc++, so only the others and the sub-includes are copied here.
 install-freestanding-headers:
 	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}/bits
-	for file in c++0x_warning.h atomic_base.h; do \
+	for file in c++0x_warning.h atomic_base.h concept_check.h move.h; do \
 	  $(INSTALL_DATA) ${glibcxx_srcdir}/include/bits/$${file} $(DESTDIR)${gxx_include_dir}/bits; done
 	$(mkinstalldirs) $(DESTDIR)${host_installdir}
 	for file in ${host_srcdir}/os_defines.h ${host_builddir}/c++config.h \
Index: libstdc++-v3/include/bits/concept_check.h
===
--- libstdc++-v3/include/bits/concept_check.h	(revision 234407)
+++ libstdc++-v3/include/bits/concept_check.h	(working copy)
@@ -41,8 +41,9 @@
 
 // Concept-checking code is off by default unless users turn it on via
 // configure options or editing c++con

[PATCH] PR libgcc/70363, fix __float128 problem with non ISA-3.0 assembler

2016-03-22 Thread Michael Meissner
This patch fixes PR libgcc/70363, which is a configuration issue if you build
GCC 6.x with an assembler that does not support the ISA 3.0 instructions.  I
missed one emulation function that needed to be a different name if the IFUNC
functions added for ISA 3.0 support are not being built.

I built a trunk compiler with a stock assembler, and did a program with a
convert from __float128 to long double/__ibm128.  If I did not include the
patch, the linker reported:

-genoa-> ~/fsf-install-ppc64le/trunk-at9x/bin/gcc -O2 test-float128-6.c 
-DDEBUG && a.out
/tmp/ccbCLWdO.o: In function `print_hex.constprop.0':
test-float128-6.c:(.text+0x84): undefined reference to `__extendkftf2'
collect2: error: ld returned 1 exit status

If built a compiler with the patch, it succeeds.  Is this patch ok to install
in the trunk?

2016-03-22  Michael Meissner  

PR libgcc/70363
* config/rs6000/extendkftf2-sw.c (__extendkftf2_sw): If libgcc was
built with an assembler that does not support ISA 3.0
instructions, rename __extendkftf2_sw to __extendkftf2.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: libgcc/config/rs6000/extendkftf2-sw.c
===
--- libgcc/config/rs6000/extendkftf2-sw.c   (revision 234405)
+++ libgcc/config/rs6000/extendkftf2-sw.c   (working copy)
@@ -39,6 +39,10 @@
 #include "soft-fp.h"
 #include "quad-float128.h"
 
+#ifndef FLOAT128_HW_INSNS
+#define __extendkftf2_sw __extendkftf2
+#endif
+
 IBM128_TYPE
 __extendkftf2_sw (__float128 value)
 {


Re: [PATCH] c++/67376 Comparison with pointer to past-the-end, of array fails inside constant expression

2016-03-22 Thread Martin Sebor

On 03/22/2016 12:52 PM, Jason Merrill wrote:

On 03/21/2016 06:09 PM, Jeff Law wrote:

On 03/21/2016 11:54 AM, Jason Merrill wrote:

Both b0 and b1 are invalid and should be diagnosed, but only b1
is.  b1 isn't because because by the time we see its initializer
in constexpr.c it's been transformed into the equivalent of "b1
= (int*)ps" (though we don't see the cast which would also make
it invalid).

But if we can avoid these early simplifying transformations and
retain a more faithful representation of the original source then
doing the checking later will likely be simpler and result in
detecting more problems with greater consistency and less effort.

Do we know where the folding is happening for this case and is it
something we can reasonably defer?ie, is this just a case we missed
as part of the deferred folding work and hence should have its own
distinct BZ to track?


Yes, why is it already folded?



Let's pull that out into a separate BZ and tackle it for gcc-7.


I need to understand the issue before I agree to defer it.

It turns out that the problem is with how cp_build_binary_op calls
cp_pointer_int_sum and thus the c-common pointer_int_sum, which folds.

The POINTER_PLUS_EXPRs thus created have been a source of many issues
with constexpr evaluation, since it's impossible to reconstruct the
original expression, especially because POINTER_PLUS_EXPR uses an
unsigned second operand.  Deferring lowering to POINTER_PLUS_EXPR would
help a lot.  But it would indeed be a significant risk at this point.

I think let's defer the fix for c++/60760 (i.e. the nullptr_p bits)
until stage 1, when it can be combined with the POINTER_PLUS_EXPR fix,
and put the rest of this patch in now.


I can split up the patch into two and post the subset without
the fix for c++/60760, though I don't expect to be done with
it after I get back (next week).

I'd like to understand your concern with the fix for c++/60760.
Is it that it's incomplete (doesn't reject taking the address
of the first member of a struct, as in &null->first_member),
or are you worried that the changes may not be stable enough?

Martin


Re: [PING**2] [PATCH, libstdc++] Add missing free-standing headers to install rule

2016-03-22 Thread Jonathan Wakely

On 22/03/16 18:29 +, Bernd Edlinger wrote:

Yes. Maybe changing concept_check.h would be better, because
I see 3 different instances of bits/c++config.h:

$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/fpu/bits/c++config.h
$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/bits/c++config.h
$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/thumb/bits/c++config.h


But they're all generated from the same include/bits/c++config in the
source tree, so that shouldn't matter.


while I only see one use of _GLIBCXX_CONCEPT_CHECKS:
$prefix/arm-eabi/include/c++/6.0.0/bits/concept_check.h


I'm fine with changing it there. We should also document that the
macro doesn't do anything for freestanding implementations.



Re: [PATCH, PR target/70302] STV: support unitialized register used in converted instructions

2016-03-22 Thread Jeff Law

On 03/22/2016 12:20 PM, Uros Bizjak wrote:

Hello!


2016-03-22  Ilya Enkovich  

PR target/70302
* config/i386/i386.c (scalar_chain::convert_op): Support
uninitialized register usage case.

gcc/testsuite/

2016-03-22  Ilya Enkovich  

PR target/70302
* gcc.target/i386/pr70302.c: New test.


OK.

I'm going to go ahead and commit this for Ilya.

Thanks!
Jeff


Re: [PATCH] c++/67376 Comparison with pointer to past-the-end, of array fails inside constant expression

2016-03-22 Thread Jason Merrill

On 03/21/2016 06:09 PM, Jeff Law wrote:

On 03/21/2016 11:54 AM, Jason Merrill wrote:

Both b0 and b1 are invalid and should be diagnosed, but only b1
is.  b1 isn't because because by the time we see its initializer
in constexpr.c it's been transformed into the equivalent of "b1
= (int*)ps" (though we don't see the cast which would also make
it invalid).

But if we can avoid these early simplifying transformations and
retain a more faithful representation of the original source then
doing the checking later will likely be simpler and result in
detecting more problems with greater consistency and less effort.

Do we know where the folding is happening for this case and is it
something we can reasonably defer?ie, is this just a case we missed
as part of the deferred folding work and hence should have its own
distinct BZ to track?


Yes, why is it already folded?



Let's pull that out into a separate BZ and tackle it for gcc-7.


I need to understand the issue before I agree to defer it.

It turns out that the problem is with how cp_build_binary_op calls 
cp_pointer_int_sum and thus the c-common pointer_int_sum, which folds.


The POINTER_PLUS_EXPRs thus created have been a source of many issues 
with constexpr evaluation, since it's impossible to reconstruct the 
original expression, especially because POINTER_PLUS_EXPR uses an 
unsigned second operand.  Deferring lowering to POINTER_PLUS_EXPR would 
help a lot.  But it would indeed be a significant risk at this point.


I think let's defer the fix for c++/60760 (i.e. the nullptr_p bits) 
until stage 1, when it can be combined with the POINTER_PLUS_EXPR fix, 
and put the rest of this patch in now.


Jason


Re: [PING**2] [PATCH, libstdc++] Add missing free-standing headers to install rule

2016-03-22 Thread Bernd Edlinger
On 22.03.2016 15:36, Jonathan Wakely wrote:
> On 22/03/16 07:10 +, Bernd Edlinger wrote:
>> Hi,
>>
>> I am pinging for this patch, which addresses an admittedly minor
>> regression
>> for free-standing libstdc++ due to changed c++11 default settings.
>> The proposed
>> patch does only change the free-standing install rule, and has
>> therefore no impact
>> on other configurations.
>>
>> https://gcc.gnu.org/ml/libstdc++/2016-03/msg4.html
>
> Sorry for the delay, I'm testing the patch today.
>
> Looks like the patch doesn't add  to the
> freestanding
> headers, which means using -D_GLIBCXX_CONCEPT_CHECKS will give a fatal
> error.
>
> It also means --disable-libstdccxx-hosted --enable-concept-checks
> creates an unusable configuration (although it's possible that
> --enable-concept-checks is already broken due to the -std=gnu++14
> default).
>
> I think it's fine for the concept checking to be unsupported for
> freestanding installations, but we should degrade gracefully, via
> something like:
>
> --- a/libstdc++-v3/include/bits/concept_check.h
> +++ b/libstdc++-v3/include/bits/concept_check.h
> @@ -42,7 +42,7 @@
> // Concept-checking code is off by default unless users turn it on via
> // configure options or editing c++config.h.
>
> -#ifndef _GLIBCXX_CONCEPT_CHECKS
> +#if !defined(_GLIBCXX_CONCEPT_CHECKS) || !defined(_GLIBCXX_HOSTED)
>
> #define __glibcxx_function_requires(...)
> #define __glibcxx_class_requires(_a,_b)
>
>
> Or in c++config.h doing:
>
> #ifndef _GLIBCXX_HOSTED
> # undef _GLIBCXX_CONCEPT_CHECKS
> #endif
>
> That seems better than just giving an error.

Yes. Maybe changing concept_check.h would be better, because
I see 3 different instances of bits/c++config.h:

$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/fpu/bits/c++config.h
$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/bits/c++config.h
$prefix/arm-eabi/include/c++/6.0.0/arm-eabi/thumb/bits/c++config.h

while I only see one use of _GLIBCXX_CONCEPT_CHECKS:
$prefix/arm-eabi/include/c++/6.0.0/bits/concept_check.h


Thanks
Bernd.


Re: [PATCH, PR target/70302] STV: support unitialized register used in converted instructions

2016-03-22 Thread Uros Bizjak
Hello!

> 2016-03-22  Ilya Enkovich  
>
> PR target/70302
> * config/i386/i386.c (scalar_chain::convert_op): Support
> uninitialized register usage case.
>
> gcc/testsuite/
>
> 2016-03-22  Ilya Enkovich  
>
> PR target/70302
> * gcc.target/i386/pr70302.c: New test.

OK.

Thanks,
Uros.


[PATCH] Fix 69845

2016-03-22 Thread Richard Henderson
In PR68142 you added a check for overflow + __INT_MIN__.
I can't figure out why the check for __INT_MIN__, except
that it seems specific to the test case you examined.

And indeed, this test case shows how things go wrong
with other distributed folding leading to overflow.

I added two tests, one signed, one unsigned.  The second
verifies that we do still fold for the defined-overflow case.

Ok?


r~
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 9d861c6..44fe2a2 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -6116,11 +6116,9 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
tree wide_type,
{
  tree tem = const_binop (code, fold_convert (ctype, t),
  fold_convert (ctype, c));
- /* If the multiplication overflowed to INT_MIN then we lost sign
-information on it and a subsequent multiplication might
-spuriously overflow.  See PR68142.  */
- if (TREE_OVERFLOW (tem)
- && wi::eq_p (tem, wi::min_value (TYPE_PRECISION (ctype), SIGNED)))
+ /* If the multiplication overflowed, we lost information on it.
+See PR68142 and PR69845.  */
+ if (TREE_OVERFLOW (tem))
return NULL_TREE;
  return tem;
}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69845-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr69845-1.c
new file mode 100644
index 000..92927ba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69845-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int32 } */
+/* { dg-options "-O -fdump-tree-gimple -fdump-tree-optimized" } */
+
+int
+main ()
+{
+  struct S { char s; } v;
+  v.s = 47;
+  int a = (int) v.s;
+  int b = (27005061 + (a + 680455));
+  int c = ((1207142401 * (((8 * b) + 9483541) - 230968044)) + 469069442);
+  if (c != 1676211843)
+__builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "b \\\* 8" 1 "gimple" } } */
+/* { dg-final { scan-tree-dump-not "abort" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69845-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr69845-2.c
new file mode 100644
index 000..e0b38e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69845-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int32 } */
+/* { dg-options "-O -fdump-tree-gimple -fdump-tree-optimized" } */
+
+int
+main ()
+{
+  struct S { char s; } v;
+  v.s = 47;
+  unsigned int a = (unsigned int) v.s;
+  unsigned int b = (27005061 + (a + 680455));
+  unsigned int c
+= ((1207142401u * (((8u * b) + 9483541u) - 230968044u)) + 469069442u);
+  if (c != 1676211843u)
+__builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "b \\\* 1067204616" 1 "gimple" } } */
+/* { dg-final { scan-tree-dump-not "abort" "optimized" } } */


Re: [PATCH] Adjust PR70251 fix

2016-03-22 Thread Marc Glisse

On Tue, 22 Mar 2016, Richard Biener wrote:


On March 22, 2016 4:55:13 PM GMT+01:00, Marc Glisse  
wrote:

On Tue, 22 Mar 2016, Richard Biener wrote:



This adjusts the PR70251 fix as discussed in the PR audit trail
and fixes a bug in genmatch required (bah, stupid GENERIC comparisons

in

GIMPLE operands...).


Thanks !

Hmm, the transformation is still disabled on AVX512:


! /* A + (B vcmp C ? 1 : 0) -> A - (B vcmp C ? -1 : 0), since vector

comparisons

!return all -1 or all 0 results.  */
 /* ??? We could instead convert all instances of the vec_cond to

negate,

but that isn't necessarily a win on its own.  */
 (simplify
!  (plus:c @3 (view_convert? (vec_cond:s @0 integer_each_onep@1

integer_zerop@2)))

  (if (VECTOR_TYPE_P (type)
   && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS

(TREE_TYPE (@0))

   && (TYPE_MODE (TREE_TYPE (type))
   == TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)
!   (minus @3 (view_convert (vec_cond @0 (negate @1) @2)


It seems that the references to @0 in the "if" should use @1 instead
(at
least the last one). I assume this test is to make sure that A has as
many
integer elements of the same size as the result of the vec_cond_expr.


It looks like that is always guaranteed by the input form and instead these are 
now useless checks which were guarding the original view-convert transform.


The input form has a view_convert_expr in it, so I don't see what prevents 
from arriving here with


v4df + view_convert((v8si < v8si) ? v8si : v8si)

for instance. That seems to indicate that some test is still needed, it is 
just better on the second or third argument of the vec_cond_expr than on 
the condition.


Or maybe you mean we could drop the view_convert_expr from the input form. 
It should have been sunk in the 2 constant arguments of the vec_cond_expr 
anyway (I didn't check if that really happens). That sounds good.



I'll remove them in a followup.

Richard.


Sorry for giving you an incomplete change in the PR.


--
Marc Glisse


Re: [PATCH] Adjust PR70251 fix

2016-03-22 Thread Richard Biener
On March 22, 2016 4:55:13 PM GMT+01:00, Marc Glisse  
wrote:
>On Tue, 22 Mar 2016, Richard Biener wrote:
>
>>
>> This adjusts the PR70251 fix as discussed in the PR audit trail
>> and fixes a bug in genmatch required (bah, stupid GENERIC comparisons
>in
>> GIMPLE operands...).
>
>Thanks !
>
>Hmm, the transformation is still disabled on AVX512:
>
>> ! /* A + (B vcmp C ? 1 : 0) -> A - (B vcmp C ? -1 : 0), since vector
>comparisons
>> !return all -1 or all 0 results.  */
>>  /* ??? We could instead convert all instances of the vec_cond to
>negate,
>> but that isn't necessarily a win on its own.  */
>>  (simplify
>> !  (plus:c @3 (view_convert? (vec_cond:s @0 integer_each_onep@1
>integer_zerop@2)))
>>   (if (VECTOR_TYPE_P (type)
>>&& TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS
>(TREE_TYPE (@0))
>>&& (TYPE_MODE (TREE_TYPE (type))
>>== TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)
>> !   (minus @3 (view_convert (vec_cond @0 (negate @1) @2)
>
>It seems that the references to @0 in the "if" should use @1 instead
>(at 
>least the last one). I assume this test is to make sure that A has as
>many 
>integer elements of the same size as the result of the vec_cond_expr.

It looks like that is always guaranteed by the input form and instead these are 
now useless checks which were guarding the original view-convert transform.

I'll remove them in a followup.

Richard.

>Sorry for giving you an incomplete change in the PR.

 


Re: [PATCH 6/7] ira.c use DF infrastructure for combine_and_move_insns

2016-03-22 Thread Bernd Schmidt

On 03/21/2016 02:42 AM, Alan Modra wrote:

* ira.c (combine_and_move_insns): Rather than scanning insns,
use DF infrastucture to find use and def insns.

- remove_death (regno, insn);


This call appears to have gone missing. Is that intentional?

Other than that it looks good for stage1.


Bernd


Re: [HSA, PATCH] Allocate memory for shadow arg (PR hsa/70337)

2016-03-22 Thread Martin Jambor
On Mon, Mar 21, 2016 at 09:51:27PM +0100, Martin Liska wrote:
> On 03/21/2016 07:23 PM, Martin Jambor wrote:
> >This is strange.  The pointer to the shadow data structure is, from
> >the HSA perspective, a normal kernel argument and therefore should
> >already be included in the kernel->kernarg_segment_size.  Have you
> >checked that the values are indeed off?
> 
> Hi Martin.
> 
> You are right that size of a shadow argument pointer should be
> included in the kernel->kernarg_segment_size. I've been currently
> testing a proper patch which conditionally copies shadow argument.
> 
> Thanks,
> Martin
> 

> From 413707c51bf4b0ac7f8dac6421be9955c18767dd Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Mon, 21 Mar 2016 21:40:03 +0100
> Subject: [PATCH] Copy shadow argument conditionally (PR hsa/70337)
> 
> libgomp/ChangeLog:
> 
> 2016-03-21  Martin Liska  
> 
>   PR hsa/70337
>   * plugin/plugin-hsa.c (GOMP_OFFLOAD_run): Copy shadow
>   argument just in case a dispatched kernel uses that argument.

This is OK, thanks,

Martin

> ---
>  libgomp/plugin/plugin-hsa.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/libgomp/plugin/plugin-hsa.c b/libgomp/plugin/plugin-hsa.c
> index d888493..f7ef600 100644
> --- a/libgomp/plugin/plugin-hsa.c
> +++ b/libgomp/plugin/plugin-hsa.c
> @@ -1255,8 +1255,16 @@ GOMP_OFFLOAD_run (int n, void *fn_ptr, void *vars, 
> void **args)
>hsa_signal_store_relaxed (s, 1);
>memcpy (shadow->kernarg_address, &vars, sizeof (vars));
>  
> -  memcpy (shadow->kernarg_address + sizeof (vars), &shadow,
> -   sizeof (struct hsa_kernel_runtime *));
> +  /* PR hsa/70337.  */
> +  size_t vars_size = sizeof (vars);
> +  if (kernel->kernarg_segment_size > vars_size)
> +{
> +  if (kernel->kernarg_segment_size != vars_size
> +   + sizeof (struct hsa_kernel_runtime *))
> + GOMP_PLUGIN_fatal ("Kernel segment size has an unexpected value");
> +  memcpy (packet->kernarg_address + vars_size, &shadow,
> +   sizeof (struct hsa_kernel_runtime *));
> +}
>  
>HSA_DEBUG ("Copying kernel runtime pointer to kernarg_address\n");
>  
> -- 
> 2.7.1
> 



Re: [PATCH] Adjust PR70251 fix

2016-03-22 Thread Marc Glisse

On Tue, 22 Mar 2016, Richard Biener wrote:



This adjusts the PR70251 fix as discussed in the PR audit trail
and fixes a bug in genmatch required (bah, stupid GENERIC comparisons in
GIMPLE operands...).


Thanks !

Hmm, the transformation is still disabled on AVX512:


! /* A + (B vcmp C ? 1 : 0) -> A - (B vcmp C ? -1 : 0), since vector comparisons
!return all -1 or all 0 results.  */
 /* ??? We could instead convert all instances of the vec_cond to negate,
but that isn't necessarily a win on its own.  */
 (simplify
!  (plus:c @3 (view_convert? (vec_cond:s @0 integer_each_onep@1 
integer_zerop@2)))
  (if (VECTOR_TYPE_P (type)
   && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))
   && (TYPE_MODE (TREE_TYPE (type))
   == TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)
!   (minus @3 (view_convert (vec_cond @0 (negate @1) @2)


It seems that the references to @0 in the "if" should use @1 instead (at 
least the last one). I assume this test is to make sure that A has as many 
integer elements of the same size as the result of the vec_cond_expr.


Sorry for giving you an incomplete change in the PR.

--
Marc Glisse


[committed] [wwwdocs] PR c/69993: update wording of -Wmisleading-indentation on website

2016-03-22 Thread David Malcolm
On Tue, 2016-03-22 at 10:28 -0400, David Malcolm wrote:
> On Tue, 2016-03-01 at 20:18 +0100, Richard Biener wrote:
> > On March 1, 2016 7:51:01 PM GMT+01:00, David Malcolm <
> > dmalc...@redhat.com> wrote:
> > > The wording of our output from -Wmisleading-indentation is rather
> > > confusing, as noted by Reddit user "sysop073" here:
> > > https://www.reddit.com/r/programming/comments/47pejg/gcc_6_wmisle
> > > ad
> > > ingindentation_vs_goto_fail/d0eonwd
> > > 
> > > > The way they split up the warning looks designed to trick you.
> > > > sslKeyExchange.c:631:8: warning: statement is indented as if it
> > > > were
> > > guarded by... [-Wmisleading-indentation]
> > > > goto fail;
> > > > ^~~~
> > > > sslKeyExchange.c:629:4: note: ...this 'if' clause, but it is
> > > > not
> > > > if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) !=
> > > > 0)
> > > > ^~
> > > > You read the first half and it sounds like goto fail; is
> > > > guarding
> > > something. Why would it not be:
> > > > sslKeyExchange.c:631:8: warning: statement is wrongly
> > > > indented...
> > > [-Wmisleading-indentation]
> > > > goto fail;
> > > > ^~~~
> > > > sslKeyExchange.c:629:4: note: ...as if it were guarded by this
> > > > 'if'
> > > clause
> > > > if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) !=
> > > > 0)
> > > > ^~
> > > 
> > > I agree that the current wording is suboptimal; certainly the
> > > wording
> > > would be much clearer if the wording of the "warning" only spoke
> > > about
> > > the
> > > statement in question, and the "note"/inform should then talk
> > > about
> > > the
> > > not-really-guarding guard.
> > > 
> > > One rewording could be:
> > > 
> > > sslKeyExchange.c:631:8: warning: statement is misleadingly
> > > indented...
> > > [-Wmisleading-indentation]
> > >goto fail;
> > >^~~~
> > > sslKeyExchange.c:629:4: note: ...as if it were guarded by this
> > > 'if'
> > > clause, but it is not
> > >if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
> > >^~
> > > 
> > > However another Reddit user ("ksion") noted here:
> > > https://www.reddit.com/r/programming/comments/47pejg/gcc_6_wmisle
> > > ad
> > > ingindentation_vs_goto_fail/d0eqyih
> > > that:
> > > > This is just passive voice, there is nothing tricky about it.
> > > > What I find more confusing -- and what your fix preserves -- is
> > > > the
> > > > reversed order of offending lines of code in the source file
> > > > and
> > > > the
> > > message.
> > > > 
> > > > I'd rather go with something like this:
> > > > sslKeyExchange.c:629:4: warning: indentation of a statement
> > > > below
> > > this 'if' clause... [-Wmisleading-indentation]
> > > > if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) !=
> > > > 0)
> > > > ^~
> > > > sslKeyExchange.c:631:8: note: ...suggests it is guarded by the
> > > > 'if'
> > > clause, but it's not
> > > > goto fail;
> > > > ^~~~
> > > > You can even see how the indentation is wrong in the very error
> > > message.
> > > 
> > > which suggests reversing the order of the messages, so that they
> > > appear
> > > in "source" order.
> > > 
> > > I think this is a big improvement in the readability of the
> > > warning.
> > > 
> > > The attached patch implements such a change, so that the warning
> > > is
> > > issued on the supposed guard clause, followed by the note on the
> > > statement that isn't really guarded.
> > > 
> > > Some examples:
> > > 
> > > Wmisleading-indentation-3.c:18:3: warning: this 'for' clause does
> > > not
> > > guard... [-Wmisleading-indentation]
> > >   for (i = 0; i < 10; i++)
> > >   ^~~
> > > Wmisleading-indentation-3.c:20:5: note: ...this statement, but
> > > the
> > > latter is indented as if it does
> > > prod[i] = a[i] * b[i];
> > > ^~~~
> > > Wmisleading-indentation-3.c: In function 'fn_6':
> > > Wmisleading-indentation-3.c:39:2: warning: this 'if' clause does
> > > not
> > > guard... [-Wmisleading-indentation]
> > >  if ((err = foo (b)) != 0)
> > >  ^~
> > > Wmisleading-indentation-3.c:41:3: note: ...this statement, but
> > > the
> > > latter is indented as if it does
> > >   goto fail;
> > >   ^~~~
> > > 
> > > I'm not totally convinced by my new wording; maybe the note could
> > > also mention the kind of clause ('if'/'while'/'else'/'for') for
> > > clarity, maybe something like:
> > > 
> > > Wmisleading-indentation-3.c: In function 'fn_6':
> > > Wmisleading-indentation-3.c:39:2: warning: this 'if' clause does
> > > not
> > > guard... [-Wmisleading-indentation]
> > >  if ((err = foo (b)) != 0)
> > >  ^~
> > > Wmisleading-indentation-3.c:41:3: note: ...this statement, but
> > > the
> > > latter is misleadingly indented
> > > as if it is guarded by the 'if'
> > >   goto fail;
> > >   ^~~~
> > > 
> > > Also, it's slightly clunkier when it comes to macros, e.g.:
> > > 
> > > Wmisleading-indentation-3.c: In function 'fn_14':
> > > Wmisleading-indentation-3.c:60:3: warning: th

Re: [PING**2] [PATCH, libstdc++] Add missing free-standing headers to install rule

2016-03-22 Thread Jonathan Wakely

On 22/03/16 07:10 +, Bernd Edlinger wrote:

Hi,

I am pinging for this patch, which addresses an admittedly minor regression
for free-standing libstdc++ due to changed c++11 default settings.  The proposed
patch does only change the free-standing install rule, and has therefore no 
impact
on other configurations.

https://gcc.gnu.org/ml/libstdc++/2016-03/msg4.html


Sorry for the delay, I'm testing the patch today.

Looks like the patch doesn't add  to the 
freestanding
headers, which means using -D_GLIBCXX_CONCEPT_CHECKS will give a fatal
error.

It also means --disable-libstdccxx-hosted --enable-concept-checks
creates an unusable configuration (although it's possible that
--enable-concept-checks is already broken due to the -std=gnu++14
default).

I think it's fine for the concept checking to be unsupported for
freestanding installations, but we should degrade gracefully, via
something like:

--- a/libstdc++-v3/include/bits/concept_check.h
+++ b/libstdc++-v3/include/bits/concept_check.h
@@ -42,7 +42,7 @@
// Concept-checking code is off by default unless users turn it on via
// configure options or editing c++config.h.

-#ifndef _GLIBCXX_CONCEPT_CHECKS
+#if !defined(_GLIBCXX_CONCEPT_CHECKS) || !defined(_GLIBCXX_HOSTED)

#define __glibcxx_function_requires(...)
#define __glibcxx_class_requires(_a,_b)


Or in c++config.h doing:

#ifndef _GLIBCXX_HOSTED
# undef _GLIBCXX_CONCEPT_CHECKS
#endif

That seems better than just giving an error.


[PATCH] Adjust PR70251 fix

2016-03-22 Thread Richard Biener

This adjusts the PR70251 fix as discussed in the PR audit trail
and fixes a bug in genmatch required (bah, stupid GENERIC comparisons in
GIMPLE operands...).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-03-22  Richard Biener  

PR middle-end/70251
* genmatch.c (gen_transform): Adjust last parameter to a three-state
int...
(capture::gen_transform): ... to change behavior when substituting
a condition into cond or not-cond expr context.
(dt_simplify::gen_1): Adjust.
* gimple-match-head.c: Include gimplify.h for unshare_expr.
* match.pd (A + (B vcmp C ? 1 : 0) -> A - (B vcmp C)): Revert
last change and instead change to
A + (B vcmp C ? 1 : 0) -> A - (B vcmp C ? -1 : 0).
(A - (B vcmp C ? 1 : 0) -> A + (B vcmp C)): Likewise.

* g++.dg/torture/pr70251.C: New testcase.

Index: gcc/genmatch.c
===
*** gcc/genmatch.c  (revision 234394)
--- gcc/genmatch.c  (working copy)
*** struct operand {
*** 548,554 
virtual void gen_transform (FILE *, int, const char *, bool, int,
  const char *, capture_info *,
  dt_operand ** = 0,
! bool = true)
  { gcc_unreachable  (); }
  };
  
--- 548,554 
virtual void gen_transform (FILE *, int, const char *, bool, int,
  const char *, capture_info *,
  dt_operand ** = 0,
! int = 0)
  { gcc_unreachable  (); }
  };
  
*** struct expr : public operand
*** 590,596 
bool force_single_use;
virtual void gen_transform (FILE *f, int, const char *, bool, int,
  const char *, capture_info *,
! dt_operand ** = 0, bool = true);
  };
  
  /* An operator that is represented by native C code.  This is always
--- 590,596 
bool force_single_use;
virtual void gen_transform (FILE *f, int, const char *, bool, int,
  const char *, capture_info *,
! dt_operand ** = 0, int = 0);
  };
  
  /* An operator that is represented by native C code.  This is always
*** struct c_expr : public operand
*** 622,628 
vec ids;
virtual void gen_transform (FILE *f, int, const char *, bool, int,
  const char *, capture_info *,
! dt_operand ** = 0, bool = true);
  };
  
  /* A wrapper around another operand that captures its value.  */
--- 622,628 
vec ids;
virtual void gen_transform (FILE *f, int, const char *, bool, int,
  const char *, capture_info *,
! dt_operand ** = 0, int = 0);
  };
  
  /* A wrapper around another operand that captures its value.  */
*** struct capture : public operand
*** 637,643 
operand *what;
virtual void gen_transform (FILE *f, int, const char *, bool, int,
  const char *, capture_info *,
! dt_operand ** = 0, bool = true);
  };
  
  /* if expression.  */
--- 637,643 
operand *what;
virtual void gen_transform (FILE *f, int, const char *, bool, int,
  const char *, capture_info *,
! dt_operand ** = 0, int = 0);
  };
  
  /* if expression.  */
*** get_operand_type (id_base *op, const cha
*** 2149,2155 
  void
  expr::gen_transform (FILE *f, int indent, const char *dest, bool gimple,
 int depth, const char *in_type, capture_info *cinfo,
!dt_operand **indexes, bool)
  {
id_base *opr = operation;
/* When we delay operator substituting during lowering of fors we
--- 2149,2155 
  void
  expr::gen_transform (FILE *f, int indent, const char *dest, bool gimple,
 int depth, const char *in_type, capture_info *cinfo,
!dt_operand **indexes, int)
  {
id_base *opr = operation;
/* When we delay operator substituting during lowering of fors we
*** expr::gen_transform (FILE *f, int indent
*** 2213,2221 
i == 0 ? NULL : op0type);
ops[i]->gen_transform (f, indent, dest, gimple, depth + 1, optype,
 cinfo, indexes,
!((!(*opr == COND_EXPR)
!  && !(*opr == VEC_COND_EXPR))
! || i != 0));
  }
  
const char *opr_name;
--- 2213,2220 
i == 0 ? NULL : op0type);
ops[i]->gen_transform (f, indent, dest, gimple, depth + 1, optype,
 cinfo, indexes,
!(*opr == COND_EXPR
! || *opr == VEC_COND_EXPR) && 

[PATCH, moxie] Fix endianness issue for moxiebox configuration

2016-03-22 Thread Anthony Green
Hello,

The attached patch fixes an endianness issue for the moxiebox
configuration of the moxie target.  I've just committed it.

Thanks,

AG


2016-03-22  Anthony Green  

* config/moxie/moxiebox.h (CC1_SPEC): Define.  Fix endianness
issue for moxiebox targets.
(CC1PLUS_SPEC): Ditto.



Index: gcc/config/moxie/moxiebox.h
===
--- gcc/config/moxie/moxiebox.h(revision 234061)
+++ gcc/config/moxie/moxiebox.h(working copy)
@@ -39,6 +39,12 @@
 #undef  ASM_SPEC
 #define ASM_SPEC "-EL"

+#undef CC1_SPEC
+#define CC1_SPEC "-mel %{meb:%ethis target is little-endian}"
+
+#undef CC1PLUS_SPEC
+#define CC1PLUS_SPEC CC1_SPEC
+
 #undef MULTILIB_DEFAULTS

 #undef SIZE_TYPE


Re: [PATCH] PR c/69993: improvements to wording of -Wmisleading-indentation

2016-03-22 Thread David Malcolm
On Tue, 2016-03-01 at 20:18 +0100, Richard Biener wrote:
> On March 1, 2016 7:51:01 PM GMT+01:00, David Malcolm <
> dmalc...@redhat.com> wrote:
> > The wording of our output from -Wmisleading-indentation is rather
> > confusing, as noted by Reddit user "sysop073" here:
> > https://www.reddit.com/r/programming/comments/47pejg/gcc_6_wmislead
> > ingindentation_vs_goto_fail/d0eonwd
> > 
> > > The way they split up the warning looks designed to trick you.
> > > sslKeyExchange.c:631:8: warning: statement is indented as if it
> > > were
> > guarded by... [-Wmisleading-indentation]
> > > goto fail;
> > > ^~~~
> > > sslKeyExchange.c:629:4: note: ...this 'if' clause, but it is not
> > > if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
> > > ^~
> > > You read the first half and it sounds like goto fail; is guarding
> > something. Why would it not be:
> > > sslKeyExchange.c:631:8: warning: statement is wrongly indented...
> > [-Wmisleading-indentation]
> > > goto fail;
> > > ^~~~
> > > sslKeyExchange.c:629:4: note: ...as if it were guarded by this
> > > 'if'
> > clause
> > > if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
> > > ^~
> > 
> > I agree that the current wording is suboptimal; certainly the
> > wording
> > would be much clearer if the wording of the "warning" only spoke
> > about
> > the
> > statement in question, and the "note"/inform should then talk about
> > the
> > not-really-guarding guard.
> > 
> > One rewording could be:
> > 
> > sslKeyExchange.c:631:8: warning: statement is misleadingly
> > indented...
> > [-Wmisleading-indentation]
> >goto fail;
> >^~~~
> > sslKeyExchange.c:629:4: note: ...as if it were guarded by this 'if'
> > clause, but it is not
> >if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
> >^~
> > 
> > However another Reddit user ("ksion") noted here:
> > https://www.reddit.com/r/programming/comments/47pejg/gcc_6_wmislead
> > ingindentation_vs_goto_fail/d0eqyih
> > that:
> > > This is just passive voice, there is nothing tricky about it.
> > > What I find more confusing -- and what your fix preserves -- is
> > > the
> > > reversed order of offending lines of code in the source file and
> > > the
> > message.
> > > 
> > > I'd rather go with something like this:
> > > sslKeyExchange.c:629:4: warning: indentation of a statement below
> > this 'if' clause... [-Wmisleading-indentation]
> > > if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
> > > ^~
> > > sslKeyExchange.c:631:8: note: ...suggests it is guarded by the
> > > 'if'
> > clause, but it's not
> > > goto fail;
> > > ^~~~
> > > You can even see how the indentation is wrong in the very error
> > message.
> > 
> > which suggests reversing the order of the messages, so that they
> > appear
> > in "source" order.
> > 
> > I think this is a big improvement in the readability of the
> > warning.
> > 
> > The attached patch implements such a change, so that the warning is
> > issued on the supposed guard clause, followed by the note on the
> > statement that isn't really guarded.
> > 
> > Some examples:
> > 
> > Wmisleading-indentation-3.c:18:3: warning: this 'for' clause does
> > not
> > guard... [-Wmisleading-indentation]
> >   for (i = 0; i < 10; i++)
> >   ^~~
> > Wmisleading-indentation-3.c:20:5: note: ...this statement, but the
> > latter is indented as if it does
> > prod[i] = a[i] * b[i];
> > ^~~~
> > Wmisleading-indentation-3.c: In function 'fn_6':
> > Wmisleading-indentation-3.c:39:2: warning: this 'if' clause does
> > not
> > guard... [-Wmisleading-indentation]
> >  if ((err = foo (b)) != 0)
> >  ^~
> > Wmisleading-indentation-3.c:41:3: note: ...this statement, but the
> > latter is indented as if it does
> >   goto fail;
> >   ^~~~
> > 
> > I'm not totally convinced by my new wording; maybe the note could
> > also mention the kind of clause ('if'/'while'/'else'/'for') for
> > clarity, maybe something like:
> > 
> > Wmisleading-indentation-3.c: In function 'fn_6':
> > Wmisleading-indentation-3.c:39:2: warning: this 'if' clause does
> > not
> > guard... [-Wmisleading-indentation]
> >  if ((err = foo (b)) != 0)
> >  ^~
> > Wmisleading-indentation-3.c:41:3: note: ...this statement, but the
> > latter is misleadingly indented
> > as if it is guarded by the 'if'
> >   goto fail;
> >   ^~~~
> > 
> > Also, it's slightly clunkier when it comes to macros, e.g.:
> > 
> > Wmisleading-indentation-3.c: In function 'fn_14':
> > Wmisleading-indentation-3.c:60:3: warning: this 'for' clause does
> > not
> > guard... [-Wmisleading-indentation]
> >   for ((VAR) = (START); (VAR) < (STOP); (VAR++))
> >   ^
> > Wmisleading-indentation-3.c:65:3: note: in expansion of macro
> > 'FOR_EACH'
> >   FOR_EACH (i, 0, 10)
> >   ^~~~
> > Wmisleading-indentation-3.c:67:5: note: ...this statement, but the
> > latter is indented as if it does
> > bar (i, i);
> > ^~~
> > 
> > That said, the reorderi

[PATCH] Add security_sensitive attribute to clean function stack and regs.

2016-03-22 Thread Marcos Díaz
Hi,
   the attached patch adds a new attribute 'security_sensitive' for functions.
The idea was discussed in PR middle-end/69976.
This attribute makes gcc to emit clean up code at the function's epilogue.
This clean-up code cleans the stack used by this function and that isn't
needed anymore. It also cleans used registers. It only works in x86_64.
Please, review the discussion here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69976
since we had some doubts with the implementation.

We also added some test-cases and ran all tests in x86_64.
We think this isn't a bug-fix but a new feature.

Changelog
2016-03-21  Marcos Diaz  
   Andres Tiraboschi  

PR tree-optimization/69820
* config/i386/i386-protos.h: Add ix86_clear_regs_emit and
ix86_sec_sensitive_attr_p
* config/i386/i386.c: (ix86_sec_sensitive_attr_p): New function
(ix86_using_red_zone): now take into account if the function has the new
attribute.
(is_preserved_reg): New function.
(is_integer_reg): New function.
(is_used_as_ret): New function.
(reg_to_string): New function.
(clear_reg_emit): New function.
(ix86_clear_regs_emit): New function.
(ix86_expand_epilogue): Added code to emit clean up code only when
security_sensitive attribute is set.
(ix86_handle_security_sensitive_attribute): New function.
(ix86_attribute_table): Added new attribute.
* config/i386/i386.md: (UNSPECV_CLRSTACK): New unspecv.
(UNSPECV_CLRREGS): New unspecv.
(return): Conditionally emit cleaning regs code.
(simple_return): Likewise
(clear_regs): New insn.
(clear_stack): New insn.
* doc/extend.texi: Added description for new security_sensitive attribute.
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index e4652f3..b69aa59 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -32,6 +32,8 @@ extern HOST_WIDE_INT ix86_initial_elimination_offset (int, 
int);
 extern void ix86_expand_prologue (void);
 extern void ix86_maybe_emit_epilogue_vzeroupper (void);
 extern void ix86_expand_epilogue (int);
+extern void ix86_clear_regs_emit (rtx*);
+extern bool ix86_sec_sensitive_attr_p(void);
 extern void ix86_expand_split_stack_prologue (void);
 
 extern void ix86_output_addr_vec_elt (FILE *, int);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3d8dbc4..7c58d6d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3690,12 +3690,19 @@ make_pass_stv (gcc::context *ctxt)
   return new pass_stv (ctxt);
 }
 
+bool ix86_sec_sensitive_attr_p(void)
+{
+  return lookup_attribute ("security_sensitive", DECL_ATTRIBUTES (cfun->decl));
+}
+
+
 /* Return true if a red-zone is in use.  */
 
 static inline bool
 ix86_using_red_zone (void)
 {
-  return TARGET_RED_ZONE && !TARGET_64BIT_MS_ABI;
+  return TARGET_RED_ZONE && !TARGET_64BIT_MS_ABI
+   && !ix86_sec_sensitive_attr_p();
 }
 
 /* Return a string that documents the current -m options.  The caller is
@@ -13325,6 +13332,144 @@ ix86_emit_restore_sse_regs_using_mov (HOST_WIDE_INT 
cfa_offset,
   }
 }
 
+/* Returns true iff regno is a register that must be preserved in a function.*/
+static inline bool
+is_preserved_reg(const unsigned int regno)
+{
+  return (regno == BX_REG)
+|| (regno == BP_REG)
+|| (regno == SP_REG)
+|| (regno == R12_REG)
+|| (regno == R13_REG)
+|| (regno == R14_REG)
+|| (regno == R15_REG);
+}
+
+/* Returns true iff regno is an integer register.*/
+static inline bool is_integer_reg(unsigned int regno)
+{
+  return (regno <= 7u) ||
+ ((FIRST_REX_INT_REG <= regno) && (regno <= LAST_REX_INT_REG));
+}
+
+/* Returns true iff regno is used to return in the current function.*/
+static inline bool
+is_used_as_ret(const unsigned int reg_number)
+{
+  bool is_ret = ix86_function_value_regno_p(reg_number);
+  if (is_ret)
+{
+  rtx ret = ix86_function_value
+   (TREE_TYPE (DECL_RESULT (cfun->decl)), cfun->decl, true);
+  if ((REG_P(ret))
+ && (is_integer_reg(REGNO(ret)))
+ && (is_integer_reg(reg_number))
+)
+   {
+ if ((GET_MODE(ret) == TImode) || (GET_MODE(ret) == CDImode))
+   is_ret = (reg_number == AX_REG) || (reg_number == DX_REG);
+ else
+   is_ret = (reg_number == AX_REG);
+   }
+  else if ((REG_P(ret)))
+   is_ret = REGNO(ret) == reg_number;
+  else
+   {
+ // Is parallel
+ const unsigned int len = XVECLEN(ret, 0);
+ unsigned int j = 0u;
+ bool found = false;
+ while (!found && j < len)
+   {
+ const rtx explist = XVECEXP(ret, 0, j);
+ const rtx ret_reg = XEXP(explist, 0);
+ found = REGNO(ret_reg) == reg_number;
+ ++j;
+   }
+ is_ret = found;
+   }
+
+}
+  return is_ret;
+}
+
+// Make this big enough to store any instruction
+#define MAX_CLEAR_STRING_SIZE 50
+
+/* Adds to str in pos position the neame of the register regno.*/
+static size_t
+reg_to_string(const unsigned in

[PATCH] Fix PR70333

2016-03-22 Thread Richard Biener

The following fixes another wide-int merge fallout by reverting back
to what the code did before it (doing a wide multiplication).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-03-22  Richard Biener  

PR middle-end/70333
* fold-const.c (extract_muldiv_1): Properly perform multiplication
in the wide type.

* gcc.dg/torture/pr70333.c: New testcase.

Index: gcc/fold-const.c
===
*** gcc/fold-const.c(revision 234394)
--- gcc/fold-const.c(working copy)
*** extract_muldiv_1 (tree t, tree c, enum t
*** 6376,6393 
  bool overflow_p = false;
  bool overflow_mul_p;
  signop sign = TYPE_SIGN (ctype);
! wide_int mul = wi::mul (op1, c, sign, &overflow_mul_p);
  overflow_p = TREE_OVERFLOW (c) | TREE_OVERFLOW (op1);
  if (overflow_mul_p
  && ((sign == UNSIGNED && tcode != MULT_EXPR) || sign == SIGNED))
overflow_p = true;
  if (!overflow_p)
!   {
! mul = wide_int::from (mul, TYPE_PRECISION (ctype),
!   TYPE_SIGN (TREE_TYPE (op1)));
! return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
! wide_int_to_tree (ctype, mul));
!   }
}
  
/* If these operations "cancel" each other, we have the main
--- 6376,6392 
  bool overflow_p = false;
  bool overflow_mul_p;
  signop sign = TYPE_SIGN (ctype);
! unsigned prec = TYPE_PRECISION (ctype);
! wide_int mul = wi::mul (wide_int::from (op1, prec, sign),
! wide_int::from (c, prec, sign),
! sign, &overflow_mul_p);
  overflow_p = TREE_OVERFLOW (c) | TREE_OVERFLOW (op1);
  if (overflow_mul_p
  && ((sign == UNSIGNED && tcode != MULT_EXPR) || sign == SIGNED))
overflow_p = true;
  if (!overflow_p)
!   return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
!   wide_int_to_tree (ctype, mul));
}
  
/* If these operations "cancel" each other, we have the main
Index: gcc/testsuite/gcc.dg/torture/pr70333.c
===
*** gcc/testsuite/gcc.dg/torture/pr70333.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr70333.c  (working copy)
***
*** 0 
--- 1,19 
+ /* { dg-do run } */
+ /* { dg-require-effective-target lp64 } */
+ 
+ unsigned long int
+ foo (signed char b, signed char e)
+ {
+   return ((2ULL * b) * (e * 13)) * (32 << 24);
+ }
+ 
+ int
+ main ()
+ {
+   if (__CHAR_BIT__ == 8
+   && sizeof (int) == 4
+   && sizeof (long long) == 8
+   && foo (-60, 1) != 0xff3dULL)
+ __builtin_abort ();
+   return 0;
+ }


Re: Fix 70278 (LRA split_regs followup patch)

2016-03-22 Thread Christophe Lyon
On 22 March 2016 at 13:14, Bernd Schmidt  wrote:
> On 03/22/2016 10:24 AM, Christophe Lyon wrote:
>>
>>
>> The ARM test isn't sufficiently protected against non-compliant
>> configurations,
>> and fails if GCC is configured for arm*linux-gnueabihf for instance
>> (see
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/234342/report-build-info.html)
>>
>> The attached small patch fixes that by requiring arm_arch_v4t_multilib
>> effective target.
>>
>> I used arm_arch_v4t_multilib instead of arm_arch_v4t because, as I
>> reported a long time ago
>> the later does not complain in some unsupported configuration because
>> the sample effective
>> target test does not contain actual code. In particular it's not
>> sufficient to reject thumb-1 with
>> hard-float.
>>
>> OK?
>
>
> No objections from me, but I copied all this from the existing testcase
> ftest-armv4t-thumb.c, so I'm puzzled why that one doesn't fail.
>

It's similar to why I tried to explain above: ftest-armv4t-thumb.c contains
only preprocessor tests, no actual code.

When the program contains code (or even a single global variable definition),
the compiler complains that" Thumb-1 hard-float VFP ABI" is not
implemented.

A long time ago, I submitted a patch to add some code to the
arm_arch_FUNC_ok effective target, but it was not accepted.

Christophe.

>
> Bernd


[PATCH, PR target/70302] STV: support unitialized register used in converted instructions

2016-03-22 Thread Ilya Enkovich
Hi,

This patch allows uninitialized registers usage in instructions
converted by STV pass.  Bootstrapped and tested on x86_64-pc-linux-gnu{-m32}.
OK for trunk?

Thanks,
Ilya
--
gcc/

2016-03-22  Ilya Enkovich  

PR target/70302
* config/i386/i386.c (scalar_chain::convert_op): Support
uninitialized register usage case.

gcc/testsuite/

2016-03-22  Ilya Enkovich  

PR target/70302
* gcc.target/i386/pr70302.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3d8dbc4..d25c5c4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3409,6 +3409,20 @@ scalar_chain::convert_op (rtx *op, rtx_insn *insn)
fprintf (dump_file, "  Preloading operand for insn %d into r%d\n",
 INSN_UID (insn), REGNO (tmp));
 }
+  else if (REG_P (*op))
+{
+  /* We may have not converted register usage in case
+this register has no definition.  Otherwise it
+should be converted in convert_reg.  */
+  df_ref ref;
+  FOR_EACH_INSN_USE (ref, insn)
+   if (DF_REF_REGNO (ref) == REGNO (*op))
+ {
+   gcc_assert (!DF_REF_CHAIN (ref));
+   break;
+ }
+  *op = gen_rtx_SUBREG (V2DImode, *op, 0);
+}
   else
 {
   gcc_assert (SUBREG_P (*op));
diff --git a/gcc/testsuite/gcc.target/i386/pr70302.c 
b/gcc/testsuite/gcc.target/i386/pr70302.c
new file mode 100644
index 000..9b82a0c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr70302.c
@@ -0,0 +1,22 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -msse2" } */
+
+long a, c, e;
+int b, d;
+unsigned long long f;
+
+extern void fn2 (const char *, int, int, int);
+
+void
+fn1(long long p1)
+{
+  unsigned long long g;
+  int i;
+  for (; i;)
+if (e)
+  g = c;
+  if (a)
+f = p1;
+  if (!f && !g)
+fn2("", b, d, d);
+}


Re: Fix 70278 (LRA split_regs followup patch)

2016-03-22 Thread Bernd Schmidt

On 03/22/2016 10:24 AM, Christophe Lyon wrote:


The ARM test isn't sufficiently protected against non-compliant configurations,
and fails if GCC is configured for arm*linux-gnueabihf for instance
(see 
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/234342/report-build-info.html)

The attached small patch fixes that by requiring arm_arch_v4t_multilib
effective target.

I used arm_arch_v4t_multilib instead of arm_arch_v4t because, as I
reported a long time ago
the later does not complain in some unsupported configuration because
the sample effective
target test does not contain actual code. In particular it's not
sufficient to reject thumb-1 with
hard-float.

OK?


No objections from me, but I copied all this from the existing testcase 
ftest-armv4t-thumb.c, so I'm puzzled why that one doesn't fail.



Bernd


Also test -O0 for OpenACC C, C++ offloading test cases

2016-03-22 Thread Thomas Schwinge
Hi!

As discussed in

(and similar to what we're already doing for Fortran, and similar to what
recently got committed to libgomp/testsuite/libgomp.hsa.c/c.exp), it has
been helpful to also run C, C++ offloading test cases with -O0 in
addition to the -O2 default.  Making my earlier gomp-4_0-branch patch
conceptually simpler, I came up with the following; OK for trunk?

commit 879c8f6dcb9dad514fb3bf11c721fed37b6574be
Author: Thomas Schwinge 
Date:   Tue Mar 22 10:26:19 2016 +0100

Also test -O0 for OpenACC C, C++ offloading test cases

libgomp/
* testsuite/libgomp.oacc-c++/c++.exp: Set up torture testing, use
gcc-dg-runtest.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: Specify
-fno-builtin-acc_on_device instead of -O0.
* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Skip for
-O0.
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta-2.c:
Don't specify -O2.
* testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta.c:
Likewise.
---
 libgomp/testsuite/libgomp.oacc-c++/c++.exp | 29 +-
 .../libgomp.oacc-c-c++-common/acc-on-device-2.c|  5 ++--
 .../libgomp.oacc-c-c++-common/acc-on-device.c  |  3 ++-
 .../kernels-alias-ipa-pta-2.c  |  2 +-
 .../kernels-alias-ipa-pta-3.c  |  2 +-
 .../kernels-alias-ipa-pta.c|  2 +-
 .../libgomp.oacc-c-c++-common/loop-auto-1.c|  4 ++-
 .../libgomp.oacc-c-c++-common/loop-dim-default.c   |  6 +++--
 .../testsuite/libgomp.oacc-c-c++-common/loop-g-1.c |  5 ++--
 .../testsuite/libgomp.oacc-c-c++-common/loop-g-2.c |  5 ++--
 .../libgomp.oacc-c-c++-common/loop-gwv-1.c |  5 ++--
 .../libgomp.oacc-c-c++-common/loop-red-g-1.c   |  5 ++--
 .../libgomp.oacc-c-c++-common/loop-red-gwv-1.c |  5 ++--
 .../libgomp.oacc-c-c++-common/loop-red-v-1.c   |  5 ++--
 .../libgomp.oacc-c-c++-common/loop-red-v-2.c   |  5 ++--
 .../libgomp.oacc-c-c++-common/loop-red-w-1.c   |  5 ++--
 .../libgomp.oacc-c-c++-common/loop-red-w-2.c   |  5 ++--
 .../testsuite/libgomp.oacc-c-c++-common/loop-v-1.c |  5 ++--
 .../testsuite/libgomp.oacc-c-c++-common/loop-w-1.c |  5 ++--
 .../libgomp.oacc-c-c++-common/loop-wv-1.c  |  5 ++--
 .../libgomp.oacc-c-c++-common/routine-g-1.c|  5 ++--
 .../libgomp.oacc-c-c++-common/routine-gwv-1.c  |  5 ++--
 .../libgomp.oacc-c-c++-common/routine-v-1.c|  5 ++--
 .../libgomp.oacc-c-c++-common/routine-w-1.c|  5 ++--
 .../libgomp.oacc-c-c++-common/routine-wv-1.c   |  5 ++--
 libgomp/testsuite/libgomp.oacc-c/c.exp | 29 +-
 26 files changed, 111 insertions(+), 56 deletions(-)

diff --git libgomp/testsuite/libgomp.oacc-c++/c++.exp 
libgomp/testsuite/libgomp.oacc-c++/c++.exp
index 88b0269..bbdbe2f 100644
--- libgomp/testsuite/libgomp.oacc-c++/c++.exp
+++ libgomp/testsuite/libgomp.oacc-c++/c++.exp
@@ -2,6 +2,7 @@
 
 load_lib libgomp-dg.exp
 load_gcc_lib gcc-dg.exp
+load_gcc_lib torture-options.exp
 
 global shlib_ext
 
@@ -13,13 +14,9 @@ if [info exists lang_include_flags] then {
 unset lang_include_flags
 }
 
-# If a testcase doesn't have special options, use these.
-if ![info exists DEFAULT_CFLAGS] then {
-set DEFAULT_CFLAGS "-O2"
-}
-
 # Initialize dg.
 dg-init
+torture-init
 
 # Turn on OpenACC.
 lappend AL

Re: [PATCH, C++, PR70290] Fix type checks for vector conditional expr

2016-03-22 Thread Richard Biener
On Mon, Mar 21, 2016 at 11:16 AM, Ilya Enkovich  wrote:
> Hi,
>
> This patch makes an integer vector type to always be used for
> type checks when building a vector conditional expression.
> With no this patch we may get a type of vector comparison
> which may have non-vector mode and different size in case
> of scalar masks usage.
>
> Bootstrapped and regetsted on x86_64-pc-linux-gnu.  OK for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Ilya
> --
> gcc/cp/
>
> 2016-03-21  Ilya Enkovich  
>
> * call.c (build_conditional_expr_1): Always use original
> condition type for vector type checks and build.
>
> gcc/testsuite/
>
> 2016-03-21  Ilya Enkovich  
>
> * g++.dg/ext/pr70290.C: New test.
>
>
> diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> index 1edbce8..d3a256c 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -4634,6 +4634,8 @@ build_conditional_expr_1 (location_t loc, tree arg1, 
> tree arg2, tree arg3,
>
>if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (arg1)))
>  {
> +  tree arg1_type = TREE_TYPE (arg1);
> +
>/* If arg1 is another cond_expr choosing between -1 and 0,
>  then we can use its comparison.  It may help to avoid
>  additional comparison, produce more accurate diagnostics
> @@ -4653,7 +4655,6 @@ build_conditional_expr_1 (location_t loc, tree arg1, 
> tree arg2, tree arg3,
>   || error_operand_p (arg3))
> return error_mark_node;
>
> -  tree arg1_type = TREE_TYPE (arg1);
>arg2_type = TREE_TYPE (arg2);
>arg3_type = TREE_TYPE (arg3);
>
> diff --git a/gcc/testsuite/g++.dg/ext/pr70290.C 
> b/gcc/testsuite/g++.dg/ext/pr70290.C
> new file mode 100644
> index 000..6de13ce
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/pr70290.C
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-mavx512vl" { target { i?86-*-* x86_64-*-* } } } 
> */
> +
> +typedef int vec __attribute__((vector_size(32)));
> +
> +vec
> +test1 (vec x,vec y)
> +{
> +  return (x < y) ? 1 : 0;
> +}
> +
> +vec
> +test2 (vec x,vec y)
> +{
> +  vec zero = { };
> +  vec one = zero + 1;
> +  return (x < y) ? one : zero;
> +}


Re: [PATCH PR69042/01]Add IV candidate for use with constant offset stripped in base.

2016-03-22 Thread Richard Biener
On Tue, Mar 22, 2016 at 11:22 AM, Bin.Cheng  wrote:
> On Wed, Mar 16, 2016 at 10:06 AM, Richard Biener
>  wrote:
>>
>> On Wed, Mar 16, 2016 at 10:48 AM, Bin Cheng  wrote:
>> > Hi,
>> > When I tried to decrease # of IV candidates, I removed code that adds IV 
>> > candidates for use with constant offset stripped in use->base.  This is 
>> > kind of too aggressive and triggers PR69042.  So here is a patch adding 
>> > back the missing candidates.  Honestly, this patch doesn't truly fix the 
>> > issue, it just brings back the original behavior in IVOPT part (Which is 
>> > still a right thing to do I think).  The issue still depends on PIC_OFFSET 
>> > register used on x86 target.  As discussed in 
>> > https://gcc.gnu.org/ml/gcc/2016-02/msg00040.html.  Furthermore, the real 
>> > problem could be in register pressure modeling about PIC_OFFSET symbol in 
>> > IVOPT.
>> >
>> > On AArch64, overall spec2k number isn't changed, though 173.applu is 
>> > regressed by ~4% because couple of loops' # of candidates now hits 
>> > "--param iv-consider-all-candidates-bound=30".  For spec2k6 data on 
>> > AArch64, INT is not affected; FP overall is not changed, as for specific 
>> > case: 459.GemsFDTD is regressed by 2%, 433.milc is improved by 2%.  To 
>> > address the regression, I will send another patch increasing the parameter 
>> > bound.
>> >
>> > Bootstrap&test on x86_64 and AArch64, is it OK?  In the meantime, I will 
>> > collect spec2k6 data on x86_64.
>>
>> Ok.
> Hi Richard,
> Hmm, I got spec2k6 data on my x86_64, it (along with patch increasing
> param iv-consider-all-candidates-bound) causes 1% regression for
> 436.cactusADM in my run.  I looked into the code, for function
> bench_staggeredleapfrog2_ (takes 99% running time after patching),
> IVOPT chooses one fewer candidates for outer loop, but it does result
> in couple of more instructions there.

You mean IVOPTs chooses one fewer IVs for the outer loop?

>  For this case, register
> pressure is a more interesting issue (36 candidates chosen in outer
> loop, many stack accesses), not sure if this 1% regression blocks the
> patch at this stage, or not?

Is this with or without the increase of the param?  What compiler options and
on what sub-architecture was this?

I think if the IVO choice looks optimal before the patch and not optimal after
then it's worth blocking but it sounds like the IVO choice is a mess anyway?
[can you maybe check IV choice by ICC?]

Thanks,
Richard.

> Thanks,
> bin


Re: [PATCH PR69489/02]Handle PHI which can be degenerated to two arguments node in tree ifcvt.

2016-03-22 Thread Richard Biener
On Mon, Mar 21, 2016 at 4:22 PM, Bin Cheng  wrote:
> Hi,
> The second issue revealed by PR69489 is tree ifcvt could not convert PHI 
> nodes with more than 2 arguments.  Among these nodes, there is a special kind 
> of PHI which can be handled.  Precisely, if the PHI node satisfies below two 
> conditions:
>  1) Number of PHI arguments with different values equals to 2 and one 
> argument has the only occurrence.
>  2) The edge corresponding to the unique argument isn't critical edge.
>
>Such PHI can be degenerated and handled just like PHI node with only two 
> arguments.  For example:
>  res = PHI ;
>can be transformed into:
>  res = (predicate of e3) ? A_2 : A_1;
>
> This patch fixes the issue.  I know we may be able to further relax the check 
> and allow handling of general multiple args PHI node, this can be a starter 
> since the change is kind of trivial.
> Bootstrap & test on x86_64 & AArch64.  Though the first part patch at 
> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00888.html needs to be revised, 
> this one is quite independent apart from the test case itself.  So any 
> opinions?

Looks good to me.  Btw, see also PR56541 where jump threading can
introduce the case but with more than two distinct PHI args.
IMHO we "simply" want to force *amy_mask_load_store to true if
if_convertible_phi_p runs into this case (so we perform
versioning to only expose the if-converted code to the vectorizer
which has a cost model to tell whether the result is profitable).
There is still the critical edge splitting only performed for
aggressive-if-conv but I think that's easily sth we can do for all
loop
bodies.

Richard.

> Thanks,
> bin
>
> 2016-03-21  Bin Cheng  
>
> PR tree-optimization/69489
> * tree-if-conv.c (phi_convertible_by_degenerating_args): New.
> (if_convertible_phi_p): Call phi_convertible_by_degenerating_args.
> Revise dump message.
> (if_convertible_bb_p): Remove check on edge count of basic block's
> predecessors.
>
> gcc/testsuite/ChangeLog
> 2016-03-21  Bin Cheng  
>
> PR tree-optimization/69489
> * gcc.dg/tree-ssa/ifc-pr69489-2.c: New test.


Re: [PATCH] Skip static ctors/dtors in IPA ICF (PR ipa/70306)

2016-03-22 Thread Jakub Jelinek
On Tue, Mar 22, 2016 at 11:24:34AM +0100, Martin Liška wrote:
> On 03/21/2016 07:20 PM, Jan Hubicka wrote:
> > OK, (it woudl make more sense to turn them into wrappers that can be easily
> > done, too, but we can do that next stage1)
> > thanks!
> > 
> > Honza
> 
> Sure, will do that in next stage1.
> I've just bootstrapped and regtested the same patch on GCC-5 branch
> w/o observing any regression.
> 
> May I install that to the branch?

Sure, thanks.

Jakub


Re: [PATCH] Skip static ctors/dtors in IPA ICF (PR ipa/70306)

2016-03-22 Thread Martin Liška
On 03/21/2016 07:20 PM, Jan Hubicka wrote:
> OK, (it woudl make more sense to turn them into wrappers that can be easily
> done, too, but we can do that next stage1)
> thanks!
> 
> Honza

Sure, will do that in next stage1.
I've just bootstrapped and regtested the same patch on GCC-5 branch
w/o observing any regression.

May I install that to the branch?
Martin


Re: [PATCH PR69042/01]Add IV candidate for use with constant offset stripped in base.

2016-03-22 Thread Bin.Cheng
On Wed, Mar 16, 2016 at 10:06 AM, Richard Biener
 wrote:
>
> On Wed, Mar 16, 2016 at 10:48 AM, Bin Cheng  wrote:
> > Hi,
> > When I tried to decrease # of IV candidates, I removed code that adds IV 
> > candidates for use with constant offset stripped in use->base.  This is 
> > kind of too aggressive and triggers PR69042.  So here is a patch adding 
> > back the missing candidates.  Honestly, this patch doesn't truly fix the 
> > issue, it just brings back the original behavior in IVOPT part (Which is 
> > still a right thing to do I think).  The issue still depends on PIC_OFFSET 
> > register used on x86 target.  As discussed in 
> > https://gcc.gnu.org/ml/gcc/2016-02/msg00040.html.  Furthermore, the real 
> > problem could be in register pressure modeling about PIC_OFFSET symbol in 
> > IVOPT.
> >
> > On AArch64, overall spec2k number isn't changed, though 173.applu is 
> > regressed by ~4% because couple of loops' # of candidates now hits "--param 
> > iv-consider-all-candidates-bound=30".  For spec2k6 data on AArch64, INT is 
> > not affected; FP overall is not changed, as for specific case: 459.GemsFDTD 
> > is regressed by 2%, 433.milc is improved by 2%.  To address the regression, 
> > I will send another patch increasing the parameter bound.
> >
> > Bootstrap&test on x86_64 and AArch64, is it OK?  In the meantime, I will 
> > collect spec2k6 data on x86_64.
>
> Ok.
Hi Richard,
Hmm, I got spec2k6 data on my x86_64, it (along with patch increasing
param iv-consider-all-candidates-bound) causes 1% regression for
436.cactusADM in my run.  I looked into the code, for function
bench_staggeredleapfrog2_ (takes 99% running time after patching),
IVOPT chooses one fewer candidates for outer loop, but it does result
in couple of more instructions there.  For this case, register
pressure is a more interesting issue (36 candidates chosen in outer
loop, many stack accesses), not sure if this 1% regression blocks the
patch at this stage, or not?

Thanks,
bin


Re: [RFA][PATCH] Adding missing calls to bitmap_clear

2016-03-22 Thread Richard Biener
On Mon, Mar 21, 2016 at 9:32 PM, Jeff Law  wrote:
> On 03/21/2016 01:10 PM, Bernd Schmidt wrote:
>>
>> On 03/21/2016 08:06 PM, Jeff Law wrote:
>>>
>>>
>>> As noted last week, find_removable_extensions initializes several
>>> bitmaps, but doesn't clear them.
>>>
>>> This is not strictly a leak as the GC system should find dead data, but
>>> it's better to go ahead and clear the bitmaps.  That releases the
>>> elements back to the cache and presumably makes things easier for the GC
>>> system as well.
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu.
>>>
>>> OK for the trunk?
>>
>>
>> Looks like they don't leak anywhere, so ok. Probably ok even to install
>> it now but maybe stage1 would be better timing.
>
> I don't mind waiting for the next stage1, this is a pretty minor issue.

It's ok at this stage as it will also fix -fmem-report.  Please also move
the thing back to heap, see below.

Btw we should disallow bitmap_initialize (&x, NULL) as it does not do
the same thing as BITMAP_ALLOC (NULL), it does the same thing
as BITMAP_ALLOC_GC ().  Thus I'd rather have a bitmap_initialize_gc (&x)
and a bitmap_initialize (&x, NULL) that ends up using the global
bitmap obstack.  No idea where REE came from history wise.

A grep shows only

ira.c:  bitmap_initialize (&seen_insns, NULL);
ree.c:  bitmap_initialize (&init, NULL);
ree.c:  bitmap_initialize (&kill, NULL);
ree.c:  bitmap_initialize (&gen, NULL);
ree.c:  bitmap_initialize (&tmp, NULL);

btw, so please consider simply changing bitmap_initialize behavior.  The IRA
use also should use the global bitmap obstack as users around that use
use BITMAP_ALLOC (NULL).  [use a default arg for 'obstack' if possible,
you have to verify it works with/without --enable-gather-detailed-mem-stats]

Thanks,
Richard.

> jeff


Re: Fix 70278 (LRA split_regs followup patch)

2016-03-22 Thread Christophe Lyon
On 18 March 2016 at 17:51, Jeff Law  wrote:
> On 03/18/2016 06:25 AM, Bernd Schmidt wrote:
>>
>> This fixes an oversight in my previous patch here. I used biggest_mode
>> in the assumption that if the reg was used in the function, it would be
>> set to something other than VOIDmode, but that fails if we have a
>> multiword access - only the first hard reg gets its biggest_mode
>> assigned in that case.
>>
>> Bootstrapped and tested on x86_64-linux, ran (just) the new arm testcase
>> manually with arm-eabi. Ok?
>>
>> (The testcase seems to be from glibc. Do we keep the copyright notices
>> on the reduced form)?
>
> I don't recall specific guidance on including the copyright notice on a
> reduced/derived test.
>
> Given the actual copyright on the original code, ISTM the safest thing to do
> is keep the notice intact.
>
> A long long time ago I receive guidance from the FSF WRT what could be
> included in the testsuite -- unfortunately I didn't keep that message. I
> probably should have.
>
>>
>> Bernd
>>
>> 70278.diff
>>
>>
>> PR rtl-optimization/70278
>> * lra-constraints.c (split_reg): Handle the case where
>> biggest_mode is
>> VOIDmode.
>>
>> testsuite/
>> * gcc.dg/torture/pr70278.c: New test.
>> * gcc.target/arm/pr70278.c: New test.
>

The ARM test isn't sufficiently protected against non-compliant configurations,
and fails if GCC is configured for arm*linux-gnueabihf for instance
(see 
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/234342/report-build-info.html)

The attached small patch fixes that by requiring arm_arch_v4t_multilib
effective target.

I used arm_arch_v4t_multilib instead of arm_arch_v4t because, as I
reported a long time ago
the later does not complain in some unsupported configuration because
the sample effective
target test does not contain actual code. In particular it's not
sufficient to reject thumb-1 with
hard-float.

OK?

Thanks,

Christophe.

> OK.
> jeff
>
2016-03-22  Christophe Lyon  

* gcc.target/arm/pr70278.c: Require arm_arch_v4t_multilib
effective target.
diff --git a/gcc/testsuite/gcc.target/arm/pr70278.c 
b/gcc/testsuite/gcc.target/arm/pr70278.c
index c44c07b..889f626 100644
--- a/gcc/testsuite/gcc.target/arm/pr70278.c
+++ b/gcc/testsuite/gcc.target/arm/pr70278.c
@@ -2,6 +2,7 @@
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } 
{ "-march=armv4t" } } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { 
"" } } */
 /* { dg-options "-mthumb" } */
+/* { dg-require-effective-target arm_arch_v4t_multilib } */
 /* { dg-add-options arm_arch_v4t } */
 /*
  * 


Re: [PATCH, i386, AVX-512] Fix PR target/70325.

2016-03-22 Thread Kirill Yukhin
Hi Uroš.
On 22 Mar 09:19, Uros Bizjak wrote:
> OK with a suitable comment describing the reason for special handling.
Thanks!

> BTW: Looking through the builtins, I noticed that some builtin
> descriptions contains duplicated flags (please see attached
> pseudo-patch). Looks like typos to me, but please review this
> situation, if everything is OK.
This is difenetely a typo. Will fix as obvious.
I did few greps and cases you catched are the only ones.
> 
> Uros.

--
Thanks, K


Re: [PATCH, i386, AVX-512] Fix PR target/70325.

2016-03-22 Thread Uros Bizjak
On Mon, Mar 21, 2016 at 3:00 PM, Kirill Yukhin  wrote:
> Hello,
> 1s in mask in i386.c/builtin_description enables
> built-ins for corresponding bits.
> So, actually if there're 2 1s in it - any bit set
> enables built-in.
>
> AVX-512VL exploits mask in opposite way.
> E.g.:
>   { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
> CODE_FOR_avx512vl_loaddquv16hi_mask, "__builtin_ia32_loaddqu
> hi256_mask", IX86_BUILTIN_LOADDQUHI256_MASK, UNKNOWN, (int) 
> V16HI_FTYPE_PCV16HI_V16HI_UHI },
>
> This means that built-in enabled if *both* bits are set to 1.
>
> So, I've added special handling for OPTION_MASK_ISA_AVX512VL
> into i386.c/def_builtin.
>
> Bootstrapped and regtested.
>
> Richard,
> is it ok for main trunk?
>
> PR target/70325
> gcc/
> * config/i386/i386.c (def_builtin): Handle
> OPTION_MASK_ISA_AVX512VL to be and-ed with other
> bits.
> gcc/testsuite/
> * gcc.target/i386/pr70325.c: New test.

OK with a suitable comment describing the reason for special handling.

BTW: Looking through the builtins, I noticed that some builtin
descriptions contains duplicated flags (please see attached
pseudo-patch). Looks like typos to me, but please review this
situation, if everything is OK.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3d8dbc4..bce0c8b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -34094,9 +34094,9 @@ static const struct builtin_description bdesc_args[] =
   { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_permvarv16hi_mask, "__builtin_ia32_permvarhi256_mask", 
IX86_BUILTIN_VPERMVARHI256_MASK, UNKNOWN, (int) 
V16HI_FTYPE_V16HI_V16HI_V16HI_UHI },
   { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_permvarv8hi_mask, "__builtin_ia32_permvarhi128_mask", 
IX86_BUILTIN_VPERMVARHI128_MASK, UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI },
   { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_vpermt2varv16hi3_mask, "__builtin_ia32_vpermt2varhi256_mask", 
IX86_BUILTIN_VPERMT2VARHI256, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI 
},
-  { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512BW | 
OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_vpermt2varv16hi3_maskz, 
"__builtin_ia32_vpermt2varhi256_maskz", IX86_BUILTIN_VPERMT2VARHI256_MASKZ, 
UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI },
+  { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_vpermt2varv16hi3_maskz, 
"__builtin_ia32_vpermt2varhi256_maskz", IX86_BUILTIN_VPERMT2VARHI256_MASKZ, 
UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI },
   { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_vpermt2varv8hi3_mask, "__builtin_ia32_vpermt2varhi128_mask", 
IX86_BUILTIN_VPERMT2VARHI128, UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI },
-  { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512BW | 
OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_vpermt2varv8hi3_maskz, 
"__builtin_ia32_vpermt2varhi128_maskz", IX86_BUILTIN_VPERMT2VARHI128_MASKZ, 
UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI },
+  { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_vpermt2varv8hi3_maskz, 
"__builtin_ia32_vpermt2varhi128_maskz", IX86_BUILTIN_VPERMT2VARHI128_MASKZ, 
UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI },
   { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_vpermi2varv16hi3_mask, "__builtin_ia32_vpermi2varhi256_mask", 
IX86_BUILTIN_VPERMI2VARHI256, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI 
},
   { OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_vpermi2varv8hi3_mask, "__builtin_ia32_vpermi2varhi128_mask", 
IX86_BUILTIN_VPERMI2VARHI128, UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI },
   { OPTION_MASK_ISA_AVX512VL, CODE_FOR_rcp14v4df_mask, 
"__builtin_ia32_rcp14pd256_mask", IX86_BUILTIN_RCP14PD256, UNKNOWN, (int) 
V4DF_FTYPE_V4DF_V4DF_UQI },
@@ -34811,9 +34811,9 @@ static const struct builtin_description bdesc_args[] =
   { OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_permvarv32qi_mask, "__builtin_ia32_permvarqi256_mask", 
IX86_BUILTIN_VPERMVARQI256_MASK, UNKNOWN, (int) 
V32QI_FTYPE_V32QI_V32QI_V32QI_USI },
   { OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_permvarv16qi_mask, "__builtin_ia32_permvarqi128_mask", 
IX86_BUILTIN_VPERMVARQI128_MASK, UNKNOWN, (int) 
V16QI_FTYPE_V16QI_V16QI_V16QI_UHI },
   { OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl_vpermt2varv32qi3_mask, "__builtin_ia32_vpermt2varqi256_mask", 
IX86_BUILTIN_VPERMT2VARQI256, UNKNOWN, (int) V32QI_FTYPE_V32QI_V32QI_V32QI_USI 
},
-  { OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512VBMI | 
OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_vpermt2varv32qi3_maskz, 
"__builtin_ia32_vpermt2varqi256_maskz", IX86_BUILTIN_VPERMT2VARQI256_MASKZ, 
UNKNOWN, (int) V32QI_FTYPE_V32QI_V32QI_V32QI_USI },
+  { OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512VL, 
CODE_FOR_avx512vl

Re: [PATCH] Fix V64QImode multiplication with AVX512BW (PR target/70329)

2016-03-22 Thread Kirill Yukhin

Hi Jakub!
On 21 Mar 21:16, Jakub Jelinek wrote:
> The ix86_expand_vecop_qihi function has been adjusted for AVX512* just
> by changing i < 32 to i < 64 (where both were sometimes wasteful), but
> for !full_interleave that is even wrong, swapping the second and third
> quarter is something that works to undo AVX256 unpacks only,
> where we want
> 0,2,4,6,8,10,12,14,32,34,36,38,40,42,44,46,16,18,20,22,24,26,28,30,48,50,52,54,56,58,60,62,
> permutation.  But, for AVX512 we want
> 0,2,4,6,8,10,12,14,64,66,68,70,72,74,76,78,16,18,20,22,24,26,28,30,80,82,84,86,88,90,92,94,32,34,36,38,40,42,44,46,96,98,100,102,104,106,108,110,48,50,52,54,56,58,60,62,112,114,116,118,120,122,124,126
> where the current trunk code has been producing
> 0,2,4,6,8,10,12,14,32,34,36,38,40,42,44,46,16,18,20,22,24,26,28,30,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,96,98,100,102,104,106,108,110,80,82,84,86,88,90,92,94,112,114,116,118,120,122,124,126
> instead.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

Your putch is OK.
I'd only suggest to add a comment to this calculation:
+   d.perm[i] = ((i * 2) & 14) + ((i & 8) ? d.nelt : 0) + (i & ~15);

--
Thanks, K


[PING**2] [PATCH, libstdc++] Add missing free-standing headers to install rule

2016-03-22 Thread Bernd Edlinger
Hi,

I am pinging for this patch, which addresses an admittedly minor regression
for free-standing libstdc++ due to changed c++11 default settings.  The proposed
patch does only change the free-standing install rule, and has therefore no 
impact
on other configurations. 

https://gcc.gnu.org/ml/libstdc++/2016-03/msg4.html


Thanks
Bernd.