Re: [patch, libfortran] Add AVX-specific matmul

2016-11-16 Thread Janne Blomqvist
On Thu, Nov 17, 2016 at 9:41 AM, Thomas Koenig  wrote:
> Am 17.11.2016 um 00:20 schrieb Jakub Jelinek:
>>
>> On Thu, Nov 17, 2016 at 12:03:18AM +0100, Thomas Koenig wrote:

 Don't you need to test in configure if the assembler supports AVX?
 Otherwise if somebody is bootstrapping gcc with older assembler, it will
 just fail to bootstrap.
>>>
>>>
>>> That's a good point.  The AVX instructions were added in binutils 2.19,
>>> which was released in 2011. This could be put in the prerequisites.
>>>
>>> What should the test do?  Fail with an error message "you need newer
>>> binutils" or simply (and silently) not compile the AVX vesion?
>>
>>
>>> From what I understood, you want those functions just to be
>>> implementation
>>
>> details, not exported from libgfortran.so*.  Thus the test would do
>> something similar to what gcc/testsuite/lib/target-supports.exp
>> (check_effective_target_avx)
>> does, but of course in autoconf way, not in tcl.
>
>
> OK, that looks straightworward enough. I'll give it a shot.
>
>> Also, from what I see, target_clones just use IFUNCs, so you probably also
>> need some configure test whether ifuncs are supported (the
>> gcc.target/i386/mvc* tests use dg-require-ifunc, so you'd need something
>> similar again in configure.  But if so, then I have no idea why you use
>> a wrapper around the function, instead of using it on the exported APIs.
>
>
> As you wrote above, I wanted this as an implementation detail. I also
> wanted the ability to be able to add new instruction sets without
> breaking the ABI.
>
> Because the caller generates the ifunc, using a wrapper function seemed
> like the best way to do it.  The overhead is neglible (the function
> is one simple jump), especially considering that we only call the
> library function for larger matrices.
>
 For matmul_i*, wouldn't it make more sense to use avx2 instead of avx,
 or both avx and avx2 and maybe avx512f?
>>>
>>>
>>> I did a vdiff of the disassembled code generated or avx and avx2, and
>>> (somewhat to my surprise) there was no difference.  Maybe, with more
>>> unrolling, something more might have happened. I didn't check for
>>> AVX512f, but I can do that.
>>
>>
>> For the float/double code it wouldn't surprise me (assuming you don't need
>> gather insns and similar stuff).  But for integers generally most of the
>> avx instructions can only handle 128-bit vectors, while avx2 has 256-bit
>> ones,
>
>
> You're right - integer multiplication looks different.
>
> Nobody I know cares about integer matrix multiplication
> speed, whereas real has gotten a _lot_ of attention over
> the decades.  So, putting in AVX will make the code run
> faster on more machines, while putting in AVX2 will
> (IMHO) bloat the library for no good reason.  However,
> I am willing to stand corrected on this. Putting in AVX512f
> makes sense.
>
> I have also been trying to get target_clones to work on POWER
> to get Altivec instructions, but to no avail. I also cannot
> find any examples in the testsuite.
>
> Since a lot of supercomputers use POWER nodes, that might also
> be attractive.
>
> Regards
>
> Thomas

Hi,

In order to reduce bloat, might it make sense to make the core blocked
gemm algorithm that Jerry committed a few days ago into a separate
static function, and then only do the target_clone stuff for that one?
The rest of the matmul function deals with all kinds of stuff like
setup, handling non-stride-1 cases, calling the external gemm function
for -fexternal-blas etc., none of which vectorizes anyway so
generating different versions of this code using different vector
instructions looks like a waste?

In that case I guess one could add the avx2 variant as well on the odd
chance that somebody for some reason cares about integer matmul.

-- 
Janne Blomqvist


Re: [patch, libfortran] Add AVX-specific matmul

2016-11-16 Thread Thomas Koenig

Am 17.11.2016 um 00:20 schrieb Jakub Jelinek:

On Thu, Nov 17, 2016 at 12:03:18AM +0100, Thomas Koenig wrote:

Don't you need to test in configure if the assembler supports AVX?
Otherwise if somebody is bootstrapping gcc with older assembler, it will
just fail to bootstrap.


That's a good point.  The AVX instructions were added in binutils 2.19,
which was released in 2011. This could be put in the prerequisites.

What should the test do?  Fail with an error message "you need newer
binutils" or simply (and silently) not compile the AVX vesion?



From what I understood, you want those functions just to be implementation

details, not exported from libgfortran.so*.  Thus the test would do
something similar to what gcc/testsuite/lib/target-supports.exp 
(check_effective_target_avx)
does, but of course in autoconf way, not in tcl.


OK, that looks straightworward enough. I'll give it a shot.


Also, from what I see, target_clones just use IFUNCs, so you probably also
need some configure test whether ifuncs are supported (the
gcc.target/i386/mvc* tests use dg-require-ifunc, so you'd need something
similar again in configure.  But if so, then I have no idea why you use
a wrapper around the function, instead of using it on the exported APIs.


As you wrote above, I wanted this as an implementation detail. I also
wanted the ability to be able to add new instruction sets without
breaking the ABI.

Because the caller generates the ifunc, using a wrapper function seemed
like the best way to do it.  The overhead is neglible (the function
is one simple jump), especially considering that we only call the
library function for larger matrices.


For matmul_i*, wouldn't it make more sense to use avx2 instead of avx,
or both avx and avx2 and maybe avx512f?


I did a vdiff of the disassembled code generated or avx and avx2, and
(somewhat to my surprise) there was no difference.  Maybe, with more
unrolling, something more might have happened. I didn't check for
AVX512f, but I can do that.


For the float/double code it wouldn't surprise me (assuming you don't need
gather insns and similar stuff).  But for integers generally most of the
avx instructions can only handle 128-bit vectors, while avx2 has 256-bit
ones,


You're right - integer multiplication looks different.

Nobody I know cares about integer matrix multiplication
speed, whereas real has gotten a _lot_ of attention over
the decades.  So, putting in AVX will make the code run
faster on more machines, while putting in AVX2 will
(IMHO) bloat the library for no good reason.  However,
I am willing to stand corrected on this. Putting in AVX512f
makes sense.

I have also been trying to get target_clones to work on POWER
to get Altivec instructions, but to no avail. I also cannot
find any examples in the testsuite.

Since a lot of supercomputers use POWER nodes, that might also
be attractive.

Regards

Thomas


Re: PR78319

2016-11-16 Thread Prathamesh Kulkarni
On 17 November 2016 at 03:20, Jeff Law  wrote:
> On 11/16/2016 01:23 PM, Prathamesh Kulkarni wrote:
>>
>> Hi,
>> As discussed in PR, this patch marks the test-case to xfail on
>> arm-none-eabi.
>> OK to commit ?
>
> You might check if Aldy's change to the uninit code helps your case
> (approved earlier today, so hopefully in the tree very soon).  I quickly
> scanned the BZ.  There's some overlap, but it might be too complex for
> Aldy's enhancements to catch.
Hi Jeff,
I tried Aldy's patch [1], but it didn't catch the case in PR78319.

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00225.html

Thanks,
Prathamesh
>
> jeff


Re: Fix PR78154

2016-11-16 Thread Jeff Law

On 11/16/2016 05:17 PM, Martin Sebor wrote:



(I've heard some noise in C++-land about making memcpy(0,0,0) valid, but
that may have just been noise)


We may have read the same discussion.  It would make some things
a little easier in C++ (and remove what most people view as yet
another unnecessary gotcha in the language).

And that may be a reasonable thing to do.

While GCC does take advantage of the non-null attribute when trying to 
prove certain pointers must be non-null, it only does so when the magic 
flag is turned on.  There was a sense that it was too aggressive and 
that time may be necessary for folks to come to terms with what GCC was 
doing, particularly in the the memcpy (*, *, 0) case -- but I've never 
gotten the sense that happened and we've never turned that flag on by 
default.


jeff



Re: [PATCH] Fix PR77848

2016-11-16 Thread Bill Schmidt
On 11/16/16 9:08 AM, Richard Biener wrote:

> On Tue, Nov 15, 2016 at 9:03 PM, Bill Schmidt
>  wrote:
>> -  if ((any_pred_load_store || any_complicated_phi)
>> -  && !version_loop_for_if_conversion (loop))
>> +  /* Since we have no cost model, always version loops if vectorization
>> + is enabled.  Either version this loop, or if the pattern is right
>> + for outer-loop vectorization, version the outer loop.  In the
>> + latter case we will still if-convert the original inner loop.  */
>> +  /* FIXME: When SLP vectorization can handle if-conversion on its own,
>> + predicate all of if-conversion on flag_tree_loop_vectorize.  */
>> +  if ((any_pred_load_store || any_complicated_phi || 
>> flag_tree_loop_vectorize)
> I'd say given fun->has_force_vectorize_loops this should be
>
>if (flag_tree_loop_if_convert != 1
>> +  && !version_loop_for_if_conversion
>> +  (versionable_outer_loop_p (loop_outer (loop))
>> +   ? loop_outer (loop) : loop))
>>  goto cleanup;
> and thus always version if the user didnt' specify -ftree-loop-if-convert
> (-ftree-loop-if-convert-stores is dead, I forgot to remove uses).
>
> Can you as a first patch (after fixing the minor things above) commit
> the patch w/o changing the condition under which we version
> (but _do_ version the outer loop if possible?).  This should be a strict
> improvement enabling more outer loop vectorization.

Done and committed.  Thanks!

>
> The 2nd patch can then fix the PR and change the condition.
>
> Thus, ok with the nits fixed and the condition unchanged.
>
> Can you re-test the 2nd part with my suggested changed predicate?

Yes, the new predicate works fine.  New patch below, bootstrapped and tested
on powerpc64le-unknown-linux-gnu, with only the bb-slp-cond-1.c regressions
previously discussed.  Is this ok for trunk?

Thanks,
Bill

>
> Thanks,
> Richard.
>
[gcc]

2016-11-16  Bill Schmidt  
Richard Biener  

PR tree-optimization/77848
* tree-if-conv.c (tree_if_conversion): Always version loops unless
the user specified -ftree-loop-if-convert.

[gcc/testsuite]

2016-11-16  Bill Schmidt  
Richard Biener  

PR tree-optimization/77848
* gfortran.dg/vect/pr77848.f: New test.


Index: gcc/testsuite/gfortran.dg/vect/pr77848.f
===
--- gcc/testsuite/gfortran.dg/vect/pr77848.f(revision 0)
+++ gcc/testsuite/gfortran.dg/vect/pr77848.f(working copy)
@@ -0,0 +1,24 @@
+! PR 77848: Verify versioning is on when vectorization fails
+! { dg-do compile }
+! { dg-options "-O3 -ffast-math -fdump-tree-ifcvt -fdump-tree-vect-details" }
+
+  subroutine sub(x,a,n,m)
+  implicit none
+  real*8 x(*),a(*),atemp
+  integer i,j,k,m,n
+  real*8 s,t,u,v
+  do j=1,m
+ atemp=0.d0
+ do i=1,n
+if (abs(a(i)).gt.atemp) then
+   atemp=a(i)
+   k = i
+end if
+ enddo
+ call dummy(atemp,k)
+  enddo
+  return
+  end
+
+! { dg-final { scan-tree-dump "LOOP_VECTORIZED" "ifcvt" } }
+! { dg-final { scan-tree-dump "vectorized 0 loops in function" "vect" } }
Index: gcc/tree-if-conv.c
===
--- gcc/tree-if-conv.c  (revision 242521)
+++ gcc/tree-if-conv.c  (working copy)
@@ -2803,10 +2803,12 @@ tree_if_conversion (struct loop *loop)
  || loop->dont_vectorize))
 goto cleanup;
 
-  /* Either version this loop, or if the pattern is right for outer-loop
- vectorization, version the outer loop.  In the latter case we will
- still if-convert the original inner loop.  */
-  if ((any_pred_load_store || any_complicated_phi)
+  /* Since we have no cost model, always version loops unless the user
+ specified -ftree-loop-if-convert.  Either version this loop, or if
+ the pattern is right for outer-loop vectorization, version the
+ outer loop.  In the latter case we will still if-convert the
+ original inner loop.  */
+  if (flag_tree_loop_if_convert != 1
   && !version_loop_for_if_conversion
   (versionable_outer_loop_p (loop_outer (loop))
? loop_outer (loop) : loop))



Re: [PATCH,gcc/MIPS] Make loongson3a use fused madd.d

2016-11-16 Thread Paul Hua
ping...

On Thu, Nov 3, 2016 at 7:58 PM, Paul Hua  wrote:
> Hi Matthew,
>
> Thanks for your comments, update the patch.
>
> *** gcc/ChangeLog ***
>
> 2016-11-03 Chenghua Xu 
>
> * config/mips/mips.h (ISA_HAS_FUSED_MADD4): Enable for
> TARGET_LOONGSON_3A.
> (ISA_HAS_UNFUSED_MADD4): Exclude TARGET_LOONGSON_3A.
>
> Thanks,
> Paul
>
> On Thu, Nov 3, 2016 at 6:31 PM, Matthew Fortune
>  wrote:
>> Paul Hua  writes:
>>> Loongson3a has 4 operand fused madd instrcution. This patch set
>>> loongson3a use fused madd.d.
>>
>> Hi Paul,
>>
>> Thanks for the fix. I was vaguely aware that this was wrong for
>> loongson-3a but never confirmed it.
>>
>> I suspect this change is mechanical enough that it can bypass
>> copyright assignment but I'd need a global maintainer to comment.
>>
>> I've sent you copyright assignment paperwork separately.
>>
>> Two comments on the patch:
>>
>>> ChangeLog :
>>>
>>> *** gcc/ChangeLog ***
>>>
>>> 2016-11-03 Chenghua Xu 
>>>
>>> config/mips/
>>> * mips.h: Set loongson3a use fused madd.d.
>>
>> The changelog needs to reference what was changed rather than the
>> effect of the change:
>>
>> * config/mips/mips.h (ISA_HAS_FUSED_MADD4): Enable for
>> TARGET_LOONGSON_3A.
>> (ISA_HAS_UNFUSED_MADD4): Exclude TARGET_LOONGSON_3A.
>>
>>
>>>diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
>>>index 81862a9..5076a2b 100644
>>>--- a/gcc/config/mips/mips.h
>>>+++ b/gcc/config/mips/mips.h
>>>@@ -1056,11 +1056,11 @@ struct mips_cpu_info {
>>>
>>> /* ISA has 4 operand fused madd instructions of the form
>>>'d = [+-] (a * b [+-] c)'.  */
>>>-#define ISA_HAS_FUSED_MADD4   TARGET_MIPS8000
>>>+#define ISA_HAS_FUSED_MADD4   (TARGET_MIPS8000 || TARGET_LOONGSON_3A)
>>>
>>> /* ISA has 4 operand unfused madd instructions of the form
>>>'d = [+-] (a * b [+-] c)'.  */
>>>-#define ISA_HAS_UNFUSED_MADD4 (ISA_HAS_FP4 && !TARGET_MIPS8000)
>>>+#define ISA_HAS_UNFUSED_MADD4 (ISA_HAS_FP4 && !TARGET_MIPS8000 && 
>>>!TARGET_LOONGSON_3A)
>>
>> Please split this line and move && !TARGET_LOONGSON_3A to the next line
>> under ISA_HAS_FP4.
>>
>>>
>>> /* ISA has 3 operand r6 fused madd instructions of the form
>>>'c = c [+-] (a * b)'.  */
>>
>> Thanks,
>> Matthew
>>


Re: c-family PATCH to tidy switch diagnostics (PR c/78285)

2016-11-16 Thread Joseph Myers
On Wed, 16 Nov 2016, Marek Polacek wrote:

> As pointed out in Bug 78285, some error calls should actually be inform calls.
> I'm not adding any new test; existing switch-5.c covers all the cases so I 
> didn't
> see much value in duplicating that part of the test.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix NetBSD bootstrap

2016-11-16 Thread Joseph Myers
I'll presume you know best about the choices of stdint.h types.  You may 
wish to consider what the correct value of use_gcc_stdint is - the 
default "none" (rely on the system's header), or "wrap" (use GCC's header 
in freestanding mode) or "provide" (always use GCC's header).

Note that GCC's header includes support for TS 18661-1 integer width 
macros, and the testsuite verifies these work in freestanding mode.  So if 
you use "none" but your system's header lacks support for these macros, 
you'll have test failures.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Fix PR78154

2016-11-16 Thread Martin Sebor

On 11/16/2016 05:17 PM, Martin Sebor wrote:

On 11/16/2016 02:21 PM, Marc Glisse wrote:

On Wed, 16 Nov 2016, Martin Sebor wrote:


On 11/16/2016 11:49 AM, Prathamesh Kulkarni wrote:

Hi Richard,
Following your suggestion in PR78154, the patch checks if stmt
contains call to memmove (and friends) in gimple_stmt_nonzero_warnv_p
and returns true in that case.


Nice.  I think the list should also include mempcpy, stpcpy, and
stpncpy, and probably also the corresponding checking built-ins
such as __builtin___memcpy_chk.

FWIW, a more general solution to consider (possibly for GCC 8)
might be to extend attribute nonnull to apply to a functions return
value as well (e.g., use zero as the index for that), to indicate
that a pointer returned from it is not null.  That would let library
implementers annotate other functions (such as strerror)


We already have that, under the name returns_nonnull. IIRC, people found
a new name clearer than using position 0, when I posted the patch. Note
also that memcpy already has both an attribute that says that it returns
its first argument, and an attribute that says that said first argument
is nonnull.


Ah, right!  Thanks for the reminder!

__builtin_memcpy and __builtin_strcpy are both declared with
the same attribute list ATTR_RET1_NOTHROW_NONNULL_LEAF (as are
their checking versions) and that's defined like so:

  DEF_ATTR_TREE_LIST (ATTR_RET1_NOTHROW_NONNULL_LEAF, ATTR_FNSPEC,
ATTR_LIST_STR1, ATTR_NOTHROW_NONNULL_LEAF)

ATTR_NOTHROW_NONNULL_LEAF doesn't include returns_nonull and neither
does ATTR_FNSPEC ATTR_LIST_STR1 (unless it's meant to imply that).

In any event, lookup_attribute ("returns_nonnull") fails for both
of these functions so I think the fix might be as "simple" as adding
the attribute.  Alternatively, if attribute fn spec str1 should imply
returns_nonnull when nonnull is set because it says (IIUC) that
the function returns the first (non-null) argument, the changes will
be a bit more involved and require adjusting other places besides
VRP (I see it used in fold-const.c as well similarly as in VRP).


I should have mentioned: when fully implemented, the test case
will pass even without VRP or without optimization.  A test for
the VRP bits will need to save the return value in a variable
and then use it (otherwise the check for memcpy(...) == 0 will
have already been optimized away by fold-const.c.



(FWIW, I quoted "simple" above because it recently took me the better
part of an afternoon to figure out how to add attribute alloc_size to
malloc.)



(I've heard some noise in C++-land about making memcpy(0,0,0) valid, but
that may have just been noise)


We may have read the same discussion.  It would make some things
a little easier in C++ (and remove what most people view as yet
another unnecessary gotcha in the language).

Martin




Re: ubsan PATCH to fix compile-time hog with operator overloading (PR sanitizer/78208)

2016-11-16 Thread Marek Polacek
On Fri, Nov 04, 2016 at 05:16:00PM +0100, Jakub Jelinek wrote:
> On Fri, Nov 04, 2016 at 05:05:51PM +0100, Marek Polacek wrote:
> > This is a similar case to PR sanitizer/70342.  Here, we were generating 
> > expression
> > in a quadratic fashion because of the initializer--we create SAVE_EXPR <>, 
> > then 
> > UBSAN_NULL >, and then COMPOUND_EXPR of these two and so on.
> > 
> > On this testcase we were instrumention CALL_EXPR that is in fact 
> > operator<<.  I
> > think those always return a reference, so it cannot be NULL, so there's no
> > point in instrumenting those?
> 
> How do you know what is the return type of a user defined overloaded
> operator?
> Even when it returns a reference, I thought the point of -fsanitize=null was
> for all references to verify their addresses are non-null.
> 
> I must say I don't understand the issue, if it is the same SAVE_EXPR in both
> lhs of COMPOUND_EXPR and UBSAN_NULL argument, shouldn't cp_genericize_r use
> of hash table to avoid walking the same tree multiple times avoid the
> exponential compile time/memory?

Sorry.  So consider the following:

class S
{
  virtual void foo () = 0;
};

struct T {
  T  << (const char *s);
};

T t;

void
S::foo ()
{
  t << "a" << "b" << "c";
}

Before
1498   if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT))
1499 ubsan_maybe_instrument_member_call (stmt, is_ctor);

the stmt will be

T::operator<< (T::operator<< (T::operator<< (, "a"), "b"), "c")

after ubsan_maybe_instrument_member_call it will be

T::operator<< (UBSAN_NULL (SAVE_EXPR , 4B, 0);, SAVE_EXPR ;, "c")

and that's what is saved into the hash table.  Then another stmt will be the
inner

T::operator<< (T::operator<< (, "a"), "b")

which we instrument and put into the hash table, and so on.  But those
SAVE_EXPRs aren't the same.  So we have a T::operator<< call that has nested
T::operator<< calls and we kind of recursively instrument all of them.

Not sure if I made this any clearer nor if this can be avoided... :(

Marek


Re: Fix PR78154

2016-11-16 Thread Martin Sebor

On 11/16/2016 02:21 PM, Marc Glisse wrote:

On Wed, 16 Nov 2016, Martin Sebor wrote:


On 11/16/2016 11:49 AM, Prathamesh Kulkarni wrote:

Hi Richard,
Following your suggestion in PR78154, the patch checks if stmt
contains call to memmove (and friends) in gimple_stmt_nonzero_warnv_p
and returns true in that case.


Nice.  I think the list should also include mempcpy, stpcpy, and
stpncpy, and probably also the corresponding checking built-ins
such as __builtin___memcpy_chk.

FWIW, a more general solution to consider (possibly for GCC 8)
might be to extend attribute nonnull to apply to a functions return
value as well (e.g., use zero as the index for that), to indicate
that a pointer returned from it is not null.  That would let library
implementers annotate other functions (such as strerror)


We already have that, under the name returns_nonnull. IIRC, people found
a new name clearer than using position 0, when I posted the patch. Note
also that memcpy already has both an attribute that says that it returns
its first argument, and an attribute that says that said first argument
is nonnull.


Ah, right!  Thanks for the reminder!

__builtin_memcpy and __builtin_strcpy are both declared with
the same attribute list ATTR_RET1_NOTHROW_NONNULL_LEAF (as are
their checking versions) and that's defined like so:

  DEF_ATTR_TREE_LIST (ATTR_RET1_NOTHROW_NONNULL_LEAF, ATTR_FNSPEC,
ATTR_LIST_STR1, ATTR_NOTHROW_NONNULL_LEAF)

ATTR_NOTHROW_NONNULL_LEAF doesn't include returns_nonull and neither
does ATTR_FNSPEC ATTR_LIST_STR1 (unless it's meant to imply that).

In any event, lookup_attribute ("returns_nonnull") fails for both
of these functions so I think the fix might be as "simple" as adding
the attribute.  Alternatively, if attribute fn spec str1 should imply
returns_nonnull when nonnull is set because it says (IIUC) that
the function returns the first (non-null) argument, the changes will
be a bit more involved and require adjusting other places besides
VRP (I see it used in fold-const.c as well similarly as in VRP).

(FWIW, I quoted "simple" above because it recently took me the better
part of an afternoon to figure out how to add attribute alloc_size to
malloc.)



(I've heard some noise in C++-land about making memcpy(0,0,0) valid, but
that may have just been noise)


We may have read the same discussion.  It would make some things
a little easier in C++ (and remove what most people view as yet
another unnecessary gotcha in the language).

Martin


Re: [patch, libfortran] Add AVX-specific matmul

2016-11-16 Thread Jakub Jelinek
On Thu, Nov 17, 2016 at 12:03:18AM +0100, Thomas Koenig wrote:
> >Don't you need to test in configure if the assembler supports AVX?
> >Otherwise if somebody is bootstrapping gcc with older assembler, it will
> >just fail to bootstrap.
> 
> That's a good point.  The AVX instructions were added in binutils 2.19,
> which was released in 2011. This could be put in the prerequisites.
> 
> What should the test do?  Fail with an error message "you need newer
> binutils" or simply (and silently) not compile the AVX vesion?

>From what I understood, you want those functions just to be implementation
details, not exported from libgfortran.so*.  Thus the test would do
something similar to what gcc/testsuite/lib/target-supports.exp 
(check_effective_target_avx)
does, but of course in autoconf way, not in tcl.
Also, from what I see, target_clones just use IFUNCs, so you probably also
need some configure test whether ifuncs are supported (the
gcc.target/i386/mvc* tests use dg-require-ifunc, so you'd need something
similar again in configure.  But if so, then I have no idea why you use
a wrapper around the function, instead of using it on the exported APIs.

> >For matmul_i*, wouldn't it make more sense to use avx2 instead of avx,
> >or both avx and avx2 and maybe avx512f?
> 
> I did a vdiff of the disassembled code generated or avx and avx2, and
> (somewhat to my surprise) there was no difference.  Maybe, with more
> unrolling, something more might have happened. I didn't check for
> AVX512f, but I can do that.

For the float/double code it wouldn't surprise me (assuming you don't need
gather insns and similar stuff).  But for integers generally most of the
avx instructions can only handle 128-bit vectors, while avx2 has 256-bit
ones.

Jakub


Re: [PATCH] libiberty: Add Rust symbol demangling.

2016-11-16 Thread Mark Wielaard
On Wed, Nov 16, 2016 at 02:56:03PM -0800, Ian Lance Taylor wrote:
> On Wed, Nov 16, 2016 at 2:18 PM, David Tolnay  wrote:
> > FSF just confirmed that my assignment/disclaimer process has been
> > completed. Ian can you take a look at your list again?
> 
> Yes, you are good.  Thanks.

I rebased the patch on top of master (which trivially applied), rebuild,
retested and pushed.
https://gcc.gnu.org/ml/gcc-cvs/2016-11/msg00798.html

I am really happy this is part of libiberty now. I will also sync it
with binutils so c++filt will be able to demangle rust symbols too.

Thanks,

Mark


Re: [PATCH] Fix combine's make_extraction (PR rtl-optimization/78378)

2016-11-16 Thread Segher Boessenkool
On Wed, Nov 16, 2016 at 10:07:23PM +0100, Jakub Jelinek wrote:
> If inner is a MEM, make_extraction requires that pos is a multiple of bytes
> and deals with offsetting it.  Or otherwise requires that pos is a multiple
> of BITS_PER_WORD and for REG inner it handles that too.  But if inner
> is something different, it calls just force_to_mode to the target mode,
> which only really works if pos is 0.
> 
> Thus the following patch restricts it to that case.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Yes, thanks!


Segher


Re: [patch, libfortran] Add AVX-specific matmul

2016-11-16 Thread Jerry DeLisle

On 11/16/2016 01:30 PM, Thomas Koenig wrote:

Hello world,

the attached patch adds an AVX-specific version of the matmul
intrinsic to the Fortran library.  This works by using the target_clones
attribute.

For testing, I compiled this on powerpc64-unknown-linux-gnu,
without any ill effects.

Also, a resulting binary reached around 15 GFlops for larger matrices
on a 3.4 GHz i7-2600 CPU.  I am currently building/regtesting on
that machine. This can give another 40% speed increase  for large
matrices on AVX.

OK for trunk?



Did you intend to name it avx_matmul and not aux_matmul?

Are the compiler flags for avx handled automatically by the gcc attributes so no 
need to endit the Makefile.am?


Fix the first and if yes to the second question, OK

Jerry







Re: [patch, libfortran] Add AVX-specific matmul

2016-11-16 Thread Thomas Koenig

Am 16.11.2016 um 23:01 schrieb Jakub Jelinek:

On Wed, Nov 16, 2016 at 10:30:03PM +0100, Thomas Koenig wrote:

the attached patch adds an AVX-specific version of the matmul
intrinsic to the Fortran library.  This works by using the target_clones
attribute.


Don't you need to test in configure if the assembler supports AVX?
Otherwise if somebody is bootstrapping gcc with older assembler, it will
just fail to bootstrap.


That's a good point.  The AVX instructions were added in binutils 2.19,
which was released in 2011. This could be put in the prerequisites.

What should the test do?  Fail with an error message "you need newer
binutils" or simply (and silently) not compile the AVX vesion?


For matmul_i*, wouldn't it make more sense to use avx2 instead of avx,
or both avx and avx2 and maybe avx512f?


I did a vdiff of the disassembled code generated or avx and avx2, and
(somewhat to my surprise) there was no difference.  Maybe, with more
unrolling, something more might have happened. I didn't check for
AVX512f, but I can do that.


2016-11-16  Thomas Koenig  

PR fortran/78379
* m4/matmul.m4:  For x86_64, make the work function for matmul


Why the extra space before For?


Will be removed.


static with target_clones for AVX and default, and create
a wrapper function to call it.
* generated/matmul_c10.c


Missing : Regenerated.


Will be added.

Regards

Thomas


Re: [PATCH] libiberty: Add Rust symbol demangling.

2016-11-16 Thread Ian Lance Taylor
On Wed, Nov 16, 2016 at 2:18 PM, David Tolnay  wrote:
> FSF just confirmed that my assignment/disclaimer process has been
> completed. Ian can you take a look at your list again?

Yes, you are good.  Thanks.

Ian


Re: RFA: PATCH to gengtype to avoid putting tree_node support in front end objects

2016-11-16 Thread Jason Merrill
On Wed, Nov 16, 2016 at 1:45 PM, Moore, Catherine
 wrote:
> /scratch/cmoore/mips-sde-elf-upstream/src/gcc-trunk-6/gcc/hash-map.h:62:12: 
> error: no matching function for call to 'gt_ggc_mx(rtx_def*&)'
>
> I configured with --target=mips-sde-elf, but I do have some local multilib 
> definitions for that target.  This ought to reproduce with mti-elf as well.
> Will you please fix or revert?

Fixed thus, parallel to the declarations in tree.h.
commit 764bcd5c87000e3ddc85607674e68122fe986e51
Author: Jason Merrill 
Date:   Wed Nov 16 17:22:19 2016 -0500

* rtl.h: Declare gt_ggc_mx and gt_pch_nx.

diff --git a/gcc/rtl.h b/gcc/rtl.h
index df5172b..6a4cf36 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3771,5 +3771,9 @@ struct GTY(()) cgraph_rtl_info {
   unsigned function_used_regs_valid: 1;
 };
 
+/* gtype-desc.c.  */
+extern void gt_ggc_mx (rtx &);
+extern void gt_pch_nx (rtx &);
+extern void gt_pch_nx (rtx &, gt_pointer_operator, void *);
 
 #endif /* ! GCC_RTL_H */


Re: [PATCH] libiberty: Add Rust symbol demangling.

2016-11-16 Thread David Tolnay
FSF just confirmed that my assignment/disclaimer process has been
completed. Ian can you take a look at your list again?

David

On Fri, Nov 11, 2016 at 1:48 PM, David Tolnay  wrote:
> On Fri, Nov 11, 2016 at 10:46 AM, Ian Lance Taylor  wrote:
>> The patch is OK when the copyright assignment is clear.
>
> Ian I sent out the paperwork Monday night to ass...@gnu.org.
>
> David


Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-11-16 Thread Andrew Burgess
* Mike Stump  [2016-11-16 12:59:53 -0800]:

> On Nov 16, 2016, at 12:09 PM, Andrew Burgess  
> wrote:
> > My only remaining concern is the new tests, I've tried to restrict
> > them to targets that I suspect they'll pass on with:
> > 
> >/* { dg-final-use { scan-assembler "\.section\[\t 
> > \]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold\.0" { target 
> > *-*-linux* *-*-gnu* } } } */
> > 
> > but I'm still nervous that I'm going to introduce test failures.  Is
> > there any advice / guidance I should follow before I commit, or are
> > folk pretty relaxed so long as I've made a reasonable effort?
> 
> So, if you are worried about the way the line is constructed, I usually test 
> it by misspelling the *-*-linux* *-*-gnu* part as *-*-linNOTux* *-*-gnNOTu* 
> and see if the test then doesn't run on your machine.  If it doesn't then you 
> can be pretty confident that only machines that match the target triplet can 
> be impacted.  I usually do this type of testing by running the test case in 
> isolation (not the full tests suite).  Anyway, do the best you can, and don't 
> worry about t it too much, learn from the experience, even if it goes wrong 
> in some way.  If it did go wrong, just be responsive (don't check it in just 
> before a 6 week vacation) about fixing it, if you can.
> 

Thanks for the feedback.

Change committed as revision 242519.  If anyone sees any issues just
let me know.

Thanks,
Andrew


c-family PATCH to tidy switch diagnostics (PR c/78285)

2016-11-16 Thread Marek Polacek
As pointed out in Bug 78285, some error calls should actually be inform calls.
I'm not adding any new test; existing switch-5.c covers all the cases so I 
didn't
see much value in duplicating that part of the test.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-11-16  Marek Polacek  

PR c/78285
* c-common.c (c_add_case_label): Turn error_at calls into inform.

* gcc.dg/switch-5.c: Turn several dg-errors into dg-messages.
* g++.dg/ext/case-range2.C: Likewise.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 2997c83..3eb7f45 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -4968,19 +4968,19 @@ c_add_case_label (location_t loc, splay_tree cases, 
tree cond, tree orig_type,
   if (high_value)
{
  error_at (loc, "duplicate (or overlapping) case value");
- error_at (DECL_SOURCE_LOCATION (duplicate),
-   "this is the first entry overlapping that value");
+ inform (DECL_SOURCE_LOCATION (duplicate),
+ "this is the first entry overlapping that value");
}
   else if (low_value)
{
  error_at (loc, "duplicate case value") ;
- error_at (DECL_SOURCE_LOCATION (duplicate), "previously used here");
+ inform (DECL_SOURCE_LOCATION (duplicate), "previously used here");
}
   else
{
  error_at (loc, "multiple default labels in one switch");
- error_at (DECL_SOURCE_LOCATION (duplicate),
-   "this is the first default label");
+ inform (DECL_SOURCE_LOCATION (duplicate),
+ "this is the first default label");
}
   goto error_out;
 }
diff --git gcc/testsuite/g++.dg/ext/case-range2.C 
gcc/testsuite/g++.dg/ext/case-range2.C
index 985ded3..f1165ad 100644
--- gcc/testsuite/g++.dg/ext/case-range2.C
+++ gcc/testsuite/g++.dg/ext/case-range2.C
@@ -11,7 +11,7 @@ T f2 (T i)
 {
   switch (i)
   {
-case low ... high : return i + 1;  // { dg-error "previously" }
+case low ... high : return i + 1;  // { dg-message "previously" }
 case 5 : return i + 2; // { dg-error "duplicate" }
 default : return 0;
   }
@@ -20,7 +20,7 @@ T f2 (T i)
 int f (int i)
 {
   switch (i) {
-case 1 ... 10: return i + 1;   // { dg-error "first entry" }
+case 1 ... 10: return i + 1;   // { dg-message "first entry" }
 case 3 ... 5 : return i + 3;   // { dg-error "duplicate" }
 default: return f2 (i);// { dg-message "required" }
   }
diff --git gcc/testsuite/gcc.dg/switch-5.c gcc/testsuite/gcc.dg/switch-5.c
index 5a58490..a097d44 100644
--- gcc/testsuite/gcc.dg/switch-5.c
+++ gcc/testsuite/gcc.dg/switch-5.c
@@ -40,13 +40,13 @@ f (int a, double d, void *p)
   switch (a)
 {
 case 0:
-default: /* { dg-error "this is the first default label" } */
+default: /* { dg-message "this is the first default label" } */
 case 1:
 default: ; /* { dg-error "multiple default labels in one switch" } */
 }
   switch (a)
 {
-case 0: /* { dg-error "previously used here" } */
+case 0: /* { dg-message "previously used here" } */
 case 1:
 case 0: ; /* { dg-error "duplicate case value" } */
 }
@@ -60,11 +60,11 @@ f (int a, double d, void *p)
  }
switch (a)
  {
- case 0: /* { dg-error "this is the first entry overlapping that value" } 
*/
+ case 0: /* { dg-message "this is the first entry overlapping that value" 
} */
  case -1 ... 1: /* { dg-error "duplicate \\(or overlapping\\) case value" 
} */
- case 2 ... 3: /* { dg-error "previously used here" } */
+ case 2 ... 3: /* { dg-message "previously used here" } */
  case 2: /* { dg-error "duplicate case value" } */
- case 4 ... 7: /* { dg-error "this is the first entry overlapping that 
value" } */
+ case 4 ... 7: /* { dg-message "this is the first entry overlapping that 
value" } */
  case 6 ... 9: ; /* { dg-error "duplicate \\(or overlapping\\) case value" 
} */
  }
switch (a)

Marek


Re: [patch, libfortran] Add AVX-specific matmul

2016-11-16 Thread Jakub Jelinek
On Wed, Nov 16, 2016 at 10:30:03PM +0100, Thomas Koenig wrote:
> the attached patch adds an AVX-specific version of the matmul
> intrinsic to the Fortran library.  This works by using the target_clones
> attribute.

Don't you need to test in configure if the assembler supports AVX?
Otherwise if somebody is bootstrapping gcc with older assembler, it will
just fail to bootstrap.
For matmul_i*, wouldn't it make more sense to use avx2 instead of avx,
or both avx and avx2 and maybe avx512f?

> 2016-11-16  Thomas Koenig  
> 
> PR fortran/78379
> * m4/matmul.m4:  For x86_64, make the work function for matmul

Why the extra space before For?

> static with target_clones for AVX and default, and create
> a wrapper function to call it.
> * generated/matmul_c10.c

Missing : Regenerated.

Jakub


Re: [patch,libgfortran] PR51119 - MATMUL slow for large matrices

2016-11-16 Thread Jerry DeLisle

Committed after approval on bugzilla to eliminate warnings.

2016-11-16  Jerry DeLisle  

PR libgfortran/51119
* Makefile.am: Remove -fno-protect-parens -fstack-arrays.
* Makefile.in: Regenerate.


r242517 = 026291bdda18395d7c746856dd7e4ed384856a1b (refs/remotes/svn/trunk)
M   libgfortran/Makefile.in
M   libgfortran/ChangeLog
M   libgfortran/Makefile.am

Regards,

Jerry


RE: [PATCH 1/4] MIPS16/GCC: Fix DImode `casesi_internal_mips16_' assembly instructions

2016-11-16 Thread Maciej W. Rozycki
On Wed, 16 Nov 2016, Matthew Fortune wrote:

> OK. I have no idea what system supports 64-bit MIPS16 but given it costs
> little to improve consistency here then it is at least doing no harm.

 No recent real hardware I believe.  Among older implementations there 
were the NEC Vr4111 and Vr4121 processors, both at the MIPS III ISA level, 
sans atomics and FPU.  Which is likely why we only support the soft-float 
model with NewABI MIPS16 compilations.  They had the usual TLB MMU though, 
so in principle they could run Linux and have the FPU emulator kick in for 
hard-float operations, except of course their glory was well before MIPS16 
support was even considered for Linux.

 Then there's always QEMU of course too.

 Patch applied (with another obvious whitespace fix), along with the rest 
from this set.  Thanks for your review!

  Maciej


Re: [PATCH] Fix up ICEs with TREE_CONSTANT references (PR c++/78373)

2016-11-16 Thread Jakub Jelinek
On Wed, Nov 16, 2016 at 04:26:36PM -0500, Jason Merrill wrote:
> On Wed, Nov 16, 2016 at 4:00 PM, Jakub Jelinek  wrote:
> > Jason's recent patch to turn reference vars initialized with invariant
> > addresses broke the first testcase below, because >singleton
> > is considered TREE_CONSTANT (because self is TREE_CONSTANT VAR_DECL and
> > singleton field has constant offset), but after going into SSA form
> > it is not supposed to be TREE_CONSTANT anymore (_2->singleton),
> > because SSA_NAMEs don't have TREE_CONSTANT set on them.
> >
> > The following patch fixes it by gimplifying such vars into their
> > DECL_INITIAL unless in OpenMP regions, where such folding is deferred
> > until omplower pass finishes.
> 
> Hmm, this seems like a workaround; why don't we see the same problem
> with constant pointer variables?

Dunno, tried to construct a testcase, but it doesn't fail.  If it is e.g.
TREE_CONSTANT because of being constexpr, then the FE already replaces it
by its initializer.

> A simpler workaround would be to not set TREE_CONSTANT on references
> in the first place, since the constexpr code doesn't need it.  What do
> you think?

If the FE doesn't need it, indeed it would be simpler this way.

> commit 6cdd28bb152fcb07a7eb6c9f053cd435cf719a20
> Author: Jason Merrill 
> Date:   Wed Nov 16 16:13:25 2016 -0500
> 
> ref
> 
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index c54a2de..87db589 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -6839,7 +6839,8 @@ cp_finish_decl (tree decl, tree init, bool 
> init_const_expr_p,
> /* Set these flags now for templates.  We'll update the flags in
>store_init_value for instantiations.  */
> DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = 1;
> -   if (decl_maybe_constant_var_p (decl))
> +   if (decl_maybe_constant_var_p (decl)
> +   && TREE_CODE (type) != REFERENCE_TYPE)
>   TREE_CONSTANT (decl) = 1;
>   }
>  }
> diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
> index 022a478..dcdb710 100644
> --- a/gcc/cp/typeck2.c
> +++ b/gcc/cp/typeck2.c
> @@ -824,7 +824,8 @@ store_init_value (tree decl, tree init, vec va_gc>** cleanups, int flags)
>const_init = (reduced_constant_expression_p (value)
>   || error_operand_p (value));
>DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = const_init;
> -  TREE_CONSTANT (decl) = const_init && decl_maybe_constant_var_p (decl);
> +  if (TREE_CODE (type) != REFERENCE_TYPE)
> + TREE_CONSTANT (decl) = const_init && decl_maybe_constant_var_p (decl);
>  }
>value = cp_fully_fold (value);
>  

Jakub


Re: PR78319

2016-11-16 Thread Jeff Law

On 11/16/2016 01:23 PM, Prathamesh Kulkarni wrote:

Hi,
As discussed in PR, this patch marks the test-case to xfail on arm-none-eabi.
OK to commit ?
You might check if Aldy's change to the uninit code helps your case 
(approved earlier today, so hopefully in the tree very soon).  I quickly 
scanned the BZ.  There's some overlap, but it might be too complex for 
Aldy's enhancements to catch.


jeff


Re: [PATCH] Fix NetBSD bootstrap

2016-11-16 Thread Mike Stump

> On Nov 16, 2016, at 11:23 AM, Krister Walfridsson 
>  wrote:
> 
> On Wed, 16 Nov 2016, Mike Stump wrote:
> 
>> Looks reasonable.  The biggest issue would be if any of those values changed 
>> through time, and the current version works for older netbsd releases, the 
>> patch could break them.  Of course, I don't have any visibility into how any 
>> of those values might have changed through time.
> 
> This should not be an issue in this case, so I'll commit the patch. Thanks!

Oh, I don't know if you are tracking release branches so that previous releases 
just work, but if you are, you can keep track of patches that would need to be 
back ported for a release branch to work nicely.  As you finish with getting 
things in shape, you can then go back and see about back porting that work, if 
you are interested in that.



[PATCH] PR fortran/58001 -- Handle tab in FORMAT

2016-11-16 Thread Steve Kargl
An earlier version of the attached patch lingered in bugzilla
for over 3 years.  I've updated the patch to include Manuel's
comment #12.  Regression tested on x86_64-*-freebsd.  OK to
commit?

2016-11-16  Steven G. Kargl  

PR fortran/58001
* io.c (next_char_not_space): Update handling of a 'tab' in a FORMAT.
(format_lex): Adjust invocations of next_char_not_space().
 
2016-11-16  Steven G. Kargl  

PR fortran/58001
* gfortran.dg/fmt_tab_1.f90: Adjust testcase.
* gfortran.dg/fmt_tab_2.f90: Ditto.

-- 
Steve
Index: gcc/fortran/io.c
===
--- gcc/fortran/io.c	(revision 242512)
+++ gcc/fortran/io.c	(working copy)
@@ -200,23 +200,14 @@ unget_char (void)
 /* Eat up the spaces and return a character.  */
 
 static char
-next_char_not_space (bool *error)
+next_char_not_space ()
 {
   char c;
   do
 {
   error_element = c = next_char (NONSTRING);
   if (c == '\t')
-	{
-	  if (gfc_option.allow_std & GFC_STD_GNU)
-	gfc_warning (0, "Extension: Tab character in format at %C");
-	  else
-	{
-	  gfc_error ("Extension: Tab character in format at %C");
-	  *error = true;
-	  return c;
-	}
-	}
+	gfc_warning (OPT_Wtabs, "Nonconforming tab character in format at %C");
 }
   while (gfc_is_whitespace (c));
   return c;
@@ -234,7 +225,6 @@ format_lex (void)
   char c, delim;
   int zflag;
   int negative_flag;
-  bool error = false;
 
   if (saved_token != FMT_NONE)
 {
@@ -243,7 +233,7 @@ format_lex (void)
   return token;
 }
 
-  c = next_char_not_space ();
+  c = next_char_not_space ();
   
   negative_flag = 0;
   switch (c)
@@ -253,7 +243,7 @@ format_lex (void)
   /* Falls through.  */
 
 case '+':
-  c = next_char_not_space ();
+  c = next_char_not_space ();
   if (!ISDIGIT (c))
 	{
 	  token = FMT_UNKNOWN;
@@ -264,7 +254,7 @@ format_lex (void)
 
   do
 	{
-	  c = next_char_not_space ();
+	  c = next_char_not_space ();
 	  if (ISDIGIT (c))
 	value = 10 * value + c - '0';
 	}
@@ -294,7 +284,7 @@ format_lex (void)
 
   do
 	{
-	  c = next_char_not_space ();
+	  c = next_char_not_space ();
 	  if (ISDIGIT (c))
 	{
 	  value = 10 * value + c - '0';
@@ -329,7 +319,7 @@ format_lex (void)
   break;
 
 case 'T':
-  c = next_char_not_space ();
+  c = next_char_not_space ();
   switch (c)
 	{
 	case 'L':
@@ -357,7 +347,7 @@ format_lex (void)
   break;
 
 case 'S':
-  c = next_char_not_space ();
+  c = next_char_not_space ();
   if (c != 'P' && c != 'S')
 	unget_char ();
 
@@ -365,7 +355,7 @@ format_lex (void)
   break;
 
 case 'B':
-  c = next_char_not_space ();
+  c = next_char_not_space ();
   if (c == 'N' || c == 'Z')
 	token = FMT_BLANK;
   else
@@ -427,7 +417,7 @@ format_lex (void)
   break;
 
 case 'E':
-  c = next_char_not_space ();
+  c = next_char_not_space ();
   if (c == 'N' )
 	token = FMT_EN;
   else if (c == 'S')
@@ -457,7 +447,7 @@ format_lex (void)
   break;
 
 case 'D':
-  c = next_char_not_space ();
+  c = next_char_not_space ();
   if (c == 'P')
 	{
 	  if (!gfc_notify_std (GFC_STD_F2003, "DP format "
@@ -478,7 +468,7 @@ format_lex (void)
 	  "specifier not allowed at %C"))
 	return FMT_ERROR;
 	  token = FMT_DT;
-	  c = next_char_not_space ();
+	  c = next_char_not_space ();
 	  if (c == '\'' || c == '"')
 	{
 	  delim = c;
@@ -518,7 +508,7 @@ format_lex (void)
   break;
 
 case 'R':
-  c = next_char_not_space ();
+  c = next_char_not_space ();
   switch (c)
 	{
 	case 'C':
@@ -559,9 +549,6 @@ format_lex (void)
   break;
 }
 
-  if (error)
-return FMT_ERROR;
-
   return token;
 }
 
Index: gcc/testsuite/gfortran.dg/fmt_tab_1.f90
===
--- gcc/testsuite/gfortran.dg/fmt_tab_1.f90	(revision 242512)
+++ gcc/testsuite/gfortran.dg/fmt_tab_1.f90	(working copy)
@@ -1,7 +1,12 @@
 ! { dg-do compile }
-! { dg-options -Wno-error=tabs }
+! { dg-options -Wtabs }
 ! PR fortran/32987
+! PR fortran/58001
   program TestFormat
 write (*, 10)
- 10 format ('Hello ',	'bug!') ! { dg-warning "Extension: Tab character in format" }
+! There is a tab character before 'bug!'.  This is accepted without
+! the -Wno-tabs option or a -std= option.
+ 10 format ('Hello ',	'bug!') ! { dg-warning "tab character in format" }
+
   end
+! { dg-excess-errors "tab character in format" }
Index: gcc/testsuite/gfortran.dg/fmt_tab_2.f90
===
--- gcc/testsuite/gfortran.dg/fmt_tab_2.f90	(revision 242512)
+++ gcc/testsuite/gfortran.dg/fmt_tab_2.f90	(working copy)
@@ -1,7 +1,9 @@
 ! { dg-do compile }
 ! { dg-options "-std=f2003" }
 ! PR fortran/32987
+! PR fortran/58001
   program TestFormat
-write 

[patch, libfortran] Add AVX-specific matmul

2016-11-16 Thread Thomas Koenig

Hello world,

the attached patch adds an AVX-specific version of the matmul
intrinsic to the Fortran library.  This works by using the target_clones
attribute.

For testing, I compiled this on powerpc64-unknown-linux-gnu,
without any ill effects.

Also, a resulting binary reached around 15 GFlops for larger matrices
on a 3.4 GHz i7-2600 CPU.  I am currently building/regtesting on
that machine. This can give another 40% speed increase  for large
matrices on AVX.

OK for trunk?

Regards

Thomas

2016-11-16  Thomas Koenig  

PR fortran/78379
* m4/matmul.m4:  For x86_64, make the work function for matmul
static with target_clones for AVX and default, and create
a wrapper function to call it.
* generated/matmul_c10.c
* generated/matmul_c16.c: Regenerated.
* generated/matmul_c4.c: Regenerated.
* generated/matmul_c8.c: Regenerated.
* generated/matmul_i1.c: Regenerated.
* generated/matmul_i16.c: Regenerated.
* generated/matmul_i2.c: Regenerated.
* generated/matmul_i4.c: Regenerated.
* generated/matmul_i8.c: Regenerated.
* generated/matmul_r10.c: Regenerated.
* generated/matmul_r16.c: Regenerated.
* generated/matmul_r4.c: Regenerated.
* generated/matmul_r8.c: Regenerated.
Index: generated/matmul_c10.c
===
--- generated/matmul_c10.c	(Revision 242477)
+++ generated/matmul_c10.c	(Arbeitskopie)
@@ -75,11 +75,37 @@ extern void matmul_c10 (gfc_array_c10 * const rest
 	int blas_limit, blas_call gemm);
 export_proto(matmul_c10);
 
+#ifdef __x86_64__
+
+/* For x86_64, we switch to AVX if that is available.  For this, we
+   let the actual work be done by the static aux_matmul - function.
+   The user-callable function will then automagically contain the
+   selection code for the right architecture.  This is done to avoid
+   knowledge of architecture details in the front end.  */
+
+static void aux_matmul_c10 (gfc_array_c10 * const restrict retarray, 
+	gfc_array_c10 * const restrict a, gfc_array_c10 * const restrict b, int try_blas,
+	int blas_limit, blas_call gemm)
+	__attribute__ ((target_clones("avx,default")));
+
 void
 matmul_c10 (gfc_array_c10 * const restrict retarray, 
 	gfc_array_c10 * const restrict a, gfc_array_c10 * const restrict b, int try_blas,
 	int blas_limit, blas_call gemm)
 {
+  aux_matmul_c10 (retarray, a, b, try_blas, blas_limit, gemm);
+}
+
+static void
+aux_matmul_c10 (gfc_array_c10 * const restrict retarray, 
+	gfc_array_c10 * const restrict a, gfc_array_c10 * const restrict b, int try_blas,
+	int blas_limit, blas_call gemm)
+#else
+matmul_c10 (gfc_array_c10 * const restrict retarray, 
+	gfc_array_c10 * const restrict a, gfc_array_c10 * const restrict b, int try_blas,
+	int blas_limit, blas_call gemm)
+#endif
+{
   const GFC_COMPLEX_10 * restrict abase;
   const GFC_COMPLEX_10 * restrict bbase;
   GFC_COMPLEX_10 * restrict dest;
Index: generated/matmul_c16.c
===
--- generated/matmul_c16.c	(Revision 242477)
+++ generated/matmul_c16.c	(Arbeitskopie)
@@ -75,11 +75,37 @@ extern void matmul_c16 (gfc_array_c16 * const rest
 	int blas_limit, blas_call gemm);
 export_proto(matmul_c16);
 
+#ifdef __x86_64__
+
+/* For x86_64, we switch to AVX if that is available.  For this, we
+   let the actual work be done by the static aux_matmul - function.
+   The user-callable function will then automagically contain the
+   selection code for the right architecture.  This is done to avoid
+   knowledge of architecture details in the front end.  */
+
+static void aux_matmul_c16 (gfc_array_c16 * const restrict retarray, 
+	gfc_array_c16 * const restrict a, gfc_array_c16 * const restrict b, int try_blas,
+	int blas_limit, blas_call gemm)
+	__attribute__ ((target_clones("avx,default")));
+
 void
 matmul_c16 (gfc_array_c16 * const restrict retarray, 
 	gfc_array_c16 * const restrict a, gfc_array_c16 * const restrict b, int try_blas,
 	int blas_limit, blas_call gemm)
 {
+  aux_matmul_c16 (retarray, a, b, try_blas, blas_limit, gemm);
+}
+
+static void
+aux_matmul_c16 (gfc_array_c16 * const restrict retarray, 
+	gfc_array_c16 * const restrict a, gfc_array_c16 * const restrict b, int try_blas,
+	int blas_limit, blas_call gemm)
+#else
+matmul_c16 (gfc_array_c16 * const restrict retarray, 
+	gfc_array_c16 * const restrict a, gfc_array_c16 * const restrict b, int try_blas,
+	int blas_limit, blas_call gemm)
+#endif
+{
   const GFC_COMPLEX_16 * restrict abase;
   const GFC_COMPLEX_16 * restrict bbase;
   GFC_COMPLEX_16 * restrict dest;
Index: generated/matmul_c4.c
===
--- generated/matmul_c4.c	(Revision 242477)
+++ generated/matmul_c4.c	(Arbeitskopie)
@@ -75,11 +75,37 @@ extern void matmul_c4 (gfc_array_c4 * const restri
 	int blas_limit, blas_call gemm);
 

Re: C++ PATCH for c++/78358 (decltype and decomposition)

2016-11-16 Thread Jason Merrill
On Tue, Nov 15, 2016 at 11:31 AM, Jason Merrill  wrote:
> OK, (hopefully) one more patch for decltype and C++17 decomposition
> declarations.  I hadn't been thinking that "referenced type" meant to
> look through references in the tuple case, since other parts of
> [dcl.decomp] define "the referenced type" directly, but that does seem
> to be how it's used elsewhere in the standard.

Nope, that wasn't right, either.  The tuple section of [dcl.decomp]
also defines "the referenced type" in a way that makes it impossible
to determine from the actual type of the variable, so we need to store
it somewhere.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 235535a2ee99c30292f8f964a3759fbeb6067e45
Author: Jason Merrill 
Date:   Tue Nov 15 22:22:34 2016 -0500

Fix tuple decomposition decltype.

* decl.c (store_decomp_type, lookup_decomp_type): New.
(cp_finish_decomp): Call store_decomp_type.
* semantics.c (finish_decltype_type): Call lookup_decomp_type.
* cp-tree.h: Declare lookup_decomp_type.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 0dcb897..cb1b9fa 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5808,6 +5808,7 @@ extern tree start_decl(const 
cp_declarator *, cp_decl_specifier_seq *, int,
 extern void start_decl_1   (tree, bool);
 extern bool check_array_initializer(tree, tree, tree);
 extern void cp_finish_decl (tree, tree, bool, tree, int);
+extern tree lookup_decomp_type (tree);
 extern void cp_finish_decomp   (tree, tree, unsigned int);
 extern int cp_complete_array_type  (tree *, tree, bool);
 extern int cp_complete_array_type_or_error (tree *, tree, bool, 
tsubst_flags_t);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 23ba087..c54a2de 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7293,6 +7293,22 @@ get_tuple_decomp_init (tree decl, unsigned i)
 }
 }
 
+/* It's impossible to recover the decltype of a tuple decomposition variable
+   based on the actual type of the variable, so store it in a hash table.  */
+static GTY(()) hash_map *decomp_type_table;
+static void
+store_decomp_type (tree v, tree t)
+{
+  if (!decomp_type_table)
+decomp_type_table = hash_map::create_ggc (13);
+  decomp_type_table->put (v, t);
+}
+tree
+lookup_decomp_type (tree v)
+{
+  return *decomp_type_table->get (v);
+}
+
 /* Finish a decomposition declaration.  DECL is the underlying declaration
"e", FIRST is the head of a chain of decls for the individual identifiers
chained through DECL_CHAIN in reverse order and COUNT is the number of
@@ -7467,6 +7483,8 @@ cp_finish_decomp (tree decl, tree first, unsigned int 
count)
  v[i]);
  goto error_out;
}
+ /* Save the decltype away before reference collapse.  */
+ store_decomp_type (v[i], eltype);
  eltype = cp_build_reference_type (eltype, !lvalue_p (init));
  TREE_TYPE (v[i]) = eltype;
  layout_decl (v[i], 0);
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index dc5ad13..96c67a5 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -8902,7 +8902,7 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
return unlowered_expr_type (expr);
  else
/* Expr is a reference variable for the tuple case.  */
-   return non_reference (TREE_TYPE (expr));
+   return lookup_decomp_type (expr);
}
 
   switch (TREE_CODE (expr))
diff --git a/gcc/testsuite/g++.dg/cpp1z/decomp17.C 
b/gcc/testsuite/g++.dg/cpp1z/decomp17.C
new file mode 100644
index 000..484094b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/decomp17.C
@@ -0,0 +1,15 @@
+// { dg-options -std=c++1z }
+
+#include 
+
+template  struct same_type;
+template  struct same_type {};
+
+int main() {
+  int i;
+  std::tuple tuple = { 1, i, 1 };
+  auto &[v, r, rr] = tuple;
+  same_type{};
+  same_type{};
+  same_type{};
+}
diff --git a/gcc/tree.h b/gcc/tree.h
index 6a98b6e..0a82a4a 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -2457,8 +2457,7 @@ extern void decl_value_expr_insert (tree, tree);
 
 /* In a VAR_DECL or PARM_DECL, the location at which the value may be found,
if transformations have made this more complicated than evaluating the
-   decl itself.  This should only be used for debugging; once this field has
-   been set, the decl itself may not legitimately appear in the function.  */
+   decl itself.  */
 #define DECL_HAS_VALUE_EXPR_P(NODE) \
   (TREE_CHECK3 (NODE, VAR_DECL, PARM_DECL, RESULT_DECL) \
->decl_common.decl_flag_2)


Re: [PATCH] Fix up ICEs with TREE_CONSTANT references (PR c++/78373)

2016-11-16 Thread Jason Merrill
On Wed, Nov 16, 2016 at 4:00 PM, Jakub Jelinek  wrote:
> Jason's recent patch to turn reference vars initialized with invariant
> addresses broke the first testcase below, because >singleton
> is considered TREE_CONSTANT (because self is TREE_CONSTANT VAR_DECL and
> singleton field has constant offset), but after going into SSA form
> it is not supposed to be TREE_CONSTANT anymore (_2->singleton),
> because SSA_NAMEs don't have TREE_CONSTANT set on them.
>
> The following patch fixes it by gimplifying such vars into their
> DECL_INITIAL unless in OpenMP regions, where such folding is deferred
> until omplower pass finishes.

Hmm, this seems like a workaround; why don't we see the same problem
with constant pointer variables?

A simpler workaround would be to not set TREE_CONSTANT on references
in the first place, since the constexpr code doesn't need it.  What do
you think?
commit 6cdd28bb152fcb07a7eb6c9f053cd435cf719a20
Author: Jason Merrill 
Date:   Wed Nov 16 16:13:25 2016 -0500

ref

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index c54a2de..87db589 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6839,7 +6839,8 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
  /* Set these flags now for templates.  We'll update the flags in
 store_init_value for instantiations.  */
  DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = 1;
- if (decl_maybe_constant_var_p (decl))
+ if (decl_maybe_constant_var_p (decl)
+ && TREE_CODE (type) != REFERENCE_TYPE)
TREE_CONSTANT (decl) = 1;
}
 }
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 022a478..dcdb710 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -824,7 +824,8 @@ store_init_value (tree decl, tree init, vec** 
cleanups, int flags)
   const_init = (reduced_constant_expression_p (value)
|| error_operand_p (value));
   DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = const_init;
-  TREE_CONSTANT (decl) = const_init && decl_maybe_constant_var_p (decl);
+  if (TREE_CODE (type) != REFERENCE_TYPE)
+   TREE_CONSTANT (decl) = const_init && decl_maybe_constant_var_p (decl);
 }
   value = cp_fully_fold (value);
 


Re: Fix PR78154

2016-11-16 Thread Marc Glisse

On Wed, 16 Nov 2016, Martin Sebor wrote:


On 11/16/2016 11:49 AM, Prathamesh Kulkarni wrote:

Hi Richard,
Following your suggestion in PR78154, the patch checks if stmt
contains call to memmove (and friends) in gimple_stmt_nonzero_warnv_p
and returns true in that case.


Nice.  I think the list should also include mempcpy, stpcpy, and
stpncpy, and probably also the corresponding checking built-ins
such as __builtin___memcpy_chk.

FWIW, a more general solution to consider (possibly for GCC 8)
might be to extend attribute nonnull to apply to a functions return
value as well (e.g., use zero as the index for that), to indicate
that a pointer returned from it is not null.  That would let library
implementers annotate other functions (such as strerror)


We already have that, under the name returns_nonnull. IIRC, people found a 
new name clearer than using position 0, when I posted the patch. Note also 
that memcpy already has both an attribute that says that it returns its 
first argument, and an attribute that says that said first argument is 
nonnull.


(I've heard some noise in C++-land about making memcpy(0,0,0) valid, but 
that may have just been noise)


--
Marc Glisse


Re: [libstdc++, testsuite] Add dg-require-thread-fence

2016-11-16 Thread Christophe Lyon
On 15 November 2016 at 12:50, Jonathan Wakely  wrote:
> On 14/11/16 14:32 +0100, Christophe Lyon wrote:
>>
>> On 20 October 2016 at 19:40, Jonathan Wakely  wrote:
>>>
>>> On 20/10/16 10:33 -0700, Mike Stump wrote:


 On Oct 20, 2016, at 9:34 AM, Jonathan Wakely  wrote:
>
>
>
> On 20/10/16 09:26 -0700, Mike Stump wrote:
>>
>>
>> On Oct 20, 2016, at 5:20 AM, Jonathan Wakely 
>> wrote:
>>>
>>>
>>>
>>> I am considering leaving this in the ARM backend to force people to
>>> think what they want to do about thread safety with statics and C++
>>> on bare-metal systems.
>
>
>
> The quoting makes it look like those are my words, but I was quoting
> Ramana from https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02751.html
>
>> Not quite in the GNU spirit?  The port people should decide the best
>> way
>> to get as much functionality as possible and everything should just
>> work, no
>> sharp edges.
>>
>> Forcing people to think sounds like a sharp edge?
>
>
>
> I'm inclined to agree, but we are talking about bare metal systems,



 So?  gcc has been doing bare metal systems for more than 2 years now.
 It
 is pretty good at it.  All my primary targets today are themselves bare
 metal systems (I test with newlib).

> where there is no one-size-fits-all solution.



 Configurations are like ice cream cones.  Everyone gets their flavor no
 matter how weird or strange.  Putting nails in a cone because you don't
 know
 if they like vanilla or chocolate isn't reasonable.  If you want, make
 two
 flavors, and vend two, if you want to just do one, pick the flavor and
 vend
 it.  Put an enum #define default_flavor vanilla, and you then have
 support
 for any flavor you want.  Want to add a configure option for the flavor
 select, add it.  You want to make a -mflavor=chocolate option, add it.
 gcc
 is literally littered with these things.
>>>
>>>
>>>
>>> Like I said, you can either build the library with
>>> -fno-threadsafe-statics or you can provide a definition of the missing
>>> symbol.
>>>
>> I gave this a try (using CXXFLAGS_FOR_TARGET=-fno-threadsafe-statics).
>> It seems to do the trick indeed: almost all tests now pass, the flag is
>> added
>> to testcase compilation.
>>
>> Among the 6 remaining failures, I noticed these two:
>> - experimental/type_erased_allocator/2.cc: still complains about the
>> missing
>> __sync_synchronize. Does it need dg-require-thread-fence?
>
>
> Yes, I think that test actually uses atomics directly, so does depend
> on the fence.
>
I've attached the patch to achieve this.
Is it OK?

>> - abi/header_cxxabi.c complains because the option is not valid for C.
>> I can see the test is already skipped for other C++-only options: it is OK
>> if I submit a patch to skip it if -fno-threadsafe-statics is used?
>
>
> Yes, it makes sense there too.

This one is not as obvious as I hoped. I tried:
-// { dg-skip-if "invalid options for C" { *-*-* } { "-std=c++??"
"-std=gnu++??" } }
+// { dg-skip-if "invalid options for C" { *-*-* } { "-std=c++??"
"-std=gnu++??" "-fno-threadsafe-statics" } }

but it does not work.

I set CXXFLAGS_FOR_TARGET=-fno-threadsafe-statics
before running GCC's configure.

This results in -fno-threadsafe-statics being used when compiling the tests,
but dg-skip-if does not consider it: it would if I passed it via
runtestflags/target-board, but then it would mean passing this flag
to all tests, not only the c++ ones, leading to errors everywhere.

Am I missing something?

Thanks,

Christophe

>> I think I'm going to use this flag in validations from now on (target
>> arm-none-eabi
>> only, with default mode/cpu/fpu).
>
>
> Thanks for the update on this.
>
libstdc++-v3/ChangeLog:

2016-11-16  Christophe Lyon  

* testsuite/experimental/type_erased_allocator/2.cc: Add
  dg-require-thread-fence.

diff --git a/libstdc++-v3/testsuite/experimental/type_erased_allocator/2.cc 
b/libstdc++-v3/testsuite/experimental/type_erased_allocator/2.cc
index 216a88c..0b73359 100644
--- a/libstdc++-v3/testsuite/experimental/type_erased_allocator/2.cc
+++ b/libstdc++-v3/testsuite/experimental/type_erased_allocator/2.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++14 } }
+// { dg-require-thread-fence "" }
 
 // Copyright (C) 2015-2016 Free Software Foundation, Inc.
 //


RE: [PATCH] microMIPS/GCC: Fix PIC call relaxation

2016-11-16 Thread Maciej W. Rozycki
On Wed, 16 Nov 2016, Matthew Fortune wrote:

> > Fix `-mrelax-pic-calls' support for microMIPS code where the relocation
> > produced is supposed to be R_MICROMIPS_JALR rather than R_MIPS_JALR.
> > The lack of short delay support comes from a missed update to this code
> > for microMIPS support and can be relieved as JALRS and JRS instructions
> > can be relaxed to BALS and B instructions respectively, so do that as
> > well.
> 
> Thanks. I didn't follow the background to this optimisation when I added
> the compact branch support so opted to retain the pre-existing behaviour.

 In any case there was no deliberate choice made here, but just a missed 
update along the original microMIPS change.

> > By doing so complement commit r196828 ("microMIPS gcc support"),
> > , which is the
> > original change that introduced microMIPS support, in particular to
> > MIPS_CALL, which is where this code previously resided.
> > 
> > Adjust the test suite accordingly, limiting R_MICROMIPS_JALR cases to
> 
> Typo fwiw: Should say R_MIPS_JALR for MIPS code.

 Indeed.  We don't use GIT style commit logs though so I can't correct it 
and your spotting will just stay here for posterity.

> >  NB the use of this feature for microMIPS is limited because short
> > encodings of register jump instructions usually do not have their branch
> > counterparts and long encodings typically are not used.  However at
> > least
> > tail calls can be converted if the jump target is in range, as can calls
> > in `-minsn32' code.  Perhaps we could switch to producing `j[al]r[s].32'
> > in the `-mrelax-pic-calls' mode like GAS does with the `jal' and `j'
> > microMIPS macros in PIC code.
> 
> Does the linker do anything for R_MICROMIPS_JALR currently? From memory
> I seem to think it was mostly ignored. Is there any risk with older linkers
> by introducing R_MICROMIPS_JALR in GCC generated code?

 For the o32 ABI the R_MICROMIPS_JALR reloc is currently effectively 
ignored by the BFD linker, although some preparations for handling have 
been made and therefore there are two switch cases where it appears.

 Actual handling for PIC call relaxation is only included for R_MIPS_JALR 
in `mips_elf_perform_relocation' though, where the actual JALR and JR 
instruction encoding is checked, which is different between the regular 
MIPS and the microMIPS ISAs.

 The presence of R_MIPS_JALR in microMIPS code will lead to an incorrect 
internal `cross_mode_jump_p' setting in `mips_elf_calculate_relocation', 
however this does not cause harm because `mips_elf_perform_relocation' 
code will check instruction encoding.

 For n32 and n64 ABIs the BFD linker handling of R_MIPS_JALR is done in 
`_bfd_mips_relax_section' instead and does not take `cross_mode_jump_p' 
into account, however the instruction encoding is likewise checked, so no 
harm there either.  I plan to remove this function, perhaps before 2.28 
even, as it is duplicated by the corresponding o32 handler now, but 
suffers from considerable bitrot.

 I plan to implement the missing R_MICROMIPS_JALR handling in the BFD 
linker sometime as well, although it may take a little bit yet.  As I 
noted in the original change description, the usable cases are limited, so 
there isn't as much incentive to have this implemented as there was with 
the regular MIPS ISA, which is also probably the reason why this was 
missed from the original microMIPS support patches.

> >  Let me know if the lack of microMIPS results would be a problem for
> > this
> > patch's acceptance.
> 
> Not a problem. We will pick this up as part of testing for the release.
> 
> There is a bit of coding style fuzz in the testcases but it is
> pre-existing from the code you duplicated so I don't think it needs
> fixing.

 I missed that indeed, having copied the test cases verbatim.  I wonder if 
we shouldn't actually factor out these test case bodies to shared .h files 
included from corresponding regular MIPS and microMIPS test cases, and 
then have the formatting fixes applied there instead.

> >  Otherwise, OK to apply?
> 
> OK, as long as you can say there is no risk with the new reloc and older
> linkers.

 As double-checked and documented above I can say there is no risk, so I 
have committed this change now.  Thanks for your review.

  Maciej


[PATCH] Fix combine's make_extraction (PR rtl-optimization/78378)

2016-11-16 Thread Jakub Jelinek
Hi!

If inner is a MEM, make_extraction requires that pos is a multiple of bytes
and deals with offsetting it.  Or otherwise requires that pos is a multiple
of BITS_PER_WORD and for REG inner it handles that too.  But if inner
is something different, it calls just force_to_mode to the target mode,
which only really works if pos is 0.

Thus the following patch restricts it to that case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-16  Jakub Jelinek  

PR rtl-optimization/78378
* combine.c (make_extraction): Use force_to_mode for non-{REG,MEM}
inner only if pos is 0.  Fix up formatting.

* gcc.c-torture/execute/pr78378.c: New test.

--- gcc/combine.c.jj2016-11-16 18:51:58.0 +0100
+++ gcc/combine.c   2016-11-16 19:52:12.735674000 +0100
@@ -7382,6 +7382,7 @@ make_extraction (machine_mode mode, rtx
   if (tmode != BLKmode
   && ((pos_rtx == 0 && (pos % BITS_PER_WORD) == 0
   && !MEM_P (inner)
+  && (pos == 0 || REG_P (inner))
   && (inner_mode == tmode
   || !REG_P (inner)
   || TRULY_NOOP_TRUNCATION_MODES_P (tmode, inner_mode)
@@ -7458,10 +7459,9 @@ make_extraction (machine_mode mode, rtx
}
   else
new_rtx = force_to_mode (inner, tmode,
-len >= HOST_BITS_PER_WIDE_INT
-? HOST_WIDE_INT_M1U
-: (HOST_WIDE_INT_1U << len) - 1,
-0);
+len >= HOST_BITS_PER_WIDE_INT
+? HOST_WIDE_INT_M1U
+: (HOST_WIDE_INT_1U << len) - 1, 0);
 
   /* If this extraction is going into the destination of a SET,
 make a STRICT_LOW_PART unless we made a MEM.  */
--- gcc/testsuite/gcc.c-torture/execute/pr78378.c.jj2016-11-16 
20:02:16.975248597 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr78378.c   2016-11-16 
20:02:03.0 +0100
@@ -0,0 +1,18 @@
+/* PR rtl-optimization/78378 */
+
+unsigned long long __attribute__ ((noinline, noclone))
+foo (unsigned long long x)
+{
+  x <<= 41;
+  x /= 232;
+  return 1 + (unsigned short) x;
+}
+
+int
+main ()
+{
+  unsigned long long x = foo (1);
+  if (x != 0x2c24)
+__builtin_abort();
+  return 0;
+}

Jakub


Re: PR61409: -Wmaybe-uninitialized false-positive with -O2

2016-11-16 Thread Jeff Law

On 11/02/2016 11:16 AM, Aldy Hernandez wrote:

Hi Jeff.

As discussed in the PR, here is a patch exploring your idea of ignoring
unguarded uses if we can prove that the guards for such uses are
invalidated by the uninitialized operand paths being executed.

This is an updated patch from my suggestion in the PR.  It bootstraps
with no regression on x86-64 Linux, and fixes the PR in question.

As the "NOTE:" in the code states, we could be much smarter when
invalidating predicates, but for now let's do straight negation which
works for the simple case.  We could expand on this in the future.

OK for trunk?


curr


commit 8375d7e28c1a798dd0cc0f487d7fa1068d9eb124
Author: Aldy Hernandez 
Date:   Thu Aug 25 10:44:29 2016 -0400

PR middle-end/61409
* tree-ssa-uninit.c (use_pred_not_overlap_with_undef_path_pred):
Remove reference to missing NUM_PREDS in function comment.
(can_one_predicate_be_invalidated_p): New.
(can_chain_union_be_invalidated_p): New.
(flatten_out_predicate_chains): New.
(uninit_ops_invalidate_phi_use): New.
(is_use_properly_guarded): Call uninit_ops_invalidate_phi_use.

[ snip ]



+static bool
+can_one_predicate_be_invalidated_p (pred_info predicate,
+   vec worklist)
+{
+  for (size_t i = 0; i < worklist.length (); ++i)
+{
+  pred_info *p = worklist[i];
+
+  /* NOTE: This is a very simple check, and only understands an
+exact opposite.  So, [i == 0] is currently only invalidated
+by [.NOT. i == 0] or [i != 0].  Ideally we should also
+invalidate with say [i > 5] or [i == 8].  There is certainly
+room for improvement here.  */
+  if (pred_neg_p (predicate, *p))
It's good enough for now.  I saw some other routines that might allow us 
to handle more cases.  I'm OK with faulting those in if/when we see such 
cases in real code.




+
+/* Return TRUE if executing the path to some uninitialized operands in
+   a PHI will invalidate the use of the PHI result later on.
+
+   UNINIT_OPNDS is a bit vector specifying which PHI arguments have
+   arguments which are considered uninitialized.
+
+   USE_PREDS is the pred_chain_union specifying the guard conditions
+   for the use of the PHI result.
+
+   What we want to do is disprove each of the guards in the factors of
+   the USE_PREDS.  So if we have:
+
+   # USE_PREDS guards of:
+   #   1. i > 5 && i < 100
+   #   2. j > 10 && j < 88
Are USE_PREDS joined by an AND or IOR?  I guess based on their type it 
must be IOR.   Thus to get to a use  #1 or #2 must be true.  So to prove 
we can't reach a use, we have to prove that #1 and #2 are both not true. 
 Right?




+
+static bool
+uninit_ops_invalidate_phi_use (gphi *phi, unsigned uninit_opnds,
+  pred_chain_union use_preds)
+{
+  /* Look for the control dependencies of all the uninitialized
+ operands and build predicates describing them.  */
+  unsigned i;
+  pred_chain_union uninit_preds[32];
+  memset (uninit_preds, 0, sizeof (pred_chain_union) * 32);
+  for (i = 0; i < MIN (32, gimple_phi_num_args (phi)); i++)
Can you replace the magic "32" with a file scoped const or #define?  I 
believe there's 2 existing uses of a magic "32" elsewhere in 
tree-ssa-uninit.c as well.



+
+  /* Build the control dependency chain for `i'...  */
+  if (compute_control_dep_chain (find_dom (e->src),
+e->src,
+dep_chains,
+_chains,
+_chain,
+_calls))

Does this miss the control dependency carried by E itself.

ie, if e->src ends in a conditional, shouldn't that conditional be 
reflected in the control dependency chain as well?  I guess we'd have to 
have the notion of computing the control dependency for an edge rather 
than a block.  It doesn't look like compute_control_dep_chain is ready 
for that.  I'm willing to put that into a "future work" bucket.


So I think just confirming my question about how USE_PREDS are joined at 
the call to uninit_opts_invalidate_phi_use and fixing the magic 32 to be 
a file scoped const or a #define and this is good to go on the trunk.


jeff




[PATCH] Fix up ICEs with TREE_CONSTANT references (PR c++/78373)

2016-11-16 Thread Jakub Jelinek
Hi!

Jason's recent patch to turn reference vars initialized with invariant
addresses broke the first testcase below, because >singleton
is considered TREE_CONSTANT (because self is TREE_CONSTANT VAR_DECL and
singleton field has constant offset), but after going into SSA form
it is not supposed to be TREE_CONSTANT anymore (_2->singleton),
because SSA_NAMEs don't have TREE_CONSTANT set on them.

The following patch fixes it by gimplifying such vars into their
DECL_INITIAL unless in OpenMP regions, where such folding is deferred
until omplower pass finishes.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-16  Jakub Jelinek  

PR c++/78373
* gimplify.c (gimplify_decl_expr): For TREE_CONSTANT reference
vars with is_gimple_min_invariant initializer after stripping
useless conversions keep non-NULL DECL_INITIAL.
(gimplify_var_or_parm_decl): Add fallback argument.  Gimplify
TREE_CONSTANT reference vars with is_gimple_min_invariant initializer
outside of OpenMP contexts to the initializer if fb_rvalue is
allowed.
(gimplify_compound_lval, gimplify_expr): Pass through fallback
argument to gimplify_var_or_parm_decl.
* omp-low.c (lower_omp_regimplify_p): Return non-zero for
TREE_CONSTANT reference vars with is_gimple_min_invariant
initializer.

* g++.dg/opt/pr78373.C: New test.
* g++.dg/gomp/pr78373.C: New test.

--- gcc/gimplify.c.jj   2016-11-10 12:34:12.0 +0100
+++ gcc/gimplify.c  2016-11-16 16:15:13.496510476 +0100
@@ -1645,10 +1645,23 @@ gimplify_decl_expr (tree *stmt_p, gimple
{
  if (!TREE_STATIC (decl))
{
+ tree new_init = NULL_TREE;
+ if (TREE_CONSTANT (decl)
+ && TREE_CODE (TREE_TYPE (decl)) == REFERENCE_TYPE)
+   {
+ new_init = init;
+ STRIP_USELESS_TYPE_CONVERSION (new_init);
+ if (is_gimple_min_invariant (new_init))
+   new_init = unshare_expr (new_init);
+ else
+   new_init = NULL_TREE;
+   }
  DECL_INITIAL (decl) = NULL_TREE;
  init = build2 (INIT_EXPR, void_type_node, decl, init);
  gimplify_and_add (init, seq_p);
  ggc_free (init);
+ if (new_init && DECL_INITIAL (decl) == NULL_TREE)
+   DECL_INITIAL (decl) = new_init;
}
  else
/* We must still examine initializers for static variables
@@ -2546,7 +2559,7 @@ static tree nonlocal_vla_vars;
DECL_VALUE_EXPR, and it's worth re-examining things.  */
 
 static enum gimplify_status
-gimplify_var_or_parm_decl (tree *expr_p)
+gimplify_var_or_parm_decl (tree *expr_p, fallback_t fallback)
 {
   tree decl = *expr_p;
 
@@ -2607,6 +2620,29 @@ gimplify_var_or_parm_decl (tree *expr_p)
   return GS_OK;
 }
 
+  /* Gimplify TREE_CONSTANT C++ reference vars as their DECL_INITIAL.  */
+  if (VAR_P (decl)
+  && TREE_CONSTANT (decl)
+  && TREE_CODE (TREE_TYPE (decl)) == REFERENCE_TYPE
+  && DECL_INITIAL (decl)
+  && (fallback & fb_rvalue) != 0
+  && is_gimple_min_invariant (DECL_INITIAL (decl)))
+{
+  if (gimplify_omp_ctxp)
+   {
+ /* Don't do this in OpenMP regions, see e.g. libgomp.c++/target-14.C
+testcase.  We'll do that in the omplower pass later instead.  */
+ if ((gimplify_omp_ctxp->region_type != ORT_TARGET
+  || gimplify_omp_ctxp->outer_context != NULL
+  || !lookup_attribute ("omp declare target",
+DECL_ATTRIBUTES (cfun->decl)))
+ && gimplify_omp_ctxp->region_type != ORT_NONE)
+   return GS_ALL_DONE;
+   }
+  *expr_p = unshare_expr (DECL_INITIAL (decl));
+  return GS_OK;
+}
+
   return GS_ALL_DONE;
 }
 
@@ -2712,7 +2748,7 @@ gimplify_compound_lval (tree *expr_p, gi
   /* Expand DECL_VALUE_EXPR now.  In some cases that may expose
 additional COMPONENT_REFs.  */
   else if ((VAR_P (*p) || TREE_CODE (*p) == PARM_DECL)
-  && gimplify_var_or_parm_decl (p) == GS_OK)
+  && gimplify_var_or_parm_decl (p, fallback) == GS_OK)
goto restart;
   else
break;
@@ -11624,7 +11660,7 @@ gimplify_expr (tree *expr_p, gimple_seq
 
case VAR_DECL:
case PARM_DECL:
- ret = gimplify_var_or_parm_decl (expr_p);
+ ret = gimplify_var_or_parm_decl (expr_p, fallback);
  break;
 
case RESULT_DECL:
--- gcc/omp-low.c.jj2016-11-16 09:23:46.0 +0100
+++ gcc/omp-low.c   2016-11-16 15:44:35.071617469 +0100
@@ -16963,6 +16963,14 @@ lower_omp_regimplify_p (tree *tp, int *w
   && bitmap_bit_p (task_shared_vars, DECL_UID (t)))
 return t;
 
+  /* And C++ references with TREE_CONSTANT set too.  */
+  if (VAR_P (t)
+  && TREE_CONSTANT (t)
+  && 

Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-11-16 Thread Mike Stump
On Nov 16, 2016, at 12:09 PM, Andrew Burgess  
wrote:
> My only remaining concern is the new tests, I've tried to restrict
> them to targets that I suspect they'll pass on with:
> 
>/* { dg-final-use { scan-assembler "\.section\[\t 
> \]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold\.0" { target 
> *-*-linux* *-*-gnu* } } } */
> 
> but I'm still nervous that I'm going to introduce test failures.  Is
> there any advice / guidance I should follow before I commit, or are
> folk pretty relaxed so long as I've made a reasonable effort?

So, if you are worried about the way the line is constructed, I usually test it 
by misspelling the *-*-linux* *-*-gnu* part as *-*-linNOTux* *-*-gnNOTu* and 
see if the test then doesn't run on your machine.  If it doesn't then you can 
be pretty confident that only machines that match the target triplet can be 
impacted.  I usually do this type of testing by running the test case in 
isolation (not the full tests suite).  Anyway, do the best you can, and don't 
worry about t it too much, learn from the experience, even if it goes wrong in 
some way.  If it did go wrong, just be responsive (don't check it in just 
before a 6 week vacation) about fixing it, if you can.



Re: [PATCH] Follow-up patch on enabling new AVX512 instructions

2016-11-16 Thread Uros Bizjak
On Tue, Nov 15, 2016 at 10:50 PM, Andrew Senkevich
 wrote:
> Hi,
>
> this is follow-up with tests for new __target__ attributes and
> __builtin_cpu_supports update.
>
> gcc/
> * config/i386/i386.c (processor_features): Add
> F_AVX5124VNNIW, F_AVX5124FMAPS.
> (isa_names_table): Handle new features.
> libgcc/
> * config/i386/cpuinfo.c (processor_features): Add
> FEATURE_AVX5124VNNIW, FEATURE_AVX5124FMAPS.
> gcc/testsuite/
> * gcc.target/i386/builtin_target.c: Handle new "avx5124vnniw",
> "avx5124fmaps".
> * gcc.target/i386/funcspec-56.inc: Test new attributes.

OK.

Thanks,
Uros.

> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 1da1abc..823930d
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -33205,6 +33205,8 @@ fold_builtin_cpu (tree fndecl, tree *args)
>  F_AVX512PF,
>  F_AVX512VBMI,
>  F_AVX512IFMA,
> +F_AVX5124VNNIW,
> +F_AVX5124FMAPS,
>  F_MAX
>};
>
> @@ -33317,6 +33319,8 @@ fold_builtin_cpu (tree fndecl, tree *args)
>{"avx512pf",F_AVX512PF},
>{"avx512vbmi",F_AVX512VBMI},
>{"avx512ifma",F_AVX512IFMA},
> +  {"avx5124vnniw",F_AVX5124VNNIW},
> +  {"avx5124fmaps",F_AVX5124FMAPS},
>  };
>
>tree __processor_model_type = build_processor_model_struct ();
> diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c
> b/gcc/testsuite/gcc.target/i386/builtin_target.c
> index 8d45d83..c620a74
> --- a/gcc/testsuite/gcc.target/i386/builtin_target.c
> +++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
> @@ -213,6 +213,10 @@ check_features (unsigned int ecx, unsigned int edx,
> assert (__builtin_cpu_supports ("avx512ifma"));
>if (ecx & bit_AVX512VBMI)
> assert (__builtin_cpu_supports ("avx512vbmi"));
> +  if (edx & bit_AVX5124VNNIW)
> +   assert (__builtin_cpu_supports ("avx5124vnniw"));
> +  if (edx & bit_AVX5124FMAPS)
> +   assert (__builtin_cpu_supports ("avx5124fmaps"));
>  }
>  }
>
> @@ -311,6 +315,10 @@ quick_check ()
>
>assert (__builtin_cpu_supports ("avx512f") >= 0);
>
> +  assert (__builtin_cpu_supports ("avx5124vnniw") >= 0);
> +
> +  assert (__builtin_cpu_supports ("avx5124fmaps") >= 0);
> +
>/* Check CPU type.  */
>assert (__builtin_cpu_is ("amd") >= 0);
>
> diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
> index af203f2..4a0ad25
> --- a/libgcc/config/i386/cpuinfo.c
> +++ b/libgcc/config/i386/cpuinfo.c
> @@ -115,7 +115,9 @@ enum processor_features
>FEATURE_AVX512ER,
>FEATURE_AVX512PF,
>FEATURE_AVX512VBMI,
> -  FEATURE_AVX512IFMA
> +  FEATURE_AVX512IFMA,
> +  FEATURE_AVX5124VNNIW,
> +  FEATURE_AVX5124FMAPS
>  };
>
>  struct __processor_model
> @@ -359,6 +361,10 @@ get_available_features (unsigned int ecx, unsigned int 
> edx,
> features |= (1 << FEATURE_AVX512IFMA);
>if (ecx & bit_AVX512VBMI)
> features |= (1 << FEATURE_AVX512VBMI);
> +  if (edx & bit_AVX5124VNNIW)
> +   features |= (1 << FEATURE_AVX5124VNNIW);
> +  if (edx & bit_AVX5124FMAPS)
> +   features |= (1 << FEATURE_AVX5124FMAPS);
>  }
>
>unsigned int ext_level;
> diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
> b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
> index 521ac8a..9334e9e 100644
> --- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
> +++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
> @@ -28,6 +28,8 @@ extern void test_avx512dq(void)
>  __attribute__((__target__("avx512dq")));
>  extern void test_avx512er(void)
> __attribute__((__target__("avx512er")));
>  extern void test_avx512pf(void)
> __attribute__((__target__("avx512pf")));
>  extern void test_avx512cd(void)
> __attribute__((__target__("avx512cd")));
> +extern void test_avx5124fmaps(void)
> __attribute__((__target__("avx5124fmaps")));
> +extern void test_avx5124vnniw(void)
> __attribute__((__target__("avx5124vnniw")));
>  extern void test_bmi (void)
> __attribute__((__target__("bmi")));
>  extern void test_bmi2 (void)
> __attribute__((__target__("bmi2")));
>
> @@ -59,6 +61,8 @@ extern void test_no_avx512dq(void)
> __attribute__((__target__("no-avx512dq")));
>  extern void test_no_avx512er(void)
> __attribute__((__target__("no-avx512er")));
>  extern void test_bo_avx512pf(void)
> __attribute__((__target__("no-avx512pf")));
>  extern void test_no_avx512cd(void)
> __attribute__((__target__("no-avx512cd")));
> +extern void test_no_avx5124fmaps(void)
> __attribute__((__target__("no-avx5124fmaps")));
> +extern void test_no_avx5124vnniw(void)
> __attribute__((__target__("no-avx5124vnniw")));
>  extern void test_no_bmi (void)
> __attribute__((__target__("no-bmi")));
>  extern void test_no_bmi2 (void)
> __attribute__((__target__("no-bmi2")));
>
>
> --
> WBR,
> Andrew


Re: [2/9] Encoding support for AArch64 DWARF operations

2016-11-16 Thread Jason Merrill
On Wed, Nov 16, 2016 at 12:50 PM, Jason Merrill  wrote:
> On Fri, Nov 11, 2016 at 1:33 PM, Jiong Wang  wrote:
>> The encoding for new added AARCH64 DWARF operations.
>
> This patch seems rather incomplete; I only see a change to
> dwarf2out.c, which won't compile since the opcodes aren't defined
> anywhere.

Sorry, now I see the rest of the patchset.

Jason


Re: Fix PR78154

2016-11-16 Thread Martin Sebor

On 11/16/2016 11:49 AM, Prathamesh Kulkarni wrote:

Hi Richard,
Following your suggestion in PR78154, the patch checks if stmt
contains call to memmove (and friends) in gimple_stmt_nonzero_warnv_p
and returns true in that case.


Nice.  I think the list should also include mempcpy, stpcpy, and
stpncpy, and probably also the corresponding checking built-ins
such as __builtin___memcpy_chk.

FWIW, a more general solution to consider (possibly for GCC 8)
might be to extend attribute nonnull to apply to a functions return
value as well (e.g., use zero as the index for that), to indicate
that a pointer returned from it is not null.  That would let library
implementers annotate other functions (such as strerror)

Martin



PR78319

2016-11-16 Thread Prathamesh Kulkarni
Hi,
As discussed in PR, this patch marks the test-case to xfail on arm-none-eabi.
OK to commit ?

Thanks,
Prathamesh
2016-11-17  Prathamesh Kulkarni  

PR tree-optimization/78319

testsuite/
* gcc.dg/uninit-pred-8_a.c (foo): Mark dg-bogus test to xfail on
arm-none-eabi.

diff --git a/gcc/testsuite/gcc.dg/uninit-pred-8_a.c 
b/gcc/testsuite/gcc.dg/uninit-pred-8_a.c
index 1b7c472..c45fba0 100644
--- a/gcc/testsuite/gcc.dg/uninit-pred-8_a.c
+++ b/gcc/testsuite/gcc.dg/uninit-pred-8_a.c
@@ -16,8 +16,9 @@ int foo (int n, int l, int m, int r)
   if (m) g++;
   else   bar();
 
+  /* marking this test as xfail on arm-none-eabi, see PR78319.  */
   if ( n ||  m || r || l)
-  blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
+  blah(v); /* { dg-bogus "uninitialized" "bogus warning" { xfail 
arm-none-eabi } } */
 
   if ( n )
   blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */


Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-11-16 Thread Andrew Burgess
* Bernd Schmidt  [2016-11-03 13:01:32 +0100]:

> On 09/14/2016 03:00 PM, Andrew Burgess wrote:
> > In an attempt to get this patch merged (as I still think that its
> > correct) I've investigated, and documented a little more about how I
> > think things currently work.  I'm sure most people reading this will
> > already know this, but hopefully, if my understanding is wrong someone
> > can point it out.
> > gcc/ChangeLog:
> > 
> > * gcc/bb-reorder.c: Remove 'toplev.h' include.
> > (pass_partition_blocks::gate): No longer check
> > user_defined_section_attribute, instead check the function decl
> > for a section attribute.
> > * gcc/c-family/c-common.c (handle_section_attribute): No longer
> > set user_defined_section_attribute.
> > * gcc/final.c (rest_of_handle_final): Likewise.
> > * gcc/toplev.c: Remove definition of user_defined_section_attribute.
> > * gcc/toplev.h: Remove declaration of
> > user_defined_section_attribute.
> > 
> > gcc/testsuiteChangeLog:
> > 
> > * gcc.dg/tree-prof/section-attr-1.c: New file.
> > * gcc.dg/tree-prof/section-attr-2.c: New file.
> > * gcc.dg/tree-prof/section-attr-3.c: New file.
> 
> I think the explanation is perfectly reasonable and the patch looks good,
> except:
> 
> > +__attribute__((noinline))
> 
> Add noclone to all of these as well.

Thanks.  Considering Jeff said, I'm thinking about it, and you've said
yes, and given Jeff's not got back, I'm considering this patch
approved (with the fix you suggest).

My only remaining concern is the new tests, I've tried to restrict
them to targets that I suspect they'll pass on with:

/* { dg-final-use { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold\.0" { target 
*-*-linux* *-*-gnu* } } } */

but I'm still nervous that I'm going to introduce test failures.  Is
there any advice / guidance I should follow before I commit, or are
folk pretty relaxed so long as I've made a reasonable effort?

Thanks,
Andrew


Re: [PATCH] Handle --enable-checking={yes,assert,release} in libcpp (PR bootstrap/72823)

2016-11-16 Thread Richard Biener
On November 16, 2016 7:22:51 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>As mentioned in the PR, libcpp uses gcc_assert in a couple of places,
>but guards it with ENABLE_ASSERT_CHECKING macro that is never defined
>in libcpp.
>
>This patch arranges for it to be defined if ENABLE_ASSERT_CHECKING
>is going to be defined in gcc subdir.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

>2016-11-16  Jakub Jelinek  
>
>   PR bootstrap/72823
>   * configure.ac (ENABLE_ASSERT_CHECKING): Define if gcc configure
>   would define that macro.
>   * configure: Regenerated.
>   * config.in: Regenerated.
>
>--- libcpp/configure.ac.jj 2016-05-20 12:44:36.0 +0200
>+++ libcpp/configure.ac2016-11-15 16:40:16.753842880 +0100
>@@ -152,9 +152,11 @@ for check in release $ac_checking_flags
> do
>   case $check in
>   # these set all the flags to specific states
>-  yes|all) ac_checking=1 ; ac_valgrind_checking= ;;
>-  no|none|release) ac_checking= ; ac_valgrind_checking= ;;
>+  yes|all) ac_checking=1 ; ac_assert_checking=1 ; ac_valgrind_checking=
>;;
>+  no|none) ac_checking= ; ac_assert_checking= ; ac_valgrind_checking=
>;;
>+  release) ac_checking= ; ac_assert_checking=1 ; ac_valgrind_checking=
>;;
>   # these enable particular checks
>+  assert) ac_assert_checking=1 ;;
>   misc) ac_checking=1 ;;
>   valgrind) ac_valgrind_checking=1 ;;
>   # accept
>@@ -170,6 +172,11 @@ else
>   AC_DEFINE(CHECKING_P, 0)
> fi
> 
>+if test x$ac_assert_checking != x ; then
>+  AC_DEFINE(ENABLE_ASSERT_CHECKING, 1,
>+[Define if you want assertions enabled.  This is a cheap check.])
>+fi
>+
> if test x$ac_valgrind_checking != x ; then
>   AC_DEFINE(ENABLE_VALGRIND_CHECKING, 1,
>[Define if you want to workaround valgrind (a memory checker) warnings
>about
>--- libcpp/configure.jj2016-05-20 12:44:36.0 +0200
>+++ libcpp/configure   2016-11-15 16:40:35.331607679 +0100
>@@ -7288,9 +7288,11 @@ for check in release $ac_checking_flags
> do
>   case $check in
>   # these set all the flags to specific states
>-  yes|all) ac_checking=1 ; ac_valgrind_checking= ;;
>-  no|none|release) ac_checking= ; ac_valgrind_checking= ;;
>+  yes|all) ac_checking=1 ; ac_assert_checking=1 ; ac_valgrind_checking=
>;;
>+  no|none) ac_checking= ; ac_assert_checking= ; ac_valgrind_checking=
>;;
>+  release) ac_checking= ; ac_assert_checking=1 ; ac_valgrind_checking=
>;;
>   # these enable particular checks
>+  assert) ac_assert_checking=1 ;;
>   misc) ac_checking=1 ;;
>   valgrind) ac_valgrind_checking=1 ;;
>   # accept
>@@ -7308,6 +7310,12 @@ else
> 
> fi
> 
>+if test x$ac_assert_checking != x ; then
>+
>+$as_echo "#define ENABLE_ASSERT_CHECKING 1" >>confdefs.h
>+
>+fi
>+
> if test x$ac_valgrind_checking != x ; then
> 
> $as_echo "#define ENABLE_VALGRIND_CHECKING 1" >>confdefs.h
>--- libcpp/config.in.jj2016-05-20 12:44:36.0 +0200
>+++ libcpp/config.in   2016-11-15 16:40:38.0 +0100
>@@ -14,6 +14,9 @@
> /* Define to 1 if using `alloca.c'. */
> #undef C_ALLOCA
> 
>+/* Define if you want assertions enabled. This is a cheap check. */
>+#undef ENABLE_ASSERT_CHECKING
>+
> /* Define to enable system headers canonicalization. */
> #undef ENABLE_CANONICAL_SYSTEM_HEADERS
> 
>
>   Jakub




[PING^2][PATCH][aarch64] Improve Logical And Immediate Expressions

2016-11-16 Thread Michael Collison
Ping^2. Link to original post:

https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02305.html


Re: [PATCH] Fix NetBSD bootstrap

2016-11-16 Thread Krister Walfridsson

On Wed, 16 Nov 2016, Mike Stump wrote:

Looks reasonable.  The biggest issue would be if any of those values 
changed through time, and the current version works for older netbsd 
releases, the patch could break them.  Of course, I don't have any 
visibility into how any of those values might have changed through time.


This should not be an issue in this case, so I'll commit the patch. 
Thanks!


   /Krister


[PATCH v2 1/2, i386] cmpstrnsi needs string length

2016-11-16 Thread Aaron Sawdey
This patch adds a test to the cmpstrnsi pattern in i386.md so that it
will bail out (FAIL) if neither of the strings is a constant string. It
can only work as a proper strncmp if the length is not longer than both
of the strings. This change is required if expand_builtin_strncmp is
going to try expansion of strncmp when neither string argument is
constant. I've also changed the pattern to indicate that operand 3 may
be clobbered (if it happens to be in cx already).

2016-11-16  Aaron Sawdey  

* config/i386/i386.md (cmpstrnsi): New test to bail out if neither
string input is a string constant.  Clobber length argument.


-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md	(revision 242428)
+++ gcc/config/i386/i386.md	(working copy)
@@ -16898,7 +16898,7 @@
   [(set (match_operand:SI 0 "register_operand")
 	(compare:SI (match_operand:BLK 1 "general_operand")
 		(match_operand:BLK 2 "general_operand")))
-   (use (match_operand 3 "general_operand"))
+   (clobber (match_operand 3 "general_operand"))
(use (match_operand 4 "immediate_operand"))]
   ""
 {
@@ -16911,6 +16911,21 @@
   if (fixed_regs[CX_REG] || fixed_regs[SI_REG] || fixed_regs[DI_REG])
 FAIL;
 
+  /* One of the strings must be a constant.  If so, expand_builtin_strncmp()
+ will have rewritten the length arg to be the minimum of the const string
+ length and the actual length arg.  If both strings are the same and
+ shorter than the length arg, repz cmpsb will not stop at the 0 byte and
+ will incorrectly base the results on chars past the 0 byte.  */
+  tree t1 = MEM_EXPR (operands[1]);
+  tree t2 = MEM_EXPR (operands[2]);
+  if (!((t1 && TREE_CODE (t1) == MEM_REF
+ && TREE_CODE (TREE_OPERAND (t1, 0)) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (TREE_OPERAND (t1, 0), 0)) == STRING_CST)
+  || (t2 && TREE_CODE (t2) == MEM_REF
+  && TREE_CODE (TREE_OPERAND (t2, 0)) == ADDR_EXPR
+  && TREE_CODE (TREE_OPERAND (TREE_OPERAND (t2, 0), 0)) == STRING_CST)))
+FAIL;
+
   out = operands[0];
   if (!REG_P (out))
 out = gen_reg_rtx (SImode);


[PATCH v2 2/2, expand] make expand_builtin_strncmp more general

2016-11-16 Thread Aaron Sawdey
This patch makes expand_builtin_strncmp attempt to expand via cmpstrnsi
even if neither of the string arguments are string constants.

2016-11-16  Aaron Sawdey  

* builtins.c (expand_builtin_strncmp): Attempt expansion of strncmp
via cmpstrnsi even if neither string is constant.

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/builtins.c
===
--- gcc/builtins.c	(revision 242428)
+++ gcc/builtins.c	(working copy)
@@ -3918,7 +3918,7 @@
   insn_code cmpstrn_icode = direct_optab_handler (cmpstrn_optab, SImode);
   if (cmpstrn_icode != CODE_FOR_nothing)
   {
-tree len, len1, len2;
+tree len, len1, len2, len3;
 rtx arg1_rtx, arg2_rtx, arg3_rtx;
 rtx result;
 tree fndecl, fn;
@@ -3937,14 +3937,19 @@
 if (len2)
   len2 = size_binop_loc (loc, PLUS_EXPR, ssize_int (1), len2);
 
+len3 = fold_convert_loc (loc, sizetype, arg3);
+
 /* If we don't have a constant length for the first, use the length
-   of the second, if we know it.  We don't require a constant for
+   of the second, if we know it.  If neither string is constant length,
+   use the given length argument.  We don't require a constant for
this case; some cost analysis could be done if both are available
but neither is constant.  For now, assume they're equally cheap,
unless one has side effects.  If both strings have constant lengths,
use the smaller.  */
 
-if (!len1)
+if (!len1 && !len2)
+  len = len3;
+else if (!len1)
   len = len2;
 else if (!len2)
   len = len1;
@@ -3961,23 +3966,10 @@
 else
   len = len2;
 
-/* If both arguments have side effects, we cannot optimize.  */
-if (!len || TREE_SIDE_EFFECTS (len))
-  return NULL_RTX;
-
-/* The actual new length parameter is MIN(len,arg3).  */
-len = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (len), len,
-		   fold_convert_loc (loc, TREE_TYPE (len), arg3));
-
-/* If we don't have POINTER_TYPE, call the function.  */
-if (arg1_align == 0 || arg2_align == 0)
-  return NULL_RTX;
-
-/* Stabilize the arguments in case gen_cmpstrnsi fails.  */
-arg1 = builtin_save_expr (arg1);
-arg2 = builtin_save_expr (arg2);
-len = builtin_save_expr (len);
-
+/* If we are not using the given length, we must incorporate it here.
+   The actual new length parameter will be MIN(len,arg3) in this case.  */
+if (len != len3)
+  len = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (len), len, len3);
 arg1_rtx = get_memory_rtx (arg1, len);
 arg2_rtx = get_memory_rtx (arg2, len);
 arg3_rtx = expand_normal (len);


[PATCH v2 0/2] strncmp builtin expansion improvement

2016-11-16 Thread Aaron Sawdey
Builtin expansion of strncmp currently only happens when at least one
of the string arguments is a constant string. Two pieces are needed to
enable this:

1) Fix i386.md cmpstrnsi pattern. It uses repzcmpsb which does not
actually test for the zero byte ending the string. So this is only a
valid pattern when the length of one of the strings is known. So this
adds a test for one of the string args being a string constant, in
which case expand_builtin_strncmp will have made sure the length arg is
no larger than this known length. Also I've changed the pattern to
reflect the fact that the generated code can clobber operand 3 if it
happens to be in cx.

2) If c_strlen () was unable to determine the length of either string,
expand_builtin_strncmp will now use only the length argument and will
proceed anyway. Also I've removed a couple pieces that Richard
indicated are not needed any more.

Bootstrap & regtest passed on x86_64 with svn 242454, ok for trunk?

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: [PATCH] Fix NetBSD bootstrap

2016-11-16 Thread Mike Stump

> On Nov 16, 2016, at 9:12 AM, Krister Walfridsson 
>  wrote:
> 
> NetBSD fails bootstrap with
>  stdatomic.h:55:17: error: unknown type name '__INT_LEAST8_TYPE__'
> This is fixed by the following patch (only i386 and x86_64 for now. I'll
> do the other ports after fixing some more issues -- the NetBSD support is
> rather broken at the moment...)
> 
> I'm the NetBSD maintainer, so I belive I don't need approval to commit this. 
> But I have been absent for a long time, so it makes sense for someone to 
> review at least this first patch.

Looks reasonable.  The biggest issue would be if any of those values changed 
through time, and the current version works for older netbsd releases, the 
patch could break them.  Of course, I don't have any visibility into how any of 
those values might have changed through time.



Re: [PATCH][GCC/TESTSUITE] Make test for traditional-cpp depend on

2016-11-16 Thread Mike Stump
On Nov 16, 2016, at 10:58 AM, Mike Stump  wrote:
> 
> Yeah, I easily could have approved it as well, so no worries.

Oh.  I see I did approve the original patch, sorry for not catching it.  Thanks 
for all your work.



Re: [PATCH][GCC/TESTSUITE] Make test for traditional-cpp depend on

2016-11-16 Thread Mike Stump
On Nov 16, 2016, at 7:57 AM, Tamar Christina  wrote:
> 
> Forgot to include the committed patch.

>>> This is causing all test names to depend on $srcdir.  A test name
>>> should never include the value of $srcdir.
>> 
>> Sorry about that, committed a fix as r242500 under the obvious rule.

Yeah, I easily could have approved it as well, so no worries.

The patch is Ok.

The way I usually catch this would be in reviewing the output of 
./contrib/compare_tests, it would complain about a ton of new tests now not 
passing, which is a dead giveaway.



Re: Fix PR78154

2016-11-16 Thread Jakub Jelinek
On Thu, Nov 17, 2016 at 12:19:37AM +0530, Prathamesh Kulkarni wrote:
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -1069,6 +1069,34 @@ gimple_assign_nonzero_warnv_p (gimple *stmt, bool 
> *strict_overflow_p)
>  }
>  }
>  
> +/* Return true if STMT is known to contain call to a string-builtin function
> +   that is known to return nonnull.  */
> +
> +static bool
> +gimple_str_nonzero_warnv_p (gimple *stmt)
> +{
> +  if (!is_gimple_call (stmt))
> +return false;

Shouldn't this be:
  if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
return false;

> +
> +  tree fndecl = gimple_call_fndecl (stmt);

> +  if (!fndecl || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
> +return false;

And drop the above 2 lines?

That we you also verify the arguments for sanity.

Otherwise I'll defer to Richard.

Jakub


Re: [RFC PATCH] avoid printing type suffix with %E

2016-11-16 Thread Martin Sebor

On 11/16/2016 11:34 AM, Jeff Law wrote:

On 10/26/2016 10:37 AM, Martin Sebor wrote:

When formatting an integer constant using the %E directive GCC
includes a suffix that indicates its type.  This can perhaps be
useful in some situations but in my experience it's distracting
and gets in the way when writing tests.

Here's an example:

  $ cat b.c && gcc b.c
  constexpr __SIZE_TYPE__ x = 2;

  enum E: bool { e = x };
  b.c:3:20: error: enumerator value 2ul is outside the range of
underlying type ‘bool’
   enum E: bool { e = x };
  ^

Notice the "2ul" in the error message.

As far as I can tell, Clang avoids printing the suffix and I think
it would be nice if the GCC pretty printer made it possible to avoid
it as well.

The attached patch implements one such approach by having the pretty
printer recognize the space format flag to suppress the type suffix,
so "%E" still prints the suffix but "% E" does not.  I did this to
preserve the existing output but I think it would be nicer to avoid
printing the suffix with %E and treat (for instance) the pound sign
as a request to add the suffix.  I have tested the attached patch
but not the alternative.

Does anyone have any comments/suggestions for which of the two
approaches would be preferable (or what I may have missed here)?
I CC David as the diagnostic maintainer.

I'm having a hard time seeing how this is a significant issue, even when
writing tests.

It also seems to me that relaying the type of the constant as a suffix
would help in cases that aren't so obvious.

What am I missing?


I don't think it's terribly important, more like nuisance.  Tests
that check the value printed by the %E directive (I've been writing
lots of those lately -- see for example (*)) have to consistently
use this pattern:

\[0-9\]+\[lu\]*

When the type of the %E argument is a type like size_t or similar
that can be an alias for unsigned long or unsigned int, it's easy
to make a mistake and hardcode either

\[0-9\]+lu

or

\[0-9\]+u

based on the target where the test is being developed and end
up with failures on targets where the actual type is the other.
Copying test cases exercising one type to those exercising the
other (say from int to long) is also more tedious than it would
be without the suffix.

Beyond tests, I have never found the suffix helpful in warnings
or errors, but I also haven't seen too many of them in released
versions of GCC.  With the work I've been doing on buffer
overflow where size expressions are routinely included in
the diagnostics, there are lots more of them.  In some (e.g.,
in all the -Wformat-length) I've taken care to avoid printing
the suffix by converting tree nodes to HOST_WIDE_INT.  But that's
cumbersome and error-prone, and leads to inconsistent output from
GCC for different diagnostics that don't do the same.

Martin

[*] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01672.html


Re: [PATCH v2] bb-reorder: Improve compgotos pass (PR71785)

2016-11-16 Thread Jeff Law

On 11/01/2016 10:27 AM, Segher Boessenkool wrote:

For code like the testcase in PR71785 GCC factors all the indirect branches
to a single dispatcher that then everything jumps to.  This is because
having many indirect branches with each many jump targets does not scale
in large parts of the compiler.  Very late in the pass pipeline (right
before peephole2) the indirect branches are then unfactored again, by
the duplicate_computed_gotos pass.

This pass works by replacing branches to such a common dispatcher by a
copy of the dispatcher.  For code like this testcase this does not work
so well: most cases do a single addition instruction right before the
dispatcher, but not all, and we end up with only two indirect jumps: the
one without the addition, and the one with the addition in its own basic
block, and now everything else jumps _there_.

This patch solves this problem by simply running the core of the
duplicate_computed_gotos pass again, as long as it does any work.  The
patch looks much bigger than it is, because I factored out two routines
to simplify the control flow.

Tested on powerpc64-linux {-m32,-m64}, and on the testcase, and on a version
of the testcase that has 2000 cases instead of 4.  Is this okay for trunk?


Segher


2016-10-30  Segher Boessenkool  

PR rtl-optimization/71785
* bb-reorder.c (duplicate_computed_gotos_find_candidates): New
function, factored out from pass_duplicate_computed_gotos::execute.
(duplicate_computed_gotos_do_duplicate): Ditto.  Don't use BB_VISITED.
(pass_duplicate_computed_gotos::execute): Rewrite.  Rerun the pass as
long as it makes changes.
OK.  I'm just going to note for the record here that while we iterate 
until nothing changes, the statement and block clamps should in practice 
ensure we hit a point where nothing changes.


Ideally I'd like to see testcases with this kind of change.  It should 
be standard operating procedure at this point.


Jeff




Fix PR78154

2016-11-16 Thread Prathamesh Kulkarni
Hi Richard,
Following your suggestion in PR78154, the patch checks if stmt
contains call to memmove (and friends) in gimple_stmt_nonzero_warnv_p
and returns true in that case.

Bootstrapped+tested on x86_64-unknown-linux-gnu.
Cross-testing on arm*-*-*, aarch64*-*-* in progress.
Would it be OK to commit this patch in stage-3 ?

Thanks,
Prathamesh
2016-11-17  Prathamesh Kulkarni  

* tree-vrp.c (gimple_str_nonzero_warnv_p): New function.
(gimple_stmt_nonzero_warnv_p): Call gimple_str_nonzero_warnv_p.

testsuite/
* gcc.dg/tree-ssa/pr78154.c: New test-case.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr78154.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr78154.c
new file mode 100644
index 000..d3463f4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr78154.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp-slim" } */
+
+void f (void *d, const void *s, __SIZE_TYPE__ n)
+{
+  if (__builtin_memcpy (d, s, n) == 0)
+__builtin_abort ();
+
+  if (__builtin_memmove (d, s, n) == 0)
+__builtin_abort ();
+
+  if (__builtin_memset (d, 0, n) == 0)
+__builtin_abort ();
+
+  if (__builtin_strcpy (d, s) == 0)
+__builtin_abort ();
+
+  if (__builtin_strcat (d, s) == 0)
+__builtin_abort ();
+
+  if (__builtin_strncpy (d, s, n) == 0)
+__builtin_abort ();
+
+  if (__builtin_strncat (d, s, n) == 0)
+__builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump-not "__builtin_abort" "evrp" } } */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index c2a4133..b563a7f 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -1069,6 +1069,34 @@ gimple_assign_nonzero_warnv_p (gimple *stmt, bool 
*strict_overflow_p)
 }
 }
 
+/* Return true if STMT is known to contain call to a string-builtin function
+   that is known to return nonnull.  */
+
+static bool
+gimple_str_nonzero_warnv_p (gimple *stmt)
+{
+  if (!is_gimple_call (stmt))
+return false;
+
+  tree fndecl = gimple_call_fndecl (stmt);
+  if (!fndecl || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
+return false;
+
+  switch (DECL_FUNCTION_CODE (fndecl))
+{
+  case BUILT_IN_MEMMOVE:
+  case BUILT_IN_MEMCPY:
+  case BUILT_IN_MEMSET:
+  case BUILT_IN_STRCPY:
+  case BUILT_IN_STRNCPY:
+  case BUILT_IN_STRCAT:
+  case BUILT_IN_STRNCAT:
+   return true;
+  default:
+   return false;
+}
+}
+
 /* Return true if STMT is known to compute a non-zero value.
If the return value is based on the assumption that signed overflow is
undefined, set *STRICT_OVERFLOW_P to true; otherwise, don't change
@@ -1097,7 +1125,7 @@ gimple_stmt_nonzero_warnv_p (gimple *stmt, bool 
*strict_overflow_p)
lookup_attribute ("returns_nonnull",
  TYPE_ATTRIBUTES (gimple_call_fntype (stmt
  return true;
-   return gimple_alloca_call_p (stmt);
+   return gimple_alloca_call_p (stmt) || gimple_str_nonzero_warnv_p (stmt);
   }
 default:
   gcc_unreachable ();


Re: [PATCH][2/2] GIMPLE Frontend, middle-end changes

2016-11-16 Thread Jeff Law

On 10/28/2016 05:51 AM, Richard Biener wrote:


These are the middle-end changes and additions to the testsuite.

They are pretty self-contained, I've organized the changelog
entries below in areas of changes:

 1) dump changes - we add a -gimple dump modifier that allows most
 function dumps to be directy fed back into the GIMPLE FE

 2) pass manager changes to implement the startwith("pass-name")
 feature which implements unit-testing for GIMPLE passes

 3) support for "SSA" input, a __PHI stmt that is lowered once the
 CFG is built, a facility to allow a specific SSA name to be allocated
 plus a small required change in the SSA rewriter to allow for
 pre-existing PHI arguments

Bootstrapped and tested on x86_64-unknown-linux-gnu (together with [1/2]).

I can approve all these changes myself but any comments are welcome.
My only worry would be ensuring that in the case where we're asking for 
a particular SSA_NAME in make_ssa_name_fn that we assert the requested 
name is available.


ISTM that if it's > the highest current version or in the freelist, then 
we ought to be safe.   If it isn't safe then we should either issue an 
error, or renumber the preexisting SSA_NAME (and determining if it's 
safe to renumber the preexisting SSA_NAME may require more context than 
we have).


jeff



RE: RFA: PATCH to gengtype to avoid putting tree_node support in front end objects

2016-11-16 Thread Moore, Catherine


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> Sent: Thursday, November 10, 2016 7:53 AM
> To: Jason Merrill 
> Cc: gcc-patches List 
> Subject: Re: RFA: PATCH to gengtype to avoid putting tree_node
> support in front end objects
> 
> On Thu, Oct 27, 2016 at 09:36:09AM -0400, Jason Merrill wrote:
> > Currently, the way gengtype works it scans the list of source files
> > with front end files at the end, and pushes data structures onto a
> > stack.  It then processes the stack in LIFO order, so that data
> > structures from front ends are handled first.  As a result, if a GTY
> > data structure in a front end depends on tree_node, gengtype
> happily
> > puts gt_ggc_mx(tree_node*&) in a front end file, leading to link
> > errors on all other front ends.
> >
> > This patch avoids this problem by appending to the list of data
> > structures so that they are processed in FIFO order, and so
> tree_node
> > gets handled in gtype-desc.o.
> >
> > Tested x86_64-pc-linux-gnu, OK for trunk?
> 
> > commit 487a1c95c0d3169b2041942ff4f8d71c9ff689eb
> > Author: Jason Merrill 
> > Date:   Wed Oct 26 23:12:23 2016 -0400
> >
> > * gengtype.c (new_structure): Append to structures list.
> >
> > (find_structure): Likewise.
> 
> Please remove the blank line in the ChangeLog.
> 
> When looking at the differences it creates, it is hard, because
> all the generated files have all the functions emitted in reverse order
> now
> from what it used to be, so I only looked at files where the size
> changed,
> and that is beyond gtype.state only in my case gt-tree-phinodes.h
> which lost
> void
> gt_ggc_mx (struct gimple *& x)
> {
>   if (x)
> gt_ggc_mx_gimple ((void *) x);
> }
> and
> void
> gt_pch_nx (struct gimple *& x)
> {
>   if (x)
> gt_pch_nx_gimple ((void *) x);
> }
> and gtype-desc.c which didn't contain those but now it does (for
> gtype-desc.c it is hard to find out due to the reordering what else
> has changed, but as gt-tree-phinodes.h shrunk by 170 characters and
> gtype-desc.c grew by 170 characters, I'd think it is all that changed).
> I believe those routines belong to gtype-desc.c, that is where similar
> ones for tree_node, etc. are, tree-phinodes.h certainly isn't the header
> that defines gimple.
> 
> So I think this patch is ok for trunk.  Thanks.
> 

Hi -- This patch caused breakage for the MIPS port while compiling 
gcc/config/mips.c:

In file included from 
/scratch/cmoore/mips-sde-elf-upstream/src/gcc-trunk-6/gcc/hash-table.h:561:0,
 from 
/scratch/cmoore/mips-sde-elf-upstream/src/gcc-trunk-6/gcc/coretypes.h:351,
 from 
/scratch/cmoore/mips-sde-elf-upstream/src/gcc-trunk-6/gcc/config/mips/mips.c:26:
/scratch/cmoore/mips-sde-elf-upstream/src/gcc-trunk-6/gcc/hash-map.h: In 
instantiation of 'static void hash_map< , 
,  
>::hash_entry::ggc_mx(hash_map< , 
,  >::hash_entry&) [with KeyId 
= nofree_string_hash; Value = rtx_def*; Traits = 
simple_hashmap_traits]':
/scratch/cmoore/mips-sde-elf-upstream/src/gcc-trunk-6/gcc/hash-table.h:1029:17: 
  required from 'void gt_ggc_mx(hash_table*) [with E = 
hash_map::hash_entry]'
/scratch/cmoore/mips-sde-elf-upstream/src/gcc-trunk-6/gcc/hash-map.h:251:13:   
required from 'void gt_ggc_mx(hash_map*) [with K = nofree_string_hash; 
V = rtx_def*; H = 
simple_hashmap_traits]'
./gt-mips.h:38:19:   required from here
/scratch/cmoore/mips-sde-elf-upstream/src/gcc-trunk-6/gcc/hash-map.h:62:12: 
error: no matching function for call to 'gt_ggc_mx(rtx_def*&)'
  gt_ggc_mx (e.m_value);
^
... etc

I configured with --target=mips-sde-elf, but I do have some local multilib 
definitions for that target.  This ought to reproduce with mti-elf as well.
Will you please fix or revert?

Thanks,
Catherine



libgo patch committed: remove runtime1.goc

2016-11-16 Thread Ian Lance Taylor
This patch to libgo replaces runtime/runtime1.goc with Go and C code.
This is a step toward eliminating goc2c.

This drops the exported parfor code; it was needed for tests in the
past, but no longer is. The Go 1.7 runtime no longer uses parfor.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 242494)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-31ff8c31d33c3e77cae4fd55445f12825eb92af5
+d9189ebc139ff739af956094626ccc5eb92c3091
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 242060)
+++ libgo/Makefile.am   (working copy)
@@ -485,7 +485,6 @@ runtime_files = \
runtime/yield.c \
$(rtems_task_variable_add_file) \
malloc.c \
-   runtime1.c \
$(runtime_getncpu_file)
 
 goc2c.$(OBJEXT): runtime/goc2c.c
@@ -498,10 +497,6 @@ malloc.c: $(srcdir)/runtime/malloc.goc g
./goc2c $< > $@.tmp
mv -f $@.tmp $@
 
-runtime1.c: $(srcdir)/runtime/runtime1.goc goc2c
-   ./goc2c $< > $@.tmp
-   mv -f $@.tmp $@
-
 %.c: $(srcdir)/runtime/%.goc goc2c
./goc2c $< > $@.tmp
mv -f $@.tmp $@
Index: libgo/go/runtime/debug.go
===
--- libgo/go/runtime/debug.go   (revision 241341)
+++ libgo/go/runtime/debug.go   (working copy)
@@ -4,6 +4,11 @@
 
 package runtime
 
+import (
+   "runtime/internal/atomic"
+   "unsafe"
+)
+
 // GOMAXPROCS sets the maximum number of CPUs that can be executing
 // simultaneously and returns the previous setting. If n < 1, it does not
 // change the current setting.
@@ -19,10 +24,18 @@ func GOMAXPROCS(n int) int
 func NumCPU() int
 
 // NumCgoCall returns the number of cgo calls made by the current process.
-func NumCgoCall() int64
+func NumCgoCall() int64 {
+   var n int64
+   for mp := (*m)(atomic.Loadp(unsafe.Pointer(allm(; mp != nil; mp = 
mp.alllink {
+   n += int64(mp.ncgocall)
+   }
+   return n
+}
 
 // NumGoroutine returns the number of goroutines that currently exist.
-func NumGoroutine() int
+func NumGoroutine() int {
+   return int(gcount())
+}
 
 // Get field tracking information.  Only fields with a tag go:"track"
 // are tracked.  This function will add every such field that is
Index: libgo/go/runtime/error.go
===
--- libgo/go/runtime/error.go   (revision 241341)
+++ libgo/go/runtime/error.go   (working copy)
@@ -133,7 +133,10 @@ type stringer interface {
String() string
 }
 
-func typestring(interface{}) string
+func typestring(x interface{}) string {
+   e := efaceOf()
+   return *e._type.string
+}
 
 // For calling from C.
 // Prints an argument passed to panic.
Index: libgo/go/runtime/export_test.go
===
--- libgo/go/runtime/export_test.go (revision 241427)
+++ libgo/go/runtime/export_test.go (working copy)
@@ -21,11 +21,10 @@ import (
 //var F64toint = f64toint
 //var Sqrt = sqrt
 
-func golockedOSThread() bool
-
 var Entersyscall = entersyscall
 var Exitsyscall = exitsyscall
-var LockedOSThread = golockedOSThread
+
+// var LockedOSThread = lockedOSThread
 
 // var Xadduintptr = xadduintptr
 
@@ -44,29 +43,6 @@ func LFStackPop(head *uint64) *LFNode {
return (*LFNode)(unsafe.Pointer(lfstackpop(head)))
 }
 
-type ParFor struct {
-   body   func(*ParFor, uint32)
-   done   uint32
-   Nthr   uint32
-   thrseq uint32
-   Cntuint32
-   wait   bool
-}
-
-func newParFor(nthrmax uint32) *ParFor
-func parForSetup(desc *ParFor, nthr, n uint32, wait bool, body func(*ParFor, 
uint32))
-func parForDo(desc *ParFor)
-func parForIters(desc *ParFor, tid uintptr) (uintptr, uintptr)
-
-var NewParFor = newParFor
-var ParForSetup = parForSetup
-var ParForDo = parForDo
-
-func ParForIters(desc *ParFor, tid uint32) (uint32, uint32) {
-   begin, end := parForIters(desc, uintptr(tid))
-   return uint32(begin), uint32(end)
-}
-
 func GCMask(x interface{}) (ret []byte) {
return nil
 }
Index: libgo/go/runtime/extern.go
===
--- libgo/go/runtime/extern.go  (revision 241341)
+++ libgo/go/runtime/extern.go  (working copy)
@@ -274,13 +274,11 @@ func SetFinalizer(obj interface{}, final
 // the actual system call.
 func KeepAlive(interface{})
 
-func getgoroot() string
-
 // GOROOT returns the root of the Go tree.
 // It uses the GOROOT environment variable, if set,
 // or else the root used during the Go build.
 func GOROOT() string {
-   s := getgoroot()
+   s := gogetenv("GOROOT")
if 

Re: [RFC PATCH] avoid printing type suffix with %E

2016-11-16 Thread Jeff Law

On 10/26/2016 10:37 AM, Martin Sebor wrote:

When formatting an integer constant using the %E directive GCC
includes a suffix that indicates its type.  This can perhaps be
useful in some situations but in my experience it's distracting
and gets in the way when writing tests.

Here's an example:

  $ cat b.c && gcc b.c
  constexpr __SIZE_TYPE__ x = 2;

  enum E: bool { e = x };
  b.c:3:20: error: enumerator value 2ul is outside the range of
underlying type ‘bool’
   enum E: bool { e = x };
  ^

Notice the "2ul" in the error message.

As far as I can tell, Clang avoids printing the suffix and I think
it would be nice if the GCC pretty printer made it possible to avoid
it as well.

The attached patch implements one such approach by having the pretty
printer recognize the space format flag to suppress the type suffix,
so "%E" still prints the suffix but "% E" does not.  I did this to
preserve the existing output but I think it would be nicer to avoid
printing the suffix with %E and treat (for instance) the pound sign
as a request to add the suffix.  I have tested the attached patch
but not the alternative.

Does anyone have any comments/suggestions for which of the two
approaches would be preferable (or what I may have missed here)?
I CC David as the diagnostic maintainer.
I'm having a hard time seeing how this is a significant issue, even when 
writing tests.


It also seems to me that relaying the type of the constant as a suffix 
would help in cases that aren't so obvious.


What am I missing?

Jeff



Re: New option -flimit-function-alignment

2016-11-16 Thread Jeff Law

On 10/14/2016 12:28 PM, Bernd Schmidt wrote:

On 10/12/2016 09:27 PM, Denys Vlasenko wrote:

Yes, something like "if max_skip >= func_size, temporarily lower
max_skip to func_size-1" (because otherwise we can create padding
bigger-or-equal to the entire function in size, which is stupid
- it's better to just put the function in that space).

This would be a nice.


That would be this patch. Bootstrapped and tested on x86_64-linux, ok?


Bernd

limit-align-v2b.diff


gcc/
* common.opt (flimit-function-alignment): New.
* doc/invoke.texi (-flimit-function-alignment): Document.
* emit-rtl.h (struct rtl_data): Add max_insn_address field.
* final.c (shorten_branches): Set it.
* varasm.c (assemble_start_function): Limit alignment if
requested.

gcc/testsuite/
* gcc.target/i386/align-limit.c: New test.

OK.  Sorry for the long delay.

jeff



[PATCH] Handle --enable-checking={yes,assert,release} in libcpp (PR bootstrap/72823)

2016-11-16 Thread Jakub Jelinek
Hi!

As mentioned in the PR, libcpp uses gcc_assert in a couple of places,
but guards it with ENABLE_ASSERT_CHECKING macro that is never defined
in libcpp.

This patch arranges for it to be defined if ENABLE_ASSERT_CHECKING
is going to be defined in gcc subdir.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-16  Jakub Jelinek  

PR bootstrap/72823
* configure.ac (ENABLE_ASSERT_CHECKING): Define if gcc configure
would define that macro.
* configure: Regenerated.
* config.in: Regenerated.

--- libcpp/configure.ac.jj  2016-05-20 12:44:36.0 +0200
+++ libcpp/configure.ac 2016-11-15 16:40:16.753842880 +0100
@@ -152,9 +152,11 @@ for check in release $ac_checking_flags
 do
case $check in
# these set all the flags to specific states
-   yes|all) ac_checking=1 ; ac_valgrind_checking= ;;
-   no|none|release) ac_checking= ; ac_valgrind_checking= ;;
+   yes|all) ac_checking=1 ; ac_assert_checking=1 ; ac_valgrind_checking= ;;
+   no|none) ac_checking= ; ac_assert_checking= ; ac_valgrind_checking= ;;
+   release) ac_checking= ; ac_assert_checking=1 ; ac_valgrind_checking= ;;
# these enable particular checks
+   assert) ac_assert_checking=1 ;;
misc) ac_checking=1 ;;
valgrind) ac_valgrind_checking=1 ;;
# accept
@@ -170,6 +172,11 @@ else
   AC_DEFINE(CHECKING_P, 0)
 fi
 
+if test x$ac_assert_checking != x ; then
+  AC_DEFINE(ENABLE_ASSERT_CHECKING, 1,
+[Define if you want assertions enabled.  This is a cheap check.])
+fi
+
 if test x$ac_valgrind_checking != x ; then
   AC_DEFINE(ENABLE_VALGRIND_CHECKING, 1,
 [Define if you want to workaround valgrind (a memory checker) warnings about
--- libcpp/configure.jj 2016-05-20 12:44:36.0 +0200
+++ libcpp/configure2016-11-15 16:40:35.331607679 +0100
@@ -7288,9 +7288,11 @@ for check in release $ac_checking_flags
 do
case $check in
# these set all the flags to specific states
-   yes|all) ac_checking=1 ; ac_valgrind_checking= ;;
-   no|none|release) ac_checking= ; ac_valgrind_checking= ;;
+   yes|all) ac_checking=1 ; ac_assert_checking=1 ; ac_valgrind_checking= ;;
+   no|none) ac_checking= ; ac_assert_checking= ; ac_valgrind_checking= ;;
+   release) ac_checking= ; ac_assert_checking=1 ; ac_valgrind_checking= ;;
# these enable particular checks
+   assert) ac_assert_checking=1 ;;
misc) ac_checking=1 ;;
valgrind) ac_valgrind_checking=1 ;;
# accept
@@ -7308,6 +7310,12 @@ else
 
 fi
 
+if test x$ac_assert_checking != x ; then
+
+$as_echo "#define ENABLE_ASSERT_CHECKING 1" >>confdefs.h
+
+fi
+
 if test x$ac_valgrind_checking != x ; then
 
 $as_echo "#define ENABLE_VALGRIND_CHECKING 1" >>confdefs.h
--- libcpp/config.in.jj 2016-05-20 12:44:36.0 +0200
+++ libcpp/config.in2016-11-15 16:40:38.0 +0100
@@ -14,6 +14,9 @@
 /* Define to 1 if using `alloca.c'. */
 #undef C_ALLOCA
 
+/* Define if you want assertions enabled. This is a cheap check. */
+#undef ENABLE_ASSERT_CHECKING
+
 /* Define to enable system headers canonicalization. */
 #undef ENABLE_CANONICAL_SYSTEM_HEADERS
 

Jakub


[committed] Fix ICE with omp for broken_loop (PR fortran/78299)

2016-11-16 Thread Jakub Jelinek
Hi!

When broken_loop is true (i.e. the OMP_FOR body doesn't return),
loop->header doesn't have to be equal to body_bb, but it makes no sense to
verify it.  We aren't adding any loop in that case anyway.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2016-11-16  Jakub Jelinek  

PR fortran/78299
* omp-low.c (expand_omp_for_static_nochunk): Don't assert
that loop->header == body_bb if broken_loop.

* gfortran.dg/gomp/pr78299.f90: New test.

--- gcc/omp-low.c.jj2016-11-10 12:34:12.0 +0100
+++ gcc/omp-low.c   2016-11-16 09:10:18.938969535 +0100
@@ -9685,7 +9685,7 @@ expand_omp_for_static_nochunk (struct om
   struct loop *loop = body_bb->loop_father;
   if (loop != entry_bb->loop_father)
 {
-  gcc_assert (loop->header == body_bb);
+  gcc_assert (broken_loop || loop->header == body_bb);
   gcc_assert (broken_loop
  || loop->latch == region->cont
  || single_pred (loop->latch) == region->cont);
--- gcc/testsuite/gfortran.dg/gomp/pr78299.f90.jj   2016-11-16 
09:15:46.282848093 +0100
+++ gcc/testsuite/gfortran.dg/gomp/pr78299.f90  2016-11-16 09:15:15.0 
+0100
@@ -0,0 +1,55 @@
+! PR fortran/78299
+! { dg-do compile }
+! { dg-additional-options "-fcheck=bounds" }
+
+program pr78299
+  integer, parameter :: n = 8
+  integer :: i, j
+  real :: x(n), y(n)
+  x = 1.0
+  y = 2.0
+  do j = 1, 9
+!$omp parallel workshare
+!$omp parallel default(shared)
+!$omp do
+do i = 1, n
+  x(i) = x(i) * y(9)   ! { dg-warning "is out of bounds" }
+end do
+!$omp end do
+!$omp end parallel
+!$omp end parallel workshare
+  end do
+  do j = 1, 9
+!$omp parallel workshare
+!$omp parallel default(shared)
+!$omp do schedule(static)
+do i = 1, n
+  x(i) = x(i) * y(9)   ! { dg-warning "is out of bounds" }
+end do
+!$omp end do
+!$omp end parallel
+!$omp end parallel workshare
+  end do
+  do j = 1, 9
+!$omp parallel workshare
+!$omp parallel default(shared)
+!$omp do schedule(static, 2)
+do i = 1, n
+  x(i) = x(i) * y(9)   ! { dg-warning "is out of bounds" }
+end do
+!$omp end do
+!$omp end parallel
+!$omp end parallel workshare
+  end do
+  do j = 1, 9
+!$omp parallel workshare
+!$omp parallel default(shared)
+!$omp do schedule(dynamic, 3)
+do i = 1, n
+  x(i) = x(i) * y(9)   ! { dg-warning "is out of bounds" }
+end do
+!$omp end do
+!$omp end parallel
+!$omp end parallel workshare
+  end do
+end

Jakub


Re: [PATCH, GCC/ARM] Make arm_feature_set agree with type of FL_* macros

2016-11-16 Thread Thomas Preudhomme

Hi,

I've rebased the patch to make arm_feature_set agree with type of FL_* macros on 
top of trunk rather than on top of the optional -mthumb patch. That involved 
doing the changes to gcc/config/arm/arm-protos.h rather than 
gcc/config/arm/arm-flags.h. I also took advantage of the fact that each line is 
changed to change the indentation to tabs and add dots in comments missing one.


For reference, please find below the original patch description:

Currently arm_feature_set is defined in gcc/config/arm/arm-flags as an array of 
2 unsigned long. However, the flags stored in these two entries are (signed) 
int, being combinations of bits set via expression of the form 1 << bitno. This 
creates 3 issues:


1) undefined behavior when setting the msb (1 << 31)
2) undefined behavior when storing a flag with msb set (negative int) into one 
of the unsigned array entries (positive int)

3) waste of space since the top 32 bits of each entry is not used

This patch changes the definition of FL_* macro to be unsigned int by using the 
form 1U << bitno instead and changes the definition of arm_feature_set to be an 
array of 2 unsigned (int) entries.


*** gcc/ChangeLog ***

2016-10-15  Thomas Preud'homme  

* config/arm/arm-protos.h (FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M,
FL_MODE26, FL_MODE32, FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED,
FL_STRONG, FL_ARCH5E, FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF,
FL_ARCH6K, FL_THUMB2, FL_NOTM, FL_THUMB_DIV, FL_VFPV3, FL_NEON,
FL_ARCH7EM, FL_ARCH7, FL_ARM_DIV, FL_ARCH8, FL_CRC32, FL_SMALLMUL,
FL_NO_VOLATILE_CE, FL_IWMMXT, FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1,
FL2_ARCH8_2, FL2_FP16INST): Reindent comment, add final dot when
missing and make value unsigned.
(arm_feature_set): Use unsigned entries instead of unsigned long.


Bootstrapped on arm-linux-gnueabihf targeting Thumb-2 state.

Is this ok for trunk?

Best regards,

Thomas

On 14/11/16 18:56, Thomas Preudhomme wrote:

My apologize, I realized when trying to apply the patch that I wrote it on top
of the optional -mthumb patch instead of the reverse. I'll rebase it to not
screw up bisect.

Best regards,

Thomas

On 14/11/16 14:47, Kyrill Tkachov wrote:


On 14/11/16 14:07, Thomas Preudhomme wrote:

Hi,

Currently arm_feature_set is defined in gcc/config/arm/arm-flags as an array
of 2 unsigned long. However, the flags stored in these two entries are
(signed) int, being combinations of bits set via expression of the form 1 <<
bitno. This creates 3 issues:

1) undefined behavior when setting the msb (1 << 31)
2) undefined behavior when storing a flag with msb set (negative int) into one
of the unsigned array entries (positive int)
3) waste of space since the top 32 bits of each entry is not used

This patch changes the definition of FL_* macro to be unsigned int by using
the form 1U << bitno instead and changes the definition of arm_feature_set to
be an array of 2 unsigned (int) entries.

Bootstrapped on arm-linux-gnueabihf targeting Thumb-2 state.

Is this ok for trunk?



Ok.
Thanks,
Kyrill


Best regards,

Thomas


diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 95bae5ef57ba4c433c0cce8e0c197959abdf887b..5cee7718554886982f535da2e9baa5015da609e4 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -351,50 +351,51 @@ extern bool arm_is_constant_pool_ref (rtx);
 /* Flags used to identify the presence of processor capabilities.  */
 
 /* Bit values used to identify processor capabilities.  */
-#define FL_NONE	  (0)	  /* No flags.  */
-#define FL_ANY	  (0x)/* All flags.  */
-#define FL_CO_PROC(1 << 0)/* Has external co-processor bus */
-#define FL_ARCH3M (1 << 1)/* Extended multiply */
-#define FL_MODE26 (1 << 2)/* 26-bit mode support */
-#define FL_MODE32 (1 << 3)/* 32-bit mode support */
-#define FL_ARCH4  (1 << 4)/* Architecture rel 4 */
-#define FL_ARCH5  (1 << 5)/* Architecture rel 5 */
-#define FL_THUMB  (1 << 6)/* Thumb aware */
-#define FL_LDSCHED(1 << 7)	  /* Load scheduling necessary */
-#define FL_STRONG (1 << 8)	  /* StrongARM */
-#define FL_ARCH5E (1 << 9)/* DSP extensions to v5 */
-#define FL_XSCALE (1 << 10)	  /* XScale */
-/* spare	  (1 << 11)	*/
-#define FL_ARCH6  (1 << 12)   /* Architecture rel 6.  Adds
-	 media instructions.  */
-#define FL_VFPV2  (1 << 13)   /* Vector Floating Point V2.  */
-#define FL_WBUF	  (1 << 14)	  /* Schedule for write buffer ops.
-	 Note: ARM6 & 7 derivatives only.  */
-#define FL_ARCH6K (1 << 15)   /* Architecture rel 6 K extensions.  */
-#define FL_THUMB2 (1 << 16)	  /* Thumb-2.  */
-#define FL_NOTM	  (1 << 17)	  /* Instructions not present in the 'M'
-	 profile.  */
-#define FL_THUMB_DIV  (1 << 18)	  /* Hardware divide (Thumb 

Re: [PATCH] Enable Intel AVX512_4FMAPS and AVX512_4VNNIW instructions

2016-11-16 Thread Andrew Senkevich
2016-11-16 19:21 GMT+03:00 Bernd Schmidt :
> On 11/15/2016 05:31 PM, Andrew Senkevich wrote:
>>
>> 2016-11-15 17:56 GMT+03:00 Jeff Law :
>>>
>>> On 11/15/2016 05:55 AM, Andrew Senkevich wrote:


 2016-11-11 14:16 GMT+03:00 Uros Bizjak :
>
>
> --- a/gcc/genmodes.c
> +++ b/gcc/genmodes.c
> --- a/gcc/init-regs.c
> +++ b/gcc/init-regs.c
> --- a/gcc/machmode.h
> +++ b/gcc/machmode.h
>
> These are middle-end changes, you will need a separate review for
> these.



 Who could review these changes?
>>>
>>>
>>> I can.  I likely dropped the message because it looked x86 specific, so
>>> if
>>> you could resend it'd be appreciated.
>>
>>
>> Attached (diff with previous only in fixed comments typos).
>
>
> Next time please split middle-end changes out from target-related stuff and
> send them separately.

Ok.

> These ones are OK.
>
>
> Bernd

Thanks!

Who could commit it?


--
WBR,
Andrew


Re: [PATCH] enhance buffer overflow warnings (and c/53562)

2016-11-16 Thread Jeff Law

[ I'm catching up on a variety of things...  So apologies if y'all
  have settled these issues. ]

On 11/02/2016 01:32 PM, Jakub Jelinek wrote:

But obviously not all levels of the warning can/should be enabled
with -Wall/-Werror.  There are cases which are worth warning by default
(the case where we want to inform the user if you reach this stmt,
you'll get your program killed (will call __chk_fail)) is something
that ought like before be enabled by default; can have a warning
switch users can disable.
Then there is the case where there is a sure buffer overflow (not using
-D_FORTIFY_SOURCE, but still __bos (, 0) tells the buffer is too short,
and it is unconditional (no tricks with PHIs where one path has short
and another part has long size).  This is something that is useful
in -Wall.
The rest I'm very doubtful about even for -Wextra.
I would hesitate on distinguishing between something that flows via a 
PHI vs something that is explicit in the IL.


It is entirely possible that an unrelated path isolation might take a 
PHI where one path is short and one long and split it into two paths. 
At that point they're both going to be explicit in the IL.  You'd then 
have to use something like global anticipability analysis to determine 
if they're executed unconditionally.


Jeff


Re: [PATCH] Fix NetBSD bootstrap

2016-11-16 Thread Bernd Schmidt

On 11/16/2016 06:12 PM, Krister Walfridsson wrote:

I'm the NetBSD maintainer, so I belive I don't need approval to commit
this. But I have been absent for a long time, so it makes sense for
someone to review at least this first patch.

Bootstrapped and tested on i386-unknown-netbsdelf6.1 and
x86_64-unknown-netbsd6.1.

OK to commit?


I'll take the position that you know best and do not need approval from 
someone else.



Bernd



Re: [2/9] Encoding support for AArch64 DWARF operations

2016-11-16 Thread Jason Merrill
On Fri, Nov 11, 2016 at 1:33 PM, Jiong Wang  wrote:
> The encoding for new added AARCH64 DWARF operations.

This patch seems rather incomplete; I only see a change to
dwarf2out.c, which won't compile since the opcodes aren't defined
anywhere.

Jason


Re: Add SET_DECL_MODE

2016-11-16 Thread Jeff Law

On 11/16/2016 09:44 AM, Richard Sandiford wrote:

This may no longer be necessary with the current version
of the SVE patches, but it does at least make things consistent
with the TYPE_MODE/SET_TYPE_MODE split.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


[ This patch is part of the SVE series posted here:
  https://gcc.gnu.org/ml/gcc/2016-11/msg00030.html ]

gcc/ada/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* gcc-interface/utils.c (create_label_decl): Use SET_DECL_MODE.

gcc/c/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* c-decl.c (merge_decls): Use SET_DECL_MODE.
(make_label, finish_struct): Likewise.

gcc/cp/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* class.c (finish_struct_bits): Use SET_DECL_MODE.
(build_base_field_1, layout_class_type, finish_struct_1): Likewise.
* decl.c (make_label_decl): Likewise.
* pt.c (tsubst_decl): Likewise.

gcc/fortran/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* trans-common.c (build_common_decl): Use SET_DECL_MODE.
* trans-decl.c (gfc_build_label_decl): Likewise.
* trans-types.c (gfc_get_array_descr_info): Likewise.

gcc/lto/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* lto.c (offload_handle_link_vars): Use SET_DECL_MODE.

gcc/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* tree.h (SET_DECL_MODE): New macro.
* cfgexpand.c (avoid_deep_ter_for_debug): Use SET_DECL_MODE.
(expand_gimple_basic_block): Likewise.
* function.c (split_complex_args): Likeise.
* ipa-prop.c (ipa_modify_call_arguments): Likewise.
* omp-simd-clone.c (ipa_simd_modify_stmt_ops): Likewise.
* stor-layout.c (layout_decl, relayout_decl): Likewise.
(finish_bitfield_representative): Likewise.
* tree.c (make_node_stat): Likewise.
* tree-inline.c (remap_ssa_name): Likewise.
(tree_function_versioning): Likewise.
* tree-into-ssa.c (rewrite_debug_stmt_uses): Likewise.
* tree-sra.c (sra_ipa_reset_debug_stmts): Likewise.
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Likewise.
* tree-ssa-loop-ivopts.c (remove_unused_ivs): Likewise.
* tree-ssa.c (insert_debug_temp_for_var_def): Likewise.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Likewise.
* varasm.c (make_debug_expr_from_rtl): Likewise.

libcc1/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* plugin.cc (plugin_build_add_field): Use SET_DECL_MODE.
I just lightly spot-checked this.  Seems like it should fit under hte 
obvious rule.  Ok for the trunk.


jeff



Re: [PATCH] Fix PR78305

2016-11-16 Thread Richard Biener
On November 16, 2016 5:22:17 PM GMT+01:00, Marc Glisse  
wrote:
>On Wed, 16 Nov 2016, Michael Matz wrote:
>
>> Hi,
>>
>> On Wed, 16 Nov 2016, Marc Glisse wrote:
>>
> The first sentence about ORing the sign bit sounds strange (except
>for a
> sign-magnitude representation). With 2's complement, INT_MIN is
>-2^31, the
> divisors are the 2^k and -(2^k). -2 * 2^30 yields INT_MIN, but
>your test
> misses -2 as a possible divisor. On the other hand, 0b100...001
>(aka
> -INT_MAX)
> is not a divisor of INT_MIN but your test says the reverse.

 Yeah, but it handled the testcase ;)  So I guess the easiest would
>be
 to check integer_pow2p (abs (TREE_OPERAND (t, 0)) then, thus
 wi::popcount (wi::abs (TREE_OPERAND (t, 0))) == 1?
>>>
>>> Looks good to me, thanks.
>>
>> An integer X is a power of two if and only if
>>  X & -X == 0  (&& X != 0 if you want to exclude zero)
>> which also nicely handles positive and negative numbers at the same
>time.
>> No need for popcounts or abs.
>
>There are bit tricks to test for powers of 2, but X & -X == 0 doesn't 
>quite work (X & -X == X is closer, but needs a tweak for negative 
>numbers). We could use
>wi::pow2_p (wi::abs (TREE_OPERAND (t, 0)))
>adding a new function pow2_p so it remains readable and we reduce the
>risk 
>of using the wrong bit trick...

Tree_pow2p uses wi::popcount 

Richard.




PING [PATCH] enable -Wformat-length for dynamically allocated buffers (pr 78245)

2016-11-16 Thread Martin Sebor

I'm looking for a review of the patch below:

  https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00779.html

Thanks

On 11/08/2016 05:09 PM, Martin Sebor wrote:

The -Wformat-length checker relies on the compute_builtin_object_size
function to determine the size of the buffer it checks for overflow.
The function returns either a size computed by the tree-object-size
pass for objects referenced by the __builtin_object_size intrinsic
(if it's used in the program) or it tries to compute it for a small
subset of expressions otherwise.  This subset doesn't include objects
allocated by either malloc or alloca, and so for those the function
returns "unknown" or (size_t)-1 in the case of -Wformat-length.  As
a consequence, -Wformat-length is unable to detect overflows
involving such objects.

The attached patch adds a new function, compute_object_size, that
uses the existing algorithms to compute and return the sizes of
allocated objects as well, as if they were referenced by
__builtin_object_size in the program source, enabling the
-Wformat-length checker to detect more buffer overflows.

Martin

PS The function makes use of the init_function_sizes API that is
otherwise unused outside the tree-object-size pass to initialize
the internal structures, but then calls fini_object_sizes to
release them before returning.  That seems wasteful because
the size of the same object or one related to it might need
to computed again in the context of the same function.  I
experimented with allocating and releasing the structures only
when current_function_decl changes but that led to crashes.
I suspect I'm missing something about the management of memory
allocated for these structures.  Does anyone have any suggestions
how to make this work?  (Do I perhaps need to allocate them using
a special allocator so they don't get garbage collected?)




[PING] [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2016-11-16 Thread Bernd Edlinger
Hi,

I'd like to ping for these two patches:

[PATCH, ARM] Further improve stack usage on sha512 (PR 77308)
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00523.html

[PATCH, ARM] Enable ldrd/strd peephole rules unconditionally
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00830.html


Thanks
Bernd.

[PATCH PR78114]Refine gfortran.dg/vect/fast-math-mgrid-resid.f

2016-11-16 Thread Bin Cheng
Hi,
Currently test gfortran.dg/vect/fast-math-mgrid-resid.f checks all predictive 
commoning opportunities for all possible loops.  This makes it fragile because 
vectorizer may peel the loop differently, as well as may choose different 
vector factors.  For example, on x86-solaris, vectorizer doesn't peel for 
prologue loop; for -march=haswell, the case is long time failed because vector 
factor is 4, while iteration distance of predictive commoning opportunity is 
smaller than 4.  This patch refines it by only checking if predictive commoning 
variable is created when vector factor is 2; or vectorization variable is 
created when factor is 4.  This works since we have only one main loop, and 
only one vector factor can be used.
Test result checked for various x64 targets.  Is it OK?

Thanks,
bin

gcc/testsuite/ChangeLog
2016-11-16  Bin Cheng  

PR testsuite/78114
* gfortran.dg/vect/fast-math-mgrid-resid.f: Refine test by
checking predictive commining variables in vectorized loop
wrto vector factor.diff --git a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f 
b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
index 88238f9..3e5c4a4 100644
--- a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
+++ b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
@@ -1,6 +1,6 @@
 ! { dg-do compile }
 ! { dg-require-effective-target vect_double }
-! { dg-options "-O3 -fpredictive-commoning -fdump-tree-pcom-details" }
+! { dg-options "-O3 -fpredictive-commoning -fdump-tree-pcom" }
 
 
 *** RESID COMPUTES THE RESIDUAL:  R = V - AU
@@ -38,8 +38,9 @@ C
   RETURN
   END
 ! we want to check that predictive commoning did something on the
-! vectorized loop.
-! { dg-final { scan-tree-dump-times "Executing predictive commoning without 
unrolling" 1 "pcom" { target lp64 } } }
-! { dg-final { scan-tree-dump-times "Executing predictive commoning without 
unrolling" 2 "pcom" { target ia32 } } }
-! { dg-final { scan-tree-dump-times "Predictive commoning failed: no suitable 
chains" 0 "pcom" } }
-! { dg-final { scan-tree-dump-times "Loop iterates only 1 time, nothing to do" 
1 "pcom" } }
+! vectorized loop.  If vector factor is 2, the vectorized loop can
+! be predictive commoned, we check if predictive commoning variable
+! is created with vector(2) type;  if vector factor is 4, there is
+! no predictive commoning opportunity, we check if vector(4) variable
+! is created.  This works because only one vector factor can be used.
+! { dg-final { scan-tree-dump-times "vector\\(2\\) real\\(.*\\) 
vectp_u.*__lsm|vector\\(4\\) real\\(.*\\)" 1 "pcom" } }


Re: [PATCH] warn on overflow in calls to allocation functions (bugs 77531 and 78284)

2016-11-16 Thread Martin Sebor

Attached is an updated version of the patch that also adds attribute
alloc_size to the standard allocation built-ins (aligned_alloc,
alloca, malloc, calloc, and realloc) and handles alloca.

Besides that, I've renamed the option to -Walloc-size-larger-than
to make it less similar to -Walloca-larger-than.  It think the new
name works because the option works with the alloc_size attribute.
 Other suggestions are of course welcome.

I've left the alloc_max_size function in place until I receive some
feedback on it.

I've regression-tested the patch on x86_64 with a few issues.  The
biggest is that the -Walloc-zero option enabled by -Wextra causes
a number of errors during bootstrap due to invoking the XALLOCAVEC
macro with a zero argument.  The errors look valid to me (and I
got past them by temporarily changing the XALLOCAVEC macro to
always allocate at least one byte) but I haven't fixed the errors
yet.  I'll post a separate patch for those.   The other open issue
is that the new warning duplicates a small subset of the
-Walloca-larger-than warnings.  I expect removing the duplicates
to be straightforward.  I post this updated patch for review while
I work on the remaining issues.

Martin

On 11/13/2016 08:19 PM, Martin Sebor wrote:

Bug 77531 requests a new warning for calls to allocation functions
(those declared with attribute alloc_size(X, Y)) that overflow the
computation X * Z of the size of the allocated object.

Bug 78284 suggests that detecting and diagnosing other common errors
in calls to allocation functions, such as allocating more space than
SIZE_MAX / 2 bytes, would help prevent subsequent buffer overflows.

The attached patch adds two new warning options, -Walloc-zero and
-Walloc-larger-than=bytes that implement these two enhancements.
The patch is not 100% finished because, as it turns out, the GCC
allocation built-ins (malloc et al.) do not make use of the
attribute and so don't benefit from the warnings.  The tests are
also incomplete, and there's at least one bug in the implementation
I know about.

I'm posting the patch while stage 1 is still open and to give
a heads up on it and to get early feedback.  I expect completing
it will be straightforward.

Martin

PS The alloc_max_size function added in the patch handles sizes
specified using suffixes like KB, MB, etc.  I added that to make
it possible to specify sizes in excess of the maximum of INT_MAX
that (AFAIK) options that take integer arguments handle out of
the box.  It only belatedly occurred to me that the suffixes
are unnecessary if the option argument is handled using strtoull.
I can remove the suffix (as I suspect it will raise objections)
but I think that a general solution along these lines would be
useful to let users specify large byte sizes in other options
as well (such -Walloca-larger-than, -Wvla-larger-then).  Are
there any suggestions or preferences?



PR c/77531 - __attribute__((alloc_size(1,2))) could also warn on multiplication overflow
PR c/78284 - warn on malloc with very large arguments

include/ChangeLog:
	* libiberty.h (XALLOCAVEC): Make sure alloca argument is non-zero.

gcc/c-family/ChangeLog:

	PR c/77531
	PR c/78284
	* c.opt (-Walloc-zero, -Walloc-larger-than): New options.

gcc/ChangeLog:

	PR c/77531
	PR c/78284
	* builtin-attrs.def (ATTR_ALLOC_SIZE): New identifier tree.
	(ATTR_MALLOC_SIZE_1_NOTHROW_LIST): New attribute list.
	(ATTR_MALLOC_SIZE_1_NOTHROW_LEAF_LIST): Same.
	(ATTR_MALLOC_SIZE_1_2_NOTHROW_LEAF_LIST): Same.
	(ATTR_ALLOC_SIZE_2_NOTHROW_LEAF_LIST): Same.
	* builtins.c (expand_builtin_alloca): Call
	maybe_warn_alloc_args_overflow.
	* builtins.def (akigned_alloc, alloca, calloc, malloc, realloc):
	Add attribute alloc_size.
	* calls.h (maybe_warn_alloc_args_overflow): Declare.
	* calls.c (alloc_max_size): New function.
	(maybe_warn_alloc_args_overflow): Define.
	(initialize_argument_information): Diagnose overflow in functions
	declared with attaribute alloc_size.
	* doc/invoke.texi (Warning Options): Document -Walloc-zero and
	-Walloc-larger-than.

gcc/testsuite/ChangeLog:

	PR c/77531
	PR c/78284
	* gcc.dg/attr-alloc_size-3.c: New test.
	* gcc.dg/attr-alloc_size-4.c: New test.
	* gcc.dg/attr-alloc_size-5.c: New test.
	* gcc.dg/attr-alloc_size-6.c: New test.
	* gcc.dg/attr-alloc_size-7.c: New test.

diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def
index 8dc59c9..2a58b31 100644
--- a/gcc/builtin-attrs.def
+++ b/gcc/builtin-attrs.def
@@ -83,6 +83,7 @@ DEF_LIST_INT_INT (5,6)
 #undef DEF_LIST_INT_INT
 
 /* Construct trees for identifiers.  */
+DEF_ATTR_IDENT (ATTR_ALLOC_SIZE, "alloc_size")
 DEF_ATTR_IDENT (ATTR_COLD, "cold")
 DEF_ATTR_IDENT (ATTR_CONST, "const")
 DEF_ATTR_IDENT (ATTR_FORMAT, "format")
@@ -150,6 +151,23 @@ DEF_ATTR_TREE_LIST (ATTR_SENTINEL_NOTHROW_LEAF_LIST, ATTR_SENTINEL,	\
 DEF_ATTR_TREE_LIST (ATTR_COLD_CONST_NORETURN_NOTHROW_LEAF_LIST, ATTR_CONST,\
 			ATTR_NULL, ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
 
+/* Allocation functions like alloca and malloc whose first 

[PATCH Obvious]Adjust test string wrto update dump info for gcc.target/arm/ivopts-orig_biv-inc.c

2016-11-16 Thread Bin Cheng
Hi,
Dump information of IVOPT has been updated while test string 
gcc.target/arm/ivopts-orig_biv-inc.c is not.  This patch does this.  Test 
result checked on arm-none-eabi.  Commit as obvious?

Thanks,
bin

gcc/testsuite/ChangeLog
2016-11-16  Bin Cheng  

* gcc.target/arm/ivopts-orig_biv-inc.c: Adjust test string
according to updated dump info.diff --git a/gcc/testsuite/gcc.target/arm/ivopts-orig_biv-inc.c 
b/gcc/testsuite/gcc.target/arm/ivopts-orig_biv-inc.c
index f7129d3..94c7e5f 100644
--- a/gcc/testsuite/gcc.target/arm/ivopts-orig_biv-inc.c
+++ b/gcc/testsuite/gcc.target/arm/ivopts-orig_biv-inc.c
@@ -15,4 +15,4 @@ unsigned char * foo(unsigned char *ReadPtr)
  return ReadPtr;
 }
 
-/* { dg-final { scan-tree-dump-times "original biv" 2 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "Incr POS: orig biv" 2 "ivopts"} } */


Re: [PATCH v2][PR libgfortran/78314] Fix ieee_support_halting

2016-11-16 Thread FX
> gcc/testsuite/
> 2016-11-16  Szabolcs Nagy  
> 
>   PR libgfortran/78314
>   * gfortran.dg/ieee/ieee_6.f90: Use ieee_support_halting.
> 
> libgfortran/
> 2016-11-16  Szabolcs Nagy  
> 
>   PR libgfortran/78314
>   * config/fpu-glibc.h (support_fpu_trap): Use feenableexcept.

OK to commit.

FX


RE: [PATCH] MIPS/GCC: Mark text contents as code or data

2016-11-16 Thread Maciej W. Rozycki
On Tue, 15 Nov 2016, Matthew Fortune wrote:

> I'm a little concerned the expected output tests may be fragile over
> time but let's wait and see.

 Indeed, but I'd rather see false negatives than false positives or no 
coverage at all.  And I hope the pieces of expected assembly quoted will 
help telling any false negatives and actual regressions apart very easily.

> OK to commit.

 Applied now, thanks for your review.

  Maciej


[PATCH] Fix NetBSD bootstrap

2016-11-16 Thread Krister Walfridsson

NetBSD fails bootstrap with
  stdatomic.h:55:17: error: unknown type name '__INT_LEAST8_TYPE__'
This is fixed by the following patch (only i386 and x86_64 for now. I'll
do the other ports after fixing some more issues -- the NetBSD support is
rather broken at the moment...)

I'm the NetBSD maintainer, so I belive I don't need approval to commit 
this. But I have been absent for a long time, so it makes sense for 
someone to review at least this first patch.


Bootstrapped and tested on i386-unknown-netbsdelf6.1 and
x86_64-unknown-netbsd6.1.

OK to commit?

   /Krister


2016-11-16  Krister Walfridsson  

* config/netbsd-stdint.h: New.
* config.gcc (i[34567]86-*-netbsd): Add netbsd-stdint.h to tm_file.
(x86_64-*-netbsd*): Likewise.Index: gcc/config/netbsd-stdint.h
===
--- gcc/config/netbsd-stdint.h  (nonexistent)
+++ gcc/config/netbsd-stdint.h  (working copy)
@@ -0,0 +1,55 @@
+/* Definitions for  types for NetBSD systems.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+#define SIG_ATOMIC_TYPE   "int"
+
+#define INT8_TYPE "signed char"
+#define INT16_TYPE"short int"
+#define INT32_TYPE"int"
+#define INT64_TYPE(LONG_TYPE_SIZE == 64 ? "long int" : "long long int")
+#define UINT8_TYPE"unsigned char"
+#define UINT16_TYPE   "short unsigned int"
+#define UINT32_TYPE   "unsigned int"
+#define UINT64_TYPE   (LONG_TYPE_SIZE == 64 ? "long unsigned int" : "long 
long unsigned int")
+
+#define INT_LEAST8_TYPE   INT8_TYPE
+#define INT_LEAST16_TYPE  INT16_TYPE
+#define INT_LEAST32_TYPE  INT32_TYPE
+#define INT_LEAST64_TYPE  INT64_TYPE
+#define UINT_LEAST8_TYPE  UINT8_TYPE
+#define UINT_LEAST16_TYPE UINT16_TYPE
+#define UINT_LEAST32_TYPE UINT32_TYPE
+#define UINT_LEAST64_TYPE UINT64_TYPE
+
+#define INT_FAST8_TYPEINT32_TYPE
+#define INT_FAST16_TYPE   INT32_TYPE
+#define INT_FAST32_TYPE   INT32_TYPE
+#define INT_FAST64_TYPE   INT64_TYPE
+#define UINT_FAST8_TYPE   UINT32_TYPE
+#define UINT_FAST16_TYPE  UINT32_TYPE
+#define UINT_FAST32_TYPE  UINT32_TYPE
+#define UINT_FAST64_TYPE  UINT64_TYPE
+
+#define INTPTR_TYPE   (LONG_TYPE_SIZE == 64 ?  INT64_TYPE :  INT32_TYPE)
+#define UINTPTR_TYPE  (LONG_TYPE_SIZE == 64 ? UINT64_TYPE : UINT32_TYPE)
Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 242350)
+++ gcc/config.gcc  (working copy)
@@ -1455,11 +1455,11 @@
tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h 
${fbsd_tm_file} i386/x86-64.h i386/freebsd.h i386/freebsd64.h"
;;
 i[34567]86-*-netbsdelf*)
-   tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h netbsd.h 
netbsd-elf.h i386/netbsd-elf.h"
+   tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h netbsd.h 
netbsd-stdint.h netbsd-elf.h i386/netbsd-elf.h"
extra_options="${extra_options} netbsd.opt netbsd-elf.opt"
;;
 x86_64-*-netbsd*)
-   tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h netbsd.h 
netbsd-elf.h i386/x86-64.h i386/netbsd64.h"
+   tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h netbsd.h 
netbsd-stdint.h netbsd-elf.h i386/x86-64.h i386/netbsd64.h"
extra_options="${extra_options} netbsd.opt netbsd-elf.opt"
;;
 i[34567]86-*-openbsd*)


Re: [PATCH] warn on overflow in calls to allocation functions (bugs 77531 and 78284)

2016-11-16 Thread Martin Sebor

On 11/14/2016 01:34 PM, Eric Gallager wrote:

On 11/13/16, Martin Sebor  wrote:

Bug 77531 requests a new warning for calls to allocation functions
(those declared with attribute alloc_size(X, Y)) that overflow the
computation X * Z of the size of the allocated object.

Bug 78284 suggests that detecting and diagnosing other common errors
in calls to allocation functions, such as allocating more space than
SIZE_MAX / 2 bytes, would help prevent subsequent buffer overflows.

The attached patch adds two new warning options, -Walloc-zero and
-Walloc-larger-than=bytes that implement these two enhancements.
The patch is not 100% finished because, as it turns out, the GCC
allocation built-ins (malloc et al.) do not make use of the
attribute and so don't benefit from the warnings.  The tests are
also incomplete, and there's at least one bug in the implementation
I know about.

I'm posting the patch while stage 1 is still open and to give
a heads up on it and to get early feedback.  I expect completing
it will be straightforward.

Martin

PS The alloc_max_size function added in the patch handles sizes
specified using suffixes like KB, MB, etc.  I added that to make
it possible to specify sizes in excess of the maximum of INT_MAX
that (AFAIK) options that take integer arguments handle out of
the box.  It only belatedly occurred to me that the suffixes
are unnecessary if the option argument is handled using strtoull.
I can remove the suffix (as I suspect it will raise objections)
but I think that a general solution along these lines would be
useful to let users specify large byte sizes in other options
as well (such -Walloca-larger-than, -Wvla-larger-then).  Are
there any suggestions or preferences?




-Walloc-larger-than looks way too similar to -Walloca-larger-than; at
first I was confused as to why you were adding the same flag again
until I spotted the one letter difference. Maybe come up with a name
that looks more distinct? Just something to bikeshed about.


I agree.  I've renamed the option to -Walloc-size-larger-than.
I think that works because it goes along with attribute alloc_size.
I'm about to post an updated patch with that change (among others).

Thanks
Martin



Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-16 Thread Michael Meissner
On Wed, Nov 16, 2016 at 04:15:10PM +, Andrew Stubbs wrote:
> On 16/11/16 13:10, Michael Meissner wrote:
> >Yeah, SFmode and DFmode should not have the TARGET_{S,D}F_FPR checks.
> 
> So, I can safely resolve my initial problem by simply removing them?
> And that wouldn't break the other use of that predicate?
> 
> >But a secondary problem is the early clobber in the match_scratch.
> 
> So, the FPR_FUSION insn works because operands 1 and 2 cannot
> conflict, which means the early-clobber is not necessary, but the
> GPR_FUSION insn cannot work because there's no way to ensure that
> operands 1 and 2 don't conflict without also specifying that
> operands 0 and 2 don't conflict, which they commonly do.
> 
> We could fix it, for now, by adding new patterns that fit both cases
> (given that the register numbers are known at peephole time).
> 
> Or, we could disable the peephole in the case where this would occur
> (as my original patch does, albeit bluntly).

I'm starting to test this patch right now (it's on LE power8 stage3 right now,
and I need to build BE power8 and BE power7 versions when I get into the office
shortly, and build spec 2017 with it for PR 78101):

[gcc]
2016-11-16  Michael Meissner  

PR target/78101
* config/rs6000/predicates.md (fusion_addis_mem_combo_load): Add
the appropriate checks for SFmode/DFmode load/stores in GPR
registers.
(fusion_addis_mem_combo_store): Likewise.
* config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Rename
fusion_fpr_* to fusion_vsx_* and add in support for ISA 3.0 scalar
d-form instructions for traditional Altivec registers.
(emit_fusion_p9_load): Likewise.
(emit_fusion_p9_store): Likewise.
* config/rs6000/rs6000.md (p9 fusion store peephole2): Remove
early clobber from scratch register.  Do not match if the register
being stored is the scratch register.
(fusion_vsx___load): Rename fusion_fpr_*
to fusion_vsx_* and add in support for ISA 3.0 scalar d-form
instructions for traditional Altivec registers.
(fusion_fpr___load): Likewise.
(fusion_vsx___store): Likewise.
(fusion_fpr___store): Likewise.

[gcc/testsuite]
2016-11-16  Michael Meissner  

PR target/78101
* gcc.target/powerpc/fusion4.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md 
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 242456)
+++ gcc/config/rs6000/predicates.md (.../gcc/config/rs6000) (working copy)
@@ -1844,7 +1844,7 @@ (define_predicate "fusion_gpr_mem_load"
 ;; Match a GPR load (lbz, lhz, lwz, ld) that uses a combined address in the
 ;; memory field with both the addis and the memory offset.  Sign extension
 ;; is not handled here, since lha and lwa are not fused.
-;; With extended fusion, also match a FPR load (lfd, lfs) and float_extend
+;; With P9 fusion, also match a fpr/vector load and float_extend
 (define_predicate "fusion_addis_mem_combo_load"
   (match_code "mem,zero_extend,float_extend")
 {
@@ -1873,11 +1873,15 @@ (define_predicate "fusion_addis_mem_comb
   break;
 
 case SFmode:
-case DFmode:
   if (!TARGET_P9_FUSION)
return 0;
   break;
 
+case DFmode:
+  if ((!TARGET_POWERPC64 && !TARGET_DF_FPR) || !TARGET_P9_FUSION)
+   return 0;
+  break;
+
 default:
   return 0;
 }
@@ -1920,6 +1924,7 @@ (define_predicate "fusion_addis_mem_comb
 case QImode:
 case HImode:
 case SImode:
+case SFmode:
   break;
 
 case DImode:
@@ -1927,13 +1932,8 @@ (define_predicate "fusion_addis_mem_comb
return 0;
   break;
 
-case SFmode:
-  if (!TARGET_SF_FPR)
-   return 0;
-  break;
-
 case DFmode:
-  if (!TARGET_DF_FPR)
+  if (!TARGET_POWERPC64 && !TARGET_DF_FPR)
return 0;
   break;
 
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 242456)
+++ gcc/config/rs6000/rs6000.c  (.../gcc/config/rs6000) (working copy)
@@ -3441,28 +3441,28 @@ rs6000_init_hard_regno_mode_ok (bool glo
 
   static const struct fuse_insns addis_insns[] = {
{ SFmode, DImode, RELOAD_REG_FPR,
- CODE_FOR_fusion_fpr_di_sf_load,
- CODE_FOR_fusion_fpr_di_sf_store },
+ CODE_FOR_fusion_vsx_di_sf_load,
+ CODE_FOR_fusion_vsx_di_sf_store },
 
{ SFmode, SImode, RELOAD_REG_FPR,
- CODE_FOR_fusion_fpr_si_sf_load,
- CODE_FOR_fusion_fpr_si_sf_store },
+ 

Re: Use rtx_mode_t instead of std::make_pair

2016-11-16 Thread Bernd Schmidt

On 11/16/2016 05:52 PM, Richard Sandiford wrote:


Using rtx_mode_t also abstracts away the representation.  The fact that
it's a std::pair rather than a custom class isn't important to users of
the interface.


Looks borderline obvious to me. OK.


Bernd



[PATCH][AArch64] PR target/78362: Make sure to only take REGNO of a register

2016-11-16 Thread Kyrill Tkachov

Hi all,

As the PR says we have an RTL checking failure that occurs when building libgcc 
for aarch64.
The expander code for addsi3 takes the REGNO of a SUBREG in operands[1]. The 
three operands
in the failing case are:
{(reg:SI 78), (subreg:SI (reg:DI 77) 0), (subreg:SI (reg:DI 73 [ ivtmp.9 ]) 0)}

According to the documentation of register_operand (which is the predicate for 
operands[1]),
operands[1] can be a REG or a SUBREG. If it's a subreg it may also contain a 
MEM before reload
(because it is guaranteed to be reloaded into a register later). Anyway, the 
bottom line is that
we have to be careful when taking REGNO of expressions during expand-time.

This patch extracts the inner rtx in case we have a SUBREG and checks that it's 
a REG before
checking its REGNO.

Bootstrapped and tested on aarch64-none-linux-gnu. Tested aarch64-none-elf with 
RTL checking enabled
(without this patch that doesn't build).

Ok for trunk?
Thanks,
Kyrill

2016-11-16  Kyrylo Tkachov  

PR target/78362
* config/aarch64/aarch64.md (add3): Extract inner expression
from a subreg in operands[1] and don't call REGNO on a non-reg expression
when deciding to force operands[2] into a reg.

2016-11-16  Kyrylo Tkachov  

PR target/78362
* gcc.c-torture/compile/pr78362.c: New test.
commit 068224c568d6f06f68512f12ecebea8bfc873fe9
Author: Kyrylo Tkachov 
Date:   Tue Nov 15 14:52:33 2016 +

[AArch64] PR target/78362: Make sure to only take REGNO of a register

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 9e5eee9..1dcb6b2 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1611,11 +1611,15 @@ (define_expand "add3"
 	  (match_operand:GPI 2 "aarch64_pluslong_operand" "")))]
   ""
 {
+  /* If operands[1] is a subreg extract the inner RTX.  */
+  rtx op1 = REG_P (operands[1]) ? operands[1] : SUBREG_REG (operands[1]);
+
   /* If the constant is too large for a single instruction and isn't frame
  based, split off the immediate so it is available for CSE.  */
   if (!aarch64_plus_immediate (operands[2], mode)
   && can_create_pseudo_p ()
-  && !REGNO_PTR_FRAME_P (REGNO (operands[1])))
+  && (!REG_P (op1)
+	 || !REGNO_PTR_FRAME_P (REGNO (op1
 operands[2] = force_reg (mode, operands[2]);
 })
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr78362.c b/gcc/testsuite/gcc.c-torture/compile/pr78362.c
new file mode 100644
index 000..66eea7d
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr78362.c
@@ -0,0 +1,11 @@
+/* PR target/78362.  */
+
+long a;
+
+void
+foo (void)
+{
+  for (;; a--)
+if ((int) a)
+  break;
+}


[PATCH v2][PR libgfortran/78314] Fix ieee_support_halting

2016-11-16 Thread Szabolcs Nagy
ieee_support_halting only checked the availability of status
flags, not trapping support.  On some targets the later can
only be checked at runtime: feenableexcept reports if
enabling traps failed.

So check trapping support by enabling/disabling it.

Updated the test that enabled trapping to check if it is
supported.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.

gcc/testsuite/
2016-11-16  Szabolcs Nagy  

PR libgfortran/78314
* gfortran.dg/ieee/ieee_6.f90: Use ieee_support_halting.

libgfortran/
2016-11-16  Szabolcs Nagy  

PR libgfortran/78314
* config/fpu-glibc.h (support_fpu_trap): Use feenableexcept.

diff --git a/gcc/testsuite/gfortran.dg/ieee/ieee_6.f90 b/gcc/testsuite/gfortran.dg/ieee/ieee_6.f90
index 8fb4f6f..43aa3bf 100644
--- a/gcc/testsuite/gfortran.dg/ieee/ieee_6.f90
+++ b/gcc/testsuite/gfortran.dg/ieee/ieee_6.f90
@@ -9,7 +9,7 @@
   implicit none
 
   type(ieee_status_type) :: s1, s2
-  logical :: flags(5), halt(5)
+  logical :: flags(5), halt(5), haltworks
   type(ieee_round_type) :: mode
   real :: x
 
@@ -18,6 +18,7 @@
   call ieee_set_flag(ieee_all, .false.)
   call ieee_set_rounding_mode(ieee_down)
   call ieee_set_halting_mode(ieee_all, .false.)
+  haltworks = ieee_support_halting(ieee_overflow)
 
   call ieee_get_status(s1)
   call ieee_set_status(s1)
@@ -46,7 +47,7 @@
   call ieee_get_rounding_mode(mode)
   if (mode /= ieee_to_zero) call abort
   call ieee_get_halting_mode(ieee_all, halt)
-  if ((.not. halt(1)) .or. any(halt(2:))) call abort
+  if ((haltworks .and. .not. halt(1)) .or. any(halt(2:))) call abort
 
   call ieee_set_status(s2)
 
@@ -58,7 +59,7 @@
   call ieee_get_rounding_mode(mode)
   if (mode /= ieee_to_zero) call abort
   call ieee_get_halting_mode(ieee_all, halt)
-  if ((.not. halt(1)) .or. any(halt(2:))) call abort
+  if ((haltworks .and. .not. halt(1)) .or. any(halt(2:))) call abort
 
   call ieee_set_status(s1)
 
@@ -79,6 +80,6 @@
   call ieee_get_rounding_mode(mode)
   if (mode /= ieee_to_zero) call abort
   call ieee_get_halting_mode(ieee_all, halt)
-  if ((.not. halt(1)) .or. any(halt(2:))) call abort
+  if ((haltworks .and. .not. halt(1)) .or. any(halt(2:))) call abort
 
 end
diff --git a/libgfortran/config/fpu-glibc.h b/libgfortran/config/fpu-glibc.h
index 6e505da..8b29a76 100644
--- a/libgfortran/config/fpu-glibc.h
+++ b/libgfortran/config/fpu-glibc.h
@@ -121,7 +121,41 @@ get_fpu_trap_exceptions (void)
 int
 support_fpu_trap (int flag)
 {
-  return support_fpu_flag (flag);
+  int exceptions = 0;
+  int old;
+
+  if (!support_fpu_flag (flag))
+return 0;
+
+#ifdef FE_INVALID
+  if (flag & GFC_FPE_INVALID) exceptions |= FE_INVALID;
+#endif
+
+#ifdef FE_DIVBYZERO
+  if (flag & GFC_FPE_ZERO) exceptions |= FE_DIVBYZERO;
+#endif
+
+#ifdef FE_OVERFLOW
+  if (flag & GFC_FPE_OVERFLOW) exceptions |= FE_OVERFLOW;
+#endif
+
+#ifdef FE_UNDERFLOW
+  if (flag & GFC_FPE_UNDERFLOW) exceptions |= FE_UNDERFLOW;
+#endif
+
+#ifdef FE_DENORMAL
+  if (flag & GFC_FPE_DENORMAL) exceptions |= FE_DENORMAL;
+#endif
+
+#ifdef FE_INEXACT
+  if (flag & GFC_FPE_INEXACT) exceptions |= FE_INEXACT;
+#endif
+
+  old = feenableexcept (exceptions);
+  if (old == -1)
+return 0;
+  fedisableexcept (exceptions & ~old);
+  return 1;
 }
 
 


Use rtx_mode_t instead of std::make_pair

2016-11-16 Thread Richard Sandiford
This change makes the code less sensitive to the exact type of the mode,
i.e. it forces a conversion where necessary.  This becomes important
when wrappers like scalar_int_mode and scalar_mode can also be used
instead of machine_mode.

Using rtx_mode_t also abstracts away the representation.  The fact that
it's a std::pair rather than a custom class isn't important to users of
the interface.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


[ This patch is part of the SVE series posted here:
  https://gcc.gnu.org/ml/gcc/2016-11/msg00030.html ]

gcc/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* combine.c (try_combine): Use rtx_mode_t instead of std::make_pair.
* dwarf2out.c (mem_loc_descriptor, loc_descriptor): Likewise.
(add_const_value_attribute): Likewise.
* explow.c (plus_constant): Likewise.
* expmed.c (expand_mult, make_tree): Likewise.
* expr.c (convert_modes): Likewise.
* loop-doloop.c (doloop_optimize): Likewise.
* postreload.c (reload_cse_simplify_set): Likewise.
* simplify-rtx.c (simplify_const_unary_operation): Likewise.
(simplify_binary_operation_1, simplify_const_binary_operation):
(simplify_const_relational_operation, simplify_immed_subreg): Likewise.
* wide-int.h: Update documentation to recommend rtx_mode_t
instead of std::make_pair.

diff --git a/gcc/combine.c b/gcc/combine.c
index ca5ddae..0f3b292 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2870,8 +2870,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
  rtx outer = SET_SRC (temp_expr);
 
  wide_int o
-   = wi::insert (std::make_pair (outer, GET_MODE (SET_DEST 
(temp_expr))),
- std::make_pair (inner, GET_MODE (dest)),
+   = wi::insert (rtx_mode_t (outer, GET_MODE (SET_DEST (temp_expr))),
+ rtx_mode_t (inner, GET_MODE (dest)),
  offset, width);
 
  combine_merges++;
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index a7344ca..e468a4c 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -15127,7 +15127,7 @@ mem_loc_descriptor (rtx rtl, machine_mode mode,
  mem_loc_result->dw_loc_oprnd2.val_class
= dw_val_class_wide_int;
  mem_loc_result->dw_loc_oprnd2.v.val_wide = ggc_alloc ();
- *mem_loc_result->dw_loc_oprnd2.v.val_wide = std::make_pair (rtl, 
mode);
+ *mem_loc_result->dw_loc_oprnd2.v.val_wide = rtx_mode_t (rtl, mode);
}
   break;
 
@@ -15670,7 +15670,7 @@ loc_descriptor (rtx rtl, machine_mode mode,
  GET_MODE_SIZE (mode), 0);
  loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int;
  loc_result->dw_loc_oprnd2.v.val_wide = ggc_alloc ();
- *loc_result->dw_loc_oprnd2.v.val_wide = std::make_pair (rtl, mode);
+ *loc_result->dw_loc_oprnd2.v.val_wide = rtx_mode_t (rtl, mode);
}
   break;
 
@@ -15695,7 +15695,7 @@ loc_descriptor (rtx rtl, machine_mode mode,
  for (i = 0, p = array; i < length; i++, p += elt_size)
{
  rtx elt = CONST_VECTOR_ELT (rtl, i);
- insert_wide_int (std::make_pair (elt, imode), p, elt_size);
+ insert_wide_int (rtx_mode_t (elt, imode), p, elt_size);
}
  break;
 
@@ -18357,7 +18357,7 @@ add_const_value_attribute (dw_die_ref die, rtx rtl)
 
 case CONST_WIDE_INT:
   {
-   wide_int w1 = std::make_pair (rtl, MAX_MODE_INT);
+   wide_int w1 = rtx_mode_t (rtl, MAX_MODE_INT);
unsigned int prec = MIN (wi::min_precision (w1, UNSIGNED),
 (unsigned int)CONST_WIDE_INT_NUNITS (rtl) * 
HOST_BITS_PER_WIDE_INT);
wide_int w = wi::zext (w1, prec);
@@ -18404,7 +18404,7 @@ add_const_value_attribute (dw_die_ref die, rtx rtl)
for (i = 0, p = array; i < length; i++, p += elt_size)
  {
rtx elt = CONST_VECTOR_ELT (rtl, i);
-   insert_wide_int (std::make_pair (elt, imode), p, elt_size);
+   insert_wide_int (rtx_mode_t (elt, imode), p, elt_size);
  }
break;
 
diff --git a/gcc/explow.c b/gcc/explow.c
index b65eee6..75af333 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -98,8 +98,7 @@ plus_constant (machine_mode mode, rtx x, HOST_WIDE_INT c,
   switch (code)
 {
 CASE_CONST_SCALAR_INT:
-  return immed_wide_int_const (wi::add (std::make_pair (x, mode), c),
-  mode);
+  return immed_wide_int_const (wi::add (rtx_mode_t (x, mode), c), mode);
 case MEM:
   /* If this is a reference to the constant pool, try replacing it with
 a reference to a new constant.  If the resulting address isn't
diff --git 

PING [PATCH] enable -fprintf-return-value by default

2016-11-16 Thread Martin Sebor

I'm looking for an approval of the attached patch.

I've adjusted the documentation based on Sandra's input (i.e.,
documented the negative of the option rather than the positive;
thank you for the review, btw.)

On 11/08/2016 08:13 PM, Martin Sebor wrote:

The -fprintf-return-value optimization has been disabled since
the last time it caused a bootstrap failure on powerpc64le.  With
the underlying problems fixed GCC has bootstrapped fine on all of
powerpc64, powerpc64le and x86_64 and tested with no regressions.
I'd like to re-enable the option.  The attached patch does that.

Thanks
Martin



gcc/c-family/ChangeLog:

	* c.opt (-fprintf-return-value): Enable by default.

gcc/ChangeLog:

	* doc/invoke.texi (-fprintf-return-value): Document that option
	is enabled by default.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 242500)
+++ gcc/c-family/c.opt	(working copy)
@@ -1550,7 +1550,7 @@ C++ ObjC++ Var(flag_pretty_templates) Init(1)
 -fno-pretty-templates Do not pretty-print template specializations as the template signature followed by the arguments.
 
 fprintf-return-value
-C ObjC C++ ObjC++ LTO Optimization Var(flag_printf_return_value) Init(0)
+C ObjC C++ ObjC++ LTO Optimization Var(flag_printf_return_value) Init(1)
 Treat known sprintf return values as constants.
 
 freplace-objc-classes
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 242500)
+++ gcc/doc/invoke.texi	(working copy)
@@ -384,7 +384,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-toplevel-reorder -fno-trapping-math -fno-zero-initialized-in-bss @gol
 -fomit-frame-pointer -foptimize-sibling-calls @gol
 -fpartial-inlining -fpeel-loops -fpredictive-commoning @gol
--fprefetch-loop-arrays -fprintf-return-value @gol
+-fprefetch-loop-arrays -fno-printf-return-value @gol
 -fprofile-correction @gol
 -fprofile-use -fprofile-use=@var{path} -fprofile-values @gol
 -fprofile-reorder-functions @gol
@@ -8286,18 +8286,19 @@ dependent on the structure of loops within the sou
 
 Disabled at level @option{-Os}.
 
-@item -fprintf-return-value
-@opindex fprintf-return-value
-Substitute constants for known return value of formatted output functions
-such as @code{sprintf}, @code{snprintf}, @code{vsprintf}, and @code{vsnprintf}
-(but not @code{printf} of @code{fprintf}).  This transformation allows GCC
-to optimize or even eliminate branches based on the known return value of
-these functions called with arguments that are either constant, or whose
-values are known to be in a range that makes determining the exact return
-value possible.  For example, both the branch and the body of the @code{if}
-statement (but not the call to @code{snprint}) can be optimized away when
-@code{i} is a 32-bit or smaller integer because the return value is guaranteed
-to be at most 8.
+@item -fno-printf-return-value
+@opindex fno-printf-return-value
+Do not substitute constants for known return value of formatted output
+functions such as @code{sprintf}, @code{snprintf}, @code{vsprintf}, and
+@code{vsnprintf} (but not @code{printf} or @code{fprintf}).  This
+transformation allows GCC to optimize or even eliminate branches based
+on the known return value of these functions called with arguments that
+are either constant, or whose values are known to be in a range that
+makes determining the exact return value possible.  For example, when
+@option{-fprintf-return-value} is in effect, both the branch and the
+body of the @code{if} statement (but not the call to @code{snprintf})
+can be optimized away when @code{i} is a 32-bit or smaller integer
+because the return value is guaranteed to be at most 8.
 
 @smallexample
 char buf[9];
@@ -8308,7 +8309,7 @@ if (snprintf (buf, "%08x", i) >= sizeof buf)
 The @option{-fprintf-return-value} option relies on other optimizations
 and yields best results with @option{-O2}.  It works in tandem with the
 @option{-Wformat-length} option.  The @option{-fprintf-return-value}
-option is disabled by default.
+option is enabled by default.
 
 @item -fno-peephole
 @itemx -fno-peephole2


Add SET_DECL_MODE

2016-11-16 Thread Richard Sandiford
This may no longer be necessary with the current version
of the SVE patches, but it does at least make things consistent
with the TYPE_MODE/SET_TYPE_MODE split.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


[ This patch is part of the SVE series posted here:
  https://gcc.gnu.org/ml/gcc/2016-11/msg00030.html ]

gcc/ada/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* gcc-interface/utils.c (create_label_decl): Use SET_DECL_MODE.

gcc/c/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* c-decl.c (merge_decls): Use SET_DECL_MODE.
(make_label, finish_struct): Likewise.

gcc/cp/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* class.c (finish_struct_bits): Use SET_DECL_MODE.
(build_base_field_1, layout_class_type, finish_struct_1): Likewise.
* decl.c (make_label_decl): Likewise.
* pt.c (tsubst_decl): Likewise.

gcc/fortran/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* trans-common.c (build_common_decl): Use SET_DECL_MODE.
* trans-decl.c (gfc_build_label_decl): Likewise.
* trans-types.c (gfc_get_array_descr_info): Likewise.

gcc/lto/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* lto.c (offload_handle_link_vars): Use SET_DECL_MODE.

gcc/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* tree.h (SET_DECL_MODE): New macro.
* cfgexpand.c (avoid_deep_ter_for_debug): Use SET_DECL_MODE.
(expand_gimple_basic_block): Likewise.
* function.c (split_complex_args): Likeise.
* ipa-prop.c (ipa_modify_call_arguments): Likewise.
* omp-simd-clone.c (ipa_simd_modify_stmt_ops): Likewise.
* stor-layout.c (layout_decl, relayout_decl): Likewise.
(finish_bitfield_representative): Likewise.
* tree.c (make_node_stat): Likewise.
* tree-inline.c (remap_ssa_name): Likewise.
(tree_function_versioning): Likewise.
* tree-into-ssa.c (rewrite_debug_stmt_uses): Likewise.
* tree-sra.c (sra_ipa_reset_debug_stmts): Likewise.
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Likewise.
* tree-ssa-loop-ivopts.c (remove_unused_ivs): Likewise.
* tree-ssa.c (insert_debug_temp_for_var_def): Likewise.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Likewise.
* varasm.c (make_debug_expr_from_rtl): Likewise.

libcc1/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* plugin.cc (plugin_build_add_field): Use SET_DECL_MODE.

diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index c06721f..fd6c202 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -3111,7 +3111,7 @@ create_label_decl (tree name, Node_Id gnat_node)
   tree label_decl
 = build_decl (input_location, LABEL_DECL, name, void_type_node);
 
-  DECL_MODE (label_decl) = VOIDmode;
+  SET_DECL_MODE (label_decl, VOIDmode);
 
   /* Add this decl to the current binding level.  */
   gnat_pushdecl (label_decl, gnat_node);
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 3e1b7a4..2358144 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -2373,7 +2373,7 @@ merge_decls (tree newdecl, tree olddecl, tree newtype, 
tree oldtype)
   /* Since the type is OLDDECL's, make OLDDECL's size go with.  */
   DECL_SIZE (newdecl) = DECL_SIZE (olddecl);
   DECL_SIZE_UNIT (newdecl) = DECL_SIZE_UNIT (olddecl);
-  DECL_MODE (newdecl) = DECL_MODE (olddecl);
+  SET_DECL_MODE (newdecl, DECL_MODE (olddecl));
   if (DECL_ALIGN (olddecl) > DECL_ALIGN (newdecl))
{
  SET_DECL_ALIGN (newdecl, DECL_ALIGN (olddecl));
@@ -3521,7 +3521,7 @@ make_label (location_t location, tree name, bool defining,
 {
   tree label = build_decl (location, LABEL_DECL, name, void_type_node);
   DECL_CONTEXT (label) = current_function_decl;
-  DECL_MODE (label) = VOIDmode;
+  SET_DECL_MODE (label, VOIDmode);
 
   c_label_vars *label_vars = ggc_alloc ();
   label_vars->shadowed = NULL;
@@ -7995,7 +7995,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
{
  TREE_TYPE (field)
= c_build_bitfield_integer_type (width, 

Re: [PATCH] Fix PR77848

2016-11-16 Thread Bill Schmidt
Thanks, Richard!  I'll follow up with these changes over the next day or
two.  Appreciate all the help!

Bill

On Wed, 2016-11-16 at 16:08 +0100, Richard Biener wrote:
> On Tue, Nov 15, 2016 at 9:03 PM, Bill Schmidt
>  wrote:
> > Hi,
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77848 identifies a situation
> > where if-conversion causes degradation when the if-converted loop is not
> > subsequently vectorized.  The if-conversion pass does not have a cost
> > model to avoid such degradations.  However, it does have a capability to
> > version the if-converted loop, so that the vectorizer can choose the
> > if-converted version if vectorization occurs, or the unmodified version
> > if vectorization does not occur.  Currently versioning is only done under
> > special circumstances.
> >
> > This patch does two things:  It requires loop versioning whenever loop
> > vectorization is enabled so that such degradations can't occur; and it
> > extends loop versioning to outer loops when such loops are of the right
> > form for outer loop vectorization.  The latter is needed to avoid
> > introducing degradations with versioning of inner loops, which disturbs
> > the pattern that outer loop vectorization expects.
> >
> > This is an embarrassingly simple patch, given how much time I spent going
> > down other paths.  The most surprising thing is that versioning the outer
> > loop doesn't require any additional handshaking with the vectorizer.  It
> > just works.  I've verified this on some examples, and we end up with the
> > correct vectorization and with the unused loop nest discarded.
> >
> > The one remaining problem with this bug is that it precludes SLP from
> > seeing if-converted loops to work on.  With this patch, if the vectorizer
> > can't vectorize an if-converted loop, the original version survives.  We
> > have one test case that fails when that happens, because it expected to
> > do SLP vectorization on the if-converted statements:
> >
> >> FAIL: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects  
> >> scan-tree-dump-times slp1 "basic block vectorized" 1
> >> FAIL: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump-times slp1 "basic block 
> >> vectorized" 1
> >
> > Arguably, this shows a deficiency in SLP vectorization, since it won't
> > see if-converted statements in non-loop code in any event.  Eventually
> > SLP should learn to handle these kinds of PHI statements itself.
> >
> > Bootstrapped and tested on powerpc64le-unknown-linux-gnu, with only the
> > specified regression.  Is this ok for trunk?
> 
> Thanks for working on this.  Comments below.
> 
> > Thanks,
> > Bill
> >
> >
> > [gcc]
> >
> > 2016-11-15  Bill Schmidt  
> >
> > PR tree-optimization/77848
> > * tree-if-conv.c (version_loop_for_if_conversion): When versioning
> > an outer loop, only save basic block aux information for the inner
> > loop.
> > (versionable_outer_loop_p): New function.
> > (tree_if_conversion): Always version a loop when vectorization
> > is enabled; version the outer loop instead of the inner one
> > if the pattern will be recognized for outer-loop vectorization.
> >
> > [gcc/testsuite]
> >
> > 2016-11-15  Bill Schmidt  
> >
> > PR tree-optimization/77848
> > * gfortran.dg/vect/pr78848.f: New test.
> >
> >
> > Index: gcc/testsuite/gfortran.dg/vect/pr77848.f
> > ===
> > --- gcc/testsuite/gfortran.dg/vect/pr77848.f(revision 0)
> > +++ gcc/testsuite/gfortran.dg/vect/pr77848.f(working copy)
> > @@ -0,0 +1,24 @@
> > +! PR 77848: Verify versioning is on when vectorization fails
> > +! { dg-do compile }
> > +! { dg-options "-O3 -ffast-math -fdump-tree-ifcvt 
> > -fdump-tree-vect-details" }
> > +
> > +  subroutine sub(x,a,n,m)
> > +  implicit none
> > +  real*8 x(*),a(*),atemp
> > +  integer i,j,k,m,n
> > +  real*8 s,t,u,v
> > +  do j=1,m
> > + atemp=0.d0
> > + do i=1,n
> > +if (abs(a(i)).gt.atemp) then
> > +   atemp=a(i)
> > +   k = i
> > +end if
> > + enddo
> > + call dummy(atemp,k)
> > +  enddo
> > +  return
> > +  end
> > +
> > +! { dg-final { scan-tree-dump "LOOP_VECTORIZED" "ifcvt" } }
> > +! { dg-final { scan-tree-dump "vectorized 0 loops in function" "vect" } }
> > Index: gcc/tree-if-conv.c
> > ===
> > --- gcc/tree-if-conv.c  (revision 242412)
> > +++ gcc/tree-if-conv.c  (working copy)
> > @@ -2533,6 +2533,7 @@ version_loop_for_if_conversion (struct loop *loop)
> >struct loop *new_loop;
> >gimple *g;
> >gimple_stmt_iterator gsi;
> > +  unsigned int save_length;
> >
> >g = gimple_build_call_internal (IFN_LOOP_VECTORIZED, 2,
> >   build_int_cst 

Re: [Patch 14/17] [libgcc, ARM] Generalise float-to-half conversion function.

2016-11-16 Thread Kyrill Tkachov

Hi James,

On 11/11/16 15:42, James Greenhalgh wrote:

Hi,

I'm adapting this patch from work started by Matthew Wahab.

Conversions from double precision floats to the ARM __fp16 are required
to round only once. A conversion function for double to __fp16 to
support this on soft-fp targets. This and the following patch add this
conversion function by reusing the exising float to __fp16 function
config/arm/fp16.c:__gnu_f2h_internal.

This patch generalizes __gnu_f2h_internal by adding a specification of
the source format and reworking the code to make use of it. Initially,
only the binary32 format is supported.

A previous version of this patch had a bug handling rounding, the update
in this patch should be sufficient to fix the bug,

replacing:


   else
 mask = 0x1fff;

With:

  mask = (point - 1) >> 10;

I've tested that fix throwing semi-random bit-patterns at the conversion
function to confirm that the software implementation now matches the
hardware behaviour for this routine.

Additionally, bootstrapped again, and cross-tested with no issues.

OK?

Thanks,
James



libgcc/

2016-11-09  James Greenhalgh  
Matthew Wahab  

* config/arm/fp16.c (struct format): New.
(binary32): New.
(__gnu_float2h_internal): New.  Body moved from
__gnu_f2h_internal and generalize.
(_gnu_f2h_internal): Move body to function __gnu_float2h_internal.
Call it with binary32.



diff --git a/libgcc/config/arm/fp16.c b/libgcc/config/arm/fp16.c
index 39c863c..ba89796 100644
--- a/libgcc/config/arm/fp16.c
+++ b/libgcc/config/arm/fp16.c
@@ -22,40 +22,74 @@
see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
.  */
 
+struct format

+{
+  /* Number of bits.  */
+  unsigned long long size;
+  /* Exponent bias.  */
+  unsigned long long bias;
+  /* Exponent width in bits.  */
+  unsigned long long exponent;
+  /* Significand precision in explicitly stored bits.  */
+  unsigned long long significand;
+};
+
+static const struct format
+binary32 =
+{
+  32,   /* size.  */
+  127,  /* bias.  */
+  8,/* exponent.  */
+  23/* significand.  */
+};
+
 static inline unsigned short
-__gnu_f2h_internal(unsigned int a, int ieee)
+__gnu_float2h_internal (const struct format* fmt,
+   unsigned long long a, int ieee)
 {
-  unsigned short sign = (a >> 16) & 0x8000;
-  int aexp = (a >> 23) & 0xff;
-  unsigned int mantissa = a & 0x007f;
-  unsigned int mask;
-  unsigned int increment;
+  unsigned long long point = 1ULL << fmt->significand;;

Trailing ';'.

<...>
@@ -93,7 +127,13 @@ __gnu_f2h_internal(unsigned int a, int ieee)
 
   /* We leave the leading 1 in the mantissa, and subtract one

  from the exponent bias to compensate.  */
-  return sign | (((aexp + 14) << 10) + (mantissa >> 13));
+  return sign | (((aexp + 14) << 10) + (mantissa >> (fmt->significand - 10)));
+}

I suppose I'm not very familiar with the soft-fp code but I don't see at a 
glance how
the comment relates to the operation it's above of (where is the 'one' being 
subtracted
from the bias?). If you want to improve that comment or give me a quick 
explanation of why
the code does what it says it does it would be appreciated.

I've gone through the generalisation and it looks correct to me.
So given that you have put this through the testing you say you did this is ok 
with the nits
above addressed.

Thanks,
Kyrill




Reorganise machmode.h headers

2016-11-16 Thread Richard Sandiford
Later patches will make machmode.h rely on wide-int.h and the
new poly-int.h, so it needs to appear later in the coretypes.h
include list.

Previously machmode.h included insn-modes.h, which as well as
the main mode enum contains configuration information like
MAX_BITSIZE_MODE_ANY_INT.  This still needs to come first,
since files like wide-int.h depend on the configuration
information.

Similarly, later patches will make the auto-generated inline
mode size functions use poly-int.h, so the patch splits them
out into their own header file and includes it after the
integer utilities.

The patch also makes the generator files include machmode.h
via coretypes.h.  Previously they did it by more indirect means.

Finally, the patch makes wide-int-print.h available via coretypes.h
too.  There didn't seem to be any reason to force only the print
routines to be included directly, and it would be painful to extend
that approach to the new polynomial integer classes.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


[ This patch is part of the SVE series posted here:
  https://gcc.gnu.org/ml/gcc/2016-11/msg00030.html ]

gcc/
2016-11-16  Richard Sandiford  
Alan Hayward  
David Sherwood  

* Makefile.in (MACHMODE_H): Remove insn-modes.h
(CORETYPES_H): New define.
(MOSTLYCLEANFILES): Add insn-modes-inline.h.
(insn-modes-inline.h, s-modes-inline-h): New rules.
(generated_files): Add insn-modes-inline.h.
(RTL_BASE_H, TREE_CORE_H): Use CORETYPES_H instead of coretypes.h.
(build/gensupport.o, build/print-rtl.o, build/read-md.o): Likewise.
(build/read-rtl.o, build/rtl.o, build/vec.o, build/hash-table.o)
(build/inchash.o, build/gencondmd.o, build/genattr.o): Likewise.
(build/genattr-common.o, build/genattrtab.o, build/genautomata.o)
(build/gencheck.o, build/gencodes.o, build/genconditions.o): Likewise.
(build/genconfig.o, build/genconstants.o, build/genemit.o): Likewise.
(build/genenums.o, build/genextract.o, build/genflags.o): Likewise.
(build/gentarget-def.o, build/genmddeps.o, build/genopinit.o)
(build/genoutput.o, build/genpeep.o, build/genpreds.o): Likewise.
(build/genrecog.o, build/genmddump.o, build/genmatch.o): Likewise.
(build/gencfn-macros.o, build/gcov-iov.o): Likewise.
* coretypes.h: Include everything up to real.h for generators.
Include insn-modes.h first.  Include wide-int-print.h after
wide-int.h.  Include insn-modes-inline.h and then machmode.h.
* machmode.h: Don't include insn-modes.h here.
* function-tests.c: Remove includes of signop.h, machmode.h,
double-int.h and wide-int.h.
* rtl.h: Likewise.
* gcc-rich-location.c: Remove includes of machmode.h, double-int.h
and wide-int.h.
* optc-save-gen.awk: Likewise.
* gencheck.c (BITS_PER_UNIT): Delete dummy definition.
* godump.c: Remove include of wide-int-print.h.
* pretty-print.h: Likewise.
* wide-int-print.cc: Likewise.
* wide-int.cc: Likewise.
* hash-map-tests.c: Remove include of signop.h.
* hash-set-tests.c: Likewise.
* rtl-tests.c: Likewise.
* mkconfig.sh: Remove include of machmode.h.
* genmodes.c (emit_insn_modes_h): Split emission of inline functions
into...
(emit_insn_modes_inline_h): ...this new function.  Emit the code
into an insn-modes-inline.h header file, adding appropriate
include guards and end comments.
(emit_insn_modes_c_header): Remove include of machmode.h.
(emit_min_insn_modes_c_header): Include coretypes.h rather than
machmode.h.
(main): Handle -i flag and call emit_insn_modes_inline_h when
it is passed.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7ecd1e4..2daa6a6 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -896,14 +896,15 @@ COMMON_TARGET_DEF = common/common-target.def 
target-hooks-macros.h
 TARGET_H = $(TM_H) target.h $(TARGET_DEF) insn-modes.h insn-codes.h
 C_TARGET_H = c-family/c-target.h $(C_TARGET_DEF)
 COMMON_TARGET_H = common/common-target.h $(INPUT_H) $(COMMON_TARGET_DEF)
-MACHMODE_H = machmode.h mode-classes.def insn-modes.h
+MACHMODE_H = machmode.h mode-classes.def
 HOOKS_H = hooks.h $(MACHMODE_H)
 HOSTHOOKS_DEF_H = hosthooks-def.h $(HOOKS_H)
 LANGHOOKS_DEF_H = langhooks-def.h $(HOOKS_H)
 TARGET_DEF_H = target-def.h target-hooks-def.h $(HOOKS_H) targhooks.h
 C_TARGET_DEF_H = c-family/c-target-def.h c-family/c-target-hooks-def.h \
   $(TREE_H) $(C_COMMON_H) $(HOOKS_H) common/common-targhooks.h
-RTL_BASE_H = coretypes.h rtl.h rtl.def $(MACHMODE_H) reg-notes.def \
+CORETYPES_H = coretypes.h insn-modes.h insn-modes-inline.h
+RTL_BASE_H = $(CORETYPES_H) rtl.h rtl.def $(MACHMODE_H) reg-notes.def \
   insn-notes.def 

Re: [RFC][PATCH] Speed-up use-after-scope (re-writing to SSA)

2016-11-16 Thread Jakub Jelinek
On Wed, Nov 16, 2016 at 05:01:31PM +0100, Martin Liška wrote:
> +  use_operand_p use_p;
> +  imm_use_iterator imm_iter;
> +  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, poisoned_var)
> +{
> +  gimple *use = USE_STMT (use_p);
> +  if (is_gimple_debug (use))
> + continue;
> +
> +  built_in_function b = (recover_p
> +  ? BUILT_IN_ASAN_REPORT_USE_AFTER_SCOPE_NOABORT
> +  : BUILT_IN_ASAN_REPORT_USE_AFTER_SCOPE);
> +  tree fun = builtin_decl_implicit (b);
> +  pretty_printer pp;
> +  pp_tree_identifier (, DECL_NAME (var_decl));
> +
> +  gcall *call = gimple_build_call (fun, 2, asan_pp_string (),
> +DECL_SIZE_UNIT (var_decl));
> +  gimple_set_location (call, gimple_location (use));
> +
> +  /* The USE can be a gimple PHI node.  If so, insert the call on
> +  all edges leading to the PHI node.  */
> +  if (is_a  (use))
> + {
> +   gphi * phi = dyn_cast (use);

No space after *.

> +   for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
> + if (gimple_phi_arg_def (phi, i) == poisoned_var)
> +   {
> + edge e = gimple_phi_arg_edge (phi, i);
> + gsi_insert_seq_on_edge (e, call);
> + *need_commit_edge_insert = true;

You clearly don't have a sufficient testsuite coverage for this,
because this won't really work if you have more than one phi
argument equal to poisoned_var.  Inserting the same gimple stmt
into multiple places can't really work.  I bet you want to set
call to NULL after the gsi_insert_seq_on_edge and before that
call if (call == NULL) { call = gimple_build_call (...); gimple_set_location 
(...); }
Or maybe gimple_copy for the 2nd etc. would work too, dunno.

> +   }
> + }
> +  else
> + {
> +   gimple_stmt_iterator gsi = gsi_for_stmt (use);
> +   gsi_insert_before (, call, GSI_NEW_STMT);
> + }
> +}
> +
> +  gimple *nop = gimple_build_nop ();
> +  SSA_NAME_IS_DEFAULT_DEF (poisoned_var) = true;
> +  SSA_NAME_DEF_STMT (poisoned_var) = nop;
> +  gsi_replace (iter, nop, GSI_NEW_STMT);

The last argument of gsi_replace is a bool, not GSI_*.
But not sure how this will work anyway, I think SSA_NAME_IS_DEFAULT_DEF
are supposed to have SSA_NAME_DEF_STMT a GIMPLE_NOP that doesn't
have bb set, while you are putting it into the stmt sequence.
Shouldn't you just gsi_remove iter instead?

Otherwise LGTM, but please post the asan patch to llvm-commits
or through their web review interface.

Jakub


Re: [PATCH] Fix PR78305

2016-11-16 Thread Marc Glisse

On Wed, 16 Nov 2016, Michael Matz wrote:


Hi,

On Wed, 16 Nov 2016, Marc Glisse wrote:


The first sentence about ORing the sign bit sounds strange (except for a
sign-magnitude representation). With 2's complement, INT_MIN is -2^31, the
divisors are the 2^k and -(2^k). -2 * 2^30 yields INT_MIN, but your test
misses -2 as a possible divisor. On the other hand, 0b100...001 (aka
-INT_MAX)
is not a divisor of INT_MIN but your test says the reverse.


Yeah, but it handled the testcase ;)  So I guess the easiest would be
to check integer_pow2p (abs (TREE_OPERAND (t, 0)) then, thus
wi::popcount (wi::abs (TREE_OPERAND (t, 0))) == 1?


Looks good to me, thanks.


An integer X is a power of two if and only if
 X & -X == 0  (&& X != 0 if you want to exclude zero)
which also nicely handles positive and negative numbers at the same time.
No need for popcounts or abs.


There are bit tricks to test for powers of 2, but X & -X == 0 doesn't 
quite work (X & -X == X is closer, but needs a tweak for negative 
numbers). We could use

wi::pow2_p (wi::abs (TREE_OPERAND (t, 0)))
adding a new function pow2_p so it remains readable and we reduce the risk 
of using the wrong bit trick...


--
Marc Glisse


Re: [PATCH] Enable Intel AVX512_4FMAPS and AVX512_4VNNIW instructions

2016-11-16 Thread Bernd Schmidt

On 11/15/2016 05:31 PM, Andrew Senkevich wrote:

2016-11-15 17:56 GMT+03:00 Jeff Law :

On 11/15/2016 05:55 AM, Andrew Senkevich wrote:


2016-11-11 14:16 GMT+03:00 Uros Bizjak :


--- a/gcc/genmodes.c
+++ b/gcc/genmodes.c
--- a/gcc/init-regs.c
+++ b/gcc/init-regs.c
--- a/gcc/machmode.h
+++ b/gcc/machmode.h

These are middle-end changes, you will need a separate review for these.



Who could review these changes?


I can.  I likely dropped the message because it looked x86 specific, so if
you could resend it'd be appreciated.


Attached (diff with previous only in fixed comments typos).


Next time please split middle-end changes out from target-related stuff 
and send them separately.


These ones are OK.


Bernd


Re: [PATCH][PPC] Fix ICE using power9 with soft-float

2016-11-16 Thread Andrew Stubbs

On 16/11/16 13:10, Michael Meissner wrote:

Yeah, SFmode and DFmode should not have the TARGET_{S,D}F_FPR checks.


So, I can safely resolve my initial problem by simply removing them? And 
that wouldn't break the other use of that predicate?



But a secondary problem is the early clobber in the match_scratch.


So, the FPR_FUSION insn works because operands 1 and 2 cannot conflict, 
which means the early-clobber is not necessary, but the GPR_FUSION insn 
cannot work because there's no way to ensure that operands 1 and 2 don't 
conflict without also specifying that operands 0 and 2 don't conflict, 
which they commonly do.


We could fix it, for now, by adding new patterns that fit both cases 
(given that the register numbers are known at peephole time).


Or, we could disable the peephole in the case where this would occur (as 
my original patch does, albeit bluntly).


Or, something else?

Andrew


Re: [PATCH] Fix PR78305

2016-11-16 Thread Michael Matz
Hi,

On Wed, 16 Nov 2016, Michael Matz wrote:

> > Looks good to me, thanks.
> 
> An integer X is a power of two if and only if
>   X & -X == 0  (&& X != 0 if you want to exclude zero)

Nonsense.  It's X & -X == X (or X & (X-1) == 0) of course, and doesn't 
handle negative numbers.  Still, no popcount needed.


Ciao,
Michael.


Re: [RFC][PATCH] Speed-up use-after-scope (re-writing to SSA)

2016-11-16 Thread Martin Liška
As the patch quite significantly slowed down tramp3d, there's analysis
of # of variables which are poisoned by the sanitizer:

== normal variables ==
   24 B:  348x (5.80%)
   16 B:  273x (4.55%)
8 B:  237x (3.95%)
1 B:  177x (2.95%)
4 B:  119x (1.98%)
   40 B:   89x (1.48%)
  144 B:   83x (1.38%)

== C++ artifical variables ==
1 B: 1325x (22.08%)
8 B:  983x (16.38%)
   24 B:  586x (9.77%)
  144 B:  415x (6.92%)
4 B:  310x (5.17%)
   12 B:  274x (4.57%)
   16 B:  119x (1.98%)

Where sample of C++ artificial can be seen here:

  struct iterator D.608813;
  struct iterator D.369241;

  try
{
  ASAN_MARK (2, , 8);
  _1 = >D.110510._M_impl._M_start;
  __gnu_cxx::__normal_iterator >::__normal_iterator (, _1);
  try
{
  D.608813 = D.369241;
  return D.608813;
}
  finally
{
  ASAN_MARK (1, , 8);
}
}
  catch
{
  <<>>
}

Problem is that these artificial variables (>70% of all in tramp3d) are often 
passed by reference and many functions in tramp3d either mark the argument
as unused, or just dereference. In situations where a reference is not saved, 
these variables should not live in memory. However,
do we have a machinery that can help with that?

My next step would be to adapt sanopt algorithm to catch use-after-scope 
{un}poisoning, however this is a different story that has significant impact
on # of poisoned variables.

Thoughts?
Martin


Re: [PATCH] Fix PR78305

2016-11-16 Thread Michael Matz
Hi,

On Wed, 16 Nov 2016, Marc Glisse wrote:

> > > The first sentence about ORing the sign bit sounds strange (except for a
> > > sign-magnitude representation). With 2's complement, INT_MIN is -2^31, the
> > > divisors are the 2^k and -(2^k). -2 * 2^30 yields INT_MIN, but your test
> > > misses -2 as a possible divisor. On the other hand, 0b100...001 (aka
> > > -INT_MAX)
> > > is not a divisor of INT_MIN but your test says the reverse.
> > 
> > Yeah, but it handled the testcase ;)  So I guess the easiest would be
> > to check integer_pow2p (abs (TREE_OPERAND (t, 0)) then, thus
> > wi::popcount (wi::abs (TREE_OPERAND (t, 0))) == 1?
> 
> Looks good to me, thanks.

An integer X is a power of two if and only if
  X & -X == 0  (&& X != 0 if you want to exclude zero)
which also nicely handles positive and negative numbers at the same time.  
No need for popcounts or abs.


Ciao,
Michael.


Re: [RFC][PATCH] Speed-up use-after-scope (re-writing to SSA)

2016-11-16 Thread Martin Liška
On 11/16/2016 02:07 PM, Jakub Jelinek wrote:
> On Wed, Nov 16, 2016 at 01:25:04PM +0100, Martin Liška wrote:
>>  
>> +
>> +/* Expand the ASAN_{LOAD,STORE} builtins.  */
> 
> Stale comment.

Fixed.

> 
>> +
>> +bool
>> +asan_expand_poison_ifn (gimple_stmt_iterator *iter,
>> +bool *need_commit_edge_insert)
>> +{
> ...
>> +  use_operand_p use_p;
>> +  imm_use_iterator imm_iter;
>> +  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, poisoned_var)
>> +{
>> +  gimple *use = USE_STMT (use_p);
>> +
> 
> You want to ignore debug stmts uses here (or reset them).

Likewise.

> 
>> +  built_in_function b = (recover_p
>> + ? BUILT_IN_ASAN_REPORT_USE_AFTER_SCOPE_NOABORT
>> + : BUILT_IN_ASAN_REPORT_USE_AFTER_SCOPE);
>> +  tree fun = builtin_decl_implicit (b);
>> +  pretty_printer pp;
>> +  pp_tree_identifier (, DECL_NAME (var_decl));
>> +
>> +  gcall *call = gimple_build_call (fun, 2, asan_pp_string (),
>> +   DECL_SIZE_UNIT (var_decl));
>> +  gimple_set_location (call, gimple_location (g));
> 
> Is that the location you want?  I mean shouldn't it use gimple_location (use)
> instead?  The bug is on the use, not on the spot where it went out of scope.
> Though the question is what to use if gimple_location (use) is
> UNKNOWN_LOCATION.

I changed the location to gimple_location(use).

> 
>> +
>> +  /* If ASAN_POISON is used in a PHI node, let's insert the call on
>> + the leading to the PHI node BB.  */
> 
> The comment doesn't make sense gramatically to me.

Modified.

> 
>> +  if (is_a  (use))
>> +{
>> +  gphi * phi = dyn_cast (use);
>> +  for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
>> +if (gimple_phi_arg_def (phi, i) == poisoned_var)
>> +  {
>> +edge e = gimple_phi_arg_edge (phi, i);
>> +gsi_insert_seq_on_edge (e, call);
>> +*need_commit_edge_insert = true;
> 
> What if there are multiple PHI args with that use?
> Shouldn't you use just FOR_EACH_USE_ON_STMT or what macros we have?

Well, as I read the macro, I still have to iterate over gphi arguments
to find the proper edge.

> 
>> --- a/libsanitizer/asan/asan_errors.cc
>> +++ b/libsanitizer/asan/asan_errors.cc
>> @@ -279,6 +279,27 @@ void ErrorInvalidPointerPair::Print() {
>>ReportErrorSummary(bug_type, );
>>  }
> 
> As I wrote on IRC, we have to submit this to compiler-rt and only
> if it is accepted, cherry-pick it together with the gcc changes.

Sure, if we are fine with the GCC part, I can suggest the sanitizer changes.

> 
>> --- a/libsanitizer/asan/asan_errors.h
>> +++ b/libsanitizer/asan/asan_errors.h
>> @@ -294,6 +294,24 @@ struct ErrorInvalidPointerPair : ErrorBase {
>>void Print();
>>  };
>>  
>> +struct ErrorUseAfterScope : ErrorBase {
>> +  uptr pc, bp, sp;
>> +  const char *variable_name;
>> +  u32 variable_size;
> 
> Shouldn't this be uptr?

Yep, changed on all places.
I'm attaching second version of the patch. I've tested the patch on linux 
kernel and I can see
>10K places where ASAN_POISON is removed (apparently there's not place where we 
>would expand to
the new API entry point).

Martin

> 
>> +  ErrorUseAfterScope(u32 tid, uptr pc_, uptr bp_, uptr sp_,
>> + const char *variable_name_, u32 variable_size_)
> 
> And here.
> 
>> +// --- ReportUseAfterScope --- {{{1
>> +void ReportUseAfterScope(const char *variable_name, u32 variable_size,
> 
> And here?
> 
>> +void ReportUseAfterScope(const char *variable_name, u32 variable_size,
>> + bool fatal);
> 
> And here?
> 
>   Jakub
> 

>From 9505c31813f224b855c5b2fab6c157e99ce54e59 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 14 Nov 2016 16:49:05 +0100
Subject: [PATCH] use-after-scope: introduce ASAN_POISON internal fn

gcc/ChangeLog:

2016-11-16  Martin Liska  

	* asan.c (asan_expand_poison_ifn): New function.
	* asan.h (asan_expand_poison_ifn): Declare the function.
	* internal-fn.c (expand_ASAN_POISON): New function.
	* internal-fn.def (ASAN_POISON): New internal fn.
	* sanitizer.def (BUILT_IN_ASAN_REPORT_USE_AFTER_SCOPE_NOABORT):
	New built-in.
	(BUILT_IN_ASAN_REPORT_USE_AFTER_SCOPE): Likewise.
	* sanopt.c (pass_sanopt::execute): Expand IFN_ASAN_POISON.
	* tree-ssa.c (is_asan_mark_p): New function.
	(execute_update_addresses_taken): Make local variables as not
	addressable if address of these varibles is just taken by
	ASAN_MARK.

gcc/testsuite/ChangeLog:

2016-11-16  Martin Liska  

	* gcc.dg/asan/use-after-scope-3.c: Run just with -O0.
	* gcc.dg/asan/use-after-scope-9.c: Run just with -O2 and
	change expected output.
---
 gcc/asan.c| 72 ++-
 gcc/asan.h|  1 +
 gcc/internal-fn.c |  7 +++
 gcc/internal-fn.def   |  1 

Re: [Patch 16/17 libgcc ARM] Half to double precision conversions

2016-11-16 Thread Kyrill Tkachov


On 11/11/16 15:42, James Greenhalgh wrote:

Hi,

This patch adds the half-to-double conversions, both as library functions,
or when supported in hardware, using the appropriate instructions.

That means adding support for the __gnu_d2h_{ieee/alternative} library calls
added in patch 2/4, and providing a more aggressive truncdfhf2 where we can.

This also lets us remove the implementation of TARGET_CONVERT_TO_TYPE.

Bootstrapped on an ARMv8-A machine,and crosstested with no issues.

OK?

Thanks,
James

---
gcc/

2016-11-09  James Greenhalgh  

* config/arm/arm.c (arm_convert_to_type): Delete.
(TARGET_CONVERT_TO_TYPE): Delete.
(arm_init_libfuncs): Enable trunc_optab from DFmode to HFmode.
(arm_libcall_uses_aapcs_base): Add trunc_optab from DF- to HFmode.
* config/arm/arm.h (TARGET_FP16_TO_DOUBLE): New.
* config/arm/arm.md (truncdfhf2): Only convert through SFmode if we
are in fast math mode, and have no single step hardware instruction.
(extendhfdf2): Only expand through SFmode if we don't have a
single-step hardware instruction.
* config/arm/vfp.md (*truncdfhf2): New.
(extendhfdf2): Likewise.

gcc/testsuite/

2016-11-09  James Greenhalgh  

* gcc.target/arm/fp16-rounding-alt-1.c (ROUNDED): Change expected
result.
* gcc.target/arm/fp16-rounding-ieee-1.c (ROUNDED): Change expected
result.


<...>

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 8393f65..4074773 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5177,20 +5177,35 @@
   ""
 )
 
-;; DFmode to HFmode conversions have to go through SFmode.

+;; DFmode to HFmode conversions on targets without a single-step hardware
+;; instruction for it would have to go through SFmode.  This is dangerous
+;; as it introduces double rounding.
+;;
+;; Disable this pattern unless we are in an unsafe math mode, or we have
+;; a single-step instruction.
+
 (define_expand "truncdfhf2"
-  [(set (match_operand:HF  0 "general_operand" "")
+  [(set (match_operand:HF  0 "s_register_operand" "")
(float_truncate:HF
-(match_operand:DF 1 "general_operand" "")))]
-  "TARGET_EITHER"
-  "
-  {
-rtx op1;
-op1 = convert_to_mode (SFmode, operands[1], 0);
-op1 = convert_to_mode (HFmode, op1, 0);
-emit_move_insn (operands[0], op1);
-DONE;
-  }"
+(match_operand:DF 1 "s_register_operand" "")))]
+  "(TARGET_EITHER && flag_unsafe_math_optimizations)
+   || (TARGET_32BIT && TARGET_FP16_TO_DOUBLE)"
+{
+  /* We don't have a direct instruction for this, so we must be in
+ an unsafe math mode, and going via SFmode.  */
+
+  if (!(TARGET_32BIT && TARGET_FP16_TO_DOUBLE))
+{
+  rtx op1;
+  gcc_assert (flag_unsafe_math_optimizations);

I'd remove this assert. From the condition of the expander it is obvious
that if !(TARGET_32BIT && TARGET_FP16_TO_DOUBLE) then 
flag_unsafe_math_optimizations is true.


Ok with this change.
Thanks,
Kyrill



Re: [PATCH 2/2] [ARC] Update target specific tests.

2016-11-16 Thread Andrew Burgess
* Claudiu Zissulescu  [2016-05-30 14:32:38 
+0200]:

> Update the ARC specific tests.
> 
> OK to apply?
> Claudiu
> 
> gcc/
> 2016-05-26  Claudiu Zissulescu  
> 
>   * testsuite/gcc.target/arc/abitest.S: New file.
>   * testsuite/gcc.target/arc/va_args-1.c: Likewise.
>   * testsuite/gcc.target/arc/va_args-2.c: Likewise.
>   * testsuite/gcc.target/arc/va_args-3.c: Likewise.
>   * testsuite/gcc.target/arc/mcrc.c: Deleted.
>   * testsuite/gcc.target/arc/mdsp-packa.c: Likewise.
>   * testsuite/gcc.target/arc/mdvbf.c: Likewise.
>   * testsuite/gcc.target/arc/mmac-24.c: Likewise.
>   * testsuite/gcc.target/arc/mmac-d16.c: Likewise.
>   * testsuite/gcc.target/arc/mno-crc.c: Likewise.
>   * testsuite/gcc.target/arc/mno-dsp-packa.c: Likewise.
>   * testsuite/gcc.target/arc/mno-dvbf.c: Likewise.
>   * testsuite/gcc.target/arc/mno-mac-24.c: Likewise.
>   * testsuite/gcc.target/arc/mno-mac-d16.c: Likewise.
>   * testsuite/gcc.target/arc/mno-rtsc.c: Likewise.
>   * testsuite/gcc.target/arc/mno-xy.c: Likewise.
>   * testsuite/gcc.target/arc/mrtsc.c: Likewise.
>   * testsuite/gcc.target/arc/arc.exp (check_effective_target_arcem):
>   New function.
>   (check_effective_target_arc700): Likewise.
>   (check_effective_target_arc6xx): Likewise.
>   (check_effective_target_arcmpy): Likewise.
>   (check_effective_target_archs): Likewise.
>   (check_effective_target_clmcpu): Likewise.
>   * testsuite/gcc.target/arc/barrel-shifter-1.c: Changed.
>   * testsuite/gcc.target/arc/builtin_simd.c: Test only for ARC700
>   cpus.
>   * testsuite/gcc.target/arc/cmem-1.c: Changed.
>   * testsuite/gcc.target/arc/cmem-2.c: Likewise.
>   * testsuite/gcc.target/arc/cmem-3.c: Likewise.
>   * testsuite/gcc.target/arc/cmem-4.c: Likewise.
>   * testsuite/gcc.target/arc/cmem-5.c: Likewise.
>   * testsuite/gcc.target/arc/cmem-6.c: Likewise.
>   * testsuite/gcc.target/arc/cmem-7.c: Likewise.
>   * testsuite/gcc.target/arc/interrupt-1.c: Test for RTIE as well.
>   * testsuite/gcc.target/arc/interrupt-2.c: Skip it for ARCv2 cores.
>   * testsuite/gcc.target/arc/interrupt-3.c: Match also ARCv2
>   warnings.
>   * testsuite/gcc.target/arc/jump-around-jump.c: Update options.
>   * testsuite/gcc.target/arc/mARC601.c: Changed.
>   * testsuite/gcc.target/arc/mcpu-arc600.c: Changed.
>   * testsuite/gcc.target/arc/mcpu-arc601.c: Changed.
>   * testsuite/gcc.target/arc/mcpu-arc700.c: Changed.
>   * testsuite/gcc.target/arc/mdpfp.c: Skip for ARCv2 cores.
>   * testsuite/gcc.target/arc/movb-1.c: Changed.
>   * testsuite/gcc.target/arc/movb-2.c: Likewise.
>   * testsuite/gcc.target/arc/movb-3.c: Likewise.
>   * testsuite/gcc.target/arc/movb-4.c: Likewise.
>   * testsuite/gcc.target/arc/movb-5.c: Likewise.
>   * testsuite/gcc.target/arc/movb_cl-1.c: Likewise.
>   * testsuite/gcc.target/arc/movb_cl-2.c: Likewise.
>   * testsuite/gcc.target/arc/movbi_cl-1.c: Likewise.
>   * testsuite/gcc.target/arc/movh_cl-1.c: Likewise.
>   * testsuite/gcc.target/arc/mspfp.c: Skip for ARC HS cores.
>   * testsuite/gcc.target/arc/mul64.c: Enable it only for ARC600.
>   * testsuite/gcc.target/arc/mulsi3_highpart-1.c: Scan for ARCv2
>   instructions.
>   * testsuite/gcc.target/arc/mulsi3_highpart-2.c: Skip it for ARCv1
>   cores.
>   * testsuite/gcc.target/arc/no-dpfp-lrsr.c: Skip it for ARC HS.
>   * testsuite/gcc.target/arc/trsub.c: Only for ARC EM cores.
>   * testsuite/gcc.target/arc/builtin_simdarc.c: Changed.
>   * testsuite/gcc.target/arc/extzv-1.c: Likewise.
>   * testsuite/gcc.target/arc/insv-1.c: Likewise.
>   * testsuite/gcc.target/arc/insv-2.c: Likewise.
>   * testsuite/gcc.target/arc/mA6.c: Likewise.
>   * testsuite/gcc.target/arc/mA7.c: Likewise.
>   * testsuite/gcc.target/arc/mARC600.c: Likewise.
>   * testsuite/gcc.target/arc/mARC700.c: Likewise.
>   * testsuite/gcc.target/arc/mcpu-arc600.c: Likewise.
>   * testsuite/gcc.target/arc/mcpu-arc700.c: Likewise.
>   * testsuite/gcc.target/arc/movl-1.c: Likewise.
>   * testsuite/gcc.target/arc/nps400-1.c: Likewise.
>   * testsuite/gcc.target/arc/trsub.c: Likewise.


These entries should be going into the gcc/testsuite/ChangeLog file,
and so don't need the "testsuite/" prefix.

Otherwise I'm happy for this to be merged.  I've only skimmed the
change, but assuming you've run the tests this all seems good.

Thanks,
Andrew



> ---
>  gcc/testsuite/gcc.target/arc/abitest.S   | 31 +++
>  gcc/testsuite/gcc.target/arc/arc.exp | 66 
> +++-
>  gcc/testsuite/gcc.target/arc/barrel-shifter-1.c  |  2 +-
>  gcc/testsuite/gcc.target/arc/builtin_simd.c  |  1 +
>  gcc/testsuite/gcc.target/arc/builtin_simdarc.c   |  1 +
>  

Re: [PING 2] [PATCH] enhance buffer overflow warnings (and c/53562)

2016-11-16 Thread Martin Sebor

I'm still looking for a review of the patch below, first posted
on 10/28 and last updated/pinged last Wednesday:

  https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00896.html

Thanks

On 11/09/2016 03:49 PM, Martin Sebor wrote:

The attached minor update to the patch also resolves bug 77784 that
points out that -Wformat-length issues a warning also issued during
the expansion of some of the __builtin___sprintf_chk intrinsics.

Martin

On 11/04/2016 02:16 PM, Martin Sebor wrote:

Attached is an update to the patch that takes into consideration
the feedback I got.  It goes back to adding just one option,
-Wstringop-overflow, as in the original, while keeping the Object
Size type as an argument.  It uses type-1 as the default setting
for string functions (strcpy et al.) and, unconditionally, type-0
for raw memory functions (memcpy, etc.)

I retested Binutils 2.27 and the Linux kernel again with this patch
and also added Glibc, and it doesn't complain about anything (both
Binutils and the kernel also build cleanly with an unpatched GCC
with_FORTIFY_SOURCE=2 or its rough equivalent for the kernel).
The emit-rtl.c warning (bug 78174) has also been suppressed by
the change to bos type-0 for memcpy.

While the patch doesn't trigger any false positives (AFAIK) it is
subject to a fair number of false negatives due to the limitations
of the tree-object-size pass, and due to transformations done by
other passes that prevent it from detecting some otherwise obvious
overflows.  Although unfortunate, I believe the warnings that are
emitted are useful as the first line of defense in software that
doesn't use _FORTIFY_SOURCE (such as GCC itself).   And this can
of course be improved if some of the limitations are removed over
time.

Martin






  1   2   >