date:20161028

Re: [PATCH, LIBGCC] Avoid count_leading_zeros with undefined result (PR 78067)

2016-10-28 Thread Bernd Edlinger

On 10/28/16 16:05, Bernd Edlinger wrote:
> On 10/27/16 22:23, Joseph Myers wrote:
>> On Thu, 27 Oct 2016, Bernd Edlinger wrote:
>>
>>> Hi,
>>>
>>> by code reading I became aware that libgcc can call count_leading_zeros
>>> in certain cases which can give undefined results.  This happens on
>>> signed int128 -> float or double conversions, when the int128 is in
>>> the range
>>> INT64_MAX+1 to UINT64_MAX.
>>
>> I'd expect testcases added to the testsuite that exercise this case at
>> runtime, if not already present.
>>
>
> Yes, thanks.  I somehow expected there were already test cases,
> somewhere, but now when you ask that, I begin to doubt as well...
>
> I will try to add an asm("int 3") and see if that gets hit at all.
>

The breakpoint got hit only once, in the libgo testsuite: runtime/pprof.

I see there are some int to float conversion tests at
gcc.dg/torture/fp-int-convert*.c, where it is easy to add a
test case that hits the breakpoint too.

However the test case does not fail before the patch,
it is just slightly undefined behavior, that is not
causing problems (at least for x86_64).

Find attached a new patch with test case.


Boot-strapped on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
2016-10-27  Bernd Edlinger  

	PR libgcc/78067
	* libgcc2.c (__floatdisf, __floatdidf): Avoid undefined results from
	count_leading_zeros.

testsuite:
2016-10-27  Bernd Edlinger  

	PR libgcc/78067
	* gcc.dg/torture/fp-int-convert.h: Add more conversion tests.
	

Index: libgcc2.c
===
--- libgcc2.c	(revision 241400)
+++ libgcc2.c	(working copy)
@@ -1643,6 +1643,11 @@
 hi = -(UWtype) hi;
 
   UWtype count, shift;
+#if !defined (COUNT_LEADING_ZEROS_0) || COUNT_LEADING_ZEROS_0 != W_TYPE_SIZE
+  if (hi == 0)
+count = W_TYPE_SIZE;
+  else
+#endif
   count_leading_zeros (count, hi);
 
   /* No leading bits means u == minimum.  */
Index: gcc/testsuite/gcc.dg/torture/fp-int-convert.h
===
--- gcc/testsuite/gcc.dg/torture/fp-int-convert.h	(revision 241647)
+++ gcc/testsuite/gcc.dg/torture/fp-int-convert.h	(working copy)
@@ -53,6 +53,8 @@ do {\
   TEST_I_F_VAL (U, F, HVAL1U (P, U), P_OK (P, U));		\
   TEST_I_F_VAL (U, F, HVAL1U (P, U) + 1, P_OK (P, U));		\
   TEST_I_F_VAL (U, F, HVAL1U (P, U) - 1, P_OK (P, U));		\
+  TEST_I_F_VAL (I, F, WVAL0S (I), 1);\
+  TEST_I_F_VAL (I, F, -WVAL0S (I), 1);\
 } while (0)
 
 #define P_OK(P, T) ((P) >= sizeof(T) * CHAR_BIT)
@@ -74,6 +76,7 @@ do {\
 			 ? (S)1		 \
 			 : (((S)1 << (sizeof(S) * CHAR_BIT - 2))	 \
 			+ ((S)3 << (sizeof(S) * CHAR_BIT - 2 - P
+#define WVAL0S(S) (S)((S)1 << (sizeof(S) * CHAR_BIT / 2 - 1))
 
 #define TEST_I_F_VAL(IT, FT, VAL, PREC_OK)		\
 do {			\

Re: [PATCH v3] gcc/config/tilegx/tilegx.c (tilegx_function_profiler): Save r10 to stack before call mcount

2016-10-28 Thread Chen Gang

Firstly, sorry for replying late (During these days, I worked overtime
every workday, and have to reply in weekend).

On 10/24/16 23:27, Jeff Law wrote:
> On 10/23/2016 12:11 PM, Bernd Edlinger wrote:
>> Hi,
>>
>> I don't know much about tilegx, but
>> I think the patch should work as is.
>>
>> This is because the
>> Save r10 code is a bundle
>>
>>   {
>>   addi sp, sp, -8
>>   st sp, r10
>>   }
>>
>> which stores r10 at [sp] and subtracts 8 from sp.
>>
>> The restore r10 code is actually two bundles:
> Thanks for pointing that out!  I totally missed the restore was two bundles.
> 
> 
>>
>>   addi sp, sp, 8
>>   ld r10, sp
>>
>> This just adds 8 to sp, and loads r10 from there.
> Right.  And with the restore as two bundles the semantics of the save/restore 
> seem consistent/correct.
> 

Oh, really. Sorry that I almost forgot my history about this patch.

Originally, I sent patch v1 both with 2 bundles, but when I sent patch
v2, I let "saving r10" within a bundle for optimization (I mentioned
about it in replying patch v1 on 2016-05-31).

>>
>> I don't know how __mcount is implemented, it must
>> be some asm code, almost all functions save the
>> lr at [sp] when invoked, but I don't know if __mcount
>> does that at all, if it doesn't do that, then the
>> adjusting of sp might be unnecessary.
>>
>> The only thing that might be a problem is that
>> the stack is always adjusted in multiples of 16
>> on the tilegx platform, see tilegx.h:
>>
>> #define STACK_BOUNDARY 128
>>
>> That is counted in bits, and means 16 bytes.
>> But your patch adjusts the stack only by 8.
> Missed that.  Without knowing the tile ports, I can't say with any degree of 
> confidence that it's safe to only adjust by 8 bytes. Adjusting by 16 seems 
> safer.
> 

Oh, really! After check all the output code, "addi sp" operation are all
times of 16!! So I guess, I shall addi sp 16, too (send patch v4 for it,
if no any addition reply within a week).

>>
>> Furthermore, I don't see how the stack unwinding will work
>> with this stack adjustment when no .cfi directives
>> are emitted, but that is probably not a big problem.
> I wouldn't be surprised if the tilegx isn't the only port with this problem.  
>  I don't think we've ever been good about making sure the unwinders are 
> correct for targets where we profile before the prologue and which emit the 
> profiling bits directly rather than representing them as RTL.
> 

Excuse me, I have no any idea about it (in fact, in honest, I guess, I
am still not quite familiar with gcc development in details).

At present, what I can know is: after this patch, gcc can pass various
related unwinding test (including nested functions) under qemu tilegx
linux-user (originally, I traced related insns, they should be ok).

:-)

Thanks.
-- 
Chen Gang (陈刚)

Managing Natural Environments is the Duty of Human Beings.

One more issue with vax and spu ports with current trunk

2016-10-28 Thread Jeff Law




REGNO_REG_CLASS is defined as:

#define REGNO_REG_CLASS(REGNO) ALL_REGS

For the vax port and in a similar manner on the spu port.


Note how it doesn't use the REGNO argument.  This causes problems for 
the new noop set code:


 /* Detect noop sets and remove them before processing side 
effects.  */

  if (set && REG_P (SET_DEST (set)) && REG_P (SET_SRC (set)))
{
  unsigned int regno = REGNO (SET_SRC (set));
  rtx r1 = find_oldest_value_reg (REGNO_REG_CLASS (regno),
  SET_DEST (set), vd);
  rtx r2 = find_oldest_value_reg (REGNO_REG_CLASS (regno),
  SET_SRC (set), vd);
  if (rtx_equal_p (r1 ? r1 : SET_DEST (set), r2 ? r2 : SET_SRC 
(set)))

{
  bool last = insn == BB_END (bb);
  delete_insn (insn);
  if (last)
break;
  continue;
}
}

"regno" will not be used on the vax port because of the definition of 
REGNO_REG_CLASS, triggering a build failure using config-list.mk.


I'd originally hacked in a fix in regcprop.c.  But then realized it's 
probably cleaner to just twiddle REGNO_REG_CLASS to reference its 
argument.  The PTX port already works in this manner.


Installing on the trunk.

Jeff
diff --git a/gcc/config/spu/spu.h b/gcc/config/spu/spu.h
index c2c31e7..7b6bad1 100644
--- a/gcc/config/spu/spu.h
+++ b/gcc/config/spu/spu.h
@@ -205,7 +205,8 @@ enum reg_class {
 {0x, 0x, 0x, 0x, 0x3}, /* general regs */ \
 {0x, 0x, 0x, 0x, 0x3}} /* all regs */
 
-#define REGNO_REG_CLASS(REGNO) (GENERAL_REGS)
+#define REGNO_REG_CLASS(REGNO) ((void)(REGNO), GENERAL_REGS)
+
 
 #define BASE_REG_CLASS GENERAL_REGS
 
diff --git a/gcc/config/vax/vax.h b/gcc/config/vax/vax.h
index 427c352..dc77aa9 100644
--- a/gcc/config/vax/vax.h
+++ b/gcc/config/vax/vax.h
@@ -226,7 +226,7 @@ enum reg_class { NO_REGS, ALL_REGS, LIM_REG_CLASSES };
reg number REGNO.  This could be a conditional expression
or could index an array.  */
 
-#define REGNO_REG_CLASS(REGNO) ALL_REGS
+#define REGNO_REG_CLASS(REGNO) ((void)(REGNO), ALL_REGS)
 
 /* The class value for index registers, and the one for base regs.  */

RFD: Buffer handling for ASM_GENERATE_INTERNAL_LABEL

2016-10-28 Thread Jeff Law



Consider this definition of ASM_GENERATE_INTERNAL_LABEL (from sp64-elf.h):

#undef  ASM_GENERATE_INTERNAL_LABEL
#define ASM_GENERATE_INTERNAL_LABEL(LABEL,PREFIX,NUM)   \
  sprintf ((LABEL), "*.L%s%ld", (PREFIX), (long)(NUM))

And a use from assemble_static_space:

   ASM_GENERATE_INTERNAL_LABEL (name, "LF", const_labelno);


For this case we can generate up to 16 bytes of data + a nul terminator.

Sadly, we only allocate 16 bytes in assemble_static_space.

Obviously it's unlikely we'll ever have a labelno that will overflow to 
-2147483648 without something else breaking.  So in practice this isn't 
likely to ever cause a problem.  But we still need to address it.


This causes 8 sparc configurations from config-list.mk to fail to build 
using the trunk compiler to build the crosses.




We can obviously fix the array to be bigger here and it's a trivial 
change.  If we get a situation where it's out of range again, we can 
detect it with the existing sprintf warnings.  It's also consistent in 
the sense that most callers to ASM_GENERATE_INTERNAL_LABEL use a 
significantly larger buffer than assemble_static_space.


Sadly, there's a bigger issue here.  Namely that the caller and the 
definition of ASM_GENERATE_INTERNAL_LABEL both can include arbitrary 
length text into the label name.  Furthermore, the buffer is allocated 
in the caller's context. It's a terrible API.


ISTM the way "out" is  to change very ASM_GENERATE_INTERNAL_LABEL 
implementation to use snprintf to first determine the length of the 
resulting string, then allocate an appropriate amount of memory 
(returning it to the caller).  The caller is then changed to use the 
buffer allocated by ASM_GENERATE_INTERNAL_LABEL, free-ing it when 
appropriate.  Major ick.  We'd probably want to hook-ize the damn thing 
while we're at it.


Other thoughts?

Jeff

Contents of PO file 'cpplib-6.1.0.eo.po'

2016-10-28 Thread Translation Project Robot



cpplib-6.1.0.eo.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

New Esperanto PO file for 'cpplib' (version 6.1.0)

2016-10-28 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Esperanto team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/eo.po

(This file, 'cpplib-6.1.0.eo.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

[Committed] PR fortran/71891 -- Fix obvious typo

2016-10-28 Thread Steve Kargl

I've committed the following to both 6-branch and trunk.

2016-10-28  Steven G. Kargl 

PR fortran/71891
* symbol.c (gfc_type_compatible): Fix typo.

Index: symbol.c
===
--- symbol.c(revision 241668)
+++ symbol.c(working copy)
@@ -4861,7 +4861,7 @@ gfc_type_compatible (gfc_typespec *ts1, 
   && !is_union1 && !is_union2)
 return (ts1->type == ts2->type);
 
-  if ((is_derived1 && is_derived2) || (is_union1 && is_union1))
+  if ((is_derived1 && is_derived2) || (is_union1 && is_union2))
 return gfc_compare_derived_types (ts1->u.derived, ts2->u.derived);
 
   if (is_derived1 && is_class2)

-- 
Steve

Go patch committed: copy slice from Go 1.7 runtime

2016-10-28 Thread Ian Lance Taylor

This patch to the compiler and libgo copies the slice support from the
Go 1.7 runtime.

This changes the Go frontend to handle append as the gc compiler does:
call a function to grow the slice, but otherwise assign the new
elements directly to the final slice.

For the current gccgo memory allocator the slice code has to call
runtime_newarray, not mallocgc directly, so that the allocator sets
the TypeInfo_Array bit in the type pointer.

This renames the static function cnew to runtime_docnew, so that the
stack trace ignores it when ignoring runtime functions.  This was
needed to fix the runtime/pprof tests on 386.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 241661)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-5ddcdfb0b2bb992a70b391ab34bf15291a514e48
+fe38baff61b9b9426a4f60ff078cf3c8722bf94d
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/escape.cc
===
--- gcc/go/gofrontend/escape.cc (revision 241384)
+++ gcc/go/gofrontend/escape.cc (working copy)
@@ -284,20 +284,19 @@ Node::op_format() const
  op << "panic";
  break;
 
-   case Runtime::APPEND:
+   case Runtime::GROWSLICE:
  op << "append";
  break;
 
-   case Runtime::COPY:
+   case Runtime::SLICECOPY:
+   case Runtime::SLICESTRINGCOPY:
+   case Runtime::TYPEDSLICECOPY:
  op << "copy";
  break;
 
case Runtime::MAKECHAN:
case Runtime::MAKEMAP:
-   case Runtime::MAKESLICE1:
-   case Runtime::MAKESLICE2:
-   case Runtime::MAKESLICE1BIG:
-   case Runtime::MAKESLICE2BIG:
+   case Runtime::MAKESLICE:
  op << "make";
  break;
 
@@ -419,10 +418,7 @@ Node::is_big(Escape_context* context) co
  Func_expression* fn = call->fn()->func_expression();
  if (fn != NULL
  && fn->is_runtime_function()
- && (fn->runtime_code() == Runtime::MAKESLICE1
- || fn->runtime_code() == Runtime::MAKESLICE2
- || fn->runtime_code() == Runtime::MAKESLICE1BIG
- || fn->runtime_code() == Runtime::MAKESLICE2BIG))
+ && fn->runtime_code() == Runtime::MAKESLICE)
{
  // Second argument is length.
  Expression_list::iterator p = call->args()->begin();
@@ -1201,13 +1197,25 @@ Escape_analysis_assign::expression(Expre
}
break;
 
- case Runtime::APPEND:
+ case Runtime::GROWSLICE:
{
- // Unlike gc/esc.go, a call to append has already had its
- // varargs lowered into a slice of arguments.
- // The content of the appended slice leaks.
- Node* appended = Node::make_node(call->args()->back());
- this->assign_deref(this->context_->sink(), appended);
+ // The contents being appended leak.
+ if (call->is_varargs())
+   {
+ Node* appended = Node::make_node(call->args()->back());
+ this->assign_deref(this->context_->sink(), appended);
+   }
+ else
+   {
+ for (Expression_list::const_iterator pa =
+call->args()->begin();
+  pa != call->args()->end();
+  ++pa)
+   {
+ Node* arg = Node::make_node(*pa);
+ this->assign(this->context_->sink(), arg);
+   }
+   }
 
  if (debug_level > 2)
go_error_at((*pexpr)->location(),
@@ -1219,7 +1227,9 @@ Escape_analysis_assign::expression(Expre
}
break;
 
- case Runtime::COPY:
+ case Runtime::SLICECOPY:
+ case Runtime::SLICESTRINGCOPY:
+ case Runtime::TYPEDSLICECOPY:
{
  // Lose track of the copied content.
  Node* copied = Node::make_node(call->args()->back());
@@ -1229,10 +1239,7 @@ Escape_analysis_assign::expression(Expre
 
  case Runtime::MAKECHAN:
  case Runtime::MAKEMAP:
- case Runtime::MAKESLICE1:
- case Runtime::MAKESLICE2:
- case Runtime::MAKESLICE1BIG:
- case Runtime::MAKESLICE2BIG:
+ case Runtime::MAKESLICE:
  case Runtime::SLICEBYTETOSTRING:

Re: [PATCH] fix linker name for uClibc

2016-10-28 Thread Michael Eager


On 10/28/2016 11:14 AM, Waldemar Brodkorb wrote:

Hi,

uClibc-ng can be used for Microblaze architecture.
It is regulary tested with qemu-system-microblaze in little and
big endian mode.

2016-10-28  Waldemar Brodkorb  

 gcc/
 * config/microblaze/linux.h: add UCLIBC_DYNAMIC_LINKER

diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h
index ae8523c..b3bf43a 100644
--- a/gcc/config/microblaze/linux.h
+++ b/gcc/config/microblaze/linux.h
@@ -29,6 +29,7 @@
  #define TLS_NEEDS_GOT 1

  #define GLIBC_DYNAMIC_LINKER "/lib/ld.so.1"
+#define UCLIBC_DYNAMIC_LINKER "/lib/ld-uClibc.so.0"

  #if TARGET_BIG_ENDIAN_DEFAULT == 0 /* LE */
  #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:;:el}"


best regards
  Waldemar


OK to apply.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

Re: [Patch 1/11] Add a new target hook for describing excess precision intentions

2016-10-28 Thread Joseph Myers

On Fri, 14 Oct 2016, James Greenhalgh wrote:

> + value set for @code{-fexcess-precision=[standard|fast]}.",

I think the correct markup for the option here is:

@option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}

(that is, using @option not @code, and with the [ | ] not in a fixed-width 
font because they aren't part of the option name).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [Patch 6/11] Migrate excess precision logic to use TARGET_EXCESS_PRECISION

2016-10-28 Thread Joseph Myers

On Fri, 14 Oct 2016, James Greenhalgh wrote:

> +/* If the join of the implicit precision in which the target will compute
> +   floating-point values and the standard precision in which the target will
> +   compute values is not equal to the standard precision, then the target
> +   is either unpredictable, or is a broken configuration in which it claims
> +   standards compliance, but doesn't honor that.
> +
> +   Effective predictability for __GCC_IEC_559 in flag_iso_mode, means that
> +   the implicit precision is not wider, or less predictable than the
> +   standard precision.
> +
> +   Return TRUE if we have been asked to compile with
> +   -fexcess-precision=standard, and following the rules above we are able
> +   to guarantee the standards mode.  */
> +
> +static bool
> +c_cpp_flt_eval_method_iec_559 (void)
> +{
> +  enum flt_eval_method implicit
> += targetm.c.excess_precision (EXCESS_PRECISION_TYPE_IMPLICIT);
> +  enum flt_eval_method standard
> += targetm.c.excess_precision (EXCESS_PRECISION_TYPE_STANDARD);
> +
> +  return (excess_precision_mode_join (implicit, standard) == standard
> +   && flag_excess_precision_cmdline == EXCESS_PRECISION_STANDARD);
> +}
> +
>  /* Return the value for __GCC_IEC_559.  */
>  static int
>  cpp_iec_559_value (void)
> @@ -775,11 +801,12 @@ cpp_iec_559_value (void)
>   applies to unpredictable contraction.  For C++, and outside
>   strict conformance mode, do not consider these options to mean
>   lack of IEEE 754 support.  */
> +
>if (flag_iso
>&& !c_dialect_cxx ()
> -  && TARGET_FLT_EVAL_METHOD != 0
> -  && flag_excess_precision_cmdline != EXCESS_PRECISION_STANDARD)
> +  && !c_cpp_flt_eval_method_iec_559 ())
>  ret = 0;

I'm not convinced by the logic you have here.  At least, it seems 
different from what we have at present, where -std=c11 
-fexcess-precision=fast is not considered unpredictable if the target 
doesn't have any implicit excess precision.

That is: I think the right question is whether the combination (front-end 
excess precision, implicit back-end excess precision) does the same thing 
as just front-end excess precision, regardless of the -fexcess-precision= 
option.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Delete GCJ

2016-10-28 Thread Eric Botcazou

> Things we may want to remove:
> 
> - references to java in contrib (download_ecj, gcc_update,
>   patch_tester.sh, update-copyright.py)
> - GCJ, GCJ_FOR_BUILD, GCJ_FOR_TARGET in Makefiles.tpl and configure.ac
> - LIBGCJ_SONAME in config/i386/{cygwin.h,mingw32.h}
> - references to java in install.texi

There are more references in sourcebuild.texi and install.texi.

-- 
Eric Botcazou

libgo patch committed: Fix time test for recent timezone data update

2016-10-28 Thread Ian Lance Taylor

This libgo patch fixes the time test for systems that using the recent
tzdata-2016g update, which added a new zone abbreviation.
Bootstrapped and ran time test on x86_64-pc-linux-gnu.  Committed to
mainline and GCC 6 branch.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 241659)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-4a8df8f8622c140777996786866395448622ac3f
+5ddcdfb0b2bb992a70b391ab34bf15291a514e48
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/time/time_test.go
===
--- libgo/go/time/time_test.go  (revision 241341)
+++ libgo/go/time/time_test.go  (working copy)
@@ -939,8 +939,11 @@ func TestLoadFixed(t *testing.T) {
// but Go and most other systems use "east is positive".
// So GMT+1 corresponds to -3600 in the Go zone, not +3600.
name, offset := Now().In(loc).Zone()
-   if name != "GMT+1" || offset != -1*60*60 {
-   t.Errorf("Now().In(loc).Zone() = %q, %d, want %q, %d", name, 
offset, "GMT+1", -1*60*60)
+   // The zone abbreviation is "-01" since tzdata-2016g, and "GMT+1"
+   // on earlier versions; we accept both. (Issue #17276).
+   if !(name == "GMT+1" || name == "-01") || offset != -1*60*60 {
+   t.Errorf("Now().In(loc).Zone() = %q, %d, want %q or %q, %d",
+   name, offset, "GMT+1", "-01", -1*60*60)
}
 }

Re: [PATCH, libgo]: Fix FAIL: time testsuite failure

2016-10-28 Thread Ian Lance Taylor

On Tue, Oct 18, 2016 at 2:19 AM, Uros Bizjak  wrote:
> The name of Etc/GMT+1 timezone is "-01", as evident from:
>
> $ TZ=Etc/GMT+1 date +%Z
> -01
>
> Attached patch fixes the testsuite failure.

Thanks--I'm going to copy the change made to the master library
instead.  Will commit shortly (Than did the actual patch).

For the record this was filed as GCC PR 78144.

Ian

Default associative containers constructors/destructor/assignment

2016-10-28 Thread François Dumont


Hi

Here is the patch to default all other associative containers 
operations that can be defaulted.


To do so I introduce a _Rb_tree_key_compare type that take care of 
value initialization of compare functor. It also make sure that functor 
is copied rather than move in move constructor with necessary noexcept 
qualification.


I also introduce _Rb_tree_header to take care of the initialization 
of the _Rb_tree_node_base used in the container header and of 
_M_node_count. I also use it to implement the move semantic and so 
default also _Rb_tree_impl move construtor.


I also propose a solution for the FIXME regarding documentation of 
container destructor, I used C++11 default declaration. I don't have 
necessary tools to generate Doxygen doc but I am confident that it 
should work fine. I had to simplify doc for operations that are now 
defaulted.



* include/bits/stl_map.h (map(const map&)): Make default.
(map(map&&)): Likewise.
(~map()): Likewise.
(operator=(const map&)): Likewise.
* include/bits/stl_multimap.h (multimap(const multimap&)): Make 
default.

(multimap(multimap&&)): Likewise.
(~multimap()): Likewise.
(operator=(const multimap&)): Likewise.
* include/bits/stl_set.h (set(const set&)): Make default.
(set(set&&)): Likewise.
(~set()): Likewise.
(operator=(const set&)): Likewise.
* include/bits/stl_multiset.h (multiset(const multiset&)): Make 
default.

(multiset(multiset&&)): Likewise.
(~multiset()): Likewise.
(operator=(const multiset&)): Likewise.
* include/bits/stl_tree.h (_Rb_tree_key_compare<>): New.
(_Rb_tree_header): New.
(_Rb_tree_impl): Inherit from latter.
(_Rb_tree_impl()): Make default.
(_Rb_tree_impl(const _Rb_tree_impl&)): New.
(_Rb_tree_impl(_Rb_tree_impl&&)): New, default.
(_Rb_tree_impl::_M_reset): Move...
(_Rb_tree_header::_M_reset): ...here.
(_Rb_tree_impl::_M_initialize): Move...
(_Rb_tree_header::_M_initialize): ...here.
(_Rb_tree(_Rb_tree&&)): Make default.
(_Rb_tree_header::_M_move_data(_Rb_tree_header&)): New.
(_Rb_tree<>::_M_move_data(_Rb_tree&, true_type)): Use latter.

Tested under Linux x86_64, ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/stl_map.h b/libstdc++-v3/include/bits/stl_map.h
index dea7d5b..bbd0a97 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -185,25 +185,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   /**
*  @brief  %Map copy constructor.
-   *  @param  __x  A %map of identical element and allocator types.
*
-   *  The newly-created %map uses a copy of the allocator object used
-   *  by @a __x (unless the allocator traits dictate a different object).
+   *  Whether the allocator is copied depends on the allocator traits.
*/
+#if __cplusplus < 201103L
   map(const map& __x)
   : _M_t(__x._M_t) { }
+#else
+  map(const map&) = default;
 
-#if __cplusplus >= 201103L
   /**
*  @brief  %Map move constructor.
-   *  @param  __x  A %map of identical element and allocator types.
*
-   *  The newly-created %map contains the exact contents of @a __x.
-   *  The contents of @a __x are a valid, but unspecified %map.
+   *  The newly-created %map contains the exact contents of the moved
+   *  instance. The moved instance is a valid, but unspecified, %map.
*/
-  map(map&& __x)
-  noexcept(is_nothrow_copy_constructible<_Compare>::value)
-  : _M_t(std::move(__x._M_t)) { }
+  map(map&&) = default;
 
   /**
*  @brief  Builds a %map from an initializer_list.
@@ -284,31 +281,31 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	: _M_t(__comp, _Pair_alloc_type(__a))
 { _M_t._M_insert_unique(__first, __last); }
 
-  // FIXME There is no dtor declared, but we should have something
-  // generated by Doxygen.  I don't know what tags to add to this
-  // paragraph to make that happen:
+#if __cplusplus >= 201103L
   /**
*  The dtor only erases the elements, and note that if the elements
*  themselves are pointers, the pointed-to memory is not touched in any
*  way.  Managing the pointer is the user's responsibility.
*/
+  ~map() = default;
+#endif
 
   /**
*  @brief  %Map assignment operator.
-   *  @param  __x  A %map of identical element and allocator types.
-   *
-   *  All the elements of @a __x are copied.
*
*  Whether the allocator is copied depends on the allocator traits.
*/
+#if __cplusplus < 201103L
   map&
   operator=(const map& __x)
   {
 	_M_t = __x._M_t;
 	return *this;
   }
+#else
+  map&
+  operator=(const map&) = default;
 
-#if __cplusplus >= 201103L
   /// Move assignment operator.
   map&
   operator=(map&&) = default;
diff --git a/libstdc++-v3/include/bits/stl_multimap.h

Re: [PATCH] Fix filesystem::path for iterators with const value_type

2016-10-28 Thread Tim Song

On Fri, Oct 28, 2016 at 1:47 PM, Jonathan Wakely  wrote:
> For some reason the Filesystem library says that you can construct
> paths from iterators with value_type that is a possibly const encoded
> character type. I don't know why we support const value_type in this
> place, when normally that is bogus (even const_iterators have a
> non-const value_type, and various algorithms won't compile with const
> value_type).

It doesn't say that. [path.req]/1 says that "the value type shall be
an encoded character type". [path.req]/2 says that the relevant
overloads need not be SFINAE'd away (or equivalent) if the value_type
is a const encoded character type, but doesn't actually say that they
are required to work.

Re: RFC [1/3] divmod transform v2

2016-10-28 Thread Prathamesh Kulkarni

On 26 October 2016 at 16:17, Richard Biener  wrote:
> On Wed, 26 Oct 2016, Prathamesh Kulkarni wrote:
>
>> On 25 October 2016 at 18:47, Richard Biener  wrote:
>> > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote:
>> >
>> >> On 25 October 2016 at 16:17, Richard Biener  wrote:
>> >> > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote:
>> >> >
>> >> >> On 25 October 2016 at 13:43, Richard Biener 
>> >> >>  wrote:
>> >> >> > On Sun, Oct 16, 2016 at 7:59 AM, Prathamesh Kulkarni
>> >> >> >  wrote:
>> >> >> >> Hi,
>> >> >> >> After approval from Bernd Schmidt, I committed the patch to remove
>> >> >> >> optab functions for
>> >> >> >> sdivmod_optab and udivmod_optab in optabs.def, which removes the 
>> >> >> >> block
>> >> >> >> for divmod patch.
>> >> >> >>
>> >> >> >> This patch is mostly the same as previous one, except it drops
>> >> >> >> targeting __udivmoddi4() because
>> >> >> >> it gave undefined reference link error for calling __udivmoddi4() on
>> >> >> >> aarch64-linux-gnu.
>> >> >> >> It appears aarch64 has hardware insn for DImode div, so 
>> >> >> >> __udivmoddi4()
>> >> >> >> isn't needed for the target
>> >> >> >> (it was a bug in my patch that called __udivmoddi4() even though
>> >> >> >> aarch64 supported hardware div).
>> >> >> >>
>> >> >> >> However this makes me wonder if it's guaranteed that __udivmoddi4()
>> >> >> >> will be available for a target if it doesn't have hardware div and
>> >> >> >> divmod insn and doesn't have target-specific libfunc for
>> >> >> >> DImode divmod ? To be conservative, the attached patch doesn't
>> >> >> >> generate call to __udivmoddi4.
>> >> >> >>
>> >> >> >> Passes bootstrap+test on x86_64-unknown-linux.
>> >> >> >> Cross-tested on arm*-*-*, aarch64*-*-*.
>> >> >> >> Verified that there are no regressions with SPEC2006 on
>> >> >> >> x86_64-unknown-linux-gnu.
>> >> >> >> OK to commit ?
>> >> >> >
>> >> >> > I think the searching is still somewhat wrong - it's been some time
>> >> >> > since my last look at the
>> >> >> > patch so maybe I've said this already.  Please bail out early for
>> >> >> > stmt_can_throw_internal (stmt),
>> >> >> > otherwise the top stmt search might end up not working.  So
>> >> >> >
>> >> >> > +
>> >> >> > +  if (top_stmt == stmt && stmt_can_throw_internal (top_stmt))
>> >> >> > +return false;
>> >> >> >
>> >> >> > can go.
>> >> >> >
>> >> >> > top_stmt may end up as a TRUNC_DIV_EXPR so it's pointless to only 
>> >> >> > look
>> >> >> > for another
>> >> >> > TRUNC_DIV_EXPR later ... you may end up without a single 
>> >> >> > TRUNC_MOD_EXPR.
>> >> >> > Which means you want a div_seen and a mod_seen, or simply record the 
>> >> >> > top_stmt
>> >> >> > code and look for the opposite in the 2nd loop.
>> >> >> Um sorry I don't quite understand how we could end up without a 
>> >> >> trunc_mod stmt ?
>> >> >> The 2nd loop adds both trunc_div and trunc_mod to stmts vector, and
>> >> >> checks if we have
>> >> >> come across at least a single trunc_div stmt (and we bail out if no
>> >> >> div is seen).
>> >> >>
>> >> >> At 2nd loop I suppose we don't need mod_seen, because stmt is
>> >> >> guaranteed to be trunc_mod_expr.
>> >> >> In the 2nd loop the following condition will never trigger for stmt:
>> >> >>   if (stmt_can_throw_internal (use_stmt))
>> >> >> continue;
>> >> >> since we checked before hand if stmt could throw and chose to bail out
>> >> >> in that case.
>> >> >>
>> >> >> and the following condition would also not trigger for stmt:
>> >> >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))
>> >> >>   {
>> >> >> end_imm_use_stmt_traverse (_iter);
>> >> >> return false;
>> >> >>   }
>> >> >> since gimple_bb (stmt) is always dominated by gimple_bb (top_stmt).
>> >> >>
>> >> >> The case where top_stmt == stmt, we wouldn't reach the above
>> >> >> condition, since we have above it:
>> >> >> if (top_stmt == stmt)
>> >> >>   continue;
>> >> >>
>> >> >> So IIUC, top_stmt and stmt would always get added to stmts vector.
>> >> >> Am I missing something ?
>> >> >
>> >> > Ah, indeed.  Maybe add a comment then, it wasn't really obvious ;)
>> >> >
>> >> > Please still move the stmt_can_throw_internal (stmt) check up.
>> >> Sure, I will move that up and do the other suggested changes.
>> >>
>> >> I was wondering if this condition in 2nd loop is too restrictive ?
>> >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))
>> >>   {
>> >> end_imm_use_stmt_traverse (_iter);
>> >> return false;
>> >>   }
>> >>
>> >> Should we rather "continue" in this case by not adding use_stmt to
>> >> stmts vector rather than dropping
>> >> the transform all-together if gimple_bb (use_stmt) is not dominated by
>> >> gimple_bb (top_stmt) ?
>> >
>> > Ah, yes - didn't spot that.
>> Hi,
>> Is this version OK ?
>
> Yes.
Committed as r241660.
Thanks a lot!

Regards,
Prathamesh
>
>

[PATCH 1/3] use rtx_insn * in various places where it is obvious

2016-10-28 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2016-10-27  Trevor Saunders  

* config/arc/arc.c (arc_emit_call_tls_get_addr): Make the type
of variables rtx_insn *.
* config/arm/arm.c (arm_call_tls_get_addr): Likewise.
(legitimize_tls_address): Likewise.
* config/bfin/bfin.c (hwloop_optimize): Likewise.
(bfin_gen_bundles): Likewise.
* config/c6x/c6x.c (reorg_split_calls): Likewise.
(c6x_reorg): Likewise.
* config/frv/frv.c (frv_reorder_packet): Likewise.
* config/i386/i386.c (ix86_split_idivmod): Likewise.
* config/ia64/ia64.c (ia64_expand_compare): Likewise.
* config/m32c/m32c.c (m32c_prepare_shift): Likewise.
* config/mn10300/mn10300.c: Likewise.
* config/rl78/rl78.c: Likewise.
* config/s390/s390.c (s390_fix_long_loop_prediction): Likewise.
* config/sh/sh-mem.cc (sh_expand_cmpstr): Likewise.
(sh_expand_cmpnstr): Likewise.
(sh_expand_strlen): Likewise.
(sh_expand_setmem): Likewise.
* config/sh/sh.md: Likewise.
* emit-rtl.c (emit_pattern_before): Likewise.
* except.c: Likewise.
* final.c: Likewise.
* jump.c: Likewise.
---
 gcc/config/arc/arc.c |  3 +--
 gcc/config/arm/arm.c |  9 +
 gcc/config/bfin/bfin.c   |  7 ---
 gcc/config/c6x/c6x.c |  9 -
 gcc/config/frv/frv.c |  2 +-
 gcc/config/i386/i386.c   |  3 ++-
 gcc/config/ia64/ia64.c   |  4 ++--
 gcc/config/m32c/m32c.c   |  4 ++--
 gcc/config/mn10300/mn10300.c |  2 +-
 gcc/config/rl78/rl78.c   |  2 +-
 gcc/config/s390/s390.c   |  7 ---
 gcc/config/sh/sh-mem.cc  |  8 
 gcc/config/sh/sh.md  | 18 +-
 gcc/emit-rtl.c   |  2 +-
 gcc/except.c |  2 +-
 gcc/final.c  |  4 ++--
 gcc/jump.c   |  4 ++--
 17 files changed, 46 insertions(+), 44 deletions(-)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 21bba0c..8e8fff4 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -4829,7 +4829,6 @@ static rtx
 arc_emit_call_tls_get_addr (rtx sym, int reloc, rtx eqv)
 {
   rtx r0 = gen_rtx_REG (Pmode, R0_REG);
-  rtx insns;
   rtx call_fusage = NULL_RTX;
 
   start_sequence ();
@@ -4846,7 +4845,7 @@ arc_emit_call_tls_get_addr (rtx sym, int reloc, rtx eqv)
   RTL_PURE_CALL_P (call_insn) = 1;
   add_function_usage_to (call_insn, call_fusage);
 
-  insns = get_insns ();
+  rtx_insn *insns = get_insns ();
   end_sequence ();
 
   rtx dest = gen_reg_rtx (Pmode);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 022c1d7..6351987 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -7900,10 +7900,10 @@ load_tls_operand (rtx x, rtx reg)
   return reg;
 }
 
-static rtx
+static rtx_insn *
 arm_call_tls_get_addr (rtx x, rtx reg, rtx *valuep, int reloc)
 {
-  rtx insns, label, labelno, sum;
+  rtx label, labelno, sum;
 
   gcc_assert (reloc != TLS_DESCSEQ);
   start_sequence ();
@@ -7927,7 +7927,7 @@ arm_call_tls_get_addr (rtx x, rtx reg, rtx *valuep, int 
reloc)
 LCT_PURE, /* LCT_CONST?  */
 Pmode, 1, reg, Pmode);
 
-  insns = get_insns ();
+  rtx_insn *insns = get_insns ();
   end_sequence ();
 
   return insns;
@@ -7959,7 +7959,8 @@ arm_tls_descseq_addr (rtx x, rtx reg)
 rtx
 legitimize_tls_address (rtx x, rtx reg)
 {
-  rtx dest, tp, label, labelno, sum, insns, ret, eqv, addend;
+  rtx dest, tp, label, labelno, sum, ret, eqv, addend;
+  rtx_insn *insns;
   unsigned int model = SYMBOL_REF_TLS_MODEL (x);
 
   switch (model)
diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c
index 9b81868..957f1ae 100644
--- a/gcc/config/bfin/bfin.c
+++ b/gcc/config/bfin/bfin.c
@@ -3431,7 +3431,8 @@ hwloop_optimize (hwloop_info loop)
   basic_block bb;
   rtx_insn *insn, *last_insn;
   rtx loop_init, start_label, end_label;
-  rtx iter_reg, scratchreg, scratch_init, scratch_init_insn;
+  rtx iter_reg, scratchreg, scratch_init;
+  rtx_insn *scratch_init_insn;
   rtx lc_reg, lt_reg, lb_reg;
   rtx seq_end;
   rtx_insn *seq;
@@ -3452,7 +3453,7 @@ hwloop_optimize (hwloop_info loop)
 
   scratchreg = NULL_RTX;
   scratch_init = iter_reg;
-  scratch_init_insn = NULL_RTX;
+  scratch_init_insn = NULL;
   if (!PREG_P (iter_reg) && loop->incoming_src)
 {
   basic_block bb_in = loop->incoming_src;
@@ -3976,7 +3977,7 @@ bfin_gen_bundles (void)
   for (insn = BB_HEAD (bb);; insn = next)
{
  int at_end;
- rtx delete_this = NULL_RTX;
+ rtx_insn *delete_this = NULL;
 
  if (NONDEBUG_INSN_P (insn))
{
diff --git a/gcc/config/c6x/c6x.c b/gcc/config/c6x/c6x.c
index f8c3d66..6cb9185 100644
--- a/gcc/config/c6x/c6x.c
+++ b/gcc/config/c6x/c6x.c
@@ -4856,7 +4856,7 @@ find_last_same_clock (rtx_insn *insn)
the SEQUENCEs that

[PATCH 3/3] split up some variables to use rtx_insn * more

2016-10-28 Thread tbsaunde+gcc

From: Trevor Saunders 

Note to readers, a -b diff is below the whitespace sensitive one and should be
much easier to read.

gcc/ChangeLog:

2016-10-27  Trevor Saunders  

* config/alpha/alpha.c (alpha_legitimize_address_1): Split up
variables so some can be rtx_insn *.
(alpha_emit_xfloating_libcall): Likewise.
* config/mips/mips.c (mips_call_tls_get_addr): Likewise.
(mips_legitimize_tls_address): Likewise.
* optabs.c (expand_binop): Likewise.
* reload1.c (gen_reload): Likewise.
---
 gcc/config/alpha/alpha.c | 117 ---
 gcc/config/mips/mips.c   |  61 
 gcc/optabs.c |   5 +-
 gcc/reload1.c|   9 ++--
 4 files changed, 101 insertions(+), 91 deletions(-)

diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 7f53967..6d390ae 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -1017,7 +1017,8 @@ alpha_legitimize_address_1 (rtx x, rtx scratch, 
machine_mode mode)
   && GET_MODE_SIZE (mode) <= UNITS_PER_WORD
   && symbolic_operand (x, Pmode))
 {
-  rtx r0, r16, eqv, tga, tp, insn, dest, seq;
+  rtx r0, r16, eqv, tga, tp, dest, seq;
+  rtx_insn *insn;
 
   switch (tls_symbolic_operand_type (x))
{
@@ -1025,66 +1026,70 @@ alpha_legitimize_address_1 (rtx x, rtx scratch, 
machine_mode mode)
  break;
 
case TLS_MODEL_GLOBAL_DYNAMIC:
- start_sequence ();
+ {
+   start_sequence ();
 
- r0 = gen_rtx_REG (Pmode, 0);
- r16 = gen_rtx_REG (Pmode, 16);
- tga = get_tls_get_addr ();
- dest = gen_reg_rtx (Pmode);
- seq = GEN_INT (alpha_next_sequence_number++);
+   r0 = gen_rtx_REG (Pmode, 0);
+   r16 = gen_rtx_REG (Pmode, 16);
+   tga = get_tls_get_addr ();
+   dest = gen_reg_rtx (Pmode);
+   seq = GEN_INT (alpha_next_sequence_number++);
 
- emit_insn (gen_movdi_er_tlsgd (r16, pic_offset_table_rtx, x, seq));
- insn = gen_call_value_osf_tlsgd (r0, tga, seq);
- insn = emit_call_insn (insn);
- RTL_CONST_CALL_P (insn) = 1;
- use_reg (_INSN_FUNCTION_USAGE (insn), r16);
+   emit_insn (gen_movdi_er_tlsgd (r16, pic_offset_table_rtx, x, seq));
+   rtx val = gen_call_value_osf_tlsgd (r0, tga, seq);
+   insn = emit_call_insn (val);
+   RTL_CONST_CALL_P (insn) = 1;
+   use_reg (_INSN_FUNCTION_USAGE (insn), r16);
 
-  insn = get_insns ();
- end_sequence ();
+   insn = get_insns ();
+   end_sequence ();
 
- emit_libcall_block (insn, dest, r0, x);
- return dest;
+   emit_libcall_block (insn, dest, r0, x);
+   return dest;
+ }
 
case TLS_MODEL_LOCAL_DYNAMIC:
- start_sequence ();
+ {
+   start_sequence ();
 
- r0 = gen_rtx_REG (Pmode, 0);
- r16 = gen_rtx_REG (Pmode, 16);
- tga = get_tls_get_addr ();
- scratch = gen_reg_rtx (Pmode);
- seq = GEN_INT (alpha_next_sequence_number++);
+   r0 = gen_rtx_REG (Pmode, 0);
+   r16 = gen_rtx_REG (Pmode, 16);
+   tga = get_tls_get_addr ();
+   scratch = gen_reg_rtx (Pmode);
+   seq = GEN_INT (alpha_next_sequence_number++);
 
- emit_insn (gen_movdi_er_tlsldm (r16, pic_offset_table_rtx, seq));
- insn = gen_call_value_osf_tlsldm (r0, tga, seq);
- insn = emit_call_insn (insn);
- RTL_CONST_CALL_P (insn) = 1;
- use_reg (_INSN_FUNCTION_USAGE (insn), r16);
+   emit_insn (gen_movdi_er_tlsldm (r16, pic_offset_table_rtx, seq));
+   rtx val = gen_call_value_osf_tlsldm (r0, tga, seq);
+   insn = emit_call_insn (val);
+   RTL_CONST_CALL_P (insn) = 1;
+   use_reg (_INSN_FUNCTION_USAGE (insn), r16);
 
-  insn = get_insns ();
- end_sequence ();
+   insn = get_insns ();
+   end_sequence ();
 
- eqv = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx),
-   UNSPEC_TLSLDM_CALL);
- emit_libcall_block (insn, scratch, r0, eqv);
+   eqv = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx),
+ UNSPEC_TLSLDM_CALL);
+   emit_libcall_block (insn, scratch, r0, eqv);
 
- eqv = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, x), UNSPEC_DTPREL);
- eqv = gen_rtx_CONST (Pmode, eqv);
+   eqv = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, x), UNSPEC_DTPREL);
+   eqv = gen_rtx_CONST (Pmode, eqv);
 
- if (alpha_tls_size == 64)
-   {
- dest = gen_reg_rtx (Pmode);
- emit_insn (gen_rtx_SET (dest, eqv));
- emit_insn (gen_adddi3 (dest, dest, scratch));
- return dest;
-   }
- if (alpha_tls_size == 32)
-   {
-

[PATCH 0/3] use rtx_insn * more

2016-10-28 Thread tbsaunde+gcc

From: Trevor Saunders 

HI,

This series changes various variables type from rtx to rtx_insn * so that the
remaining patches in this series
http://gcc.gnu.org/ml/gcc-patches/2016-10/msg01353.html can be applied.

patches bootstrapped and regtested on x86_64-linux-gnu, and run through 
config-list.mk, ok?

Thanks!

Trev

Trevor Saunders (3):
  use rtx_insn * in various places where it is obvious
  split up the trial variable in reorg.c:relax_delay_slots to use
rtx_insn * more
  split up some variables to use rtx_insn * more

 gcc/config/alpha/alpha.c | 117 ++-
 gcc/config/arc/arc.c |   3 +-
 gcc/config/arm/arm.c |   9 ++--
 gcc/config/bfin/bfin.c   |   7 +--
 gcc/config/c6x/c6x.c |   9 ++--
 gcc/config/frv/frv.c |   2 +-
 gcc/config/i386/i386.c   |   3 +-
 gcc/config/ia64/ia64.c   |   4 +-
 gcc/config/m32c/m32c.c   |   4 +-
 gcc/config/mips/mips.c   |  61 +++---
 gcc/config/mn10300/mn10300.c |   2 +-
 gcc/config/rl78/rl78.c   |   2 +-
 gcc/config/s390/s390.c   |   7 +--
 gcc/config/sh/sh-mem.cc  |   8 +--
 gcc/config/sh/sh.md  |  18 +++
 gcc/emit-rtl.c   |   2 +-
 gcc/except.c |   2 +-
 gcc/final.c  |   4 +-
 gcc/jump.c   |   4 +-
 gcc/optabs.c |   5 +-
 gcc/reload1.c|   9 ++--
 gcc/reorg.c  |  19 ---
 22 files changed, 156 insertions(+), 145 deletions(-)

-- 
2.9.3.dirty

[PATCH 2/3] split up the trial variable in reorg.c:relax_delay_slots to use rtx_insn * more

2016-10-28 Thread tbsaunde+gcc

From: Trevor Saunders 

gcc/ChangeLog:

2016-10-27  Trevor Saunders  

* reorg.c (relax_delay_slots): Split up the trial variable.
---
 gcc/reorg.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/gcc/reorg.c b/gcc/reorg.c
index 799d27b..da4d7c6 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -222,7 +222,7 @@ static void steal_delay_list_from_fallthrough (rtx_insn *, 
rtx, rtx_sequence *,
 static void try_merge_delay_insns (rtx_insn *, rtx_insn *);
 static rtx_insn *redundant_insn (rtx, rtx_insn *, const vec &);
 static int own_thread_p (rtx, rtx, int);
-static void update_block (rtx_insn *, rtx);
+static void update_block (rtx_insn *, rtx_insn *);
 static int reorg_redirect_jump (rtx_jump_insn *, rtx);
 static void update_reg_dead_notes (rtx_insn *, rtx_insn *);
 static void fix_reg_dead_note (rtx_insn *, rtx);
@@ -1703,7 +1703,7 @@ own_thread_p (rtx thread, rtx label, int 
allow_fallthrough)
BARRIER in relax_delay_slots.  */
 
 static void
-update_block (rtx_insn *insn, rtx where)
+update_block (rtx_insn *insn, rtx_insn *where)
 {
   /* Ignore if this was in a delay slot and it came from the target of
  a branch.  */
@@ -3118,7 +3118,6 @@ relax_delay_slots (rtx_insn *first)
 {
   rtx_insn *insn, *next;
   rtx_sequence *pat;
-  rtx trial;
   rtx_insn *delay_insn;
   rtx target_label;
 
@@ -3271,10 +3270,10 @@ relax_delay_slots (rtx_insn *first)
  for (i = 0; i < XVECLEN (pat, 0); i++)
INSN_FROM_TARGET_P (XVECEXP (pat, 0, i)) = 0;
 
- trial = PREV_INSN (insn);
+ rtx_insn *prev = PREV_INSN (insn);
  delete_related_insns (insn);
  gcc_assert (GET_CODE (pat) == SEQUENCE);
- add_insn_after (delay_insn, trial, NULL);
+ add_insn_after (delay_insn, prev, NULL);
  after = delay_insn;
  for (i = 1; i < pat->len (); i++)
after = emit_copy_of_insn_after (pat->insn (i), after);
@@ -3295,9 +3294,9 @@ relax_delay_slots (rtx_insn *first)
 
   /* If this jump goes to another unconditional jump, thread it, but
 don't convert a jump into a RETURN here.  */
-  trial = skip_consecutive_labels (follow_jumps (target_label,
-delay_jump_insn,
-));
+  rtx trial = skip_consecutive_labels (follow_jumps (target_label,
+delay_jump_insn,
+));
   if (ANY_RETURN_P (trial))
trial = find_end_label (trial);
 
@@ -3401,10 +3400,10 @@ relax_delay_slots (rtx_insn *first)
  for (i = 0; i < XVECLEN (pat, 0); i++)
INSN_FROM_TARGET_P (XVECEXP (pat, 0, i)) = 0;
 
- trial = PREV_INSN (insn);
+ rtx_insn *prev = PREV_INSN (insn);
  delete_related_insns (insn);
  gcc_assert (GET_CODE (pat) == SEQUENCE);
- add_insn_after (delay_jump_insn, trial, NULL);
+ add_insn_after (delay_jump_insn, prev, NULL);
  after = delay_jump_insn;
  for (i = 1; i < pat->len (); i++)
after = emit_copy_of_insn_after (pat->insn (i), after);
-- 
2.9.3.dirty

libgo patch committed: add missing build tag

2016-10-28 Thread Ian Lance Taylor

This patch to libgo adds a missing build tag to
runtime/flstack_32bit.go, fixing the build on 32-bit PPC.  This should
fix GCC PR 78143.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu, which admittedly does not have the problem.
Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 241655)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-7fb11c908ddab4932cc416f16657cec3bc878a1a
+4a8df8f8622c140777996786866395448622ac3f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/lfstack_32bit.go
===
--- libgo/go/runtime/lfstack_32bit.go   (revision 241427)
+++ libgo/go/runtime/lfstack_32bit.go   (working copy)
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// +build 386 arm nacl armbe m68k mips mipsle mips64p32 mips64p32le mipso32 
mipsn32 s390 sparc
+// +build 386 arm nacl armbe m68k mips mipsle mips64p32 mips64p32le mipso32 
mipsn32 ppc s390 sparc
 
 package runtime

Re: [SPARC] Add support for overflow arithmetic

2016-10-28 Thread Jakub Jelinek

On Fri, Oct 28, 2016 at 07:27:56PM +0200, Eric Botcazou wrote:
> Thanks for the hint.  The hook is the way to go I think because BITS_PER_WORD 
> is not a constant, so the default would not be properly initialized.  Here's 
> a 
> tentative patch, I'll add a couple of SPARC-specific testcases if accepted.
> 
> Tested on SPARC/Solaris, OK for the mainline?
> 
> 
>   * doc/tm.texi.in (Target Macros) Add TARGET_MIN_ARITHMETIC_PRECISION.
>   * doc/tm.texi: Regenerate.
>   * internal-fn.c (expand_arith_overflow): Rewrite handling of target
>   dependent support by means of TARGET_MIN_ARITHMETIC_PRECISION.
>   * target.def (min_arithmetic_precision): New hook.
>   * targhooks.c (default_min_arithmetic_precision): New function.
>   * targhooks.h (default_min_arithmetic_precision): Declare.
>   * config/sparc/sparc.c (TARGET_MIN_ARITHMETIC_PRECISION): Define.
>   (sparc_min_arithmetic_precision): New function.

Ok, thanks.

Jakub

[PATCH] Make filesystem::path work with basic_string_view (P0392R0)

2016-10-28 Thread Jonathan Wakely


This integrates string_view with experimental::filesystem::path, which
is not actually in the TS, but is required for std::filesystem. I've
made it work with experimental::string_view in C++14 mode, and with
std::string_view in C++17 mode. I suppose I could have made it work
with either in C++17 mode, but that would be a bit more complicated.

* include/experimental/bits/fs_path.h (__is_path_src)
(_S_range_begin, _S_range_end): Overload to treat string_view as a
Source object.
(path::operator+=, path::compare): Overload for basic_string_view.
* testsuite/experimental/filesystem/path/construct/string_view.cc:
New test.
* testsuite/experimental/filesystem/path/construct/
string_view_cxx17.cc: New test.

Tested x86_64-linux, committed to trunk.

commit d554b7a9a7cad560a4e483bd6fbb900d91da2655
Author: Jonathan Wakely 
Date:   Fri Oct 28 19:30:29 2016 +0100

Make filesystem::path work with basic_string_view (P0392R0)

* include/experimental/bits/fs_path.h (__is_path_src)
(_S_range_begin, _S_range_end): Overload to treat string_view as a
Source object.
(path::operator+=, path::compare): Overload for basic_string_view.
* testsuite/experimental/filesystem/path/construct/string_view.cc:
New test.
* testsuite/experimental/filesystem/path/construct/
string_view_cxx17.cc: New test.

diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h 
b/libstdc++-v3/include/experimental/bits/fs_path.h
index f6a290d..70a5445 100644
--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -44,6 +44,9 @@
 #include 
 #include 
 #include 
+#if __cplusplus == 201402L
+# include 
+#endif
 
 #if defined(_WIN32) && !defined(__CYGWIN__)
 # define _GLIBCXX_FILESYSTEM_IS_WINDOWS 1
@@ -61,6 +64,12 @@ inline namespace v1
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
+#if __cplusplus == 201402L
+  using std::experimental::basic_string_view;
+#elif __cplusplus > 201402L
+  using std::basic_string_view;
+#endif
+
   /**
* @ingroup filesystem
* @{
@@ -87,6 +96,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   static __is_encoded_char<_CharT>
   __is_path_src(const basic_string<_CharT, _Traits, _Alloc>&, int);
 
+#if __cplusplus >= 201402L
+template
+  static __is_encoded_char<_CharT>
+  __is_path_src(const basic_string_view<_CharT, _Traits>&, int);
+#endif
+
 template
   static std::false_type
   __is_path_src(const _Unknown&, ...);
@@ -130,6 +145,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   _S_range_end(const basic_string<_CharT, _Traits, _Alloc>& __str)
   { return __str.data() + __str.size(); }
 
+#if __cplusplus >= 201402L
+template
+  static const _CharT*
+  _S_range_begin(const basic_string_view<_CharT, _Traits>& __str)
+  { return __str.data(); }
+
+template
+  static const _CharT*
+  _S_range_end(const basic_string_view<_CharT, _Traits>& __str)
+  { return __str.data() + __str.size(); }
+#endif
+
 template())),
 typename _Val = typename std::iterator_traits<_Iter>::value_type>
@@ -243,6 +270,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 path& operator+=(const string_type& __x);
 path& operator+=(const value_type* __x);
 path& operator+=(value_type __x);
+#if __cplusplus >= 201402L
+path& operator+=(basic_string_view __x);
+#endif
 
 template
   _Path<_Source>&
@@ -311,6 +341,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 int compare(const path& __p) const noexcept;
 int compare(const string_type& __s) const;
 int compare(const value_type* __s) const;
+#if __cplusplus >= 201402L
+int compare(const basic_string_view __s) const;
+#endif
 
 // decomposition
 
@@ -768,6 +801,16 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 return *this;
   }
 
+#if __cplusplus >= 201402L
+  inline path&
+  path::operator+=(basic_string_view __x)
+  {
+_M_pathname.append(__x.data(), __x.size());
+_M_split_cmpts();
+return *this;
+  }
+#endif
+
   template
 inline path::_Path<_CharT*, _CharT*>&
 path::operator+=(_CharT __x)
@@ -909,6 +952,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   inline int
   path::compare(const value_type* __s) const { return compare(path(__s)); }
 
+#if __cplusplus >= 201402L
+  inline int
+  path::compare(basic_string_view __s) const
+  { return compare(path(__s)); }
+#endif
+
   inline path
   path::filename() const { return empty() ? path() : *--end(); }
 
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/path/construct/string_view.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/path/construct/string_view.cc
new file mode 100644
index 000..13ebaaa
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/path/construct/string_view.cc
@@ -0,0 +1,56 @@
+// { dg-options "-lstdc++fs -std=gnu++1z" }
+// { dg-do run { target c++1z } }
+// { dg-require-filesystem-ts

[PATCH] fix linker name for uClibc

2016-10-28 Thread Waldemar Brodkorb

Hi,

uClibc-ng can be used for Microblaze architecture.
It is regulary tested with qemu-system-microblaze in little and
big endian mode.

2016-10-28  Waldemar Brodkorb  

gcc/
* config/microblaze/linux.h: add UCLIBC_DYNAMIC_LINKER

diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h
index ae8523c..b3bf43a 100644
--- a/gcc/config/microblaze/linux.h
+++ b/gcc/config/microblaze/linux.h
@@ -29,6 +29,7 @@
 #define TLS_NEEDS_GOT 1
 
 #define GLIBC_DYNAMIC_LINKER "/lib/ld.so.1"
+#define UCLIBC_DYNAMIC_LINKER "/lib/ld-uClibc.so.0"
 
 #if TARGET_BIG_ENDIAN_DEFAULT == 0 /* LE */
 #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:;:el}"


best regards
 Waldemar

Re: [PATCH] Fix PR77407

2016-10-28 Thread Marc Glisse


On Wed, 28 Sep 2016, Richard Biener wrote:


The following patch implements patterns to catch x / abs (x)
and x / -x, taking advantage of undefinedness at x == 0 as
opposed to the PR having testcases with explicit != 0 checks.

Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

Richard.

2016-09-28  Richard Biener  

PR middle-end/77407
* match.pd: Add X / abs (X) -> X < 0 ? -1 : 1 and
X / -X -> -1 simplifications.


I notice that we still have the following comment a few lines above:

/* Make sure to preserve divisions by zero.  This is the reason why
   we don't simplify x / x to 1 or 0 / x to 0.  */

Did we give up on preserving divisions by 0? Can we now do the 2 
simplifications listed by the comment?


--
Marc Glisse

Re: [Aarch64] Variant field

2016-10-28 Thread Andrew Pinski

On Fri, Oct 28, 2016 at 10:50 AM, Benedikt Huber
 wrote:
> Hi,
>
> In the aarch64 backend we would need VARIANT field in AARCH64_CORE to specify 
> and detect
> variants of xgene.
>
> I found this patch of Andrew Pinski
> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02148.html
>
> However it did not find its way to trunk.
> Why was that the case?

Because the full patch set has not been reviewed   Parts of it was
reviewed back and parts in the last few weeks :).

> What is the correct way to proceed when (re)using this patch for submission?
> Is that possible at all?

I was in the process of resubmitting that patch set.  In fact I was
able to commit patches 1, 2 and 4 already.  patch 3 was resubmitted 5
days ago:
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01855.html

I hope someone will approve it and then I will resubmit the last patch
which adds the variant portion.

The patch set does not apply directly any more as one variable name has changed.

Thanks,
Andrew

>
> Thank you and best regards,
> Benedikt
>

libgo patch committed: redirect mkrsysinfo.sh grep output to /dev/null

2016-10-28 Thread Ian Lance Taylor

This patch to libgo redirects the output of a grep command in
mkrsysinfo.sh to /dev/null.  The output otherwise appears in the
middle of a build log, where it is harmless but confusing.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 241432)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-6d9929a1641b180e724c2fdcdd55f6a254f1dec0
+7fb11c908ddab4932cc416f16657cec3bc878a1a
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/mkrsysinfo.sh
===
--- libgo/mkrsysinfo.sh (revision 241347)
+++ libgo/mkrsysinfo.sh (working copy)
@@ -78,7 +78,7 @@ if grep '^const _epoll_data_offset ' ${O
   fi
 fi
 # Make sure EPOLLET is positive.
-if grep '^const _EPOLLET = [0-9]' gen-sysinfo.go; then
+if grep '^const _EPOLLET = [0-9]' gen-sysinfo.go > /dev/null 2>&1; then
   echo "const _EPOLLETpos = _EPOLLET" >> ${OUT}
 else
   echo "const _EPOLLETpos = 0x8000" >> ${OUT}

[Aarch64] Variant field

2016-10-28 Thread Benedikt Huber

Hi,

In the aarch64 backend we would need VARIANT field in AARCH64_CORE to specify 
and detect
variants of xgene.

I found this patch of Andrew Pinski
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02148.html

However it did not find its way to trunk.
Why was that the case?
What is the correct way to proceed when (re)using this patch for submission?
Is that possible at all?

Thank you and best regards,
Benedikt



signature.asc
Description: Message signed with OpenPGP using GPGMail

[PATCH] Fix filesystem::path for iterators with const value_type

2016-10-28 Thread Jonathan Wakely


For some reason the Filesystem library says that you can construct
paths from iterators with value_type that is a possibly const encoded
character type. I don't know why we support const value_type in this
place, when normally that is bogus (even const_iterators have a
non-const value_type, and various algorithms won't compile with const
value_type).

Anyway, this fixes path to allow such wonky iterators, and fixes a bug
where I was using *iter++ which isn't defined for input iterators.

* include/experimental/bits/fs_path.h
(path::_S_convert<_Iter>(_Iter, _Iter)): Remove cv-qualifiers from
iterator's value_type.
(path::_S_convert<_Iter>(_Iter __first, __null_terminated)): Likewise.
Do not use operation not supported by input iterators.
(path::__is_path_iter_src): Add partial specialization for const
encoded character types.
* testsuite/experimental/filesystem/path/construct/range.cc: Test
construction from input iterators with const value types.

Tested powerpc64le-linux, committed to trunk.


commit e8d357aafe250824728fd9e69dcf650871f6a6e6
Author: Jonathan Wakely 
Date:   Fri Oct 28 18:07:18 2016 +0100

Fix filesystem::path for iterators with const value_type

* include/experimental/bits/fs_path.h
(path::_S_convert<_Iter>(_Iter, _Iter)): Remove cv-qualifiers from
iterator's value_type.
(path::_S_convert<_Iter>(_Iter __first, __null_terminated)): Likewise.
Do not use operation not supported by input iterators.
(path::__is_path_iter_src): Add partial specialization for const
encoded character types.
* testsuite/experimental/filesystem/path/construct/range.cc: Test
construction from input iterators with const value types.

diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h 
b/libstdc++-v3/include/experimental/bits/fs_path.h
index 4d7291f..f6a290d 100644
--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -385,7 +385,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   _S_convert(_Iter __first, _Iter __last)
   {
using __value_type = typename std::iterator_traits<_Iter>::value_type;
-   return _Cvt<__value_type>::_S_convert(__first, __last);
+   return _Cvt>::_S_convert(__first, __last);
   }
 
 template
@@ -393,10 +393,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   _S_convert(_InputIterator __src, __null_terminated)
   {
using _Tp = typename std::iterator_traits<_InputIterator>::value_type;
-   std::basic_string<_Tp> __tmp;
-   while (*__src != _Tp{})
- __tmp.push_back(*__src++);
-   return _S_convert(__tmp.data(), __tmp.data() + __tmp.size());
+   std::basic_string> __tmp;
+   for (; *__src != _Tp{}; ++__src)
+ __tmp.push_back(*__src);
+   return _S_convert(__tmp.c_str(), __tmp.c_str() + __tmp.size());
   }
 
 static string_type
@@ -571,6 +571,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 struct path::__is_encoded_char : std::true_type
 { using value_type = char32_t; };
 
+  template
+struct path::__is_encoded_char : __is_encoded_char<_Tp> { };
+
   struct path::_Cmpt : path
   {
 _Cmpt(string_type __s, _Type __t, size_t __pos)
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/path/construct/range.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/path/construct/range.cc
index b68e65d..3dfec2f 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/path/construct/range.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/path/construct/range.cc
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 using std::experimental::filesystem::path;
 using __gnu_test::compare_paths;
@@ -54,6 +55,52 @@ test01()
 compare_paths(p1, p7);
 compare_paths(p1, p8);
 #endif
+
+using __gnu_test::test_container;
+using __gnu_test::input_iterator_wrapper;
+// Test with input iterators and const value_types
+test_container
+  r1((), () + s.size());
+path p9(r1.begin(), r1.end());
+compare_paths(p1, p9);
+
+test_container
+  r2((), () + s.size() + 1); // includes null-terminator
+path p10(r2.begin());
+compare_paths(p1, p10);
+
+test_container
+  r3(s.c_str(), s.c_str() + s.size());
+path p11(r3.begin(), r3.end());
+compare_paths(p1, p11);
+
+test_container
+  r4(s.c_str(), s.c_str() + s.size() + 1); // includes null-terminator
+path p12(r4.begin());
+compare_paths(p1, p12);
+
+#if _GLIBCXX_USE_WCHAR_T
+// Test with input iterators and const value_types
+test_container
+  r5((), () + ws.size());
+path p13(r5.begin(), r5.end());
+compare_paths(p1, p13);
+
+test_container
+  r6((), () + ws.size() + 1);

Re: [PR debug/77773] segfault when compiling __simd64_float16_t with -g

2016-10-28 Thread Richard Biener

On October 28, 2016 6:46:07 PM GMT+02:00, Aldy Hernandez  
wrote:
>On 10/28/2016 01:40 AM, Richard Biener wrote:
>> On Thu, Oct 27, 2016 at 6:14 PM, Aldy Hernandez 
>wrote:
>>> On 10/27/2016 12:35 AM, Richard Biener wrote:

 On Wed, Oct 26, 2016 at 9:17 PM, Aldy Hernandez 
>wrote:
>
> The following one-liner segfaults on arm-eabi when compiled with
> -mfloat-abi=hard -g:
>
> __simd64_float16_t usingit;
>
> The problem is that the pretty printer (in
>simple_type_specificer()) is
> dereferencing a NULL result from c_common_type_for_mode:
>
>   int prec = TYPE_PRECISION (t);
>   if (ALL_FIXED_POINT_MODE_P (TYPE_MODE (t)))
> t = c_common_type_for_mode (TYPE_MODE (t),
>TYPE_SATURATING
> (t));
>   else
> t = c_common_type_for_mode (TYPE_MODE (t),
>TYPE_UNSIGNED
> (t));
>   if (TYPE_NAME (t))
>
> The type in question is:
>
> 
>
> which corresponds to HFmode and which AFAICT, does not have a type
>by
> design.
>
> I see that other uses of *type_for_node() throughout the compiler
>check
> the
> result for NULL, so perhaps we should do the same here.
>
> The attached patch fixes the problem.
>
> OK for trunk?


 Your added assert shows another possible issue - can you fix this
>by
 assigning
 the result of c_common_type_for_mode to a new variable, like
>common_t and
 use that for the TYPE_NAME (...) case?  I think this was what was
 intended.
>>>
>>>
>>> Certainly.
>>>
>>> OK pending tests?
>>
>> Ok.
>>
>
>Thanks.
>
>I just noticed this is also a GCC 6 regression.  Assuming the GCC 6 
>branch is open for regression bugfixes, is this OK for the branch?

Yes.

Richard.

>Aldy

Re: [SPARC] Add support for overflow arithmetic

2016-10-28 Thread Eric Botcazou

> Then to some extent defining WORD_REGISTER_OPERATIONS on SPARC is a lie,
> it only has "INT_REGISTER_OPERATIONS", i.e. all operations smaller than
> int are performed on the whole register, int operations can be really done
> in SImode in the IL (no need to sign/zero extend anything to DImode, if you
> just ignore the high 32 bits).

On the other hand SPARC perfectly matches the documentation:

 -- Macro: WORD_REGISTER_OPERATIONS
 Define this macro to 1 if operations between registers with
 integral mode smaller than a word are always performed on the
 entire register.  Most RISC machines have this property and most
 CISC machines do not.

If you don't define it for SPARC, then you'll never define it!  The macro 
makes it possible to do some optimizations in combine.c and rtlanal.c so it 
looks quite useful.  Note that SPARC is one of the very few RISC targets that 
don't define PROMOTE_MODE for variables since a patch of yours from 1999:
  https://gcc.gnu.org/ml/gcc-patches/1999-12n/msg00202.html
so it's already parameterized to avoid sign/zero-extending to DImode.

> Guess easiest would be to add some targetm constant or hook that gives
> you bit precision - integral arithmetics smaller than this precision is
> performed in precision.  Then define it by default to
> #ifdef WORD_REGISTER_OPERATIONS
>   BITS_PER_WORD
> #else
>   BITS_PER_UNIT
> #endif
> and for sparc set to 32, then use this targetm constant or hook in
> internal-fn.c instead of WORD_REGISTER_OPERATIONS and BITS_PER_WORD.

Thanks for the hint.  The hook is the way to go I think because BITS_PER_WORD 
is not a constant, so the default would not be properly initialized.  Here's a 
tentative patch, I'll add a couple of SPARC-specific testcases if accepted.

Tested on SPARC/Solaris, OK for the mainline?


* doc/tm.texi.in (Target Macros) Add TARGET_MIN_ARITHMETIC_PRECISION.
* doc/tm.texi: Regenerate.
* internal-fn.c (expand_arith_overflow): Rewrite handling of target
dependent support by means of TARGET_MIN_ARITHMETIC_PRECISION.
* target.def (min_arithmetic_precision): New hook.
* targhooks.c (default_min_arithmetic_precision): New function.
* targhooks.h (default_min_arithmetic_precision): Declare.
* config/sparc/sparc.c (TARGET_MIN_ARITHMETIC_PRECISION): Define.
(sparc_min_arithmetic_precision): New function.

-- 
Eric BotcazouIndex: doc/tm.texi
===
--- doc/tm.texi	(revision 241611)
+++ doc/tm.texi	(working copy)
@@ -10618,6 +10618,23 @@ smaller than a word are always performed
 Most RISC machines have this property and most CISC machines do not.
 @end defmac
 
+@deftypefn {Target Hook} {unsigned int} TARGET_MIN_ARITHMETIC_PRECISION (void)
+On some RISC architectures with 64-bit registers, the processor also
+maintains 32-bit condition codes that make it possible to do real 32-bit
+arithmetic, although the operations are performed on the full registers.
+
+On such architectures, defining this hook to 32 tells the compiler to try
+using 32-bit arithmetical operations setting the condition codes instead
+of doing full 64-bit arithmetic.
+
+More generally, define this hook on RISC architectures if you want the
+compiler to try using arithmetical operations setting the condition codes
+with a precision lower than the word precision.
+
+You need not define this hook if @code{WORD_REGISTER_OPERATIONS} is not
+defined to 1.
+@end deftypefn
+
 @defmac LOAD_EXTEND_OP (@var{mem_mode})
 Define this macro to be a C expression indicating when insns that read
 memory in @var{mem_mode}, an integral mode narrower than a word, set the
Index: doc/tm.texi.in
===
--- doc/tm.texi.in	(revision 241611)
+++ doc/tm.texi.in	(working copy)
@@ -7575,6 +7575,8 @@ smaller than a word are always performed
 Most RISC machines have this property and most CISC machines do not.
 @end defmac
 
+@hook TARGET_MIN_ARITHMETIC_PRECISION
+
 @defmac LOAD_EXTEND_OP (@var{mem_mode})
 Define this macro to be a C expression indicating when insns that read
 memory in @var{mem_mode}, an integral mode narrower than a word, set the
Index: internal-fn.c
===
--- internal-fn.c	(revision 241611)
+++ internal-fn.c	(working copy)
@@ -1824,12 +1836,11 @@ expand_arith_overflow (enum tree_code co
 	  return;
 	}
 
-  /* For sub-word operations, if target doesn't have them, start
-	 with precres widening right away, otherwise do it only
-	 if the most simple cases can't be used.  */
-  if (WORD_REGISTER_OPERATIONS
-	  && orig_precres == precres
-	  && precres < BITS_PER_WORD)
+  /* For operations with low precision, if target doesn't have them, start
+	 with precres widening right away, otherwise do it only if the most
+	 simple cases can't be used.  */
+  const int min_precision =

Re: [PR debug/77773] segfault when compiling __simd64_float16_t with -g

2016-10-28 Thread Aldy Hernandez


On 10/28/2016 01:40 AM, Richard Biener wrote:

On Thu, Oct 27, 2016 at 6:14 PM, Aldy Hernandez  wrote:

On 10/27/2016 12:35 AM, Richard Biener wrote:


On Wed, Oct 26, 2016 at 9:17 PM, Aldy Hernandez  wrote:


The following one-liner segfaults on arm-eabi when compiled with
-mfloat-abi=hard -g:

__simd64_float16_t usingit;

The problem is that the pretty printer (in simple_type_specificer()) is
dereferencing a NULL result from c_common_type_for_mode:

  int prec = TYPE_PRECISION (t);
  if (ALL_FIXED_POINT_MODE_P (TYPE_MODE (t)))
t = c_common_type_for_mode (TYPE_MODE (t), TYPE_SATURATING
(t));
  else
t = c_common_type_for_mode (TYPE_MODE (t), TYPE_UNSIGNED
(t));
  if (TYPE_NAME (t))

The type in question is:



which corresponds to HFmode and which AFAICT, does not have a type by
design.

I see that other uses of *type_for_node() throughout the compiler check
the
result for NULL, so perhaps we should do the same here.

The attached patch fixes the problem.

OK for trunk?



Your added assert shows another possible issue - can you fix this by
assigning
the result of c_common_type_for_mode to a new variable, like common_t and
use that for the TYPE_NAME (...) case?  I think this was what was
intended.



Certainly.

OK pending tests?


Ok.



Thanks.

I just noticed this is also a GCC 6 regression.  Assuming the GCC 6 
branch is open for regression bugfixes, is this OK for the branch?


Aldy

[gomp4] propagating conditionals in worker-vector partitioned loops

2016-10-28 Thread Cesar Philippidis

I've applied the patch to gomp-4_0-branch to correct an issue involving
the propagation of variables used in conditional expressions to worker
and vector partitioned loops. More details regarding this patch can be
found here 

Cesar
2016-10-26  Cesar Philippidis  

	gcc/
	* config/nvptx/nvptx.c (nvptx_single): Use a single predicate
	for loops partitioned across both worker and vector axes.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/broadcast-1.c: New test.


diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 7bf5987..4e6ed60 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -3507,11 +3507,38 @@ nvptx_single (unsigned mask, basic_block from, basic_block to)
   /* Insert the vector test inside the worker test.  */
   unsigned mode;
   rtx_insn *before = tail;
+  rtx wvpred = NULL_RTX;
+  bool skip_vector = false;
+
+  /* Create a single predicate for loops containing both worker and
+ vectors.  */
+  if (cond_branch
+  && (GOMP_DIM_MASK (GOMP_DIM_WORKER) & mask)
+  && (GOMP_DIM_MASK (GOMP_DIM_VECTOR) & mask))
+{
+  rtx regx = gen_reg_rtx (SImode);
+  rtx regy = gen_reg_rtx (SImode);
+  rtx tmp = gen_reg_rtx (SImode);
+  wvpred = gen_reg_rtx (BImode);
+
+  emit_insn_before (gen_oacc_dim_pos (regx, const1_rtx), head);
+  emit_insn_before (gen_oacc_dim_pos (regy, const2_rtx), head);
+  emit_insn_before (gen_rtx_SET (tmp, gen_rtx_IOR (SImode, regx, regy)),
+			head);
+  emit_insn_before (gen_rtx_SET (wvpred, gen_rtx_NE (BImode, tmp,
+			 const0_rtx)),
+			head);
+
+  skip_mask &= ~(GOMP_DIM_MASK (GOMP_DIM_VECTOR));
+  skip_vector = true;
+}
+
   for (mode = GOMP_DIM_WORKER; mode <= GOMP_DIM_VECTOR; mode++)
 if (GOMP_DIM_MASK (mode) & skip_mask)
   {
 	rtx_code_label *label = gen_label_rtx ();
-	rtx pred = cfun->machine->axis_predicate[mode - GOMP_DIM_WORKER];
+	rtx pred = skip_vector ? wvpred
+	  : cfun->machine->axis_predicate[mode - GOMP_DIM_WORKER];
 
 	if (!pred)
 	  {
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/broadcast-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/broadcast-1.c
new file mode 100644
index 000..4dcb60d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/broadcast-1.c
@@ -0,0 +1,49 @@
+/* Ensure that worker-vector state conditional expressions are
+   properly handled by the nvptx backend.  */
+
+#include 
+#include 
+
+
+#define N 1024
+
+int A[N][N] ;
+
+void test(int x)
+{
+#pragma acc parallel  num_gangs(16) num_workers(4) vector_length(32) copyout(A)
+  {
+#pragma acc loop gang
+for(int j=0;j

Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-10-28 Thread Andrew Burgess

* Jeff Law  [2016-10-28 09:58:14 -0600]:

> On 09/15/2016 08:24 AM, Andrew Burgess wrote:
> > * Jakub Jelinek  [2016-09-14 15:07:56 +0200]:
> > 
> > > On Wed, Sep 14, 2016 at 02:00:48PM +0100, Andrew Burgess wrote:
> > > > In an attempt to get this patch merged (as I still think that its
> > > > correct) I've investigated, and documented a little more about how I
> > > > think things currently work.  I'm sure most people reading this will
> > > > already know this, but hopefully, if my understanding is wrong someone
> > > > can point it out.
> > > 
> > > I wonder if user_defined_section_attribute instead shouldn't be moved
> > > into struct function and be handled as a per-function flag then.
> > 
> > That would certainly solve the problem I'm trying to address.  But I
> > wonder, how is that different to looking for a section attribute on
> > the function DECL?
> I'm not sure it is significantly different.  It seems like it's just an
> implementation detail.  I'd err on the side of putting this into the struct
> function rather than on the DECL node simply to keep the size of DECL nodes
> from increasing.  Even if you can find suitable free flag bits, those can
> likely be better used for other purposes.

I didn't add anything to the DECL, the information we need is already
there.  The relevant chunk of the patch is:

@@ -2890,7 +2889,7 @@ pass_partition_blocks::gate (function *fun)
 we are going to omit the reordering.  */
  && optimize_function_for_speed_p (fun)
  && !DECL_COMDAT_GROUP (current_function_decl)
- && !user_defined_section_attribute);
+ && !lookup_attribute ("section", DECL_ATTRIBUTES (fun->decl)));
 unsigned

I have not made any changes to add anything new to the DECL.  I guess
an argument _could_ be made that looking up an attribute is too
expensive to be used in a pass::gate function (I haven't looked into
how expensive it is) but I figured that initially at least it's better
to reuse the data we already have around than to add a new flag that
duplicates something we already have.

> I'm still pondering the actual patch.  It's not forgotten.

Would it help clarify things if I added some printf style tracing and
posted a trace?  This might help highlight how
USER_DEFINED_SECTION_ATTRIBUTE is set in a different phase of
compilation and so can't possibly be of any use when deciding whether
or not to perform the pass or not.

I'm still keen to see this merged, so any extra leg work I can do to
help move this forward, please let me know; I'm happy to help.

Thanks,
Andrew

Re: [PATCHv2 4/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

2016-10-28 Thread Andre Vieira (lists)

On 27/10/16 11:44, Kyrill Tkachov wrote:
> 
> On 27/10/16 11:00, Andre Vieira (lists) wrote:
>> On 26/10/16 17:30, Kyrill Tkachov wrote:
>>> On 26/10/16 17:26, Andre Vieira (lists) wrote:
 On 26/10/16 13:51, Kyrill Tkachov wrote:
> Hi Andre,
>
> On 25/10/16 17:29, Andre Vieira (lists) wrote:
>> On 24/08/16 12:01, Andre Vieira (lists) wrote:
>>> On 25/07/16 14:23, Andre Vieira (lists) wrote:
 This patch extends support for the ARMv8-M Security Extensions
 'cmse_nonsecure_entry' attribute to safeguard against leak of
 information through unbanked registers.

 When returning from a nonsecure entry function we clear all
 caller-saved
 registers that are not used to pass return values, by writing
 either
 the
 LR, in case of general purpose registers, or the value 0, in case
 of FP
 registers. We use the LR to write to APSR and FPSCR too. We
 currently do
 not support entry functions that pass arguments or return
 variables on
 the stack and we diagnose this. This patch relies on the existing
 code
 to make sure callee-saved registers used in cmse_nonsecure_entry
 functions are saved and restored thus retaining their nonsecure
 mode
 value, this should be happening already as it is required by AAPCS.

 This patch also clears padding bits for cmse_nonsecure_entry
 functions
 with struct and union return types. For unions a bit is only
 considered
 a padding bit if it is an unused bit in every field of that union.
 The
 function that calculates these is used in a later patch to do the
 same
 for arguments of cmse_nonsecure_call's.

 *** gcc/ChangeLog ***
 2016-07-25  Andre Vieira
Thomas Preud'homme  

* config/arm/arm.c (output_return_instruction): Clear
registers.
(thumb2_expand_return): Likewise.
(thumb1_expand_epilogue): Likewise.
(thumb_exit): Likewise.
(arm_expand_epilogue): Likewise.
(cmse_nonsecure_entry_clear_before_return): New.
(comp_not_to_clear_mask_str_un): New.
(compute_not_to_clear_mask): New.
* config/arm/thumb1.md (*epilogue_insns): Change length
 attribute.
* config/arm/thumb2.md (*thumb2_return): Likewise.

 *** gcc/testsuite/ChangeLog ***
 2016-07-25  Andre Vieira
Thomas Preud'homme  

* gcc.target/arm/cmse/cmse.exp: Test different multilibs
 separate.
* gcc.target/arm/cmse/struct-1.c: New.
* gcc.target/arm/cmse/bitfield-1.c: New.
* gcc.target/arm/cmse/bitfield-2.c: New.
* gcc.target/arm/cmse/bitfield-3.c: New.
* gcc.target/arm/cmse/baseline/cmse-2.c: Test that
 registers are
 cleared.
* gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
* gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
* gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.

>>> Updated this patch to correctly clear only the cumulative
>>> exception-status (0-4,7) and the condition code bits (28-31) of the
>>> FPSCR. I also adapted the code to be handle the bigger floating
>>> point
>>> register files.
>>>
>>> 
>>>
>>> This patch extends support for the ARMv8-M Security Extensions
>>> 'cmse_nonsecure_entry' attribute to safeguard against leak of
>>> information through unbanked registers.
>>>
>>> When returning from a nonsecure entry function we clear all
>>> caller-saved
>>> registers that are not used to pass return values, by writing
>>> either the
>>> LR, in case of general purpose registers, or the value 0, in case
>>> of FP
>>> registers. We use the LR to write to APSR. For FPSCR we clear
>>> only the
>>> cumulative exception-status (0-4, 7) and the condition code bits
>>> (28-31). We currently do not support entry functions that pass
>>> arguments
>>> or return variables on the stack and we diagnose this. This patch
>>> relies
>>> on the existing code to make sure callee-saved registers used in
>>> cmse_nonsecure_entry functions are saved and restored thus retaining
>>> their nonsecure mode value, this should be happening

Re: [PATCH][AArch64] Add a SHA1H pattern

2016-10-28 Thread James Greenhalgh

On Fri, Oct 28, 2016 at 04:54:05PM +0100, Wilco Dijkstra wrote:
> James Greenhalgh wrote:
> > On Wed, Oct 26, 2016 at 12:11:44PM +, Wilco Dijkstra wrote:
> > > Add a SHA1H pattern with a V2SI input.  This avoids unnecessary
> > > DUPs when using intrinsics like vsha1h_u32 (vgetq_lane_u32 (x, 0)).
> >
> > I think this is incorrect for big endian - element 0 of a vec_select in
> > big-endian for V4SImode is the high 32-bits (i.e. bits 96-127 of the
> > architected register). I think you'd need two patterns, one as below for
> > !BYTES_BIG_ENDIAN, and one selecting element 3 for BYTES_BIG_ENDIAN.
> 
> Yes that's true, big-endian SIMD works in mysterious ways... Here is the 
> updated
> patch (tested on aarch64_be-none-elf too):
> 
> Add LE/BE SHA1H patterns with a V2SI input.  This avoids unnecessary
> DUPs when using intrinsics like vsha1h_u32 (vgetq_lane_u32 (x, 0)).


Thanks, this respin looks OK to me.

James

> ChangeLog:
> 2016-10-28  Wilco Dijkstra  
> 
>   * config/aarch64/aarch64-simd.md (aarch64_crypto_sha1hv4si): New 
> pattern.
>   (aarch64_be_crypto_sha1hv4si): New pattern.
> --

Re: [patch,testsuite] Support dg-require-effective-target label_offsets.

2016-10-28 Thread Mike Stump

On Oct 27, 2016, at 3:16 AM, Georg-Johann Lay  wrote:
> 
> Now imagine some arithmetic like & - &  This might result in one 
> or two stub addresses, and difference between such addresses is a complete 
> different thing than the difference between the original labels:  The result 
> might differ in absolute value and in sign, i.e. you can't even determine 
> whether LAB1 or LAB2 comes first in the generated code as the order of stubs 
> might differ from the order of respective labels.

So, I think this all doesn't matter any.  Every address gs(LAB) fits in 16-bits 
by definition, and every gs(LAB1) - gs(LAB2) fits into 16 bits and thus is 
valid for all 16-bit one contexts.  The fact the order between the stub and the 
actual code is different is irrelevant, it is a private implementation detail 
of the port, the point is the semantics are fixed and constant and useful.  In 
deed that there is even a stub is a private implementation detail of the port.  
I think the `extra' helpful warning from avr_print_operand_address is wrong and 
should be removed.  Think of the label as gs(LAB), not LAB, burn LAB from your 
mind.  Once you do that, you see you can't talk about the order LAB1 > LAB2, 
the concept doesn't make any sense.  The _only_ think you can talk about is 
gs(LAB1) > gs(LAB2).  And because of that, it is always consistent and works 
just fine.

Once that misguided complains from gcc and bisutils are fixed, are their any 
failing cases?

Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-10-28 Thread Jeff Law


On 09/15/2016 08:24 AM, Andrew Burgess wrote:

* Jakub Jelinek  [2016-09-14 15:07:56 +0200]:


On Wed, Sep 14, 2016 at 02:00:48PM +0100, Andrew Burgess wrote:

In an attempt to get this patch merged (as I still think that its
correct) I've investigated, and documented a little more about how I
think things currently work.  I'm sure most people reading this will
already know this, but hopefully, if my understanding is wrong someone
can point it out.


I wonder if user_defined_section_attribute instead shouldn't be moved
into struct function and be handled as a per-function flag then.


That would certainly solve the problem I'm trying to address.  But I
wonder, how is that different to looking for a section attribute on
the function DECL?
I'm not sure it is significantly different.  It seems like it's just an 
implementation detail.  I'd err on the side of putting this into the 
struct function rather than on the DECL node simply to keep the size of 
DECL nodes from increasing.  Even if you can find suitable free flag 
bits, those can likely be better used for other purposes.



I'm still pondering the actual patch.  It's not forgotten.

jeff

Re: [PATCH][AArch64] Add a SHA1H pattern

2016-10-28 Thread Wilco Dijkstra

James Greenhalgh wrote:
> On Wed, Oct 26, 2016 at 12:11:44PM +, Wilco Dijkstra wrote:
> > Add a SHA1H pattern with a V2SI input.  This avoids unnecessary
> > DUPs when using intrinsics like vsha1h_u32 (vgetq_lane_u32 (x, 0)).
>
> I think this is incorrect for big endian - element 0 of a vec_select in
> big-endian for V4SImode is the high 32-bits (i.e. bits 96-127 of the
> architected register). I think you'd need two patterns, one as below for
> !BYTES_BIG_ENDIAN, and one selecting element 3 for BYTES_BIG_ENDIAN.

Yes that's true, big-endian SIMD works in mysterious ways... Here is the updated
patch (tested on aarch64_be-none-elf too):

Add LE/BE SHA1H patterns with a V2SI input.  This avoids unnecessary
DUPs when using intrinsics like vsha1h_u32 (vgetq_lane_u32 (x, 0)).

ChangeLog:
2016-10-28  Wilco Dijkstra  

* config/aarch64/aarch64-simd.md (aarch64_crypto_sha1hv4si): New 
pattern.
(aarch64_be_crypto_sha1hv4si): New pattern.
--

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
9ce7f00050913aebd9f83ae9c4ce4ad469dd0d98..89bdcb3f7ed53d092dd95c81fe4a15fb15dc907c
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -5705,6 +5705,26 @@
   [(set_attr "type" "crypto_sha1_fast")]
 )
 
+(define_insn "aarch64_crypto_sha1hv4si"
+  [(set (match_operand:SI 0 "register_operand" "=w")
+   (unspec:SI [(vec_select:SI (match_operand:V4SI 1 "register_operand" "w")
+(parallel [(const_int 0)]))]
+UNSPEC_SHA1H))]
+  "TARGET_SIMD && TARGET_CRYPTO && !BYTES_BIG_ENDIAN"
+  "sha1h\\t%s0, %s1"
+  [(set_attr "type" "crypto_sha1_fast")]
+)
+
+(define_insn "aarch64_be_crypto_sha1hv4si"
+  [(set (match_operand:SI 0 "register_operand" "=w")
+   (unspec:SI [(vec_select:SI (match_operand:V4SI 1 "register_operand" "w")
+(parallel [(const_int 3)]))]
+UNSPEC_SHA1H))]
+  "TARGET_SIMD && TARGET_CRYPTO && BYTES_BIG_ENDIAN"
+  "sha1h\\t%s0, %s1"
+  [(set_attr "type" "crypto_sha1_fast")]
+)
+
 (define_insn "aarch64_crypto_sha1su1v4si"
   [(set (match_operand:V4SI 0 "register_operand" "=w")
 (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "0")
>  (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "0")
>

Re: [RFA] Fix various PPC build failures due to int-in-boolean-context code

2016-10-28 Thread Jakub Jelinek

On Fri, Oct 28, 2016 at 09:12:29AM -0600, Jeff Law wrote:
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Avoid
>   false positive from int-in-boolean-context warnings.
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 5e35e33..38a5226 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -3880,7 +3880,7 @@ rs6000_option_override_internal (bool global_init_p)
>  
>If there is a TARGET_DEFAULT, use that.  Otherwise fall back to using
>-mcpu=powerpc, -mcpu=powerpc64, or -mcpu=powerpc64le defaults.  */
> -  HOST_WIDE_INT flags = ((TARGET_DEFAULT) ? TARGET_DEFAULT
> +  HOST_WIDE_INT flags = ((TARGET_DEFAULT) != 0 ? TARGET_DEFAULT

Why ()s around TARGET_DEFAULT?  If they are needed, they should be provided
in the TARGET_DEFAULT macro definition.  So I think
  HOST_WIDE_INT flags
= (TARGET_DEFAULT != 0 ? TARGET_DEFAULT
   : processor_target_table[cpu_index].target_enable);
is what we want to use (the processor_target_table[cpu_index].target_enable
line is too long where it is right now).

>: processor_target_table[cpu_index].target_enable);
>rs6000_isa_flags |= (flags & ~rs6000_isa_flags_explicit);
>  }

Jakub

Fix bfin port WRT fallthru warnings

2016-10-28 Thread Jeff Law



These were pretty obvious when looking at the code.  Verified the bfin 
ports from config-list.mk will build with a trunk compiler.


Installing on the trunk.  Now onward to the target independent bits 
(which are few).


Jeff
commit 71012b1c342ab2b69494429ec2d60d94248acea5
Author: law 
Date:   Fri Oct 28 15:22:28 2016 +

* config/bfin/bfin.c (bfin_legitimate_address_p): Add missing
fallthru comment.
* config/bfin/bfin.h (TARGET_CPU_CPP_BUILTINS): Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@241651 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cfd0929..3f2ea4d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-10-28  Jeff Law  
+
+   * config/bfin/bfin.c (bfin_legitimate_address_p): Add missing
+   fallthru comment.
+   * config/bfin/bfin.h (TARGET_CPU_CPP_BUILTINS): Likewise.
+
 2016-10-28  Segher Boessenkool  
 
PR rtl-optimization/78029
diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c
index 9b81868..2dfd038 100644
--- a/gcc/config/bfin/bfin.c
+++ b/gcc/config/bfin/bfin.c
@@ -2718,6 +2718,7 @@ bfin_legitimate_address_p (machine_mode mode, rtx x, bool 
strict)
&& REG_P (XEXP (x, 0))
&& bfin_valid_reg_p (REGNO (XEXP (x, 0)), strict, mode, POST_INC))
   return true;
+break;
   case PRE_DEC:
 if (LEGITIMATE_MODE_FOR_AUTOINC_P (mode)
&& XEXP (x, 0) == stack_pointer_rtx
diff --git a/gcc/config/bfin/bfin.h b/gcc/config/bfin/bfin.h
index a85c8c4..b5f1544 100644
--- a/gcc/config/bfin/bfin.h
+++ b/gcc/config/bfin/bfin.h
@@ -110,30 +110,35 @@
  break;\
case BFIN_CPU_BF542M:   \
  builtin_define ("__ADSPBF542M__");\
+ /* FALLTHRU */\
case BFIN_CPU_BF542:\
  builtin_define ("__ADSPBF542__"); \
  builtin_define ("__ADSPBF54x__"); \
  break;\
case BFIN_CPU_BF544M:   \
  builtin_define ("__ADSPBF544M__");\
+ /* FALLTHRU */\
case BFIN_CPU_BF544:\
  builtin_define ("__ADSPBF544__"); \
  builtin_define ("__ADSPBF54x__"); \
  break;\
case BFIN_CPU_BF547M:   \
  builtin_define ("__ADSPBF547M__");\
+ /* FALLTHRU */\
case BFIN_CPU_BF547:\
  builtin_define ("__ADSPBF547__"); \
  builtin_define ("__ADSPBF54x__"); \
  break;\
case BFIN_CPU_BF548M:   \
  builtin_define ("__ADSPBF548M__");\
+ /* FALLTHRU */\
case BFIN_CPU_BF548:\
  builtin_define ("__ADSPBF548__"); \
  builtin_define ("__ADSPBF54x__"); \
  break;\
case BFIN_CPU_BF549M:   \
  builtin_define ("__ADSPBF549M__");\
+ /* FALLTHRU */\
case BFIN_CPU_BF549:\
  builtin_define ("__ADSPBF549__"); \
  builtin_define ("__ADSPBF54x__"); \

Re: [RFA] Fix various PPC build failures due to int-in-boolean-context code

2016-10-28 Thread Jeff Law


On 10/28/2016 09:17 AM, Jakub Jelinek wrote:

On Fri, Oct 28, 2016 at 09:12:29AM -0600, Jeff Law wrote:

* config/rs6000/rs6000.c (rs6000_option_override_internal): Avoid
false positive from int-in-boolean-context warnings.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5e35e33..38a5226 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3880,7 +3880,7 @@ rs6000_option_override_internal (bool global_init_p)

 If there is a TARGET_DEFAULT, use that.  Otherwise fall back to using
 -mcpu=powerpc, -mcpu=powerpc64, or -mcpu=powerpc64le defaults.  */
-  HOST_WIDE_INT flags = ((TARGET_DEFAULT) ? TARGET_DEFAULT
+  HOST_WIDE_INT flags = ((TARGET_DEFAULT) != 0 ? TARGET_DEFAULT


Why ()s around TARGET_DEFAULT?
Not strictly needed.  But I didn't see a need to change that given 
they've been in the port "forever" and they don't impact readability in 
any significant way.




If they are needed, they should be provided

in the TARGET_DEFAULT macro definition.  So I think
  HOST_WIDE_INT flags
= (TARGET_DEFAULT != 0 ? TARGET_DEFAULT
   : processor_target_table[cpu_index].target_enable);
is what we want to use (the processor_target_table[cpu_index].target_enable
line is too long where it is right now).
Agreed.  I wasn't looking to do any cleanups, just get everything 
building again with config-list.mk.


jeff

[RFA] Fix various PPC build failures due to int-in-boolean-context code

2016-10-28 Thread Jeff Law



The PPC port is stumbling over the new integer in boolean context warnings.

In particular this code from rs6000_option_override_internal is 
problematical:


  HOST_WIDE_INT flags = ((TARGET_DEFAULT) ? TARGET_DEFAULT
 : 
processor_target_table[cpu_index].target_enable);


The compiler is flagging the (TARGET_DEFAULT) condition.  That's 
supposed to to be a boolean.


After all the macro expansions are done it ultimately looks something 
like this:


 long flags = (((1L << 7)) ? (1L << 7)
: processor_target_table[cpu_index].target_enable);

Note the (1L << 7) used as the condition for the ternary.  That's what 
has the int-in-boolean-context warning tripping.  It's a false positive 
IMHO.


Working around the warning is pretty trivial, we can just compare 
against zero.  ie


((TARGET_DEFAULT) != 0 ? ... : ...;

With that change all the PPC configurations in config-list.mk can be 
built with a trunk compiler.




OK for the trunk?

Jeff
* config/rs6000/rs6000.c (rs6000_option_override_internal): Avoid
false positive from int-in-boolean-context warnings.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5e35e33..38a5226 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3880,7 +3880,7 @@ rs6000_option_override_internal (bool global_init_p)
 
 If there is a TARGET_DEFAULT, use that.  Otherwise fall back to using
 -mcpu=powerpc, -mcpu=powerpc64, or -mcpu=powerpc64le defaults.  */
-  HOST_WIDE_INT flags = ((TARGET_DEFAULT) ? TARGET_DEFAULT
+  HOST_WIDE_INT flags = ((TARGET_DEFAULT) != 0 ? TARGET_DEFAULT
 : processor_target_table[cpu_index].target_enable);
   rs6000_isa_flags |= (flags & ~rs6000_isa_flags_explicit);
 }

Re: MAINTAINERS update

2016-10-28 Thread Carl E. Love

Hi, 

I added myself to the MAINTAINERS file (Write After Approval) on
10/27/2016.  The commit was r241636.

Sorry, forgot the patch the first time.
 
  Carl Love


Index: ChangeLog
===
--- ChangeLog   (revision 241636)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2016-10-27  Carl Love  
+
+   * MAINTAINERS (Write After Approval): Add myself.
+
 2016-10-27  Andrew Burgess  
 
* MAINTAINERS (Reviewers): Add myself.
Index: MAINTAINERS
===
--- MAINTAINERS (revision 241636)
+++ MAINTAINERS (working copy)
@@ -479,6 +479,7 @@
 Manuel López-Ibáñez
 Martin v. Löwis

 H.J. Lu
+Carl Love  
 Christophe Lyon
 Luis Machado   
 Ziga Mahkovec

MAINTAINERS update

2016-10-28 Thread Carl E. Love

Hi,

I added myself to the MAINTAINERS file (Write After Approval) on
10/27/2016.  The commit was r241636.

 Carl Love

[PATCH] Implement std::filesystem for C++17

2016-10-28 Thread Jonathan Wakely


Here's a patch to move std::experimental::filesystem to
std::filesystem, and then add using declarations to pull it all back
into std::experimental::filesystem.

The definitions are still in the separate libstdc++fs.a archive, not
in libstdc++.so, and I plan to keep it that for GCC 7, because there
are some changes likely to happen to the spec that might affect the
API and ABI.

I'll probably commit this in about a week, during the Issaquah
meeting.


commit a287ae40c1cb242f5b56b0d12eef4464704292d0
Author: Jonathan Wakely 
Date:   Fri Oct 21 13:57:32 2016 +0100

std::filesystem

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index a1190bc..a01fa86 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -662,14 +662,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
   Adopt the File System TS for C++17	 
   
 	http://www.w3.org/1999/xlink; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0218r1.html;>
 	P0218R1
 	
   
-   No 
+   7 
__has_include(filesystem) ,
 	  __cpp_lib_filesystem >= 201603 
 
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 15a164e..cfd82bd 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -36,6 +36,7 @@ std_headers = \
 	${std_srcdir}/complex \
 	${std_srcdir}/condition_variable \
 	${std_srcdir}/deque \
+	${std_srcdir}/filesystem \
 	${std_srcdir}/forward_list \
 	${std_srcdir}/fstream \
 	${std_srcdir}/functional \
diff --git a/libstdc++-v3/include/experimental/bits/fs_dir.h b/libstdc++-v3/include/experimental/bits/fs_dir.h
index 70a95eb..b42fec7 100644
--- a/libstdc++-v3/include/experimental/bits/fs_dir.h
+++ b/libstdc++-v3/include/experimental/bits/fs_dir.h
@@ -40,11 +40,9 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
-namespace experimental
-{
 namespace filesystem
 {
-inline namespace v1
+inline namespace __fs1
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
@@ -353,9 +351,8 @@ _GLIBCXX_END_NAMESPACE_CXX11
 
   // @} group filesystem
 _GLIBCXX_END_NAMESPACE_VERSION
-} // namespace v1
+} // namespace __fs1
 } // namespace filesystem
-} // namespace experimental
 } // namespace std
 
 #endif // C++11
diff --git a/libstdc++-v3/include/experimental/bits/fs_fwd.h b/libstdc++-v3/include/experimental/bits/fs_fwd.h
index fb8521a..ad17465 100644
--- a/libstdc++-v3/include/experimental/bits/fs_fwd.h
+++ b/libstdc++-v3/include/experimental/bits/fs_fwd.h
@@ -40,11 +40,9 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
-namespace experimental
-{
 namespace filesystem
 {
-inline namespace v1
+inline namespace __fs1
 {
 #if _GLIBCXX_INLINE_VERSION
 inline namespace __7 { }
@@ -283,9 +281,8 @@ _GLIBCXX_END_NAMESPACE_CXX11
 
   // @} group filesystem
 _GLIBCXX_END_NAMESPACE_VERSION
-} // namespace v1
+} // namespace __fs1
 } // namespace filesystem
-} // namespace experimental
 } // namespace std
 
 #endif // C++11
diff --git a/libstdc++-v3/include/experimental/bits/fs_ops.h b/libstdc++-v3/include/experimental/bits/fs_ops.h
index 62a9826..9ff300d 100644
--- a/libstdc++-v3/include/experimental/bits/fs_ops.h
+++ b/libstdc++-v3/include/experimental/bits/fs_ops.h
@@ -38,11 +38,9 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
-namespace experimental
-{
 namespace filesystem
 {
-inline namespace v1
+inline namespace __fs1
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
@@ -287,9 +285,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // @} group filesystem
 _GLIBCXX_END_NAMESPACE_VERSION
-} // namespace v1
+} // namespace __fs1
 } // namespace filesystem
-} // namespace experimental
 } // namespace std
 
 #endif // C++11
diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h b/libstdc++-v3/include/experimental/bits/fs_path.h
index 4d7291f..35d39dc 100644
--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -52,11 +52,9 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
-namespace experimental
-{
 namespace filesystem
 {
-inline namespace v1
+inline namespace __fs1
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_BEGIN_NAMESPACE_CXX11
@@ -337,6 +335,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 bool is_absolute() const;
 bool is_relative() const { return !is_absolute(); }
 
+#if __cplusplus > 201402L
+// TODO generation
+path lexically_normal() const;
+path lexically_relative(const path& __base) const;
+path lexically_proximate(const path& __base) const;
+#endif
+
 // iterators
 class iterator;
 typedef iterator const_iterator;
@@ -1028,9 +1033,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   // @} group filesystem
 _GLIBCXX_END_NAMESPACE_CXX11
 _GLIBCXX_END_NAMESPACE_VERSION
-} // namespace v1
+} // namespace __fs1
 } // namespace filesystem
-} // namespace experimental
 } // namespace std
 
 #endif // C++11
diff --git

Re: [PATCH] sched: Do not mix prologue and epilogue insns

2016-10-28 Thread Bernd Schmidt


On 10/28/2016 03:37 PM, Segher Boessenkool wrote:

This patch makes scheduling not reorder prologue insns relative to
epilogue insns and vice versa.  This fixes PR78029.


This seems good to me.


Bernd

Re: [PATCH] Fix and testcases for pr72747

2016-10-28 Thread Will Schmidt

On Fri, 2016-10-28 at 08:31 -0500, Will Schmidt wrote:
> On Fri, 2016-10-28 at 10:38 +0200, Richard Biener wrote:
> > On Thu, Oct 27, 2016 at 5:37 PM, Will Schmidt  
> > wrote:
> > > Hi,
> > >
> > > Per PR72747, A statement such as "v = vec_splats (1);" correctly
> > > initializes a vector.  However, a statement such as "v[1] = v[0] =
> > > vec_splats (1);" initializes both v[1] and v[0] to random garbage.
> > >
> > > It has been determined that this is occurring because we did not emit
> > > the actual initialization statement before our final exit from
> > > gimplify_init_constructor, at which time we lose the expression when we
> > > assign *expr_p to either NULL or object.  This problem affected both 
> > > constant
> > > and non-constant initializers.  Corrected this by moving the logic to
> > > emit the statement up earlier within the if/else logic.
> > >
> > > Bootstrapped and make check ran without regressions on
> > > powerpc64le-unknown-linux-gnu.
> > >
> > > OK for trunk?
> > 
> > Ok.
> > 
> > RIchard.
> 
> Committed revision 241647.
> Thanks.   

Oops, forgot to ask.  After some burn-in time, is this OK to backport to
the 5 and 6 branches? 

The bug was first noticed in GCC 5.

Thanks,
-Will

> 
> 
> > 
> > > Thanks,
> > > -Will
> > >
> > > gcc:
> > > 2016-10-26  Will Schmidt 
> > >
> > > PR middle-end/72747
> > > * gimplify.c (gimplify_init_constructor): Move emit of constructor
> > > assignment to earlier in the if/else logic.
> > >
> > > testsuite:
> > > 2016-10-26  Will Schmidt 
> > >
> > > PR middle-end/72747
> > > * c-c++-common/pr72747-1.c: New test.
> > > * c-c++-common/pr72747-2.c: Likewise.
> > >
> > > Index: gcc/gimplify.c
> > > ===
> > > --- gcc/gimplify.c  (revision 241600)
> > > +++ gcc/gimplify.c  (working copy)
> > > @@ -4730,24 +4730,23 @@
> > >
> > >if (ret == GS_ERROR)
> > >  return GS_ERROR;
> > > -  else if (want_value)
> > > +  /* If we have gimplified both sides of the initializer but have
> > > + not emitted an assignment, do so now.  */
> > > +  if (*expr_p)
> > >  {
> > > +  tree lhs = TREE_OPERAND (*expr_p, 0);
> > > +  tree rhs = TREE_OPERAND (*expr_p, 1);
> > > +  gassign *init = gimple_build_assign (lhs, rhs);
> > > +  gimplify_seq_add_stmt (pre_p, init);
> > > +}
> > > +  if (want_value)
> > > +{
> > >*expr_p = object;
> > >return GS_OK;
> > >  }
> > >else
> > >  {
> > > -  /* If we have gimplified both sides of the initializer but have
> > > -not emitted an assignment, do so now.  */
> > > -  if (*expr_p)
> > > -   {
> > > - tree lhs = TREE_OPERAND (*expr_p, 0);
> > > - tree rhs = TREE_OPERAND (*expr_p, 1);
> > > - gassign *init = gimple_build_assign (lhs, rhs);
> > > - gimplify_seq_add_stmt (pre_p, init);
> > > - *expr_p = NULL;
> > > -   }
> > > -
> > > +  *expr_p = NULL;
> > >return GS_ALL_DONE;
> > >  }
> > >  }
> > > Index: gcc/testsuite/c-c++-common/pr72747-1.c
> > > ===
> > > --- gcc/testsuite/c-c++-common/pr72747-1.c  (revision 0)
> > > +++ gcc/testsuite/c-c++-common/pr72747-1.c  (working copy)
> > > @@ -0,0 +1,16 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target powerpc_altivec_ok } */
> > > +/* { dg-options "-maltivec -fdump-tree-gimple" } */
> > > +
> > > +/* PR 72747: Test that cascaded definition is happening for constant 
> > > vectors. */
> > > +
> > > +#include 
> > > +
> > > +int main (int argc, char *argv[])
> > > +{
> > > +   __vector int v1,v2;
> > > +   v1 = v2 = vec_splats ((int) 42);
> > > +   return 0;
> > > +}
> > > +/* { dg-final { scan-tree-dump-times " v2 = { 42, 42, 42, 42 }" 1 
> > > "gimple" } } */
> > > +
> > > Index: gcc/testsuite/c-c++-common/pr72747-2.c
> > > ===
> > > --- gcc/testsuite/c-c++-common/pr72747-2.c  (revision 0)
> > > +++ gcc/testsuite/c-c++-common/pr72747-2.c  (working copy)
> > > @@ -0,0 +1,18 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target powerpc_altivec_ok } */
> > > +/* { dg-options "-c -maltivec -fdump-tree-gimple" } */
> > > +
> > > +/* PR 72747: test that cascaded definition is happening for non 
> > > constants. */
> > > +
> > > +void foo ()
> > > +{
> > > +  extern int i;
> > > +  __vector int v,w;
> > > +v = w = (vector int) { i };
> > > +}
> > > +
> > > +int main (int argc, char *argv[])
> > > +{
> > > +  return 0;
> > > +}
> > > +/* { dg-final { scan-tree-dump-times " w = {i.0_1}" 1 "gimple" } } */
> > >
> > >
> > 
> 
>

Re: [PATCH] GIMPLE store merging pass

2016-10-28 Thread Kyrill Tkachov



On 27/10/16 14:31, Richard Biener wrote:

On Mon, 24 Oct 2016, Kyrill Tkachov wrote:


Hi all,

This is a slight update over [1] with Richard's feedback addressed.
In terminate_all_aliasing_chains we now terminate the chain early if
the destination is writing to a base offset by a variable amount.
This avoids walking the store chain and performing more alias checks.

The param max-stores-to-merge is introduced to limit the number of statements
we merge. Its default value is set to 64 which should be enough for now and
avoids
blowing up compile time in cases such as [2].

I've also introduced a timevar for the pass to allow us to track it in
-ftime-report.

Bootstrapped and tested on aarch64, arm, x86_64.

Ok?

Ok if you add documentation for the two new --params.


Thanks!
I've added the documentation and this is what I'll be committing.

Thanks again for the review.
Kyrill


2016-10-28  Kyrylo Tkachov  

PR middle-end/22141
* Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
* common.opt (fstore-merging): New Optimization option.
* opts.c (default_options_table): Add entry for
OPT_ftree_store_merging.
* fold-const.h (can_native_encode_type_p): Declare prototype.
* fold-const.c (can_native_encode_type_p): Define.
* params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
(PARAM_MAX_STORES_TO_MERGE): Likewise.
* timevar.def (TV_GIMPLE_STORE_MERGING): New timevar.
* passes.def: Insert pass_tree_store_merging.
* tree-pass.h (make_pass_store_merging): Declare extern
prototype.
* gimple-ssa-store-merging.c: New file.
* doc/invoke.texi (Optimization Options): Document
-fstore-merging.
(--param documentation): Document store-merging-allow-unaligned
and max-stores-to-merge.

2016-10-28  Kyrylo Tkachov  
Jakub Jelinek  
Andrew Pinski  

PR middle-end/22141
PR rtl-optimization/23684
* gcc.c-torture/execute/pr22141-1.c: New test.
* gcc.c-torture/execute/pr22141-2.c: Likewise.
* gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
* gcc.target/aarch64/ldp_stp_4.c: Likewise.
* gcc.dg/store_merging_1.c: New test.
* gcc.dg/store_merging_2.c: Likewise.
* gcc.dg/store_merging_3.c: Likewise.
* gcc.dg/store_merging_4.c: Likewise.
* gcc.dg/store_merging_5.c: Likewise.
* gcc.dg/store_merging_6.c: Likewise.
* gcc.dg/store_merging_7.c: Likewise.
* gcc.target/i386/pr22141.c: Likewise.
* gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.
* g++.dg/init/new17.C: Likewise.


Thanks,
Richard.


Thanks,
Kyrill

[1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01459.html
[2] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01880.html

2016-10-24  Kyrylo Tkachov  

 PR middle-end/22141
 * Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
 * common.opt (fstore-merging): New Optimization option.
 * opts.c (default_options_table): Add entry for
 OPT_ftree_store_merging.
 * fold-const.h (can_native_encode_type_p): Declare prototype.
 * fold-const.c (can_native_encode_type_p): Define.
 * params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
 (PARAM_MAX_STORES_TO_MERGE): Likewise.
 * timevar.def (TV_GIMPLE_STORE_MERGING): New timevar.
 * passes.def: Insert pass_tree_store_merging.
 * tree-pass.h (make_pass_store_merging): Declare extern
 prototype.
 * gimple-ssa-store-merging.c: New file.
 * doc/invoke.texi (Optimization Options): Document
 -fstore-merging.

2016-10-24  Kyrylo Tkachov  
 Jakub Jelinek  
 Andrew Pinski  

 PR middle-end/22141
 PR rtl-optimization/23684
 * gcc.c-torture/execute/pr22141-1.c: New test.
 * gcc.c-torture/execute/pr22141-2.c: Likewise.
 * gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
 * gcc.target/aarch64/ldp_stp_4.c: Likewise.
 * gcc.dg/store_merging_1.c: New test.
 * gcc.dg/store_merging_2.c: Likewise.
 * gcc.dg/store_merging_3.c: Likewise.
 * gcc.dg/store_merging_4.c: Likewise.
 * gcc.dg/store_merging_5.c: Likewise.
 * gcc.dg/store_merging_6.c: Likewise.
 * gcc.dg/store_merging_7.c: Likewise.
 * gcc.target/i386/pr22141.c: Likewise.
 * gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.
 * g++.dg/init/new17.C: Likewise.



diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d5fd1048d7dca71cbbf20b4f3fca129ebfb34dea..622d038f6fb50c35ff48bc84a69967071aa17e90 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1296,6 +1296,7 @@ OBJS = \
 	gimple-ssa-isolate-paths.o \
 	gimple-ssa-nonnull-compare.o \
 	gimple-ssa-split-paths.o \
+	gimple-ssa-store-merging.o \
 	gimple-ssa-strength-reduction.o \
 	gimple-ssa-sprintf.o \
 	gimple-ssa-warn-alloca.o \
diff --git a/gcc/common.opt

[PATCH] Implement std::launder for C++17

2016-10-28 Thread Jonathan Wakely


This implements std::launder, using Jakub's new __builtin_launder.

In order to allow our headers to be used with Clang (which doesn't
implement all our new built-ins yet) I've added a couple of checks
using Clang's __has_builtin.

My initial version of std::launder used
static_assert(!is_funcion_v<_Tp> && !is_void_v<_Tp>) which required
including , and that uses the new built-in
__has_unique_object_representations, so I guarded that too. In the end
I decided not to include  in  and dealt with the
function and void cases using deleted overloads of std::launder
instead.

* doc/xml/manual/status_cxx2017.xml: Update status.
* doc/html/*: Regenerate.
* include/std/type_traits (has_unique_object_representations): Guard
with __has_builtin check.
* libsupc++/new (launder): Define for C++17.
* testsuite/18_support/launder/1.cc: New test.
* testsuite/18_support/launder/requirements.cc: New test.
* testsuite/18_support/launder/requirements_neg.cc: New test.

Tested powerpc64le-linux, committed to trunk.

commit 802f00ede1c4ade0bfa0d6f8a3f4d212296e75bd
Author: Jonathan Wakely 
Date:   Thu Oct 20 21:52:39 2016 +0100

Implement std::launder for C++17

* doc/xml/manual/status_cxx2017.xml: Update status.
* doc/html/*: Regenerate.
* include/std/type_traits (has_unique_object_representations): Guard
with __has_builtin check.
* libsupc++/new (launder): Define for C++17.
* testsuite/18_support/launder/1.cc: New test.
* testsuite/18_support/launder/requirements.cc: New test.
* testsuite/18_support/launder/requirements_neg.cc: New test.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index a1190bc..d008bd9e 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -69,14 +69,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
Core Issue 1776: Replacement of class objects containing 
reference members
   
http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0137r1.html;>
P0137R1

   
-   No 
+   7 
__cpp_lib_launder >= 201606 
 
 
@@ -281,7 +280,7 @@ Feature-testing recommendations for C++.
N4089

   
-   5 ? 
+   6 
   
 
 
@@ -292,7 +291,7 @@ Feature-testing recommendations for C++.
  N4366

   
-   5 ? 
+   6 
   
 
 
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index d402b5b..6824c9e 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3041,6 +3041,14 @@ template 
 template 
   constexpr bool is_convertible_v = is_convertible<_From, _To>::value;
 
+#ifdef __has_builtin
+# if !__has_builtin(__has_unique_object_representations)
+// Try not to break non-GNU compilers that don't support the built-in:
+#  define _GLIBCXX_NO_BUILTIN_HAS_UNIQ_OBJ_REP 1
+# endif
+#endif
+
+#ifndef _GLIBCXX_NO_BUILTIN_HAS_UNIQ_OBJ_REP
 # define __cpp_lib_has_unique_object_representations 201606
   /// has_unique_object_representations
   template
@@ -3049,6 +3057,8 @@ template 
   remove_cv_t>
   )>
 { };
+#endif
+#undef _GLIBCXX_NO_BUILTIN_HAS_UNIQ_OBJ_REP
 
 #endif // C++17
 
diff --git a/libstdc++-v3/libsupc++/new b/libstdc++-v3/libsupc++/new
index 477fadc..6bc6c00 100644
--- a/libstdc++-v3/libsupc++/new
+++ b/libstdc++-v3/libsupc++/new
@@ -176,6 +176,41 @@ inline void operator delete[](void*, void*) 
_GLIBCXX_USE_NOEXCEPT { }
 //@}
 } // extern "C++"
 
+#if __cplusplus > 201402L
+#ifdef __has_builtin
+# if !__has_builtin(__builtin_launder)
+// Try not to break non-GNU compilers that don't support the built-in:
+#  define _GLIBCXX_NO_BUILTIN_LAUNDER 1
+# endif
+#endif
+
+#ifndef _GLIBCXX_NO_BUILTIN_LAUNDER
+namespace std
+{
+#define __cpp_lib_launder 201606
+  /// Pointer optimization barrier [ptr.launder]
+  template
+constexpr _Tp*
+launder(_Tp* __p) noexcept
+{ return __builtin_launder(__p); }
+
+  // The program is ill-formed if T is a function type or
+  // (possibly cv-qualified) void.
+
+  template
+void launder(_Ret (*)(_Args...)) = delete;
+  template
+void launder(_Ret (*)(_Args..)) = delete;
+
+  void launder(void*) = delete;
+  void launder(const void*) = delete;
+  void launder(volatile void*) = delete;
+  void launder(const volatile void*) = delete;
+}
+#endif // _GLIBCXX_NO_BUILTIN_LAUNDER
+#undef _GLIBCXX_NO_BUILTIN_LAUNDER
+#endif // C++17
+
 #pragma GCC visibility pop
 
 #endif
diff --git a/libstdc++-v3/testsuite/18_support/launder/1.cc 
b/libstdc++-v3/testsuite/18_support/launder/1.cc
new file mode 100644
index 000..51096a3
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/launder/1.cc
@@ -0,0 +1,56 @@
+//

Re: [PATCH, LIBGCC] Avoid count_leading_zeros with undefined result (PR 78067)

2016-10-28 Thread Bernd Edlinger

On 10/27/16 22:23, Joseph Myers wrote:
> On Thu, 27 Oct 2016, Bernd Edlinger wrote:
>
>> Hi,
>>
>> by code reading I became aware that libgcc can call count_leading_zeros
>> in certain cases which can give undefined results.  This happens on
>> signed int128 -> float or double conversions, when the int128 is in the range
>> INT64_MAX+1 to UINT64_MAX.
>
> I'd expect testcases added to the testsuite that exercise this case at
> runtime, if not already present.
>

Yes, thanks.  I somehow expected there were already test cases,
somewhere, but now when you ask that, I begin to doubt as well...

I will try to add an asm("int 3") and see if that gets hit at all.


Bernd.

Re: Fix PR77309, combine eliminates sign bit comparison

2016-10-28 Thread Segher Boessenkool

Hi Bernd,

On Fri, Oct 28, 2016 at 01:18:19PM +0200, Bernd Schmidt wrote:
> In this PR, we manage to simplify the code down to
> 
> (lt (and (reg) (signbit)) (const 0))
> 
> simplify_comparison then calls make_compound_operation on the AND 
> expression, and that turns it into a ZERO_EXTRACT of a single bit, 
> changing the meaning of the comparison.
> 
> The problem is a special case we have for comparisons in 
> make_compound_operation, it thinks it should convert single-bit ANDs 
> into such extractions. But this is only valid if the bit isn't the sign 
> bit, or if we're testing for equality with zero.
> 
> The following patch was bootstrapped and tested on x86_64-linux. Ok?

Okay, thanks!


Segher

[PATCH] sched: Do not mix prologue and epilogue insns

2016-10-28 Thread Segher Boessenkool

This patch makes scheduling not reorder prologue insns relative to
epilogue insns and vice versa.  This fixes PR78029.

The problem in that PR:
We have two insns, in this order:

(insn/f 300 299 267 8 (set (reg:DI 65 lr)
(reg:DI 0 0)) 579 {*movdi_internal64}
 (expr_list:REG_DEAD (reg:DI 0 0)
(expr_list:REG_CFA_RESTORE (reg:DI 65 lr)
(nil
...
(insn/f 310 268 134 8 (set (mem/c:DI (plus:DI (reg/f:DI 1 1)
(const_int 144 [0x90])) [6  S8 A8])
(reg:DI 0 0)) 579 {*movdi_internal64}
 (expr_list:REG_DEAD (reg:DI 0 0)
(expr_list:REG_CFA_OFFSET (set (mem/c:DI (plus:DI (reg/f:DI 1 1)
(const_int 144 [0x90])) [6  S8 A8])
(reg:DI 65 lr))
(nil

and sched swaps them (when compiling for power6, it tries to put memory
stores together, so insn 310 is moved up past 300 to go together with
some other store).  But the REG_CFA_RESTORE and REG_CFA_OFFSET cannot be
swapped (they both say where the orig value of LR now lives).

Tested on powerpc64-linux {-m32,-m64}, no regressions.

Is this okay for trunk?


Segher


2016-10-28  Segher Boessenkool  

PR rtl-optimization/78029
* function.c (prologue_contains, epilogue_contains): New functions.
(record_prologue_seq, record_epilogue_seq): New functions.
* function.h (prologue_contains, epilogue_contains,
record_prologue_seq, record_epilogue_seq): New declarations.
* sched-deps.c (sched_analyze_insn): Make dependencies to prevent
mixing prologue and epilogue insns.
(init_deps): Initialize the new fields in struct deps_desc.
* sched-int.h (struct deps_desc): New fields last_prologue,
last_epilogue, and last_logue_was_epilogue.
* shrink-wrap.c (emit_common_heads_for_components): Record all
emitted prologue and epilogue insns.
(emit_common_tails_for_components): Ditto.
(insert_prologue_epilogue_for_components): Ditto.

---
 gcc/function.c| 23 +++
 gcc/function.h|  4 
 gcc/sched-deps.c  | 28 
 gcc/sched-int.h   | 11 +++
 gcc/shrink-wrap.c |  6 ++
 5 files changed, 72 insertions(+)

diff --git a/gcc/function.c b/gcc/function.c
index ea40ad1..0b1d168 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5752,6 +5752,18 @@ contains (const_rtx insn, hash_table 
*hash)
 }
 
 int
+prologue_contains (const_rtx insn)
+{
+  return contains (insn, prologue_insn_hash);
+}
+
+int
+epilogue_contains (const_rtx insn)
+{
+  return contains (insn, epilogue_insn_hash);
+}
+
+int
 prologue_epilogue_contains (const_rtx insn)
 {
   if (contains (insn, prologue_insn_hash))
@@ -5761,6 +5773,17 @@ prologue_epilogue_contains (const_rtx insn)
   return 0;
 }
 
+void
+record_prologue_seq (rtx_insn *seq)
+{
+  record_insns (seq, NULL, _insn_hash);
+}
+
+void
+record_epilogue_seq (rtx_insn *seq)
+{
+  record_insns (seq, NULL, _insn_hash);
+}
 
 /* Set JUMP_LABEL for a return insn.  */
 
diff --git a/gcc/function.h b/gcc/function.h
index 590a490..e854c7f 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -628,7 +628,11 @@ extern void clobber_return_register (void);
 extern void expand_function_end (void);
 extern rtx get_arg_pointer_save_area (void);
 extern void maybe_copy_prologue_epilogue_insn (rtx, rtx);
+extern int prologue_contains (const_rtx);
+extern int epilogue_contains (const_rtx);
 extern int prologue_epilogue_contains (const_rtx);
+extern void record_prologue_seq (rtx_insn *);
+extern void record_epilogue_seq (rtx_insn *);
 extern void emit_return_into_block (bool simple_p, basic_block bb);
 extern void set_return_jump_label (rtx_insn *);
 extern bool active_insn_between (rtx_insn *head, rtx_insn *tail);
diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index 6cd8332..1ebd776 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -3502,6 +3502,31 @@ sched_analyze_insn (struct deps_desc *deps, rtx x, 
rtx_insn *insn)
   if (!deps->readonly)
deps->last_args_size = insn;
 }
+
+  /* We must not mix prologue and epilogue insns.  See PR78029.  */
+  if (prologue_contains (insn))
+{
+  add_dependence_list (insn, deps->last_epilogue, true, REG_DEP_ANTI, 
true);
+  if (!deps->readonly)
+   {
+ if (deps->last_logue_was_epilogue)
+   free_INSN_LIST_list (>last_prologue);
+ deps->last_prologue = alloc_INSN_LIST (insn, deps->last_prologue);
+ deps->last_logue_was_epilogue = false;
+   }
+}
+
+  if (epilogue_contains (insn))
+{
+  add_dependence_list (insn, deps->last_prologue, true, REG_DEP_ANTI, 
true);
+  if (!deps->readonly)
+   {
+ if (!deps->last_logue_was_epilogue)
+   free_INSN_LIST_list (>last_epilogue);
+ deps->last_epilogue = alloc_INSN_LIST (insn, deps->last_epilogue);
+ deps->last_logue_was_epilogue = true;
+   }
+}
 }
 
 /* Return TRUE if INSN

Re: [PATCH] Fix and testcases for pr72747

2016-10-28 Thread Will Schmidt

On Fri, 2016-10-28 at 10:38 +0200, Richard Biener wrote:
> On Thu, Oct 27, 2016 at 5:37 PM, Will Schmidt  
> wrote:
> > Hi,
> >
> > Per PR72747, A statement such as "v = vec_splats (1);" correctly
> > initializes a vector.  However, a statement such as "v[1] = v[0] =
> > vec_splats (1);" initializes both v[1] and v[0] to random garbage.
> >
> > It has been determined that this is occurring because we did not emit
> > the actual initialization statement before our final exit from
> > gimplify_init_constructor, at which time we lose the expression when we
> > assign *expr_p to either NULL or object.  This problem affected both 
> > constant
> > and non-constant initializers.  Corrected this by moving the logic to
> > emit the statement up earlier within the if/else logic.
> >
> > Bootstrapped and make check ran without regressions on
> > powerpc64le-unknown-linux-gnu.
> >
> > OK for trunk?
> 
> Ok.
> 
> RIchard.

Committed revision 241647.
Thanks.   


> 
> > Thanks,
> > -Will
> >
> > gcc:
> > 2016-10-26  Will Schmidt 
> >
> > PR middle-end/72747
> > * gimplify.c (gimplify_init_constructor): Move emit of constructor
> > assignment to earlier in the if/else logic.
> >
> > testsuite:
> > 2016-10-26  Will Schmidt 
> >
> > PR middle-end/72747
> > * c-c++-common/pr72747-1.c: New test.
> > * c-c++-common/pr72747-2.c: Likewise.
> >
> > Index: gcc/gimplify.c
> > ===
> > --- gcc/gimplify.c  (revision 241600)
> > +++ gcc/gimplify.c  (working copy)
> > @@ -4730,24 +4730,23 @@
> >
> >if (ret == GS_ERROR)
> >  return GS_ERROR;
> > -  else if (want_value)
> > +  /* If we have gimplified both sides of the initializer but have
> > + not emitted an assignment, do so now.  */
> > +  if (*expr_p)
> >  {
> > +  tree lhs = TREE_OPERAND (*expr_p, 0);
> > +  tree rhs = TREE_OPERAND (*expr_p, 1);
> > +  gassign *init = gimple_build_assign (lhs, rhs);
> > +  gimplify_seq_add_stmt (pre_p, init);
> > +}
> > +  if (want_value)
> > +{
> >*expr_p = object;
> >return GS_OK;
> >  }
> >else
> >  {
> > -  /* If we have gimplified both sides of the initializer but have
> > -not emitted an assignment, do so now.  */
> > -  if (*expr_p)
> > -   {
> > - tree lhs = TREE_OPERAND (*expr_p, 0);
> > - tree rhs = TREE_OPERAND (*expr_p, 1);
> > - gassign *init = gimple_build_assign (lhs, rhs);
> > - gimplify_seq_add_stmt (pre_p, init);
> > - *expr_p = NULL;
> > -   }
> > -
> > +  *expr_p = NULL;
> >return GS_ALL_DONE;
> >  }
> >  }
> > Index: gcc/testsuite/c-c++-common/pr72747-1.c
> > ===
> > --- gcc/testsuite/c-c++-common/pr72747-1.c  (revision 0)
> > +++ gcc/testsuite/c-c++-common/pr72747-1.c  (working copy)
> > @@ -0,0 +1,16 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target powerpc_altivec_ok } */
> > +/* { dg-options "-maltivec -fdump-tree-gimple" } */
> > +
> > +/* PR 72747: Test that cascaded definition is happening for constant 
> > vectors. */
> > +
> > +#include 
> > +
> > +int main (int argc, char *argv[])
> > +{
> > +   __vector int v1,v2;
> > +   v1 = v2 = vec_splats ((int) 42);
> > +   return 0;
> > +}
> > +/* { dg-final { scan-tree-dump-times " v2 = { 42, 42, 42, 42 }" 1 "gimple" 
> > } } */
> > +
> > Index: gcc/testsuite/c-c++-common/pr72747-2.c
> > ===
> > --- gcc/testsuite/c-c++-common/pr72747-2.c  (revision 0)
> > +++ gcc/testsuite/c-c++-common/pr72747-2.c  (working copy)
> > @@ -0,0 +1,18 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target powerpc_altivec_ok } */
> > +/* { dg-options "-c -maltivec -fdump-tree-gimple" } */
> > +
> > +/* PR 72747: test that cascaded definition is happening for non constants. 
> > */
> > +
> > +void foo ()
> > +{
> > +  extern int i;
> > +  __vector int v,w;
> > +v = w = (vector int) { i };
> > +}
> > +
> > +int main (int argc, char *argv[])
> > +{
> > +  return 0;
> > +}
> > +/* { dg-final { scan-tree-dump-times " w = {i.0_1}" 1 "gimple" } } */
> >
> >
>

Re: [PATCH] Fix computation of register limit for -fsched-pressure

2016-10-28 Thread Pat Haugen

On 10/28/2016 06:38 AM, Bin.Cheng wrote:
> On Fri, Oct 28, 2016 at 12:27 PM, Tamar Christina
>  wrote:
>> > Looking at it again,
>> >
>> > it seems to be that the testcase should be adjusted.
>> > There's no actual spilling. It just uses more registers than before due to 
>> > the scheduling.
> Sorry I didn't look into the test, but using more registers sounds
> like a regression too?  At least we need to make sure it's reasonable.

Using more/less registers is not unexpected, it all depends on a target's 
number of call-used vs. call-saved registers. As Bin mentioned, as long as it's 
reasonable.

-Pat

[PATCH] Fix PR78128

2016-10-28 Thread Richard Biener


The following fixes a mistake in an earlier patch of mine fixing
PR71002.  fold-const.c make_bit_field_ref doesn't care about alias
sets of references it merges but as it commons to a common base
it assumed the original refs were aliased by the new one.  The
testcase of PR71002 shows this is not true for union references
which are punned to alias-set zero.  Now my fix retained the
alias-set of (one of!) the original refs if it wasn't equal to
the alias-set of the common inner ref.  That obviously breaks
things if the alias sets of the other original refs do not alias
this one.

The following rectifies this by only handling the alias-set zero
case as required by the original PR.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
and branch.

Richard.

2016-10-28  Richard Biener  

PR middle-end/78128
PR middle-end/71002
* fold-const.c (make_bit_field_ref): Only adjust alias set
when the original alias set was zero.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 241642)
+++ gcc/fold-const.c(working copy)
@@ -3787,11 +3787,11 @@ make_bit_field_ref (location_t loc, tree
 {
   tree result, bftype;
 
-  if (get_alias_set (inner) != get_alias_set (orig_inner))
+  alias_set_type iset = get_alias_set (orig_inner);
+  if (iset == 0 && get_alias_set (inner) != iset)
 inner = fold_build2 (MEM_REF, TREE_TYPE (inner),
 build_fold_addr_expr (inner),
-build_int_cst
- (reference_alias_ptr_type (orig_inner), 0));
+build_int_cst (ptr_type_node, 0));
 
   if (bitpos == 0 && !reversep)
 {

Re: [patch,testsuite] Support dg-require-effective-target label_offsets.

2016-10-28 Thread Georg-Johann Lay


On 27.10.2016 12:49, Bernd Schmidt wrote:

On 10/27/2016 12:16 PM, Georg-Johann Lay wrote:

Now imagine some arithmetic like & - &  This might result in
one or two stub addresses, and difference between such addresses is a
complete different thing than the difference between the original
labels:  The result might differ in absolute value and in sign, i.e. you
can't even determine whether LAB1 or LAB2 comes first in the generated
code as the order of stubs might differ from the order of respective
labels.


Ok, so you can't expect to use the value directly, but I think this is not too
different from other targets. You still should be able to add the difference
back to & and expect to get & or gs(&). That's what the
execution test does - why isn't this working?

Bernd


Makes sense what you are writing...  Just replacing a label should not do any 
harm (except for ordering).  I had a deeper look into it and some tests are 
failing because of a missing Binutils support for special expressions, not 
because the tests don't work per se.  Patching the generated asm to results as 
would be provided by working Binutils make the test pass.


Johann

Re: [PATCH VECT]Skip unnecessary data dependence check after visited store stmt in slp

2016-10-28 Thread Richard Biener

On Fri, Oct 28, 2016 at 1:10 PM, Bin Cheng  wrote:
> Hi,
> Function vect_slp_analyze_node_dependences delays data-dependence check for 
> visited store stmts until we run into the last store, because all stores are 
> sunk/vectorized at the position of the last one.  The problem is that it 
> still checks data-dep for current store stmt after the delay part of code.  
> This is unnecessary no matter the last store stmt is encountered or not.  
> This patch fixes the issue by simple refactoring.  Bootstrap and test on 
> x86_64.  Is it OK?

Ok.

Richard.

> Thanks,
> bin
>
> 2016-10-27  Bin Cheng  
>
> * tree-vect-data-refs.c (vect_slp_analyze_node_dependences): Skip
> unnecessary data dependence check after visited store stmt.

Re: [PATCH VECT]Swap operands for cond_reduction when necessary

2016-10-28 Thread Richard Biener

On Wed, Oct 26, 2016 at 6:42 PM, Bin Cheng  wrote:
> Hi,
> For stmt defining reduction, GCC vectorizer assumes that the reduction 
> variable is always the last (second) operand.  Another fact is that 
> vectorizer doesn't swap operands for cond_reduction during analysis stage.  
> The problem is GCC middle-end may canonicalize cond_expr into a form that 
> reduction variable is not the last one.  At the moment, such case cannot be 
> vectorized.
> The patch fixes this issue by swapping operands in cond_reduction when it's 
> necessary.  The patch also swaps it back if vectorization fails.  The patch 
> resolves failures introduced by previous match.pd patches.  In addition, 
> couple cases are XPASSed on AArch64 now, which means more loops are 
> vectorized.  I will send following patch addressing those XPASS tests.
> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?

@@ -1225,6 +1225,20 @@ destroy_loop_vec_info (loop_vec_info
loop_vinfo, bool clean_stmts)
swap_ssa_operands (stmt,
   gimple_assign_rhs1_ptr (stmt),
   gimple_assign_rhs2_ptr (stmt));
+ else if (code == COND_EXPR
+  && CONSTANT_CLASS_P (gimple_assign_rhs2 (stmt)))
+   {
+ tree cond_expr = gimple_assign_rhs1 (stmt);
+ enum tree_code cond_code = TREE_CODE (cond_expr);
+
+ gcc_assert (TREE_CODE_CLASS (cond_code) == tcc_comparison);
+ /* HONOR_NANS doesn't matter when inverting it back.  */

I think this doesn't hold true for COND_EXPRs that were originally
this way as canonicalization
is also inhibited by this.  I suggest to simply not invert back when
cond_code == ERROR_MARK
as we can't possibly have swapped it to the current non-canonical way.

Ok with that change.

Thanks,
Richard.

+ cond_code = invert_tree_comparison (cond_code, false);
+ gcc_assert (cond_code != ERROR_MARK);
+ TREE_SET_CODE (cond_expr, cond_code);
+ swap_ssa_operands (stmt, gimple_assign_rhs2_ptr (stmt),
+gimple_assign_rhs3_ptr (stmt));


> Thanks,
> bin
>
> 2016-10-25  Bin Cheng  
>
> * tree-vect-loop.c (destroy_loop_vec_info): Handle cond_expr.
> (vect_is_simple_reduction): Swap cond_reduction by inversion.

Re: [PATCH VECT]Support operand swapping for cond_expr in vect_slp

2016-10-28 Thread Richard Biener

On Thu, Oct 27, 2016 at 3:37 PM, Bin Cheng  wrote:
> Hi,
> During analysis, vect_slp checks if statements of a group are isomorphic to 
> each other, specifically, all statements have to be isomorphic to the first 
> one.  Apparently, operands of commutative operators (PLUS_EXPR/MINUS_EXPR 
> etc.) could be swapped when checking isomorphic property.  Though vect_slp 
> has basic support for such commutative operators, the related code is not 
> properly implemented:
>   1) vect_build_slp_tree mixes operand swapping in the current slp tree node 
> and operand swapping in its child slp tree nodes.
>   2) Operand swapping in the current slp tree is incorrect when 
> vect_get_and_check_slp_defs has already committed to a fixed operand order.
> In addition, operand swapping for COND_EXPR is implemented in a wrong way 
> (place) because:
>   3) vect_get_and_check_slp_defs swaps operands for COND_EXPR by changing 
> comparison code after vect_build_slp_tree_1 checks the code consistency for 
> the statement group.
>   4) vect_build_slp_tree_1 should support operand swapping for COND_EXPR 
> while it doesn't.
>
> This patch addresses above issues.  It supports COND_EXPR by recording 
> swapping information in vect_build_slp_tree_1 and applies the swap in 
> vect_get_check_slp_defs.  It supports two types swapping: swapping and 
> inverting.  The patch also does refactoring so that operand swapping in child 
> slp tree node and the current slp tree node are differentiated.  With this 
> patch, failures (gcc.dg/vect/slp-cond-3.c) revealed by previous COND_EXPR 
> match.pd patches are resolved.
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

Ok, but please re-instantiate the early-out here:

@@ -905,18 +960,10 @@ vect_build_slp_tree (vec_info *vinfo,
   slp_oprnd_info oprnd_info;
   FOR_EACH_VEC_ELT (stmts, i, stmt)
 {
-  switch (vect_get_and_check_slp_defs (vinfo, stmt, i, _info))
-   {
-   case 0:
- break;
-   case -1:
- matches[0] = false;
- vect_free_oprnd_info (oprnds_info);
- return NULL;



you seem to needlessly continue checking other DEFs if one returns -1.

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-10-25  Bin Cheng  
>
> * tree-vect-slp.c (vect_get_and_check_slp_defs): New parameter SWAP.
> Check slp defs for COND_EXPR by swapping/inverting operands if
> indicated by the new parameter SWAP.
> (vect_build_slp_tree_1): New parameter SWAP.  Check COND_EXPR stmt
> is isomorphic to the first stmt via swapping/inverting.  Store swap
> information in the new parameter SWAP.
> (vect_build_slp_tree): New local array SWAP and pass it to function
> vect_build_slp_tree_1.  Cleanup result handlding code for function
> call to vect_get_and_check_slp_defs.  Skip oeprand swapping if the
> order of operands has been fixed as indicated by SWAP[i].

[PATCH][0/2] GIMPLE Frontend

2016-10-28 Thread Richard Biener


I've posted two patches implementing a GIMPLE Frontend to the extent
required for simple unit testing of GIMPLE passes.  The work was
mostly done by Prasad Ghangal during this years GSoC project.  I've
picked it up to ensure it would be ready for the end of stage1
even though the frontend itself can IMHO be developed throughout
stage3 where required.

https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02336.html
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02335.html

The current status is that simple C testcases can be compiled
and dumped at some random pass with -gimple and that dump
output can be fed back into the GIMPLE frontend (by adding -fgimple
to the C compiler invocation).

Most syntax and semantic checking is only done by the GIMPLE
verification code we have in the middle-end as the parser is
modeled after the C one which usually accepts more complex input
as allowed by GIMPLE.

I expect there is syntax we still do not parse (MEM_REF comes to
my mind) plus on-the-side info that we either do not have a good
source representation for or that the parser does not yet parse
(generally EH info or things like range info or points-to info).

Currently what the parser emits is a mixed bag of low and high
gimple (it's technically high gimple as it still has the outer
gimple BIND).  It parses SSA form with PHIs being represented
as a function call to __PHI with pairs of label (comes-from)
and value.

A feature of the frontend is that it allows (or rather requires
for all declarations) mixing C with GIMPLE.  This means you
can write a single function in GIMPLE and surround it with
a (unit-)test harness written in C.

I hope we can get this into GCC 7, kind of as a preview as I expect
it will evolve once we get the first unit tests (I didn't try to
re-write existing tests).

Thanks,
Richard.

[PATCH][2/2] GIMPLE Frontend, middle-end changes

2016-10-28 Thread Richard Biener


These are the middle-end changes and additions to the testsuite.

They are pretty self-contained, I've organized the changelog
entries below in areas of changes:

 1) dump changes - we add a -gimple dump modifier that allows most
 function dumps to be directy fed back into the GIMPLE FE

 2) pass manager changes to implement the startwith("pass-name")
 feature which implements unit-testing for GIMPLE passes

 3) support for "SSA" input, a __PHI stmt that is lowered once the
 CFG is built, a facility to allow a specific SSA name to be allocated
 plus a small required change in the SSA rewriter to allow for
 pre-existing PHI arguments

Bootstrapped and tested on x86_64-unknown-linux-gnu (together with [1/2]).

I can approve all these changes myself but any comments are welcome.

Thanks,
Richard.

2016-10-28  Prasad Ghangal  
Richard Biener  

* dumpfile.h (TDF_GIMPLE): Add.
* dumpfile.c (dump_options): Add gimple.
* gimple-pretty-print.c (dump_gimple_switch): Adjust dump
for TDF_GIMPLE.
(dump_gimple_label): Likewise.
(dump_gimple_phi): Likewise.
(dump_gimple_bb_header): Likewise.
(dump_phi_nodes): Likewise.
(pp_cfg_jump): Likewise.  Pass in dump flags.
(dump_implicit_edges): Adjust.
* passes.c (pass_init_dump_file): Do not dump function header
for TDF_GIMPLE.
* tree-cfg.c (dump_function_to_file): Dump function return type
and __GIMPLE keyword for TDF_GIMPLE.  Change guard for dumping
GIMPLE stmts.
* tree-pretty-print.c (dump_decl_name): Adjust dump for TDF_GIMPLE.
(dump_generic_node): Likewise.

* function.h (struct function): Add pass_startwith member.
* passes.c (execute_one_pass): Implement startwith.

* tree-ssanames.c (make_ssa_name_fn): New argument, check for version
and assign proper version for parsed ssa names.
* tree-ssanames.h (make_ssa_name_fn): Add new argument to the function.
* internal-fn.c (expand_PHI): New function.
* internal-fn.h (expand_PHI): Declared here.
* internal-fn.def: New defination for PHI.
* tree-cfg.c (lower_phi_internal_fn): New function.
(build_gimple_cfg): Call it.
(verify_gimple_call): Condition for passing label as arg in internal
function PHI.
* tree-into-ssa.c (rewrite_add_phi_arguments): Handle already
present PHIs with arguments.

testsuite/
* gcc.dg/gimplefe-1.c: New testcase.
* gcc.dg/gimplefe-2.c: Likewise.
* gcc.dg/gimplefe-3.c: Likewise.
* gcc.dg/gimplefe-4.c: Likewise.
* gcc.dg/gimplefe-5.c: Likewise.
* gcc.dg/gimplefe-6.c: Likewise.
* gcc.dg/gimplefe-7.c: Likewise.
* gcc.dg/gimplefe-8.c: Likewise.
* gcc.dg/gimplefe-9.c: Likewise.
* gcc.dg/gimplefe-10.c: Likewise.
* gcc.dg/gimplefe-11.c: Likewise.
* gcc.dg/gimplefe-12.c: Likewise.
* gcc.dg/gimplefe-13.c: Likewise.
* gcc.dg/gimplefe-14.c: Likewise.
* gcc.dg/gimplefe-15.c: Likewise.
* gcc.dg/gimplefe-16.c: Likewise.
* gcc.dg/gimplefe-17.c: Likewise.
* gcc.dg/gimplefe-18.c: Likewise.

diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 74522a6..e9483bc 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -108,13 +108,15 @@ static const struct dump_option_value_info dump_options[] 
=
   {"nouid", TDF_NOUID},
   {"enumerate_locals", TDF_ENUMERATE_LOCALS},
   {"scev", TDF_SCEV},
+  {"gimple", TDF_GIMPLE},
   {"optimized", MSG_OPTIMIZED_LOCATIONS},
   {"missed", MSG_MISSED_OPTIMIZATION},
   {"note", MSG_NOTE},
   {"optall", MSG_ALL},
   {"all", ~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_TREE | TDF_RTL | TDF_IPA
| TDF_STMTADDR | TDF_GRAPH | TDF_DIAGNOSTIC | TDF_VERBOSE
-   | TDF_RHS_ONLY | TDF_NOUID | TDF_ENUMERATE_LOCALS | TDF_SCEV)},
+   | TDF_RHS_ONLY | TDF_NOUID | TDF_ENUMERATE_LOCALS | TDF_SCEV
+   | TDF_GIMPLE)},
   {NULL, 0}
 };
 
diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h
index 3f08b16..b7d70f2 100644
--- a/gcc/dumpfile.h
+++ b/gcc/dumpfile.h
@@ -82,9 +82,10 @@ enum tree_dump_index
 #define TDF_CSELIB (1 << 23)   /* Dump cselib details.  */
 #define TDF_SCEV   (1 << 24)   /* Dump SCEV details.  */
 #define TDF_COMMENT(1 << 25)   /* Dump lines with prefix ";;"  */
-#define MSG_OPTIMIZED_LOCATIONS  (1 << 26)  /* -fopt-info optimized sources */
-#define MSG_MISSED_OPTIMIZATION  (1 << 27)  /* missed opportunities */
-#define MSG_NOTE (1 << 28)  /* general optimization info */
+#define TDF_GIMPLE (1 << 26)   /* Dump in GIMPLE FE syntax  */
+#define MSG_OPTIMIZED_LOCATIONS  (1 << 27)  /* -fopt-info optimized sources */
+#define MSG_MISSED_OPTIMIZATION  (1 << 28)  /* missed opportunities */
+#define MSG_NOTE (1 << 29)  /* general optimization info */
 #define MSG_ALL

[PATCH][1/2] GIMPLE Frontend, C FE parts (and GIMPLE parser)

2016-10-28 Thread Richard Biener


These are the C (and ObjC) Frontend changes required by the GIMPLE
Frontend which is now itself contained in c/gimple-parser.[ch].

Most changes are due to a new c-parser.h header where we export
stuff from the C parser that the GIMPLE frontend requires.  Other
changes include new __GIMPLE and __PHI keywords, handling
__GIMPLE as new declspec and dispatching to the GIMPLE parser
for __GIMPLE marked function definitions.

We'd like to include the GIMPLE parser for GCC 7, as the parser
is pretty self-contained (and now works to a good extent) it can
be improved during stage3 or when testcases show that it needs
improvement.

Bootstrapped and tested on x86_64-unknown-linux-gnu (together with [2/2])
for C, ObjC.

Ok for trunk?

Thanks,
Richard.

2016-10-28  Prasad Ghangal  
Richard Biener  

c/
* Make-lang.in (C_AND_OBJC_OBJS): Add gimple-parser.o.
* config-lang.in (gtfiles): Add c/c-parser.h.
* c-tree.h (enum c_declspec_word): Add cdw_gimple.
(struct c_declspecs): Add gimple_pass member and gimple_p flag.
* c-parser.c (enum c_id_kind, struct c_token,
c_parser_next_token_is, c_parser_next_token_is_not,
c_parser_next_token_is_keyword,
enum c_lookahead_kind, enum c_dtr_syn, enum c_parser_prec):
Split out to ...
* c-parser.h: ... new header.
* c-parser.c: Include c-parser.h and gimple-parser.h.
(c_parser_peek_token, c_parser_peek_2nd_token,
c_token_starts_typename, c_parser_next_token_starts_declspecs,
c_parser_next_tokens_start_declaration, c_parser_consume_token,
c_parser_error, c_parser_require, c_parser_skip_until_found,
c_parser_declspecs, c_parser_declarator, c_parser_peek_nth_token,
c_parser_cast_expression): Export.
(c_parser_tokens_buf): New function.
(c_parser_error): Likewise.
(c_parser_set_error): Likewise.
(c_parser_declspecs): Handle RID_GIMPLE.
(c_parser_declaration_or_fndef): Parse __GIMPLE marked body
via c_parser_parse_gimple_body.
* c-parser.h (c_parser_peek_token, c_parser_peek_2nd_token,
c_token_starts_typename, c_parser_next_token_starts_declspecs,
c_parser_next_tokens_start_declaration, c_parser_consume_token,
c_parser_error, c_parser_require, c_parser_skip_until_found,
c_parser_declspecs, c_parser_declarator, c_parser_peek_nth_token,
c_parser_cast_expression): Declare.
(struct c_parser): Declare forward.
(c_parser_tokens_buf): Declare.
(c_parser_error): Likewise.
(c_parser_set_error): Likewise.
* gimple-parser.c: New file.
* gimple-parser.h: Likewise.

obj-c/
* config-lang.in (gtfiles): Add c/c-parser.h.

c-family/
* c-common.h (c_common_resword): Add RID_GIMPLE, RID_PHI types.
* c-common.h (enum rid): Add RID_GIMPLE, RID_PHI.
* c.opt (fgimple): New option.

* doc/invoke.texi (fgimple): Document.
 
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 307862b..2997c83 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -435,6 +435,8 @@ const struct c_common_resword c_common_reswords[] =
   { "__underlying_type", RID_UNDERLYING_TYPE, D_CXXONLY },
   { "__volatile",  RID_VOLATILE,   0 },
   { "__volatile__",RID_VOLATILE,   0 },
+  { "__GIMPLE",RID_GIMPLE, D_CONLY },
+  { "__PHI",   RID_PHI,D_CONLY },
   { "alignas", RID_ALIGNAS,D_CXXONLY | D_CXX11 | D_CXXWARN },
   { "alignof", RID_ALIGNOF,D_CXXONLY | D_CXX11 | D_CXXWARN },
   { "asm", RID_ASM,D_ASM },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 547bab2..1fbe060 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -118,6 +118,12 @@ enum rid
 
   RID_FRACT, RID_ACCUM, RID_AUTO_TYPE, RID_BUILTIN_CALL_WITH_STATIC_CHAIN,
 
+  /* "__GIMPLE", for the GIMPLE-parsing extension to the C frontend. */
+  RID_GIMPLE,
+
+  /* "__PHI", for parsing PHI function in GIMPLE FE.  */
+  RID_PHI,
+
   /* C11 */
   RID_ALIGNAS, RID_GENERIC,
 
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 458d453..24d3b8e 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -200,6 +200,10 @@ F
 Driver C ObjC C++ ObjC++ Joined Separate MissingArgError(missing path after 
%qs)
 -FAdd  to the end of the main framework include path.
 
+fgimple
+C Var(flag_gimple) Init(0)
+Enable parsing GIMPLE.
+
 H
 C ObjC C++ ObjC++
 Print the name of header files as they are used.
diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in
index 72c9ae7..cd7108b 100644
--- a/gcc/c/Make-lang.in
+++ b/gcc/c/Make-lang.in
@@ -51,7 +51,8 @@ CFLAGS-c/gccspec.o += $(DRIVER_DEFINES)
 # Language-specific object files for C and Objective C.
 C_AND_OBJC_OBJS = attribs.o c/c-errors.o c/c-decl.o c/c-typeck.o \
   c/c-convert.o

Re: [PATCH] Fix computation of register limit for -fsched-pressure

2016-10-28 Thread Bin.Cheng

On Fri, Oct 28, 2016 at 12:27 PM, Tamar Christina
 wrote:
> Looking at it again,
>
> it seems to be that the testcase should be adjusted.
> There's no actual spilling. It just uses more registers than before due to 
> the scheduling.
Sorry I didn't look into the test, but using more registers sounds
like a regression too?  At least we need to make sure it's reasonable.

Thanks,
bin
>
> I will update the testcase.
>
> Thanks.
>
> 
> From: gcc-patches-ow...@gcc.gnu.org  on behalf 
> of Tamar Christina 
> Sent: Friday, October 28, 2016 10:53:20 AM
> To: Pat Haugen; Maxim Kuvyrkov
> Cc: GCC Patches; nd
> Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure
>
> Forwarding to list as well.
> 
> From: Tamar Christina
> Sent: Friday, October 28, 2016 10:52:17 AM
> To: Pat Haugen; Maxim Kuvyrkov
> Cc: GCC Patches
> Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure
>
> Hi Pat,
>
> The commit seems to be causing some odd stack spills on aarch64.
>
> I've created a new ticket https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78142
>
> Thanks,
> Tamar
>
> 
> From: gcc-patches-ow...@gcc.gnu.org  on behalf 
> of Pat Haugen 
> Sent: Tuesday, October 18, 2016 4:07:55 PM
> To: Maxim Kuvyrkov
> Cc: GCC Patches
> Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure
>
> On 10/18/2016 05:31 AM, Maxim Kuvyrkov wrote:
>>> > I see your point and agree that current code isn't optimal.  However, I 
>>> > don't think your patch is accurate either.  Consider 
>>> > https://gcc.gnu.org/onlinedocs/gccint/Register-Basics.html and let's 
>>> > assume that FIXED_REGISTERS in class CL is set for a third of the 
>>> > registers, and CALL_USED_REGISTERS is set to "1" for another third of 
>>> > registers.  So we have a third available for zero-cost allocation 
>>> > (CALL_USED_REGISTERS-FIXED_REGISTERS), a third available for spill-cost 
>>> > allocation (ALL_REGISTERS-CALL_USED_REGISTERS) and a third non-available 
>>> > (FIXED_REGISTERS).
>>> >
>>> > For a non-loop-single-basic-block function we should be targeting only 
>>> > the third of register available at zero-cost -- correct?
> Yes.
>
>   This is what is done by the current code, but, apparently, by accident.  It 
> seems that the right register count can be obtained with:
>>> >
>>> >  for (int i = 0; i < ira_class_hard_regs_num[cl]; ++i)
>>> > -  if (call_used_regs[ira_class_hard_regs[cl][i]])
>>> > -++call_used_regs_num[cl];
>>> > +  if (!call_used_regs[ira_class_hard_regs[cl][i]]
>>> > +   || fixed_regs[ira_class_hard_regs[cl][i]])
>>> > +++call_saved_regs_num[cl];
>>> >
>>> > Does this look correct to you?
>> Thinking some more, it seems like fixed_regs should not be available to the 
>> scheduler no matter what.  Therefore, the number of fixed registers should 
>> be subtracted from ira_class_hard_regs_num[cl] without any scaling 
>> (entry_freq / bb_freq).
>
> Ahh, yes, I forgot about FIXED_REGISTERS being included in 
> CALL_USED_REGISTERS. I agree they should be totally removed from the register 
> limit calculation. I'll rework the patch.
>
> Thanks,
> Pat
>

Re: [PATCH] Fix computation of register limit for -fsched-pressure

2016-10-28 Thread Tamar Christina

Looking at it again,

it seems to be that the testcase should be adjusted.
There's no actual spilling. It just uses more registers than before due to the 
scheduling.

I will update the testcase.

Thanks.


From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Friday, October 28, 2016 10:53:20 AM
To: Pat Haugen; Maxim Kuvyrkov
Cc: GCC Patches; nd
Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure

Forwarding to list as well.

From: Tamar Christina
Sent: Friday, October 28, 2016 10:52:17 AM
To: Pat Haugen; Maxim Kuvyrkov
Cc: GCC Patches
Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure

Hi Pat,

The commit seems to be causing some odd stack spills on aarch64.

I've created a new ticket https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78142

Thanks,
Tamar


From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Pat Haugen 
Sent: Tuesday, October 18, 2016 4:07:55 PM
To: Maxim Kuvyrkov
Cc: GCC Patches
Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure

On 10/18/2016 05:31 AM, Maxim Kuvyrkov wrote:
>> > I see your point and agree that current code isn't optimal.  However, I 
>> > don't think your patch is accurate either.  Consider 
>> > https://gcc.gnu.org/onlinedocs/gccint/Register-Basics.html and let's 
>> > assume that FIXED_REGISTERS in class CL is set for a third of the 
>> > registers, and CALL_USED_REGISTERS is set to "1" for another third of 
>> > registers.  So we have a third available for zero-cost allocation 
>> > (CALL_USED_REGISTERS-FIXED_REGISTERS), a third available for spill-cost 
>> > allocation (ALL_REGISTERS-CALL_USED_REGISTERS) and a third non-available 
>> > (FIXED_REGISTERS).
>> >
>> > For a non-loop-single-basic-block function we should be targeting only the 
>> > third of register available at zero-cost -- correct?
Yes.

  This is what is done by the current code, but, apparently, by accident.  It 
seems that the right register count can be obtained with:
>> >
>> >  for (int i = 0; i < ira_class_hard_regs_num[cl]; ++i)
>> > -  if (call_used_regs[ira_class_hard_regs[cl][i]])
>> > -++call_used_regs_num[cl];
>> > +  if (!call_used_regs[ira_class_hard_regs[cl][i]]
>> > +   || fixed_regs[ira_class_hard_regs[cl][i]])
>> > +++call_saved_regs_num[cl];
>> >
>> > Does this look correct to you?
> Thinking some more, it seems like fixed_regs should not be available to the 
> scheduler no matter what.  Therefore, the number of fixed registers should be 
> subtracted from ira_class_hard_regs_num[cl] without any scaling (entry_freq / 
> bb_freq).

Ahh, yes, I forgot about FIXED_REGISTERS being included in CALL_USED_REGISTERS. 
I agree they should be totally removed from the register limit calculation. 
I'll rework the patch.

Thanks,
Pat

Fix PR77309, combine eliminates sign bit comparison

2016-10-28 Thread Bernd Schmidt


In this PR, we manage to simplify the code down to

(lt (and (reg) (signbit)) (const 0))

simplify_comparison then calls make_compound_operation on the AND 
expression, and that turns it into a ZERO_EXTRACT of a single bit, 
changing the meaning of the comparison.


The problem is a special case we have for comparisons in 
make_compound_operation, it thinks it should convert single-bit ANDs 
into such extractions. But this is only valid if the bit isn't the sign 
bit, or if we're testing for equality with zero.


The following patch was bootstrapped and tested on x86_64-linux. Ok?


Bernd
PR rtl-optimization/77309
	* combine.c (make_compound_operation): Allow EQ for IN_CODE, and
	don't assume an equality comparison for plain COMPARE.
	(simplify_comparison): Pass a more accurate code to
	make_compound_operation.

PR rtl-optimization/77309
	* gcc.dg/torture/pr77309.c: New test.

Index: gcc/combine.c
===
--- gcc/combine.c	(revision 241233)
+++ gcc/combine.c	(working copy)
@@ -7757,7 +7757,8 @@ extract_left_shift (rtx x, int count)
 
IN_CODE says what kind of expression we are processing.  Normally, it is
SET.  In a memory address it is MEM.  When processing the arguments of
-   a comparison or a COMPARE against zero, it is COMPARE.  */
+   a comparison or a COMPARE against zero, it is COMPARE, or EQ if more
+   precisely it is an equality comparison against zero.  */
 
 rtx
 make_compound_operation (rtx x, enum rtx_code in_code)
@@ -7771,6 +7772,7 @@ make_compound_operation (rtx x, enum rtx
   rtx new_rtx = 0;
   rtx tem;
   const char *fmt;
+  bool equality_comparison = false;
 
   /* PR rtl-optimization/70944.  */
   if (VECTOR_MODE_P (mode))
@@ -7780,6 +7782,11 @@ make_compound_operation (rtx x, enum rtx
  address, we stay there.  If we have a comparison, set to COMPARE,
  but once inside, go back to our default of SET.  */
 
+  if (in_code == EQ)
+{
+  equality_comparison = true;
+  in_code = COMPARE;
+}
   next_code = (code == MEM ? MEM
 	   : ((code == COMPARE || COMPARISON_P (x))
 		  && XEXP (x, 1) == const0_rtx) ? COMPARE
@@ -7988,11 +7995,12 @@ make_compound_operation (rtx x, enum rtx
   /* If we are in a comparison and this is an AND with a power of two,
 	 convert this into the appropriate bit extract.  */
   else if (in_code == COMPARE
-	   && (i = exact_log2 (UINTVAL (XEXP (x, 1 >= 0)
+	   && (i = exact_log2 (UINTVAL (XEXP (x, 1 >= 0
+	   && (equality_comparison || i < GET_MODE_PRECISION (mode) - 1))
 	new_rtx = make_extraction (mode,
-			   make_compound_operation (XEXP (x, 0),
-			next_code),
-			   i, NULL_RTX, 1, 1, 0, 1);
+   make_compound_operation (XEXP (x, 0),
+			next_code),
+   i, NULL_RTX, 1, 1, 0, 1);
 
   /* If the one operand is a paradoxical subreg of a register or memory and
 	 the constant (limited to the smaller mode) has only zero bits where
@@ -12425,7 +12433,11 @@ simplify_comparison (enum rtx_code code,
  We can never remove a SUBREG for a non-equality comparison because
  the sign bit is in a different place in the underlying object.  */
 
-  op0 = make_compound_operation (op0, op1 == const0_rtx ? COMPARE : SET);
+  rtx_code op0_mco_code = SET;
+  if (op1 == const0_rtx)
+op0_mco_code = code == NE || code == EQ ? EQ : COMPARE;
+
+  op0 = make_compound_operation (op0, op0_mco_code);
   op1 = make_compound_operation (op1, SET);
 
   if (GET_CODE (op0) == SUBREG && subreg_lowpart_p (op0)
Index: gcc/testsuite/gcc.dg/torture/pr77309.c
===
--- gcc/testsuite/gcc.dg/torture/pr77309.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr77309.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do run } */
+
+int a, b;
+
+int main ()
+{
+  long c = 1 % (2 ^ b);
+  c = -c & ~(~(b ^ ~b) || a);
+
+  if (c >= 0)
+__builtin_abort ();
+
+  return 0;
+}

[PATCH VECT]Skip unnecessary data dependence check after visited store stmt in slp

2016-10-28 Thread Bin Cheng

Hi,
Function vect_slp_analyze_node_dependences delays data-dependence check for 
visited store stmts until we run into the last store, because all stores are 
sunk/vectorized at the position of the last one.  The problem is that it still 
checks data-dep for current store stmt after the delay part of code.  This is 
unnecessary no matter the last store stmt is encountered or not.  This patch 
fixes the issue by simple refactoring.  Bootstrap and test on x86_64.  Is it OK?

Thanks,
bin

2016-10-27  Bin Cheng  

* tree-vect-data-refs.c (vect_slp_analyze_node_dependences): Skip
unnecessary data dependence check after visited store stmt.diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index c99fa40..9346cfe 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -583,6 +583,7 @@ vect_slp_analyze_node_dependences (slp_instance instance, 
slp_tree node,
  if (!dr_b)
return false;
 
+ bool dependent = false;
  /* If we run into a store of this same instance (we've just
 marked those) then delay dependence checking until we run
 into the last store because this is where it will have
@@ -599,22 +600,21 @@ vect_slp_analyze_node_dependences (slp_instance instance, 
slp_tree node,
= STMT_VINFO_DATA_REF (vinfo_for_stmt (store));
  ddr_p ddr = initialize_data_dependence_relation
(dr_a, store_dr, vNULL);
- if (vect_slp_analyze_data_ref_dependence (ddr))
-   {
- free_dependence_relation (ddr);
- return false;
-   }
+ dependent = vect_slp_analyze_data_ref_dependence (ddr);
  free_dependence_relation (ddr);
+ if (dependent)
+   break;
}
}
-
- ddr_p ddr = initialize_data_dependence_relation (dr_a, dr_b, vNULL);
- if (vect_slp_analyze_data_ref_dependence (ddr))
+ else
{
+ ddr_p ddr = initialize_data_dependence_relation (dr_a,
+  dr_b, vNULL);
+ dependent = vect_slp_analyze_data_ref_dependence (ddr);
  free_dependence_relation (ddr);
- return false;
}
- free_dependence_relation (ddr);
+ if (dependent)
+   return false;
}
 }
   return true;

Re: [PATCH, GCC] Fix conflicting posix_memalign declaration error

2016-10-28 Thread Szabolcs Nagy

On 28/10/16 11:38, Bernd Schmidt wrote:
> On 10/27/2016 10:47 PM, Caroline Tice wrote:
>>
>> * config/i386/pmm_malloc.h (posix_memalign):  Add ifdefs to only
>> decorate the declaration with 'throw()' if __GLIBC__ is defined.
> 
> I seem to recall a similar patch being submitted by Szabolcs. My suggestion 
> at the time was to move _mm_malloc
> into libgcc so that it could just include the right header.
> 

i stopped working on that patch because it seems
gcc-6 does not care about inconsistent exception
specification in this particular case any more
(even in strict standard conform mode) and thus
i could not provide a test case along the patch
which H.J.Lu asked for.

(i also think that hardwiring libc specific
knowledge here is wrong: glibc might change its
declaration, c++ code should not try to redeclare
standard c interfaces, which is just another
regression in the c++ language compared to c.)

Re: [patch] Use straight-line sequence for signed overflow additive operation

2016-10-28 Thread Bernd Schmidt


On 10/27/2016 05:26 PM, Eric Botcazou wrote:

as suggested by Segher, this changes the generic signed-signed-signed case of
expand_addsub_overflow to using a straight-line code sequence instead of a
branchy one, the new sequence being also shorter.



* dojump.c (do_jump_by_parts_greater_rtx): Invert probability when
swapping the arms of the branch.
* internal-fn.c (expand_addsub_overflow): Use a straight-line code
sequence for the generic signed-signed-signed case.


Ok.


Bernd

Re: [PATCH, GCC] Fix conflicting posix_memalign declaration error

2016-10-28 Thread Bernd Schmidt


On 10/27/2016 10:47 PM, Caroline Tice wrote:


* config/i386/pmm_malloc.h (posix_memalign):  Add ifdefs to only
decorate the declaration with 'throw()' if __GLIBC__ is defined.


I seem to recall a similar patch being submitted by Szabolcs. My 
suggestion at the time was to move _mm_malloc into libgcc so that it 
could just include the right header.



Bernd

Re: [PATCH] Fix computation of register limit for -fsched-pressure

2016-10-28 Thread Tamar Christina


Forwarding to list as well.

From: Tamar Christina
Sent: Friday, October 28, 2016 10:52:17 AM
To: Pat Haugen; Maxim Kuvyrkov
Cc: GCC Patches
Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure

Hi Pat,

The commit seems to be causing some odd stack spills on aarch64.

I've created a new ticket https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78142

Thanks,
Tamar


From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Pat Haugen 
Sent: Tuesday, October 18, 2016 4:07:55 PM
To: Maxim Kuvyrkov
Cc: GCC Patches
Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure

On 10/18/2016 05:31 AM, Maxim Kuvyrkov wrote:
>> > I see your point and agree that current code isn't optimal.  However, I 
>> > don't think your patch is accurate either.  Consider 
>> > https://gcc.gnu.org/onlinedocs/gccint/Register-Basics.html and let's 
>> > assume that FIXED_REGISTERS in class CL is set for a third of the 
>> > registers, and CALL_USED_REGISTERS is set to "1" for another third of 
>> > registers.  So we have a third available for zero-cost allocation 
>> > (CALL_USED_REGISTERS-FIXED_REGISTERS), a third available for spill-cost 
>> > allocation (ALL_REGISTERS-CALL_USED_REGISTERS) and a third non-available 
>> > (FIXED_REGISTERS).
>> >
>> > For a non-loop-single-basic-block function we should be targeting only the 
>> > third of register available at zero-cost -- correct?
Yes.

  This is what is done by the current code, but, apparently, by accident.  It 
seems that the right register count can be obtained with:
>> >
>> >  for (int i = 0; i < ira_class_hard_regs_num[cl]; ++i)
>> > -  if (call_used_regs[ira_class_hard_regs[cl][i]])
>> > -++call_used_regs_num[cl];
>> > +  if (!call_used_regs[ira_class_hard_regs[cl][i]]
>> > +   || fixed_regs[ira_class_hard_regs[cl][i]])
>> > +++call_saved_regs_num[cl];
>> >
>> > Does this look correct to you?
> Thinking some more, it seems like fixed_regs should not be available to the 
> scheduler no matter what.  Therefore, the number of fixed registers should be 
> subtracted from ira_class_hard_regs_num[cl] without any scaling (entry_freq / 
> bb_freq).

Ahh, yes, I forgot about FIXED_REGISTERS being included in CALL_USED_REGISTERS. 
I agree they should be totally removed from the register limit calculation. 
I'll rework the patch.

Thanks,
Pat

Re: [PATCH, ARM/testsuite 6/7] Force soft float in ARMv6-M and ARMv8-M Baseline options

2016-10-28 Thread Thomas Preudhomme


On 22/09/16 16:47, Richard Earnshaw (lists) wrote:

On 22/09/16 15:51, Thomas Preudhomme wrote:

Sorry, noticed an error in the patch. It was not caught during testing
because GCC was built with --with-mode=thumb. Correct patch attached.

Best regards,

Thomas

On 22/09/16 14:49, Thomas Preudhomme wrote:

Hi,

ARMv6-M and ARMv8-M Baseline only support soft float ABI. Therefore, the
arm_arch_v8m_base add option should pass -mfloat-abi=soft, much like
-mthumb is
passed for architectures that only support Thumb instruction set. This
patch
adds -mfloat-abi=soft to both arm_arch_v6m and arm_arch_v8m_base add
options.
Patch is in attachment.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2016-07-15  Thomas Preud'homme  

* lib/target-supports.exp (add_options_for_arm_arch_v6m): Add
-mfloat-abi=soft option.
(add_options_for_arm_arch_v8m_base): Likewise.


Is this ok for trunk?

Best regards,

Thomas


6_softfloat_testing_v6m_v8m_baseline.patch


diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 
0dabea0850124947a7fe333e0b94c4077434f278..b5d72f1283be6a6e4736a1d20936e169c1384398
 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3540,24 +3540,25 @@ proc check_effective_target_arm_fp16_hw { } {
 # Usage: /* { dg-require-effective-target arm_arch_v5_ok } */
 #/* { dg-add-options arm_arch_v5 } */
 #   /* { dg-require-effective-target arm_arch_v5_multilib } */
-foreach { armfunc armflag armdef } { v4 "-march=armv4 -marm" __ARM_ARCH_4__
-v4t "-march=armv4t" __ARM_ARCH_4T__
-v5 "-march=armv5 -marm" __ARM_ARCH_5__
-v5t "-march=armv5t" __ARM_ARCH_5T__
-v5te "-march=armv5te" __ARM_ARCH_5TE__
-v6 "-march=armv6" __ARM_ARCH_6__
-v6k "-march=armv6k" __ARM_ARCH_6K__
-v6t2 "-march=armv6t2" __ARM_ARCH_6T2__
-v6z "-march=armv6z" __ARM_ARCH_6Z__
-v6m "-march=armv6-m -mthumb" 
__ARM_ARCH_6M__
-v7a "-march=armv7-a" __ARM_ARCH_7A__
-v7r "-march=armv7-r" __ARM_ARCH_7R__
-v7m "-march=armv7-m -mthumb" 
__ARM_ARCH_7M__
-v7em "-march=armv7e-m -mthumb" 
__ARM_ARCH_7EM__
-v8a "-march=armv8-a" __ARM_ARCH_8A__
-v8_1a "-march=armv8.1a" __ARM_ARCH_8A__
-v8m_base "-march=armv8-m.base -mthumb" 
__ARM_ARCH_8M_BASE__
-v8m_main "-march=armv8-m.main -mthumb" 
__ARM_ARCH_8M_MAIN__ } {
+foreach { armfunc armflag armdef } {
+   v4 "-march=armv4 -marm" __ARM_ARCH_4__
+   v4t "-march=armv4t" __ARM_ARCH_4T__
+   v5 "-march=armv5 -marm" __ARM_ARCH_5__
+   v5t "-march=armv5t" __ARM_ARCH_5T__
+   v5te "-march=armv5te" __ARM_ARCH_5TE__
+   v6 "-march=armv6" __ARM_ARCH_6__
+   v6k "-march=armv6k" __ARM_ARCH_6K__
+   v6t2 "-march=armv6t2" __ARM_ARCH_6T2__
+   v6z "-march=armv6z" __ARM_ARCH_6Z__
+   v6m "-march=armv6-m -mthumb -mfloat-abi=soft" __ARM_ARCH_6M__
+   v7a "-march=armv7-a" __ARM_ARCH_7A__
+   v7r "-march=armv7-r" __ARM_ARCH_7R__
+   v7m "-march=armv7-m -mthumb" __ARM_ARCH_7M__
+   v7em "-march=armv7e-m -mthumb" __ARM_ARCH_7EM__
+   v8a "-march=armv8-a" __ARM_ARCH_8A__
+   v8_1a "-march=armv8.1a" __ARM_ARCH_8A__
+   v8m_base "-march=armv8-m.base -mthumb -mfloat-abi=soft" 
__ARM_ARCH_8M_BASE__
+   v8m_main "-march=armv8-m.main -mthumb" __ARM_ARCH_8M_MAIN__ } {
 eval [string map [list FUNC $armfunc FLAG $armflag DEF $armdef ] {
proc check_effective_target_arm_arch_FUNC_ok { } {
if { [ string match "*-marm*" "FLAG" ] &&



I think if you're going to do this you need to also check that changing
the ABI in this way isn't incompatible with other aspects of how the
user has invoked dejagnu.


The reason this patch was made is that without it dg-require-effective-target 
arm_arch_v8m_base_ok evaluates to true for an arm-none-linux-gnueabihf toolchain 
but then any testcase containing a function for such a target (such as the 
atomic-op-* in gcc.target/arm) will error out because ARMv8-M Baseline does not 
support hard float ABI.


I see 2 ways to fix this:

1) the approach taken in this patch, ie saying that to select ARMv8-M baseline 
architecture you need the right -march, -mthumb but also the right float ABI.


Note that the comment at the top of that procedure says:
# Creates a series of routines that return 1 if the given architecture
# can be selected and a routine to give the flags

[PATCH][GIMPLE FE] Revert some unnecesary changes, get a .gimple dump

2016-10-28 Thread Richard Biener


Tested on x86_64-unknown-linux-gnu.

Richard.

2016-10-28  Richard Biener  

c/
* gimple-parser.c: Include tree-dump.h.
(c_parser_parse_gimple_body): Do not claim PROP_gimple_lcf
or PROP_gimple_leh.  Dump to .gimple dump file.
(c_parser_parse_ssa_name): Set DECL_GIMPLE_REG_P for vector
and complex vars we create SSA names for.

* gimplify.c (gimplify_function_tree): Revert unnecessary changes.
* tree-cfg.c (dump_function_to_file): Adjust condition when to
dump GIMPLE.

diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index 392f6b0..7f8d948 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssanames.h"
 #include "gimple-ssa.h"
 #include "tree-dfa.h"
+#include "tree-dump.h"
 
 
 /* Gimple parsing functions.  */
@@ -114,10 +115,11 @@ c_parser_parse_gimple_body (c_parser *parser)
   /* While we have SSA names in the IL we do not have a CFG built yet
  and PHIs are represented using a PHI internal function.  We do
  have lowered control flow and exception handling (well, we do not
- have parser support for EH yet).  */
-  cfun->curr_properties = PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh;
+ have parser support for EH yet).  But as we still have BINDs
+ we have to go through lowering again.  */
+  cfun->curr_properties = PROP_gimple_any;
 
-  return;
+  dump_function (TDI_generic, current_function_decl);
 }
 
 /* Parse a compound statement in gimple function body.
@@ -696,6 +698,9 @@ c_parser_parse_ssa_name (c_parser *parser,
  c_parser_error (parser, "base variable or SSA name not 
declared"); 
  return error_mark_node;
}
+ if (VECTOR_TYPE_P (TREE_TYPE (parent))
+ || TREE_CODE (TREE_TYPE (parent)) == COMPLEX_TYPE)
+   DECL_GIMPLE_REG_P (parent) = 1;
  name = make_ssa_name_fn (cfun, parent,
   gimple_build_nop (), version);
}
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index efae537..5da1725 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -12337,22 +12337,7 @@ gimplify_function_tree (tree fndecl)
   && !needs_to_live_in_memory (ret))
 DECL_GIMPLE_REG_P (ret) = 1;
 
-  if (!cfun->gimple_body)
-bind = gimplify_body (fndecl, true);
-  else
-{
-  gimple_seq seq;
-  gimple *outer_stmt;
-  seq = cfun->gimple_body;
-  outer_stmt = gimple_seq_first_stmt (seq);
-  if (gimple_code (outer_stmt) == GIMPLE_BIND
- && gimple_seq_first (seq) == gimple_seq_last (seq))
-   bind = as_a  (outer_stmt);
-  else
-   bind = gimple_build_bind (NULL_TREE, seq, NULL);
-
-  DECL_SAVED_TREE (fndecl) = NULL_TREE;
-}
+  bind = gimplify_body (fndecl, true);
 
   /* The tree body of the function is no longer needed, replace it
  with the new GIMPLE body.  */
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 7d7763d..e99e102 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -7666,7 +7666,7 @@ dump_function_to_file (tree fndecl, FILE *file, int flags)
 
   fprintf (file, "}\n");
 }
-  else if (DECL_SAVED_TREE (fndecl) == NULL)
+  else if (fun->curr_properties & PROP_gimple_any)
 {
   /* The function is now in GIMPLE form but the CFG has not been
 built yet.  Emit the single sequence of GIMPLE statements

Re: [PATCH] Fix COMPONENT_REF expansion (PR rtl-optimization/77919)

2016-10-28 Thread Jakub Jelinek

On Fri, Oct 28, 2016 at 10:52:34AM +0200, Richard Biener wrote:
> > I've already committed the original patch based on Eric's review, but
> > managed to come up with another testcase that still ICEs (one with two
> > different complex modes).  Is the following ok for trunk if it passes
> > bootstrap/regtest?
> 
> As we're dealing with memory isn't GET_MODE_SIZE the correct thing to
> use?

GET_MODE_PRECISION is what the case VIEW_CONVERT_EXPR case tests:
  /* If the input and output modes are both the same, we are done.  */
  if (mode == GET_MODE (op0))
;
  /* If neither mode is BLKmode, and both modes are the same size
 then we can use gen_lowpart.  */
  else if (mode != BLKmode && GET_MODE (op0) != BLKmode
   && (GET_MODE_PRECISION (mode)
   == GET_MODE_PRECISION (GET_MODE (op0)))
   && !COMPLEX_MODE_P (GET_MODE (op0)))  
{
  if (GET_CODE (op0) == SUBREG)
op0 = force_reg (GET_MODE (op0), op0);
  temp = gen_lowpart_common (mode, op0);  
  if (temp)
op0 = temp;
  else
{
  if (!REG_P (op0) && !MEM_P (op0))
op0 = force_reg (GET_MODE (op0), op0);
  op0 = gen_lowpart (mode, op0);
}
}
The CONCAT operands can be a MEM (just likely won't be both MEM or adjacent
MEM).

BTW, the VCE part could also handle 2 different complex modes.

Jakub

Re: [PATCH] Fix COMPONENT_REF expansion (PR rtl-optimization/77919)

2016-10-28 Thread Richard Biener

On Fri, 28 Oct 2016, Jakub Jelinek wrote:

> On Fri, Oct 28, 2016 at 01:32:22AM -0600, Jeff Law wrote:
> > >I think so.  I'll leave the rest to people more familiar with RTL
> > >expansion -- generally I thought the callers of expand() have to deal
> > >with expansions that return a different mode?
> > You generally have to deal with expansions that return the object in a new
> > pseudo instead of the one you asked for -- so the caller has to test for
> > that and emit a copy when it happens.
> > 
> > I don't offhand recall cases where we have to deal with getting a result in
> > a different mode than was asked.  But given the history of the expanders, I
> > wouldn't be surprised if there's oddball cases where that can happen.
> 
> I've already committed the original patch based on Eric's review, but
> managed to come up with another testcase that still ICEs (one with two
> different complex modes).  Is the following ok for trunk if it passes
> bootstrap/regtest?

As we're dealing with memory isn't GET_MODE_SIZE the correct thing to
use?

> 2016-10-28  Jakub Jelinek  
> 
>   PR rtl-optimization/77919
>   * expr.c (expand_expr_real_1) : Only avoid forcing
>   into memory if both modes are complex and their inner modes have the
>   same precision.  If the two modes are different complex modes, convert
>   each part separately and generate a new CONCAT.
> 
>   * g++.dg/torture/pr77919-2.C: New test.
> 
> --- gcc/expr.c.jj 2016-10-28 10:35:14.753234774 +0200
> +++ gcc/expr.c2016-10-28 10:35:28.760057716 +0200
> @@ -10422,10 +10422,35 @@ expand_expr_real_1 (tree exp, rtx target
> {
>   if (bitpos == 0
>   && bitsize == GET_MODE_BITSIZE (GET_MODE (op0))
> - && COMPLEX_MODE_P (mode1))
> + && COMPLEX_MODE_P (mode1)
> + && COMPLEX_MODE_P (GET_MODE (op0))
> + && (GET_MODE_PRECISION (GET_MODE_INNER (mode1))
> + == GET_MODE_PRECISION (GET_MODE_INNER (GET_MODE (op0)
> {
>   if (reversep)
> op0 = flip_storage_order (GET_MODE (op0), op0);
> + if (mode1 != GET_MODE (op0))
> +   {
> + rtx parts[2];
> + for (int i = 0; i < 2; i++)
> +   {
> + rtx op = read_complex_part (op0, i != 0);
> + if (GET_CODE (op) == SUBREG)
> +   op = force_reg (GET_MODE (op), op);
> + rtx temp = gen_lowpart_common (GET_MODE_INNER (mode1),
> +op);
> + if (temp)
> +   op = temp;
> + else
> +   {
> + if (!REG_P (op) && !MEM_P (op))
> +   op = force_reg (GET_MODE (op), op);
> + op = gen_lowpart (GET_MODE_INNER (mode1), op);
> +   }
> + parts[i] = op;
> +   }
> + op0 = gen_rtx_CONCAT (mode1, parts[0], parts[1]);
> +   }
>   return op0;
> }
>   if (bitpos == 0
> --- gcc/testsuite/g++.dg/torture/pr77919-2.C.jj   2016-10-28 
> 10:35:49.294798140 +0200
> +++ gcc/testsuite/g++.dg/torture/pr77919-2.C  2016-10-28 10:29:38.0 
> +0200
> @@ -0,0 +1,10 @@
> +// PR rtl-optimization/77919
> +// { dg-do compile }
> +
> +typedef _Complex long long B;
> +struct A { A (double) {} _Complex double i; };
> +typedef struct { B b; } C;
> +struct D { D (const B ) : b (x) {} B b; };
> +static inline B foo (const double *x) { C *a; a = (C *) x; return a->b; }
> +static inline D baz (const A ) { return foo ((double *) ); }
> +D b = baz (0);
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix COMPONENT_REF expansion (PR rtl-optimization/77919)

2016-10-28 Thread Jakub Jelinek

On Fri, Oct 28, 2016 at 01:32:22AM -0600, Jeff Law wrote:
> >I think so.  I'll leave the rest to people more familiar with RTL
> >expansion -- generally I thought the callers of expand() have to deal
> >with expansions that return a different mode?
> You generally have to deal with expansions that return the object in a new
> pseudo instead of the one you asked for -- so the caller has to test for
> that and emit a copy when it happens.
> 
> I don't offhand recall cases where we have to deal with getting a result in
> a different mode than was asked.  But given the history of the expanders, I
> wouldn't be surprised if there's oddball cases where that can happen.

I've already committed the original patch based on Eric's review, but
managed to come up with another testcase that still ICEs (one with two
different complex modes).  Is the following ok for trunk if it passes
bootstrap/regtest?

2016-10-28  Jakub Jelinek  

PR rtl-optimization/77919
* expr.c (expand_expr_real_1) : Only avoid forcing
into memory if both modes are complex and their inner modes have the
same precision.  If the two modes are different complex modes, convert
each part separately and generate a new CONCAT.

* g++.dg/torture/pr77919-2.C: New test.

--- gcc/expr.c.jj   2016-10-28 10:35:14.753234774 +0200
+++ gcc/expr.c  2016-10-28 10:35:28.760057716 +0200
@@ -10422,10 +10422,35 @@ expand_expr_real_1 (tree exp, rtx target
  {
if (bitpos == 0
&& bitsize == GET_MODE_BITSIZE (GET_MODE (op0))
-   && COMPLEX_MODE_P (mode1))
+   && COMPLEX_MODE_P (mode1)
+   && COMPLEX_MODE_P (GET_MODE (op0))
+   && (GET_MODE_PRECISION (GET_MODE_INNER (mode1))
+   == GET_MODE_PRECISION (GET_MODE_INNER (GET_MODE (op0)
  {
if (reversep)
  op0 = flip_storage_order (GET_MODE (op0), op0);
+   if (mode1 != GET_MODE (op0))
+ {
+   rtx parts[2];
+   for (int i = 0; i < 2; i++)
+ {
+   rtx op = read_complex_part (op0, i != 0);
+   if (GET_CODE (op) == SUBREG)
+ op = force_reg (GET_MODE (op), op);
+   rtx temp = gen_lowpart_common (GET_MODE_INNER (mode1),
+  op);
+   if (temp)
+ op = temp;
+   else
+ {
+   if (!REG_P (op) && !MEM_P (op))
+ op = force_reg (GET_MODE (op), op);
+   op = gen_lowpart (GET_MODE_INNER (mode1), op);
+ }
+   parts[i] = op;
+ }
+   op0 = gen_rtx_CONCAT (mode1, parts[0], parts[1]);
+ }
return op0;
  }
if (bitpos == 0
--- gcc/testsuite/g++.dg/torture/pr77919-2.C.jj 2016-10-28 10:35:49.294798140 
+0200
+++ gcc/testsuite/g++.dg/torture/pr77919-2.C2016-10-28 10:29:38.0 
+0200
@@ -0,0 +1,10 @@
+// PR rtl-optimization/77919
+// { dg-do compile }
+
+typedef _Complex long long B;
+struct A { A (double) {} _Complex double i; };
+typedef struct { B b; } C;
+struct D { D (const B ) : b (x) {} B b; };
+static inline B foo (const double *x) { C *a; a = (C *) x; return a->b; }
+static inline D baz (const A ) { return foo ((double *) ); }
+D b = baz (0);


Jakub

Re: [PR debug/77773] segfault when compiling __simd64_float16_t with -g

2016-10-28 Thread Richard Biener

On Thu, Oct 27, 2016 at 6:14 PM, Aldy Hernandez  wrote:
> On 10/27/2016 12:35 AM, Richard Biener wrote:
>>
>> On Wed, Oct 26, 2016 at 9:17 PM, Aldy Hernandez  wrote:
>>>
>>> The following one-liner segfaults on arm-eabi when compiled with
>>> -mfloat-abi=hard -g:
>>>
>>> __simd64_float16_t usingit;
>>>
>>> The problem is that the pretty printer (in simple_type_specificer()) is
>>> dereferencing a NULL result from c_common_type_for_mode:
>>>
>>>   int prec = TYPE_PRECISION (t);
>>>   if (ALL_FIXED_POINT_MODE_P (TYPE_MODE (t)))
>>> t = c_common_type_for_mode (TYPE_MODE (t), TYPE_SATURATING
>>> (t));
>>>   else
>>> t = c_common_type_for_mode (TYPE_MODE (t), TYPE_UNSIGNED
>>> (t));
>>>   if (TYPE_NAME (t))
>>>
>>> The type in question is:
>>>
>>> 
>>>
>>> which corresponds to HFmode and which AFAICT, does not have a type by
>>> design.
>>>
>>> I see that other uses of *type_for_node() throughout the compiler check
>>> the
>>> result for NULL, so perhaps we should do the same here.
>>>
>>> The attached patch fixes the problem.
>>>
>>> OK for trunk?
>>
>>
>> Your added assert shows another possible issue - can you fix this by
>> assigning
>> the result of c_common_type_for_mode to a new variable, like common_t and
>> use that for the TYPE_NAME (...) case?  I think this was what was
>> intended.
>
>
> Certainly.
>
> OK pending tests?

Ok.

Thanks,
Richard.

> Aldy

Re: [PATCH] Fix and testcases for pr72747

2016-10-28 Thread Richard Biener

On Thu, Oct 27, 2016 at 5:37 PM, Will Schmidt  wrote:
> Hi,
>
> Per PR72747, A statement such as "v = vec_splats (1);" correctly
> initializes a vector.  However, a statement such as "v[1] = v[0] =
> vec_splats (1);" initializes both v[1] and v[0] to random garbage.
>
> It has been determined that this is occurring because we did not emit
> the actual initialization statement before our final exit from
> gimplify_init_constructor, at which time we lose the expression when we
> assign *expr_p to either NULL or object.  This problem affected both constant
> and non-constant initializers.  Corrected this by moving the logic to
> emit the statement up earlier within the if/else logic.
>
> Bootstrapped and make check ran without regressions on
> powerpc64le-unknown-linux-gnu.
>
> OK for trunk?

Ok.

RIchard.

> Thanks,
> -Will
>
> gcc:
> 2016-10-26  Will Schmidt 
>
> PR middle-end/72747
> * gimplify.c (gimplify_init_constructor): Move emit of constructor
> assignment to earlier in the if/else logic.
>
> testsuite:
> 2016-10-26  Will Schmidt 
>
> PR middle-end/72747
> * c-c++-common/pr72747-1.c: New test.
> * c-c++-common/pr72747-2.c: Likewise.
>
> Index: gcc/gimplify.c
> ===
> --- gcc/gimplify.c  (revision 241600)
> +++ gcc/gimplify.c  (working copy)
> @@ -4730,24 +4730,23 @@
>
>if (ret == GS_ERROR)
>  return GS_ERROR;
> -  else if (want_value)
> +  /* If we have gimplified both sides of the initializer but have
> + not emitted an assignment, do so now.  */
> +  if (*expr_p)
>  {
> +  tree lhs = TREE_OPERAND (*expr_p, 0);
> +  tree rhs = TREE_OPERAND (*expr_p, 1);
> +  gassign *init = gimple_build_assign (lhs, rhs);
> +  gimplify_seq_add_stmt (pre_p, init);
> +}
> +  if (want_value)
> +{
>*expr_p = object;
>return GS_OK;
>  }
>else
>  {
> -  /* If we have gimplified both sides of the initializer but have
> -not emitted an assignment, do so now.  */
> -  if (*expr_p)
> -   {
> - tree lhs = TREE_OPERAND (*expr_p, 0);
> - tree rhs = TREE_OPERAND (*expr_p, 1);
> - gassign *init = gimple_build_assign (lhs, rhs);
> - gimplify_seq_add_stmt (pre_p, init);
> - *expr_p = NULL;
> -   }
> -
> +  *expr_p = NULL;
>return GS_ALL_DONE;
>  }
>  }
> Index: gcc/testsuite/c-c++-common/pr72747-1.c
> ===
> --- gcc/testsuite/c-c++-common/pr72747-1.c  (revision 0)
> +++ gcc/testsuite/c-c++-common/pr72747-1.c  (working copy)
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-maltivec -fdump-tree-gimple" } */
> +
> +/* PR 72747: Test that cascaded definition is happening for constant 
> vectors. */
> +
> +#include 
> +
> +int main (int argc, char *argv[])
> +{
> +   __vector int v1,v2;
> +   v1 = v2 = vec_splats ((int) 42);
> +   return 0;
> +}
> +/* { dg-final { scan-tree-dump-times " v2 = { 42, 42, 42, 42 }" 1 "gimple" } 
> } */
> +
> Index: gcc/testsuite/c-c++-common/pr72747-2.c
> ===
> --- gcc/testsuite/c-c++-common/pr72747-2.c  (revision 0)
> +++ gcc/testsuite/c-c++-common/pr72747-2.c  (working copy)
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-c -maltivec -fdump-tree-gimple" } */
> +
> +/* PR 72747: test that cascaded definition is happening for non constants. */
> +
> +void foo ()
> +{
> +  extern int i;
> +  __vector int v,w;
> +v = w = (vector int) { i };
> +}
> +
> +int main (int argc, char *argv[])
> +{
> +  return 0;
> +}
> +/* { dg-final { scan-tree-dump-times " w = {i.0_1}" 1 "gimple" } } */
>
>

Re: [PATCH] Fix host_size_t_cst_p predicate

2016-10-28 Thread Richard Biener

On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška  wrote:
> On 10/27/2016 03:35 PM, Richard Biener wrote:
>> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška  wrote:
>>> Running simple test-case w/o the proper header file causes ICE:
>>> strncmp ("a", "b", -1);
>>>
>>> 0xe74462 tree_to_uhwi(tree_node const*)
>>> ../../gcc/tree.c:7324
>>> 0x90a23f host_size_t_cst_p
>>> ../../gcc/fold-const-call.c:63
>>> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*, tree_node*, 
>>> tree_node*)
>>> ../../gcc/fold-const-call.c:1512
>>> 0x787b01 fold_builtin_3
>>> ../../gcc/builtins.c:8385
>>> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, int, bool)
>>> ../../gcc/builtins.c:8465
>>> 0x9052b1 fold(tree_node*)
>>> ../../gcc/fold-const.c:11919
>>> 0x6de2bb c_fully_fold_internal
>>> ../../gcc/c/c-fold.c:185
>>> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*)
>>> ../../gcc/c/c-fold.c:90
>>> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*)
>>> ../../gcc/c/c-typeck.c:10369
>>> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*)
>>> ../../gcc/c/c-typeck.c:10414
>>> 0x6cb578 c_parser_statement_after_labels
>>> ../../gcc/c/c-parser.c:5430
>>> 0x6cd333 c_parser_compound_statement_nostart
>>> ../../gcc/c/c-parser.c:4944
>>> 0x6cdbde c_parser_compound_statement
>>> ../../gcc/c/c-parser.c:4777
>>> 0x6c93ac c_parser_declaration_or_fndef
>>> ../../gcc/c/c-parser.c:2176
>>> 0x6d51ab c_parser_external_declaration
>>> ../../gcc/c/c-parser.c:1574
>>> 0x6d5c09 c_parser_translation_unit
>>> ../../gcc/c/c-parser.c:1454
>>> 0x6d5c09 c_parse_file()
>>> ../../gcc/c/c-parser.c:18173
>>> 0x72ffd2 c_common_parse_file()
>>> ../../gcc/c-family/c-opts.c:1087
>>>
>>> Following patch improves the host_size_t_cst_p predicate.
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>>
>>> Ready to be installed?
>>
>> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) *
>> CHAR_BIT test is now redundant.
>>
>> OTOH it was probably desired to allow -1 here?  A little looking back
>> in time should tell.
>
> Ok, it started with r229922, where it was changed from:
>
>   if (tree_fits_uhwi_p (len) && p1 && p2)
> {
>   const int i = strncmp (p1, p2, tree_to_uhwi (len));
> ...
>
> to current version:
>
> case CFN_BUILT_IN_STRNCMP:
>   {
> bool const_size_p = host_size_t_cst_p (arg2, );
>
> Thus I'm suggesting to change to back to it.
>
> Ready to be installed?

Let's ask Richard.

Richard.

> Thanks,
> Martin
>
>>
>> Richard.
>>
>>> Martin
>

[PATCH][GIMPLE FE] Adjust __GIMPLE parsing

2016-10-28 Thread Richard Biener


The following handles __GIMPLE as declspec which better follows other
similar handling.  It also reverts the C FE parts back to rely
on finish_function (that adjusts things like visibility - sth we'll
need in the end).  And it makes the __GIMPLE specs (currently only
startswith) optional.

Tested on x86_64-unknown-linux-gnu.

Given the similarity to attributes I'm not sure if we want to try
re-using the attribute list parsing as well as the storage during
declspec processing (I'd have to experiment with that).  At least
once we add more specs to __GIMPLE this might make things simpler.

Richard.

2016-10-28  Richard Biener  

c/
* c-parser.c (c_parser_declaration_or_fndef): Move __GIMPLE
parsing to declspecs parsing.  Rely on finish_function.
(c_parser_declspecs): Handle RID_GIMPLE.
* c-tree.h (enum c_declspec_word): Add cdw_gimple.
(struct c_declspecs): Add gimple_pass member and gimple_p flag.
* gimple-parser.c (c_parser_gimple_pass_list): Adjust interface,
make specs optional and consume closing paren.
* gimple-parser.h (c_parser_gimple_pass_list): Adjust.

testsuite/
* gcc.dg/gimplefe-1.c: Drop optional specs for __GIMPLE.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 413d8a7..2997c83 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -436,7 +436,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__volatile",  RID_VOLATILE,   0 },
   { "__volatile__",RID_VOLATILE,   0 },
   { "__GIMPLE",RID_GIMPLE, D_CONLY },
-  { "__PHI",   RID_PHI,D_CONLY},
+  { "__PHI",   RID_PHI,D_CONLY },
   { "alignas", RID_ALIGNAS,D_CXXONLY | D_CXX11 | D_CXXWARN },
   { "alignof", RID_ALIGNOF,D_CXXONLY | D_CXX11 | D_CXXWARN },
   { "asm", RID_ASM,D_ASM },
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 63385fc..d21b8e9 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -159,7 +159,7 @@ struct GTY(()) c_parser {
   /* The look-ahead tokens.  */
   c_token * GTY((skip)) tokens;
   /* Buffer for look-ahead tokens.  */
-  c_token GTY(()) tokens_buf[4];
+  c_token tokens_buf[4];
   /* How many look-ahead tokens are available (0 - 4, or
  more if parsing from pre-lexed tokens).  */
   unsigned int tokens_avail;
@@ -1563,8 +1563,6 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
   tree all_prefix_attrs;
   bool diagnosed_no_specs = false;
   location_t here = c_parser_peek_token (parser)->location;
-  bool gimple_body_p = false;
-  char *pass = NULL;
 
   if (static_assert_ok
   && c_parser_next_token_is_keyword (parser, RID_STATIC_ASSERT))
@@ -1650,19 +1648,6 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
   return;
 }
 
-  if (c_parser_next_token_is (parser, CPP_KEYWORD))
-{
-  c_token *kw_token = c_parser_peek_token (parser);
-  if (kw_token->keyword == RID_GIMPLE)
-   {
- gimple_body_p = true;
- c_parser_consume_token (parser);
- c_parser_gimple_pass_list (parser, );
- c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
-"expected %<)%>");
-   }
-}
-
   finish_declspecs (specs);
   bool auto_type_p = specs->typespec_word == cts_auto_type;
   if (c_parser_next_token_is (parser, CPP_SEMICOLON))
@@ -1793,7 +1778,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
   struct c_declarator *declarator;
   bool dummy = false;
   timevar_id_t tv;
-  tree fnbody;
+  tree fnbody = NULL_TREE;
   /* Declaring either one or more declarators (in which case we
 should diagnose if there were no declaration specifiers) or a
 function definition (in which case the diagnostic for
@@ -2076,7 +2061,6 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
c_parser_declaration_or_fndef (parser, false, false, false,
   true, false, NULL, vNULL);
   store_parm_decls ();
-
   if (omp_declare_simd_clauses.exists ()
  || !vec_safe_is_empty (parser->cilk_simd_fn_tokens))
c_finish_omp_declare_simd (parser, current_function_decl, NULL_TREE,
@@ -2086,23 +2070,23 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
   DECL_STRUCT_FUNCTION (current_function_decl)->function_start_locus
= c_parser_peek_token (parser)->location;
 
-  if (gimple_body_p && flag_gimple)
+  /* If the definition was marked with __GIMPLE then parse the
+ function body as GIMPLE.  */
+  if (specs->gimple_p)
{
- cfun->pass_startwith = pass;
+ cfun->pass_startwith = specs->gimple_pass;
  bool saved = in_late_binary_op;
  in_late_binary_op = true;
  c_parser_parse_gimple_body (parser);
  in_late_binary_op = saved;
-

Re: [arm.c] Use VAR_P

2016-10-28 Thread Kyrill Tkachov



On 28/10/16 06:00, Prathamesh Kulkarni wrote:

Hi,
This patch uses replaces TREE_CODE(x) == VAR_DECL by VAR_P(x) in arm.c.
Bootstrap+tested on arm-linux-gnueabihf.
OK to commit ?


Ok (I would consider this obvious).
Thanks,
Kyrill


Thanks,
Prathamesh

Re: RFC [2/3] divmod transform v2 - override expand_divmod_libfunc for ARM port

2016-10-28 Thread Kyrill Tkachov



On 27/10/16 14:31, Prathamesh Kulkarni wrote:

On 26 October 2016 at 18:51, Kyrill Tkachov  wrote:

On 16/10/16 07:00, Prathamesh Kulkarni wrote:

Hi,
This patch overrides expand_divmod_libfunc hook for ARM port.
I separated the SImode tests into separate file from DImode tests
because certain arm configs (cortex-15) have hardware div insn for
SImode but not for DImode, and for that config we want SImode tests to
be disabled but not DImode tests. The patch therefore has two
target-effective checks: divmod and divmod_simode.
Cross-tested on arm*-*-*.
OK to commit ?


Looks ok to me, the implementation of the hook is straightforward though
I have a question.
arm_expand_divmod_libfunc is not supposed to ever be called for SImode
TARGET_IDIV.
It asserts it rather than just failing the expansion in some way.
How does the midend know not to call TARGET_EXPAND_DIVMOD_LIBFUNC in that
case, does it
just check if the relevant sdiv optab is not available?

Yes. The divmod transform isn't enabled if target supports hardware
div in the same
or wider mode even if divmod libfunc is available for the given mode.

If so, this is ok for trunk assuming a bootstrap and test run on
arm-none-linux-gnueabihf
shows no issues. Would be good to try one for --with-cpu=cortex-a15 and one
with a !TARGET_IDIV
target, say --with-cpu=cortex-a9.

Bootstrap+tested on arm-linux-gnueabihf --with-cpu=cortex-a15 and
--with-cpu=cortex-a9.
Also cross-tested on arm*-*-*.
OK to commit ?


Yes, thanks.
Kyrill


Thanks,
Prathamesh

Sorry for the delay.

Thanks,
Kyrill



Thanks,
Prathamesh

Re: [PATCH] Fix COMPONENT_REF expansion (PR rtl-optimization/77919)

2016-10-28 Thread Jeff Law


On 10/28/2016 01:25 AM, Richard Biener wrote:

On Thu, 27 Oct 2016, Jakub Jelinek wrote:


Hi!

The following testcase ICEs on x86_64-linux with -O1, the problem is
that we expand assignment from COMPONENT_REF of MEM_REF into a V4SImode
SSA_NAME.  The MEM_REF has non-addressable DCmode var inside of it, and
type of a struct containing a single V4SImode element.

The bug seems to be that if the op0 (i.e. get_inner_reference expanded)
is a CONCAT and we want a reference that covers all bits of the CONCAT,
we just short path it and return it immediately, rather than trying
to convert it to the requested mode.

I've bootstrapped/regtested on x86_64-linux and i686-linux the following
patch, which takes the short path only if we want a complex mode.
The only place it makes a difference in both bootstraps/regtests was this
new testcase.

Though, perhaps COMPLEX_MODE_P (mode1) is also wrong, if mode1 isn't
GET_MODE (op0), then we still will return something with unexpected mode
(e.g. DCmode vs. CDImode); I wonder if for such mismatches we shouldn't
just force_reg (convert_modes ()) each CONCAT operand separately and
create a new CONCAT.  Do we have a guarantee that COMPLEX_MODE_P (GET_MODE 
(op0))
if op0 is CONCAT?


I think so.  I'll leave the rest to people more familiar with RTL
expansion -- generally I thought the callers of expand() have to deal
with expansions that return a different mode?
You generally have to deal with expansions that return the object in a 
new pseudo instead of the one you asked for -- so the caller has to test 
for that and emit a copy when it happens.


I don't offhand recall cases where we have to deal with getting a result 
in a different mode than was asked.  But given the history of the 
expanders, I wouldn't be surprised if there's oddball cases where that 
can happen.


jeff

Re: [PATCH] Fix COMPONENT_REF expansion (PR rtl-optimization/77919)

2016-10-28 Thread Richard Biener

On Thu, 27 Oct 2016, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs on x86_64-linux with -O1, the problem is
> that we expand assignment from COMPONENT_REF of MEM_REF into a V4SImode
> SSA_NAME.  The MEM_REF has non-addressable DCmode var inside of it, and
> type of a struct containing a single V4SImode element.
> 
> The bug seems to be that if the op0 (i.e. get_inner_reference expanded)
> is a CONCAT and we want a reference that covers all bits of the CONCAT,
> we just short path it and return it immediately, rather than trying
> to convert it to the requested mode.
> 
> I've bootstrapped/regtested on x86_64-linux and i686-linux the following
> patch, which takes the short path only if we want a complex mode.
> The only place it makes a difference in both bootstraps/regtests was this
> new testcase.
> 
> Though, perhaps COMPLEX_MODE_P (mode1) is also wrong, if mode1 isn't
> GET_MODE (op0), then we still will return something with unexpected mode
> (e.g. DCmode vs. CDImode); I wonder if for such mismatches we shouldn't
> just force_reg (convert_modes ()) each CONCAT operand separately and
> create a new CONCAT.  Do we have a guarantee that COMPLEX_MODE_P (GET_MODE 
> (op0))
> if op0 is CONCAT?

I think so.  I'll leave the rest to people more familiar with RTL 
expansion -- generally I thought the callers of expand() have to deal
with expansions that return a different mode?

Richard.

> 2016-10-27  Jakub Jelinek  
> 
>   PR rtl-optimization/77919
>   * expr.c (expand_expr_real_1) : Force CONCAT into
>   MEM if mode1 is not a complex mode.
> 
>   * g++.dg/torture/pr77919.C: New test.
> 
> --- gcc/expr.c.jj 2016-10-27 20:50:22.699586175 +0200
> +++ gcc/expr.c2016-10-27 21:15:30.146309091 +0200
> @@ -10421,7 +10421,8 @@ expand_expr_real_1 (tree exp, rtx target
>   if (GET_CODE (op0) == CONCAT && !must_force_mem)
> {
>   if (bitpos == 0
> - && bitsize == GET_MODE_BITSIZE (GET_MODE (op0)))
> + && bitsize == GET_MODE_BITSIZE (GET_MODE (op0))
> + && COMPLEX_MODE_P (mode1))
> {
>   if (reversep)
> op0 = flip_storage_order (GET_MODE (op0), op0);
> --- gcc/testsuite/g++.dg/torture/pr77919.C.jj 2016-10-27 21:15:19.883440139 
> +0200
> +++ gcc/testsuite/g++.dg/torture/pr77919.C2016-10-27 21:16:01.694906242 
> +0200
> @@ -0,0 +1,11 @@
> +// PR rtl-optimization/77919
> +// { dg-do compile }
> +// { dg-additional-options "-Wno-psabi" }
> +
> +struct A { A (double) {} _Complex double i; };
> +typedef int __attribute__ ((vector_size (16))) B;
> +typedef struct { B b; } C;
> +struct D { D (const B ) : b (x) {} B b; };
> +static inline B foo (const double *x) { C *a; a = (C *) x; return a->b; }
> +static inline D baz (const A ) { return foo ((double *) ); }
> +D b = baz (0);
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix COMPONENT_REF expansion (PR rtl-optimization/77919)

2016-10-28 Thread Eric Botcazou

> Though, perhaps COMPLEX_MODE_P (mode1) is also wrong, if mode1 isn't
> GET_MODE (op0), then we still will return something with unexpected mode
> (e.g. DCmode vs. CDImode); I wonder if for such mismatches we shouldn't
> just force_reg (convert_modes ()) each CONCAT operand separately and
> create a new CONCAT.  Do we have a guarantee that COMPLEX_MODE_P (GET_MODE
> (op0)) if op0 is CONCAT?

In practice, at this point of the pipeline, I'd think so.

> 2016-10-27  Jakub Jelinek  
> 
>   PR rtl-optimization/77919
>   * expr.c (expand_expr_real_1) : Force CONCAT into
>   MEM if mode1 is not a complex mode.
> 
>   * g++.dg/torture/pr77919.C: New test.

I think that's OK.

-- 
Eric Botcazou

Re: [PATCH] Fix a REE bug (PR rtl-optimization/78132)

2016-10-28 Thread Eric Botcazou

> 2016-10-27  Jakub Jelinek  
> 
>   PR rtl-optimization/78132
>   * ree.c (combine_reaching_defs): Give up if copy_needed and
>   !HARD_REGNO_MODE_OK (REGNO (src_reg), dst_mode).
> 
>   * gcc.target/i386/pr78132.c: New test.

OK, thanks.

-- 
Eric Botcazou

87 matches

Mail list logo