Removed charlen_03.f90 and charlen_10.f90 testcases

2016-12-17 Thread Steve Kargl
I have committed a patch to remove the charlen_03.f90 and
charlen_10.f90 testcases.  These have been attached to PR
fortran/78746.

Index: ChangeLog
===
--- ChangeLog   (revision 243777)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2016-12-17  Steven G. Kargl  
+
+   PR fortran/78746
+   * charlen_03.f90: Remove test. 
+   * charlen_10.f90: Ditto.
+
 2016-12-17  Jakub Jelinek  
 
PR sanitizer/78832
Index: gfortran.dg/charlen_03.f90
===
--- gfortran.dg/charlen_03.f90  (revision 243777)
+++ gfortran.dg/charlen_03.f90  (nonexistent)
@@ -1,9 +0,0 @@
-! { dg-do compile }
-! PR fortran/65173
-program p
-   type t
-  character(:), allocatable :: x(n) ! { dg-error "must have a deferred 
shape" }
-   end type
-end
-! { dg-excess-errors "must be of INTEGER type" }
-
Index: gfortran.dg/charlen_10.f90
===
--- gfortran.dg/charlen_10.f90  (revision 243777)
+++ gfortran.dg/charlen_10.f90  (nonexistent)
@@ -1,9 +0,0 @@
-! { dg-do compile }
-! PR fortran/65173
-program p
-   type t
-  character(:), allocatable :: x(y)1  ! { dg-error "must have a deferred 
shape" }
-   end type
-end
-! { dg-excess-errors "must be of INTEGER type" } 
-

-- 
Steve


Re: [Patch, Fortran, committed] PR 78592: [7 Regression] ICE in gfc_find_specific_dtio_proc, at fortran/interface.c:4939

2016-12-17 Thread Janus Weil
2016-12-17 22:49 GMT+01:00 Janus Weil :
> Hi Mikael,
>
>>> I have just committed a completely obvious patch for this PR. All it
>>> does is rearrange some expressions to avoid an ICE (see attachment):
>>>
>> I have made a late review of it, and I think it’s not as innocent as it
>> seems.
>> With it, if the first element’s formal is not properly set, the rest of the
>> generic linked list is ignored.
>>
>> Here is a variant of the testcase committed.
>> It shows no error if the module procedure line is commented, and two errors
>> if it’s uncommented, one error saying that the write of z2 should use DTIO.
>> The latter error should not appear.
>
> thanks for the comment, and sorry that I didn't notice this problem.
>
> The attached patch should fix it. What do you think about it?
>
> Btw, with trunk r243776 I get an ICE on your test case, when the
> module procedure is commented out. You don't see this?

I have opened PR 78848 for this.

Cheers,
Janus


Re: [PATCH] fix powerpc64le bootstrap failure caused by r243661 (PR 78817)

2016-12-17 Thread Martin Sebor

On 12/17/2016 01:01 PM, Markus Trippelsdorf wrote:

On 2016.12.16 at 18:27 +0100, Jakub Jelinek wrote:

On Fri, Dec 16, 2016 at 10:10:00AM -0700, Martin Sebor wrote:

No.  The first call to sm_read_sector just doesn't exit.  So it is warning
about dead code.


If the code is dead then GCC should eliminate it.  With it eliminated


There is (especially with jump threading, but not limited to that, other
optimizations may result in that too), code that even very smart optimizing
compiler isn't able to prove that is dead.


the warning would be gone.  The warning isn't smart and it doesn't
try to be.  It only works with what GCC gives it.  In this case the
dump shows that GCC thinks the code is reachable.  If it isn't that
would seem to highlight a missed optimization opportunity, not a need
to make the warning smarter than the optimizer.


No, it highlights that the warning is done in a wrong place where it suffers
from too many false positives.


Another issue with Martin's patch is that it adds many false positives
when one uses -fsanitize=undefined, e.g.:

 % cat test.ii
struct A {
  char *msg;
  A(const char *);
};
A::A(const char *p1) {
  msg = new char[__builtin_strlen(p1) + 1];
  __builtin_strcpy(msg, p1);
}

 % g++ -Wall  -O2 -c test.ii
 % g++ -Wall -fsanitize=undefined -O2 -c test.ii
test.ii: In constructor ‘A::A(const char*)’:
test.ii:6:34: warning: argument 1 null where non-null expected [-Wnonnull]
   msg = new char[__builtin_strlen(p1) + 1];
  ^~~~
test.ii:6:34: note: in a call to built-in function ‘long unsigned int 
__builtin_strlen(const char*)’


I agree that these warnings should probably not be issued, though
it's interesting to see where they come from.  The calls are in
the code emitted by GCC, are reachable, and end up taking place
with the right Ubsan runtime recovery options.  It turns out that
Ubsan transforms calls to nonnull functions into conditional
branches testing the argument for null, like so:

if (s == 0)
  __builtin___ubsan_handle_nonnull_arg();
n = strlen (s);

and GCC then transforms those into

if (s == 0)
  {
__builtin___ubsan_handle_nonnull_arg();
n = strlen (NULL);
  }

When the ubsan_handle_nonnull_arg function returns to the caller
the call to strlen(NULL) is made.

This doesn't happen when -fno-sanitize-recover=undefined is used
when compiling the file because then Ubsan inserts calls to
__builtin___ubsan_handle_nonnull_arg_abort instead which is declared
noreturn.

It would be easy to suppress these warnings when -fsantize=undefined
is used.  Distinguishing the Ubsan-inserted calls from those in the
source code would be more challenging.  Implementing the warning as
a pass that runs before the Ubsan pass gets around the problem.

Martin


Re: [Patch, Fortran, committed] PR 78592: [7 Regression] ICE in gfc_find_specific_dtio_proc, at fortran/interface.c:4939

2016-12-17 Thread Janus Weil
Hi Mikael,

>> I have just committed a completely obvious patch for this PR. All it
>> does is rearrange some expressions to avoid an ICE (see attachment):
>>
> I have made a late review of it, and I think it’s not as innocent as it
> seems.
> With it, if the first element’s formal is not properly set, the rest of the
> generic linked list is ignored.
>
> Here is a variant of the testcase committed.
> It shows no error if the module procedure line is commented, and two errors
> if it’s uncommented, one error saying that the write of z2 should use DTIO.
> The latter error should not appear.

thanks for the comment, and sorry that I didn't notice this problem.

The attached patch should fix it. What do you think about it?

Btw, with trunk r243776 I get an ICE on your test case, when the
module procedure is commented out. You don't see this?

Cheers,
Janus




> program p
>type t
>end type
>type(t) :: z
>type, extends(t) :: t2
>end type
>class(t2), allocatable :: z2
>interface write(formatted)
>   procedure wf2
>   module procedure wf   ! error
>end interface
>print *, z
>allocate(z2)
>print *, z2 ! spurious error
>   contains
>subroutine wf2(this, a, b, c, d, e)
>   class(t2), intent(in) :: this
>   integer, intent(in) :: a
>   character, intent(in) :: b
>   integer, intent(in) :: c(:)
>   integer, intent(out) :: d
>   character, intent(inout) :: e
>end subroutine wf2
> end
>
>
>
>>
>> pr78592.diff
>>
>> Index: gcc/fortran/interface.c
>> ===
>> --- gcc/fortran/interface.c (revision 243004)
>> +++ gcc/fortran/interface.c (working copy)
>> @@ -4933,15 +4933,15 @@ gfc_find_specific_dtio_proc (gfc_symbol *derived,
>>   && tb_io_st->n.sym
>>   && tb_io_st->n.sym->generic)
>> {
>> - gfc_interface *intr;
>> - for (intr = tb_io_st->n.sym->generic; intr; intr = intr->next)
>> + for (gfc_interface *intr = tb_io_st->n.sym->generic;
>> +  intr && intr->sym && intr->sym->formal;
>> +  intr = intr->next)
>> {
>>   gfc_symbol *fsym = intr->sym->formal->sym;
>> - if (intr->sym && intr->sym->formal
>> - && ((fsym->ts.type == BT_CLASS
>> - && CLASS_DATA (fsym)->ts.u.derived == extended)
>> -   || (fsym->ts.type == BT_DERIVED
>> -   && fsym->ts.u.derived == extended)))
>> + if ((fsym->ts.type == BT_CLASS
>> +  && CLASS_DATA (fsym)->ts.u.derived == extended)
>> + || (fsym->ts.type == BT_DERIVED
>> + && fsym->ts.u.derived == extended))
>> {
>>   dtio_sub = intr->sym;
>>   break;
>
>
Index: gcc/fortran/interface.c
===
--- gcc/fortran/interface.c (revision 243776)
+++ gcc/fortran/interface.c (working copy)
@@ -4949,17 +4949,19 @@ gfc_find_specific_dtio_proc (gfc_symbol *derived,
  && tb_io_st->n.sym->generic)
{
  for (gfc_interface *intr = tb_io_st->n.sym->generic;
-  intr && intr->sym && intr->sym->formal;
-  intr = intr->next)
+  intr && intr->sym; intr = intr->next)
{
- gfc_symbol *fsym = intr->sym->formal->sym;
- if ((fsym->ts.type == BT_CLASS
-  && CLASS_DATA (fsym)->ts.u.derived == extended)
- || (fsym->ts.type == BT_DERIVED
- && fsym->ts.u.derived == extended))
+ if (intr->sym->formal)
{
- dtio_sub = intr->sym;
- break;
+ gfc_symbol *fsym = intr->sym->formal->sym;
+ if ((fsym->ts.type == BT_CLASS
+ && CLASS_DATA (fsym)->ts.u.derived == extended)
+ || (fsym->ts.type == BT_DERIVED
+ && fsym->ts.u.derived == extended))
+   {
+ dtio_sub = intr->sym;
+ break;
+   }
}
}
}


Re: [Patch, Fortran, committed] PR 78592: [7 Regression] ICE in gfc_find_specific_dtio_proc, at fortran/interface.c:4939

2016-12-17 Thread Mikael Morin

Hello,

Le 30/11/2016 à 10:52, Janus Weil a écrit :

Hi all,

I have just committed a completely obvious patch for this PR. All it
does is rearrange some expressions to avoid an ICE (see attachment):

I have made a late review of it, and I think it’s not as innocent as it 
seems.
With it, if the first element’s formal is not properly set, the rest of 
the generic linked list is ignored.


Here is a variant of the testcase committed.
It shows no error if the module procedure line is commented, and two 
errors if it’s uncommented, one error saying that the write of z2 should 
use DTIO. The latter error should not appear.



program p
   type t
   end type
   type(t) :: z
   type, extends(t) :: t2
   end type
   class(t2), allocatable :: z2
   interface write(formatted)
  procedure wf2
  module procedure wf   ! error
   end interface
   print *, z
   allocate(z2)
   print *, z2 ! spurious error
  contains
   subroutine wf2(this, a, b, c, d, e)
  class(t2), intent(in) :: this
  integer, intent(in) :: a
  character, intent(in) :: b
  integer, intent(in) :: c(:)
  integer, intent(out) :: d
  character, intent(inout) :: e
   end subroutine wf2
end





pr78592.diff

Index: gcc/fortran/interface.c
===
--- gcc/fortran/interface.c (revision 243004)
+++ gcc/fortran/interface.c (working copy)
@@ -4933,15 +4933,15 @@ gfc_find_specific_dtio_proc (gfc_symbol *derived,
  && tb_io_st->n.sym
  && tb_io_st->n.sym->generic)
{
- gfc_interface *intr;
- for (intr = tb_io_st->n.sym->generic; intr; intr = intr->next)
+ for (gfc_interface *intr = tb_io_st->n.sym->generic;
+  intr && intr->sym && intr->sym->formal;
+  intr = intr->next)
{
  gfc_symbol *fsym = intr->sym->formal->sym;
- if (intr->sym && intr->sym->formal
- && ((fsym->ts.type == BT_CLASS
- && CLASS_DATA (fsym)->ts.u.derived == extended)
-   || (fsym->ts.type == BT_DERIVED
-   && fsym->ts.u.derived == extended)))
+ if ((fsym->ts.type == BT_CLASS
+  && CLASS_DATA (fsym)->ts.u.derived == extended)
+ || (fsym->ts.type == BT_DERIVED
+ && fsym->ts.u.derived == extended))
{
  dtio_sub = intr->sym;
  break;




Re: [PATCH] fix powerpc64le bootstrap failure caused by r243661 (PR 78817)

2016-12-17 Thread Markus Trippelsdorf
On 2016.12.16 at 18:27 +0100, Jakub Jelinek wrote:
> On Fri, Dec 16, 2016 at 10:10:00AM -0700, Martin Sebor wrote:
> > > No.  The first call to sm_read_sector just doesn't exit.  So it is warning
> > > about dead code.
> > 
> > If the code is dead then GCC should eliminate it.  With it eliminated
> 
> There is (especially with jump threading, but not limited to that, other
> optimizations may result in that too), code that even very smart optimizing
> compiler isn't able to prove that is dead.
> 
> > the warning would be gone.  The warning isn't smart and it doesn't
> > try to be.  It only works with what GCC gives it.  In this case the
> > dump shows that GCC thinks the code is reachable.  If it isn't that
> > would seem to highlight a missed optimization opportunity, not a need
> > to make the warning smarter than the optimizer.
> 
> No, it highlights that the warning is done in a wrong place where it suffers
> from too many false positives.

Another issue with Martin's patch is that it adds many false positives
when one uses -fsanitize=undefined, e.g.:

 % cat test.ii
struct A {
  char *msg;
  A(const char *);
};
A::A(const char *p1) {
  msg = new char[__builtin_strlen(p1) + 1];
  __builtin_strcpy(msg, p1);
}

 % g++ -Wall  -O2 -c test.ii
 % g++ -Wall -fsanitize=undefined -O2 -c test.ii
test.ii: In constructor ‘A::A(const char*)’:
test.ii:6:34: warning: argument 1 null where non-null expected [-Wnonnull]
   msg = new char[__builtin_strlen(p1) + 1];
  ^~~~
test.ii:6:34: note: in a call to built-in function ‘long unsigned int 
__builtin_strlen(const char*)’

-- 
Markus


Re: [PATCH] Fix -fcompare-debug sanopt bug (PR sanitizer/78832)

2016-12-17 Thread Richard Biener
On December 16, 2016 9:56:10 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>The following testcase fails with -fcompare-debug, because we have a bb
>containing 2 ASAN_MARK (POISON, ...) calls immediately after each
>other,
>followed with -g only by debug stmts till end of basic block.
>sanitize_asan_mark_poison walks stmts in a bb backwards and assumes
>(incorrectly) that gsi_remove will move the iterator backwards too, but
>it moves it forward (and if removing stmt at the end of bb turns it
>into
>gsi end).  The effect is that with -g, we remove both ASAN_MARK calls
>(desirable), because after the first removal gsi is moved to the
>following
>debug stmt which we do nothing about once again and get to the first
>ASAN_MARK, but with -g0 gsi becomes end and we don't remove anything
>further
>from the bb after removing the last stmt.
>
>Fixed by copying the iterator to a copy, gsi_prev and then gsi_remove
>on the
>copy.  The rest is just cleanup, I think we don't need the bool vars
>when we
>can easily continue; instead.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

>2016-12-16  Jakub Jelinek  
>
>   PR sanitizer/78832
>   * sanopt.c (sanitize_asan_mark_unpoison): Remove next variable, use
>   continue if gsi_next should be skipped.
>   (sanitize_asan_mark_poison): Remove prev variable, use continue if
>   gsi_prev should be skipped.  When removing ASAN_MARK, do gsi_prev
>   first and gsi_remove on a previously made copy of the iterator.
>
>   * gcc.dg/asan/pr78832.c: New test.
>
>--- gcc/sanopt.c.jj2016-12-14 20:28:14.0 +0100
>+++ gcc/sanopt.c   2016-12-16 17:08:03.016432304 +0100
>@@ -740,7 +740,6 @@ sanitize_asan_mark_unpoison (void)
>   gimple_stmt_iterator gsi;
>   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
>   {
>-bool next = true;
> gimple *stmt = gsi_stmt (gsi);
> if (gimple_call_internal_p (stmt, IFN_ASAN_MARK))
>   {
>@@ -753,12 +752,11 @@ sanitize_asan_mark_unpoison (void)
> unlink_stmt_vdef (stmt);
> release_defs (stmt);
> gsi_remove (, true);
>-next = false;
>+continue;
>   }
>   }
> 
>-if (next)
>-  gsi_next ();
>+gsi_next ();
>   }
> }
> }
>@@ -840,7 +838,6 @@ sanitize_asan_mark_poison (void)
>   gimple_stmt_iterator gsi;
>   for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);)
>   {
>-bool prev = true;
> gimple *stmt = gsi_stmt (gsi);
> if (maybe_contains_asan_check (stmt))
>   break;
>@@ -850,12 +847,13 @@ sanitize_asan_mark_poison (void)
>   fprintf (dump_file, "Removing ASAN_MARK poison\n");
> unlink_stmt_vdef (stmt);
> release_defs (stmt);
>-gsi_remove (, true);
>-prev = false;
>+gimple_stmt_iterator gsi2 = gsi;
>+gsi_prev ();
>+gsi_remove (, true);
>+continue;
>   }
> 
>-if (prev)
>-  gsi_prev ();
>+gsi_prev ();
>   }
> }
> }
>--- gcc/testsuite/gcc.dg/asan/pr78832.c.jj 2016-12-16
>17:22:27.446280214 +0100
>+++ gcc/testsuite/gcc.dg/asan/pr78832.c2016-12-16 17:23:17.117638783
>+0100
>@@ -0,0 +1,22 @@
>+/* PR sanitizer/78832 */
>+/* { dg-do compile } */
>+/* { dg-additional-options "-fcompare-debug" } */
>+
>+void bar (int *);
>+
>+int
>+foo (int x)
>+{
>+  int *f = 0;
>+  if (x)
>+goto lab;
>+  {
>+int y, z;
>+bar ();
>+int *d = 
>+bar ();
>+int *e = 
>+  }
>+  f = 
>+  lab: return 6;
>+}
>
>   Jakub




Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE

2016-12-17 Thread Richard Sandiford
Jeff Law  writes:
> This is the first of the 4 part patchkit to address deficiencies in our 
> DSE implementation.
>
> This patch addresses the P2 regression 33562 which has been a low 
> priority regression since gcc-4.3.  To summarize, DSE no longer has the 
> ability to detect an aggregate store as dead if subsequent stores are 
> done in a piecemeal fashion.
>
> I originally tackled this by changing how we lower complex objects. 
> That was sufficient to address 33562, but was reasonably rejected.
>
> This version attacks the problem by improving DSE to track stores to 
> memory at a byte level.  That allows us to determine if a series of 
> stores completely covers an earlier store (thus making the earlier store 
> dead).
>
> A useful side effect of this is we can detect when parts of a store are 
> dead and potentially rewrite the store.  This patch implements that for 
> complex object initializations.  While not strictly part of 33562, it's 
> so closely related that I felt it belongs as part of this patch.
>
> This originally limited the size of the tracked memory space to 64 
> bytes.  I bumped the limit after working through the CONSTRUCTOR and 
> mem* trimming patches.  The 256 byte limit is still fairly arbitrary and 
> I wouldn't lose sleep if we throttled back to 64 or 128 bytes.

FWIW (and shouldn't affect whether the patch goes in)...

If SVE support is accepted for GCC 8 then we have the additional problem
that the sizes of useful aggregates can be runtime invariants.  For example,
in the SVE equivalent of float64x2x2_t, it would be good to be able to
detect that an assignment to the full structure is dead if we later
assign to both of the individual vectors.  Probably the most convenient
way of doing that would be to track ranges rather than individual bytes,
like the local part of the RTL DSE pass does.  (It was relatively easy
to convert local RTL DSE to variable-length modes, but the global part
also uses byte bitmaps.)

I guess bitmaps are going to be more efficient for small structures or
for accesses that occur in an arbitrary order.  But tracking ranges
means adding at most one fixed-size range record per update, so I suppose
the throttle would be on the number of records (= number of gaps + 1)
rather than the size of the structure.

Thanks,
Richard


Re: [PATCH 1/2] [ARC] Generating code for profiling.

2016-12-17 Thread Matthias Klose
On 16.12.2016 14:16, Claudiu Zissulescu wrote:
> Committed with the suggested mods.
> 
> Thank you for your review,
> Claudiu
> 
>> -Original Message-
>> From: Andrew Burgess [mailto:andrew.burg...@embecosm.com]
>> Sent: Wednesday, December 14, 2016 1:13 PM
>> To: Claudiu Zissulescu 
>> Cc: gcc-patches@gcc.gnu.org; francois.bed...@synopsys.com
>> Subject: Re: [PATCH 1/2] [ARC] Generating code for profiling.
>>
>> * Claudiu Zissulescu  [2016-12-05
>> 12:59:12 +0100]:

libgcc/config/arc/gmon is empty now. Removed this directory as obvious.

Matthias





[PATCH, i386]: Further cleanups to bitmanip instructions

2016-12-17 Thread Uros Bizjak
The patch moves splitting to after epilogue_completed, so eventual
regrename pass won't break register matching requirements for
TARGET_AVOID_FALSE_DEP_FOR_BMI fix.

Also, remove unneeded expanders and merge a couple of patterns.

2016-12-17  Uros Bizjak  

* config/i386/i386.md (*tzcnt_1): Merge *tzcnt_1_falsedep_1
and *tzcnt_1 to define_insn_and_split pattern.  Adjust split
condition to split after epilogue_completed.
(ctz2): Remove expander.
(ctz2): Merge *ctz2_falsedep_1 and *ctz2 to
define_insn_and_split pattern.  Adjust split condition to split
after epilogue_completed.
(clz2_lznct): Remove expander.
(clz2_lzcnt): Merge *clz2_lzcnt_falsedep_1 and
*clz2 to define_insn_and_split pattern.  Adjust split
condition to split after epilogue_completed.
(_): Remove expander.
(_): Merge *__falsedep_1 and
*_ to define_insn_and_split pattern.  Adjust split
condition to split after epilogue_completed.
(_hi): New insn pattern.
(popcount2): Remove expander.
(popcount2): Merge *popcount2_falsedep_1 and
*popcount2 to define_insn_and_split pattern.  Adjust split
condition to split after epilogue_completed.
(popcounthi2): New insn pattern.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 243746)
+++ config/i386/i386.md (working copy)
@@ -12569,19 +12566,17 @@
   ix86_expand_clear (operands[2]);
 })
 
-; False dependency happens when destination is only updated by tzcnt,
-; lzcnt or popcnt.  There is no false dependency when destination is
-; also used in source.
-(define_insn_and_split "*tzcnt_1_falsedep_1"
+(define_insn_and_split "*tzcnt_1"
   [(set (reg:CCC FLAGS_REG)
(compare:CCC (match_operand:SWI48 1 "nonimmediate_operand" "rm")
 (const_int 0)))
(set (match_operand:SWI48 0 "register_operand" "=r")
(ctz:SWI48 (match_dup 1)))]
-  "TARGET_BMI
-   && TARGET_AVOID_FALSE_DEP_FOR_BMI && optimize_function_for_speed_p (cfun)"
-  "#"
-  "&& reload_completed"
+  "TARGET_BMI"
+  "tzcnt{}\t{%1, %0|%0, %1}";
+  "&& TARGET_AVOID_FALSE_DEP_FOR_BMI && epilogue_completed
+   && optimize_function_for_speed_p (cfun)
+   && !reg_mentioned_p (operands[0], operands[1])"
   [(parallel
 [(set (reg:CCC FLAGS_REG)
  (compare:CCC (match_dup 1) (const_int 0)))
@@ -12588,11 +12583,16 @@
  (set (match_dup 0)
  (ctz:SWI48 (match_dup 1)))
  (unspec [(match_dup 0)] UNSPEC_INSN_FALSE_DEP)])]
-{
-  if (!reg_mentioned_p (operands[0], operands[1]))
-ix86_expand_clear (operands[0]);
-})
+  "ix86_expand_clear (operands[0]);"
+  [(set_attr "type" "alu1")
+   (set_attr "prefix_0f" "1")
+   (set_attr "prefix_rep" "1")
+   (set_attr "btver2_decode" "double")
+   (set_attr "mode" "")])
 
+; False dependency happens when destination is only updated by tzcnt,
+; lzcnt or popcnt.  There is no false dependency when destination is
+; also used in source.
 (define_insn "*tzcnt_1_falsedep"
   [(set (reg:CCC FLAGS_REG)
(compare:CCC (match_operand:SWI48 1 "nonimmediate_operand" "rm")
@@ -12609,20 +12609,6 @@
(set_attr "btver2_decode" "double")
(set_attr "mode" "")])
 
-(define_insn "*tzcnt_1"
-  [(set (reg:CCC FLAGS_REG)
-   (compare:CCC (match_operand:SWI48 1 "nonimmediate_operand" "rm")
-(const_int 0)))
-   (set (match_operand:SWI48 0 "register_operand" "=r")
-   (ctz:SWI48 (match_dup 1)))]
-  "TARGET_BMI"
-  "tzcnt{}\t{%1, %0|%0, %1}"
-  [(set_attr "type" "alu1")
-   (set_attr "prefix_0f" "1")
-   (set_attr "prefix_rep" "1")
-   (set_attr "btver2_decode" "double")
-   (set_attr "mode" "")])
-
 (define_insn "*bsf_1"
   [(set (reg:CCZ FLAGS_REG)
(compare:CCZ (match_operand:SWI48 1 "nonimmediate_operand" "rm")
@@ -12637,13 +12623,6 @@
(set_attr "znver1_decode" "vector")
(set_attr "mode" "")])
 
-(define_expand "ctz2"
-  [(parallel
-[(set (match_operand:SWI48 0 "register_operand")
- (ctz:SWI48
-   (match_operand:SWI48 1 "nonimmediate_operand")))
- (clobber (reg:CC FLAGS_REG))])])
-
 (define_insn_and_split "*ctzhi2"
   [(set (match_operand:SI 0 "register_operand")
(ctz:SI
@@ -12662,28 +12641,47 @@
   DONE;
 })
 
-; False dependency happens when destination is only updated by tzcnt,
-; lzcnt or popcnt.  There is no false dependency when destination is
-; also used in source.
-(define_insn_and_split "*ctz2_falsedep_1"
+(define_insn_and_split "ctz2"
   [(set (match_operand:SWI48 0 "register_operand" "=r")
(ctz:SWI48
  (match_operand:SWI48 1 "nonimmediate_operand" "rm")))
(clobber (reg:CC FLAGS_REG))]
+  ""
+{
+  if (TARGET_BMI)
+return "tzcnt{}\t{%1, %0|%0, %1}";
+  else if (optimize_function_for_size_p (cfun))
+;
+  else if (TARGET_GENERIC)
+/* tzcnt expands to 'rep bsf' and we can use it 

[Ada] Fix PR ada/78845

2016-12-17 Thread Simon Wright
The function Ada.Numerics.Generic_Real_Arrays.Inverse is required (ARM 
G.3.1(72)) to return a matrix with the bounds of the dimension indices swapped, 
i.e. result'Range(1) == input'Range(2) and vice versa. The present code gets 
result'Range(1) correct, but result'Range(2) always starts at 1.

Of course, many users would always use ranges starting at 1 and wouldn't see a 
problem.

2016-12-17  Simon Wright  

PR ada/78845
* a-ngrear.adb (Inverse): call Unit_Matrix with First_1 set to 
A'First(2)
and vice versa.



a-ngrear.adb.diff
Description: Binary data


Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE

2016-12-17 Thread Richard Biener
On December 16, 2016 7:43:22 PM GMT+01:00, Jakub Jelinek  
wrote:
>On Fri, Dec 16, 2016 at 06:35:58PM +, Joseph Myers wrote:
>> On Thu, 15 Dec 2016, Jeff Law wrote:
>> 
>> > This version attacks the problem by improving DSE to track stores
>to memory at
>> > a byte level.  That allows us to determine if a series of stores
>completely
>> > covers an earlier store (thus making the earlier store dead).
>> 
>> Question: suppose you have an assignment for a struct with padding,
>then 
>> stores for all the elements.  Does it detect that the original
>assignment 
>> is dead (because there is no need to copy padding on struct
>assignment)?

In GIMPLE struct assignment is equal to memcpy, so that info is lost after 
leaving frontend realm.

>We fold memcpy into struct assignment early, does the same apply also
>to
>memcpy?  Or shall we fold memcpy into struct assignment only when there
>is
>no padding (or find different IL representation of that)?

If we want to change that we can use a char[] type for copying on GIMPLE.

Richard.

>   Jakub




Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE

2016-12-17 Thread Jeff Law

On 12/16/2016 12:29 AM, Trevor Saunders wrote:

On Thu, Dec 15, 2016 at 06:54:43PM -0700, Jeff Law wrote:

   unsigned cnt = 0;
+  bitmap live_bytes = NULL;
+  bitmap orig_live_bytes = NULL;

   *use_stmt = NULL;

+  /* REF is a memory write.  Go ahead and get its base, size, extent
+ information and encode the bytes written into LIVE_BYTES.  We can
+ handle any case where we have a known base and maximum size.
+
+ However, experimentation has shown that bit level tracking is not
+ useful in practice, so we only track at the byte level.
+
+ Furthermore, experimentation has shown that 99% of the cases
+ that require byte tracking are 64 bytes or less.  */
+  if (valid_ao_ref_for_dse (ref)
+  && (ref->max_size / BITS_PER_UNIT
+ <= PARAM_VALUE (PARAM_DSE_MAX_OBJECT_SIZE)))
+{
+  live_bytes = BITMAP_ALLOC (NULL);
+  orig_live_bytes = BITMAP_ALLOC (NULL);
+  bitmap_set_range (live_bytes,
+   ref->offset / BITS_PER_UNIT,
+   ref->max_size / BITS_PER_UNIT);
+  bitmap_copy (orig_live_bytes, live_bytes);


would it maybe make sense to use sbitmaps since the length is known to
be short, and doesn't change after allocation?
So a few interesting things have to be dealt if we want to make this 
change.  I already mentioned the need to bias based on ref->offset so 
that the range of bytes we're tracking is represented 0..size.


While we know the length of the potential dead store we don't know the 
length of the subsequent stores that we're hoping make the original a 
dead store.  Thus when we start clearing LIVE_BYTES based on those 
subsequent stores, we have to normalize those against the ref->offset of 
the original store.


What's even more fun is sizing.  Those subsequent stores may be 
considerably larger.  Which means that a bitmap_clear_range call has to 
be a hell of a lot more careful when working with sbitmaps (we just 
happily stop all over memory in that case) whereas a bitmap the right 
things will "just happen".


On a positive size since we've normalized the potential dead store's 
byte range to 0..size, it means computing trims is easier because we 
inherently know how many bits were originally set.  So compute_trims 
becomes trivial and we can simplify trim_complex_store a bit as well.


And, of course, we don't have a bitmap_{clear,set}_range or a 
bitmap_count_bits implementation for sbitmaps.



It's all a bit of "ugh", but should be manageable.

Jeff