from:"amonakov at gcc dot gnu dot org"

[Bug middle-end/38857] [4.4 Regression] ICE in selective scheduler

2009-01-20 Thread amonakov at gcc dot gnu dot org



--- Comment #3 from amonakov at gcc dot gnu dot org  2009-01-20 15:45 
---
The assert that fails is checking whether an instruction was correctly
disconnected from the insn stream (at its original location) to be inserted on
the scheduling boundary by adjusting PREV_INSN/NEXT_INSN links (we try to move
instructions instead of removing and reissuing new instruction to avoid cost of
re-initialization of associated structures).

There are two different versions of code to decide whether it is appropriate to
move an instruction for places where instructions are disconnected or inserted
into the stream, as different scheduler data is available there.  Attached
patch (by Andrey) changes it so that decision is made at instruction's original
location and saved until we need to move/issue it at the scheduling boundary in
should_move parameter.

Bootstrapped  regtested on ia64-linux.  I will include the testcase when
sending patch to gcc-patches@
Steven, can you please also check if it fixes the testcase you've seen fail on
this assert?


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org
 AssignedTo|unassigned at gcc dot gnu   |amonakov at gcc dot gnu dot
   |dot org |org
 Status|NEW |ASSIGNED
   Last reconfirmed|2009-01-19 22:49:52 |2009-01-20 15:45:10
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38857

[Bug middle-end/38857] [4.4 Regression] ICE in selective scheduler

2009-01-20 Thread amonakov at gcc dot gnu dot org



--- Comment #4 from amonakov at gcc dot gnu dot org  2009-01-20 15:47 
---
Created an attachment (id=17153)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17153action=view)
proposed patch


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38857

[Bug middle-end/38857] [4.4 Regression] ICE in selective scheduler

2009-01-22 Thread amonakov at gcc dot gnu dot org



--- Comment #7 from amonakov at gcc dot gnu dot org  2009-01-22 12:19 
---
(In reply to comment #6)
 -static bool code_motion_path_driver (insn_t, av_set_t, ilist_t,
 - cmpd_local_params_p, void *);
 +static int code_motion_path_driver (insn_t, av_set_t, ilist_t,
 +cmpd_local_params_p, void *);
 
 You probably don't want this bit...?
 

The function returns -1 in some circumstances.  This change is not relevant to
the ICE in question, but is nevertheless a correction (maybe not the best, as
'return true' and 'return false' are used in function's body).  I'm not sure
what's best here -- to include this in PR fix submission, or as a separate
patch.

FWIW, there're a couple more unrelated changes:
1) check if a reg is actually a hard reg
   if (REG_P (*cur_rtx)
+   HARD_REGISTER_P (*cur_rtx)
hard_regno_nregs[REGNO(*cur_rtx)][GET_MODE (*cur_rtx)]  1)

and
2) Do not merge info from successors if not relevant
   /* Merge data, clean up, etc.  */
-  if (code_motion_path_driver_info-after_merge_succs)
+  if (res != -1  code_motion_path_driver_info-after_merge_succs)
 code_motion_path_driver_info-after_merge_succs (lparams, static_params);

Again, I will submit them separately if so desired.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38857

[Bug middle-end/38857] [4.4 Regression] ICE in selective scheduler

2009-01-29 Thread amonakov at gcc dot gnu dot org

--- Comment #9 from amonakov at gcc dot gnu dot org  2009-01-29 10:53 
---
Subject: Bug 38857

Author: amonakov
Date: Thu Jan 29 10:53:15 2009
New Revision: 143753

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=143753
Log:
2009-01-29  Andrey Belevantsev  a...@ispras.ru
Alexander Monakov  amona...@ispras.ru

PR middle-end/38857
* sel-sched.c (count_occurrences_1): Check that *cur_rtx is a hard
register.
(move_exprs_to_boundary): Change return type and pass through
should_move from move_op.  Relax assert.  Update usage ...
(schedule_expr_on_boundary): ... here.  Use should_move instead of
cant_move.
(move_op_orig_expr_found): Indicate that insn was disconnected from
stream.
(code_motion_process_successors): Do not call after_merge_succs
callback if original expression was not found when traversing any of
the branches.
(code_motion_path_driver): Change return type.  Update prototype.
(move_op): Update comment.  Add a new parameter (should_move).  Update
prototype.  Set *should_move based on indication provided by
move_op_orig_expr_found.

2009-01-29  Steve Ellcey  s...@cup.hp.com

PR middle-end/38857
* gcc.c-torture/compile/pr38857.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/compile/pr38857.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/sel-sched.c
trunk/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38857

[Bug middle-end/38857] [4.4 Regression] ICE in selective scheduler

2009-01-29 Thread amonakov at gcc dot gnu dot org



--- Comment #10 from amonakov at gcc dot gnu dot org  2009-01-29 10:55 
---
Fixed with above commit.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38857

[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops

2009-04-17 Thread amonakov at gcc dot gnu dot org



-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

  Known to fail||4.4.0 4.5.0
  Known to work||4.3.2
   Priority|P3  |P5
Summary|Miscompile with -O2 -   |[4.4/4.5 Regression]
   |funroll-loops   |Miscompile with -O2 -
   ||funroll-loops


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794

[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops

2009-04-17 Thread amonakov at gcc dot gnu dot org



--- Comment #2 from amonakov at gcc dot gnu dot org  2009-04-17 21:55 
---
I attempted to investigate the miscompilation on the 4.4 branch.
The problem seems to appear in dse2 pass.  Basically, after encountering

  313 dx:DI=ax:DI+0x4
  187 {[di:DI+dx:DI]=[di:DI+dx:DI]0x1;clobber flags:CC;}
...
  191 [di:DI+dx:DI+0x4]=cx:SI
  314 dx:DI=ax:DI+0x8
  200 {[di:DI+dx:DI]=[di:DI+dx:DI]0x1;clobber flags:CC;}

and upon considering insn 200, dse2 decides to delete insn 191 and protect insn
187 (both are wrong, 200 depends on 191 and 187 is irrelevant):

**scanning insn=200
  mem: (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63])
(reg:DI 1 dx [orig:84 ivtmp.36 ] [84]))
expanding: r5 into: NULL
expanding: r1 into: (plus:DI (value:DI)
(const_int 8 [0x8]))
expanding value DI into: r0
expanding: r0 into: NULL

   after cselib_expand address: (plus:DI (plus:DI (reg/v/f:DI 5 di [orig:63 a ]
[63])
(reg:DI 0 ax [orig:76 ivtmp.36 ] [76]))
(const_int 8 [0x8]))

   after canon_rtx address: (plus:DI (plus:DI (reg/v/f:DI 5 di [orig:63 a ]
[63])
(reg:DI 0 ax [orig:76 ivtmp.36 ] [76]))
(const_int 8 [0x8]))
  varying cselib base=67 offset = 8
 processing cselib load mem:(mem:SI (plus:DI (reg/v/f:DI 5 di [orig:63 a ]
[63])
(reg:DI 1 dx [orig:84 ivtmp.36 ] [84])) [2 S4 A32])
 processing cselib load against insn 191
 processing cselib load against insn 187
removing from active insn=187 has store
  mem: (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63])
(reg:DI 1 dx [orig:84 ivtmp.36 ] [84]))
expanding: r5 into: NULL
expanding: r1 into: (plus:DI (value:DI)
(const_int 8 [0x8]))
expanding value DI into: r0
expanding: r0 into: NULL

   after cselib_expand address: (plus:DI (plus:DI (reg/v/f:DI 5 di [orig:63 a ]
[63])
(reg:DI 0 ax [orig:76 ivtmp.36 ] [76]))
(const_int 8 [0x8]))

   after canon_rtx address: (plus:DI (plus:DI (reg/v/f:DI 5 di [orig:63 a ]
[63])
(reg:DI 0 ax [orig:76 ivtmp.36 ] [76]))
(const_int 8 [0x8]))
  varying cselib base=67 offset = 8
 processing cselib store [8..12)
trying store in insn=191 gid=-1[8..12)
Locally deleting insn 191
deferring deletion of insn with uid = 191.
mems_found = 1, cannot_delete = false


I wonder how dse2 is supposed to notice that insn 314 changes DX.  E.g. when
checking rhs of insn 200 ([di+dx]) against lhs of insn 191 ([di+dx+4] for
different dx) in check_mem_read_rtx it calls canon_true_dependence (from
dse.c:2224) for [di+dx] and [di+dx+4] which returns false.  However, these
references clearly conflict.  Maybe a stupid question, but shouldn't this
canon_true_dependence call receive canonicalized MEMs from 'base' and
'store_info-cse_base'?


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gmail dot com,
   ||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794

[Bug driver/39851] New: gcc -Q --help=target does not list extensions selected by -march=

2009-04-22 Thread amonakov at gcc dot gnu dot org

Even though -march=nocona enables SSE2 and SSE3 extensions, gcc -Q
--help=target does not list them as enabled.  This may be confusing to the user
(see http://gcc.gnu.org/ml/gcc-help/2009-04/msg00293.html ):

$ gcc -Q --help=target -march=nocona | grep msse[23]
  -msse2[disabled]
  -msse3[disabled]

-msse[23] would be enabled in gcc/config/i386/i386.c:override_options(), called
from toplev.c:process_options(), which is in turn called from do_compile(). 
OTOH, --help=target is processed in decode_options(), which is executed before
do_compile.  It would be nice if --help=target processing could see options
overridden by target backend.


-- 
   Summary: gcc -Q --help=target does not list extensions selected
by -march=
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: driver
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: amonakov at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39851

[Bug middle-end/42130] [graphite-branch] dealII fails

2009-11-25 Thread amonakov at gcc dot gnu dot org



--- Comment #4 from amonakov at gcc dot gnu dot org  2009-11-25 11:48 
---
Tobias,

Please fix the testcase before committing to trunk, like this ('return 0' is
needed to ensure the test does not fail when compiled correctly; 'noclone' to
ensure that foo is not specialized for n=0):

/* { dg-options -O2 -fno-tree-ch } */ 
#include vector

using std::vector;

vectorunsigned  __attribute__((noinline,noclone)) foo(unsigned n)
{
  vectorunsigned *vv = new vectorunsigned(n, 0u);
  return *vv;
}


int main()
{
  foo(0);
  return 0;
}

(In reply to comment #3)


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42130

[Bug middle-end/42245] ICE in verify_backedges for 197.parser with sel-sched

2009-12-04 Thread amonakov at gcc dot gnu dot org



--- Comment #2 from amonakov at gcc dot gnu dot org  2009-12-04 18:00 
---
(In reply to comment #0)

Janis,

Thank you for the testcase.  This bug and PR42249 are fixed by Andrey's old
patch:

http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01930.html

The patch in that message still applies cleanly.  I'm working on re-testing it
with current mainline.  If you could test that patch in your environment, it
would be very appreciated.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org
 AssignedTo|unassigned at gcc dot gnu   |amonakov at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-12-04 18:00:57
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42245

[Bug rtl-optimization/42246] ICE in init_seqno for 186.crafty with sel-sched

2009-12-04 Thread amonakov at gcc dot gnu dot org



-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org
 AssignedTo|unassigned at gcc dot gnu   |amonakov at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-12-04 18:01:40
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42246

[Bug rtl-optimization/42249] unrecognizable insn for 254.gap with sel-sched

2009-12-04 Thread amonakov at gcc dot gnu dot org



-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org
 AssignedTo|unassigned at gcc dot gnu   |amonakov at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-12-04 18:02:16
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42249

[Bug rtl-optimization/42294] [4.5 Regression] ICE in code_motion_path_driver for 416.gamess

2009-12-07 Thread amonakov at gcc dot gnu dot org



--- Comment #3 from amonakov at gcc dot gnu dot org  2009-12-07 18:23 
---
Also not reproducible on x86_64-ppc64 cross.
While codegen differences on ppc/ppc64/x86_64 cross are certainly surprising,
in the end this testcase most likely indicates a bug in sel-sched.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||abel at ispras dot ru
 AssignedTo|unassigned at gcc dot gnu   |amonakov at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-12-07 18:23:47
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42294

[Bug rtl-optimization/42294] [4.5 Regression] ICE in code_motion_path_driver for 416.gamess

2009-12-08 Thread amonakov at gcc dot gnu dot org



--- Comment #4 from amonakov at gcc dot gnu dot org  2009-12-08 11:55 
---
(In reply to comment #3)
 Also not reproducible on x86_64-ppc64 cross.
 While codegen differences on ppc/ppc64/x86_64 cross are certainly surprising,
 in the end this testcase most likely indicates a bug in sel-sched.

Reprodicible on x86_64-ppc64 cross with -fno-section-anchors appended to
command line.  Native ppc64 compiler seems to not use section anchors on this
testcase.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42294

[Bug middle-end/42245] ICE in verify_backedges for 197.parser with sel-sched

2009-12-28 Thread amonakov at gcc dot gnu dot org



--- Comment #5 from amonakov at gcc dot gnu dot org  2009-12-28 14:23 
---
(In reply to comment #4)
 Re. comment #3 - do you have an example of when/how this can happen? Perhaps
 you can add it to the comment.
 

Here, we are scheduling a loop that looks like this:

  +---+
  | 4 |--+
  +-+-+   |
| |
v |
  +---+   |
  | 5 +-+ |
  +-+-+ | |
|   | |
v   | |
  +---+ | |
exit-+ 6 | | |
  +---+ | |
|   | |
v   | |
  +---+ | |
+-+ 7 | | |
| +---+ | |
|   | |
|   | |
| +---+ | |
| | 8 |+ |
| +-+-+   |
|   | |
|   v |
| +---+   |
-| 9 +---+
  +---+

The order of blocks as given by prev_bb/next_bb corresponds to the figure, but
in rev_top_order_index BB8 goes before BB6 (which is OK since they are
topologically independent).  When BB8 becomes empty in the process of
scheduling, in the preparation to delete it we redirect BB7-BB9 branch to BB8
(essentially eliminating a now unneeded jump instruction in the end of BB7). 
The corresponding comment reads in sel-sched-ir.c reads:

/* Check if there is an unnecessary jump in previous basic block leading to
next basic block left after removing INSN from stream.  If it is so, remove
that jump and redirect edge to current basic block (where there was INSN before
deletion).  This way when NOP will be deleted several instructions later with
its basic block we will not get a jump to next instruction, which can be
harmful.  */

This makes BB6 and BB8 topologically dependent.  We then merge BB7 and BB8,
creating BB6-BB8 branch, which appears to be a backedge (since topological
order has not been recomputed).

Since the explanation is quite lengthy, we'd prefer to just leave a reference
to this PR in the comment.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42245

[Bug libgomp/42519] bootstrap fails on powerpc64-linux because of libgomp

2009-12-28 Thread amonakov at gcc dot gnu dot org



--- Comment #2 from amonakov at gcc dot gnu dot org  2009-12-28 17:45 
---
(In reply to comment #1)
 So, your pthread.h doesn't contain prototype for pthread_getaffinity_np, yet
 libpthread.so.0 exports it?  Otherwise:

Affected system declares those functions in /usr/include/nptl/pthread.h.  Some
clarifications here: http://gcc.gnu.org/ml/gcc/2009-12/msg00376.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42519

[Bug middle-end/42245] ICE in verify_backedges for 197.parser with sel-sched

2010-01-11 Thread amonakov at gcc dot gnu dot org



--- Comment #7 from amonakov at gcc dot gnu dot org  2010-01-11 15:04 
---
Our previous patch (http://gcc.gnu.org/ml/gcc-patches/2009-12/msg01215.html)
failed to correctly fix the problem, and the new testcase uncovers a flaw in
that implementation.

We 'forgot' to recompute topological order if it was invalidated in
tidy_control_flow but not in maybe_tidy_empty_bb called from that function. 
Fixed by passing recompute_toporder_p to the latter on top of the mentioned
previous patch as below (the patch also makes maybe_tidy_empty_bb static by
moving the only caller into the same file).

* sel-sched-ir.c (maybe_tidy_empty_bb): Make static.  Add new
argument.  Update all callers.
(purge_empty_blocks): Export and move from...
* sel-sched.c (purge_empty_blocks): ... here.  Delete.
* sel-sched-ir.h (maybe_tidy_empty_bb): Delete prototype.
(purge_empty_blocks): Declare.

diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index cffee1c..e20df17 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -3538,13 +3538,13 @@ sel_recompute_toporder (void)
 }

 /* Tidy the possibly empty block BB.  */
-bool
-maybe_tidy_empty_bb (basic_block bb)
+static bool
+maybe_tidy_empty_bb (basic_block bb, bool recompute_toporder_p)
 {
   basic_block succ_bb, pred_bb;
   edge e;
   edge_iterator ei;
-  bool rescan_p, recompute_toporder_p = false;
+  bool rescan_p;

   /* Keep empty bb only if this block immediately precedes EXIT and
  has incoming non-fallthrough edge, or it has no predecessors or
@@ -3630,7 +3630,7 @@ tidy_control_flow (basic_block xbb, bool full_tidying)
   insn_t first, last;

   /* First check whether XBB is empty.  */
-  changed = maybe_tidy_empty_bb (xbb);
+  changed = maybe_tidy_empty_bb (xbb, false);
   if (changed || !full_tidying)
 return changed;

@@ -3694,7 +3694,7 @@ tidy_control_flow (basic_block xbb, bool full_tidying)
  that contained that jump, becomes empty too.  In such case
  remove it too.  */
   if (sel_bb_empty_p (xbb-prev_bb))
-changed = maybe_tidy_empty_bb (xbb-prev_bb);
+changed = maybe_tidy_empty_bb (xbb-prev_bb, recompute_toporder_p);
   else if (recompute_toporder_p)
sel_recompute_toporder ();
 }
@@ -3702,6 +3702,24 @@ tidy_control_flow (basic_block xbb, bool full_tidying)
   return changed;
 }

+/* Purge meaningless empty blocks in the middle of a region.  */
+void
+purge_empty_blocks (void)
+{
+  /* Do not attempt to delete preheader.  */
+  int i = sel_is_loop_preheader_p (BASIC_BLOCK (BB_TO_BLOCK (0))) ? 1 : 0;
+
+  while (i  current_nr_blocks)
+{
+  basic_block b = BASIC_BLOCK (BB_TO_BLOCK (i));
+
+  if (maybe_tidy_empty_bb (b, false))
+   continue;
+
+  i++;
+}
+}
+
 /* Rip-off INSN from the insn stream.  When ONLY_DISCONNECT is true,
do not delete insn's data, because it will be later re-emitted.
Return true if we have removed some blocks afterwards.  */
diff --git a/gcc/sel-sched-ir.h b/gcc/sel-sched-ir.h
index 317258c..b5121c0 100644
--- a/gcc/sel-sched-ir.h
+++ b/gcc/sel-sched-ir.h
@@ -1619,7 +1619,7 @@ extern bool tidy_control_flow (basic_block, bool);
 extern void free_bb_note_pool (void);

 extern void sel_remove_empty_bb (basic_block, bool, bool);
-extern bool maybe_tidy_empty_bb (basic_block bb);
+extern void purge_empty_blocks (void);
 extern basic_block sel_split_edge (edge);
 extern basic_block sel_create_recovery_block (insn_t);
 extern void sel_merge_blocks (basic_block, basic_block);
diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 37be754..9271b80 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -6790,24 +6790,6 @@ setup_current_loop_nest (int rgn)
   gcc_assert (LOOP_MARKED_FOR_PIPELINING_P (current_loop_nest));
 }

-/* Purge meaningless empty blocks in the middle of a region.  */
-static void
-purge_empty_blocks (void)
-{
-  /* Do not attempt to delete preheader.  */
-  int i = sel_is_loop_preheader_p (BASIC_BLOCK (BB_TO_BLOCK (0))) ? 1 : 0;
-
-  while (i  current_nr_blocks)
-{
-  basic_block b = BASIC_BLOCK (BB_TO_BLOCK (i));
-
-  if (maybe_tidy_empty_bb (b))
-   continue;
-
-  i++;
-}
-}
-
 /* Compute instruction priorities for current region.  */
 static void
 sel_compute_priorities (int rgn)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42245

[Bug middle-end/37447] [4.4 Regression] test pr28982b.c fails execution on power4 or later with ira change

2008-10-02 Thread amonakov at gcc dot gnu dot org



--- Comment #6 from amonakov at gcc dot gnu dot org  2008-10-02 11:27 
---
This patch also fixes miscompilation of vla1.f90 test on ia64 on sel-sched
branch.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37447

[Bug target/37381] [4.4 Regression] ICE in ia64_speculate_insn, at config/ia64/ia64.c:6902

2008-10-14 Thread amonakov at gcc dot gnu dot org



--- Comment #10 from amonakov at gcc dot gnu dot org  2008-10-14 13:41 
---
Since Andrey has just committed ia64 changes from sel-sched branch to trunk,
the underlying problem is fixed and ICEs on original testcase and mentioned
regression tests are gone.  Closing as fixed.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37381

[Bug target/37381] [4.4 Regression] ICE in ia64_speculate_insn, at config/ia64/ia64.c:6902

2008-10-16 Thread amonakov at gcc dot gnu dot org



--- Comment #12 from amonakov at gcc dot gnu dot org  2008-10-16 17:31 
---
Subject: Bug 37381

Author: amonakov
Date: Thu Oct 16 17:30:06 2008
New Revision: 141177

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=141177
Log:
2008-10-16  Alexander Monakov  [EMAIL PROTECTED]

PR target/37381
* gcc.c-torture/compile/pr37381.c: New test.


Added:
trunk/gcc/testsuite/gcc.c-torture/compile/pr37381.c
Modified:
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37381

[Bug target/37381] [4.4 Regression] ICE in ia64_speculate_insn, at config/ia64/ia64.c:6902

2008-10-16 Thread amonakov at gcc dot gnu dot org



--- Comment #13 from amonakov at gcc dot gnu dot org  2008-10-16 17:42 
---
H.J., thanks for reminding.  Closing again.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37381

[Bug tree-optimization/41118] New: Wrong dependence analysis in graphite for unrestricted pointers

2009-08-19 Thread amonakov at gcc dot gnu dot org

Consider this example: 

cat  pr41118.c EOF

void foo(int n, int *a, int *b)
{
  int i;
  for (i = 0; i  n; i++)
a[i] = b[i];
}

EOF

gcc -S -O2 pr41118.c -floop-parallelize-all -ftree-parallelize-loops=2

grep GOMP 41118.s

GCC considers the loop parallel, even though arrays pointed to by arguments may
arbitrarily overlap.  This is because dependency analysis in graphite treats
p[i] as global_mem[alias_set_for_p][i]. In this example, since both a and b
have
alias set 0, a[0][i0] and b[0][i1] are considered independent for i0 != i1.


-- 
   Summary: Wrong dependence analysis in graphite for unrestricted
pointers
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: amonakov at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41118

[Bug target/33604] [4.3/4.4 Regression] Revision 119502 causes significantly slower results with 4.3/4.4 compared to 4.2

2008-08-29 Thread amonakov at gcc dot gnu dot org



--- Comment #43 from amonakov at gcc dot gnu dot org  2008-08-29 13:12 
---
Checking original testcase times on x86_64 prescott with gentoo 4.2, 4.3 and
today's trunk:
2.960sg++-4.2.4 (GCC) 4.2.4 (Gentoo 4.2.4 p1.0)
2.916sg++-4.3.1 (Gentoo 4.3.1-r1 p1.1) 4.3.1
3.993sg++ (GCC) 4.4.0 20080829 (experimental)
2.796sg++ (GCC) 4.4.0 20080829 (experimental) with --param
max-inline-insns-auto=126

So I believe lack of inlining is the biggest 4.4's problem. We do not inline
3x3 matrix multiplication in benchmark loop.

While looking at it I found that einline2 dump does not always show the reason
for not inlining. I would like to propose the following patch:

--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -1494,6 +1494,8 @@ cgraph_decide_inlining_incrementally (struct cgraph_node
*node,
  }
if (cgraph_default_inline_p (e-callee, failed_reason))
  inlined |= try_inline (e, mode, depth);
+   else if (dump_file)
+ fprintf (dump_file, Not inlining: %s.\n, failed_reason);
   }
   node-aux = (void *)(size_t) old_mode;
   return inlined;


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604

[Bug target/37381] [4.4 Regression] ICE in ia64_speculate_insn, at config/ia64/ia64.c:6902

2008-09-08 Thread amonakov at gcc dot gnu dot org



--- Comment #4 from amonakov at gcc dot gnu dot org  2008-09-08 10:38 
---
Scheduling of instructions dependent on speculative loads was implemented a
bit differently on sel-sched branch and on trunk (before the merge).  Since
ia64.c changes were not checked in, a discrepancy appeared, resulting in an ICE
when attempting to schedule an instruction dependent on speculative load.

This issue is fixed by restoring the scheduler behaviour to the
pre-merge state.  This change will have to be reverted when ia64 changes are
approved (handling of BE_IN_SPEC bits will be implemented in back-end).

Bootstrapped and regtested on ia64-linux.  I will add the testcase and re-post
to [EMAIL PROTECTED]


PR target/37381
* haifa-sched.c (sched_speculate_insn): Filter BE_IN_SPEC bits before
passing dependence status to back-end.


Index: gcc/haifa-sched.c
===
--- gcc/haifa-sched.c   (revision 140093)
+++ gcc/haifa-sched.c   (working copy)
@@ -4288,7 +4288,7 @@ sched_speculate_insn (rtx insn, ds_t req
!(request  BEGIN_SPEC))
 return 0;

-  return targetm.sched.speculate_insn (insn, request, new_pat);
+  return targetm.sched.speculate_insn (insn, request  BEGIN_SPEC, new_pat);
 }

 static int


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |amonakov at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2008-09-08 10:38:12
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37381

[Bug middle-end/37499] [4.4 Regression] Scheduling pass 2 time increases by order of magnitude

2008-09-16 Thread amonakov at gcc dot gnu dot org



--- Comment #4 from amonakov at gcc dot gnu dot org  2008-09-16 15:12 
---
A patch for this bug has been posted at
http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01135.html

Running the testcase on similarly configured compiler shows 2.47 seconds spent
in scheduling2, out of 151.27 total time (2.2GHz Opteron 8354).  Unpatched
times are 32.43 sched2/180.89 total.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|abel at gcc dot gnu dot org |amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37499

[Bug middle-end/37499] [4.4 Regression] Scheduling pass 2 time increases by order of magnitude

2008-09-18 Thread amonakov at gcc dot gnu dot org

--- Comment #6 from amonakov at gcc dot gnu dot org  2008-09-18 08:31 
---
Subject: Bug 37499

Author: amonakov
Date: Thu Sep 18 08:29:48 2008
New Revision: 140445

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=140445
Log:
2008-09-18  Alexander Monakov  [EMAIL PROTECTED]

PR middle-end/37499
* sched-int.h (struct _haifa_insn_data): Remove unused field
ref_count.

* sched-rgn.c (ref_counts): Remove.
(insn_referenced): New static variable.
(INSN_REF_COUNT): Remove.
(sched_run_compute_dependencies): Use insn_referenced instead of
INSN_REF_COUNT.
(add_branch_dependences): Likewise.  Delete dead assignment.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/sched-int.h
trunk/gcc/sched-rgn.c

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37499

[Bug middle-end/37499] [4.4 Regression] Scheduling pass 2 time increases by order of magnitude

2008-09-18 Thread amonakov at gcc dot gnu dot org



--- Comment #7 from amonakov at gcc dot gnu dot org  2008-09-18 08:34 
---
Fixed with above commit.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37499

[Bug middle-end/42245] ICE in verify_backedges for 197.parser with sel-sched

2010-01-14 Thread amonakov at gcc dot gnu dot org

--- Comment #9 from amonakov at gcc dot gnu dot org  2010-01-14 10:29 
---
Subject: Bug 42245

Author: amonakov
Date: Thu Jan 14 10:28:47 2010
New Revision: 155890

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=155890
Log:
2010-01-14  Andrey Belevantsev  a...@ispras.ru
Alexander Monakov  amona...@ispras.ru

PR middle-end/42245
* sel-sched-ir.c (sel_recompute_toporder): New.  Use it...
(maybe_tidy_empty_bb): ... here.  Make static.  Add new
argument.  Update all callers.
(tidy_control_flow): ... and here.  Recompute topological order
of basic blocks in region if necessary.
(sel_redirect_edge_and_branch): Change return type.  Return true
if topological order might have been invalidated.
(purge_empty_blocks): Export and move from...
* sel-sched.c (purge_empty_blocks): ... here.
* sel-sched-ir.h (sel_redirect_edge_and_branch): Update prototype.
(maybe_tidy_empty_bb): Delete prototype.
(purge_empty_blocks): Declare.

* gcc.dg/pr42245.c: New.
* gcc.dg/pr42245-2.c: New.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/sel-sched-ir.c
trunk/gcc/sel-sched-ir.h
trunk/gcc/sel-sched.c
trunk/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42245

[Bug middle-end/42245] ICE in verify_backedges for 197.parser with sel-sched

2010-01-14 Thread amonakov at gcc dot gnu dot org



--- Comment #10 from amonakov at gcc dot gnu dot org  2010-01-14 10:38 
---
Subject: Bug 42245

Author: amonakov
Date: Thu Jan 14 10:38:14 2010
New Revision: 155891

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=155891
Log:
Add tests missing from previous commit.

PR middle-end/42245
* gcc.dg/pr42245.c: New.
* gcc.dg/pr42245-2.c: New.

Added:
trunk/gcc/testsuite/gcc.dg/pr42245-2.c
trunk/gcc/testsuite/gcc.dg/pr42245.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42245

[Bug rtl-optimization/39453] ICE : in init_seqno, at sel-sched.c:6433

2010-01-14 Thread amonakov at gcc dot gnu dot org



--- Comment #6 from amonakov at gcc dot gnu dot org  2010-01-14 10:40 
---
Subject: Bug 39453

Author: amonakov
Date: Thu Jan 14 10:40:19 2010
New Revision: 155892

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=155892
Log:
2010-01-14  Alexander Monakov  amona...@ispras.ru

PR rtl-optimization/39453
PR rtl-optimization/42246
* sel-sched-ir.c (considered_for_pipelining_p): Do not test
for pipelining_p.
(sel_add_loop_preheaders): Add preheader to last_added_blocks.

* gcc.dg/pr39453.c: New.
* gcc.dg/pr42246.c: New.


Added:
trunk/gcc/testsuite/gcc.dg/pr39453.c
trunk/gcc/testsuite/gcc.dg/pr42246.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/sel-sched-ir.c
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39453

[Bug rtl-optimization/42246] ICE in init_seqno for 186.crafty with sel-sched

2010-01-14 Thread amonakov at gcc dot gnu dot org



--- Comment #5 from amonakov at gcc dot gnu dot org  2010-01-14 10:40 
---
Subject: Bug 42246

Author: amonakov
Date: Thu Jan 14 10:40:19 2010
New Revision: 155892

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=155892
Log:
2010-01-14  Alexander Monakov  amona...@ispras.ru

PR rtl-optimization/39453
PR rtl-optimization/42246
* sel-sched-ir.c (considered_for_pipelining_p): Do not test
for pipelining_p.
(sel_add_loop_preheaders): Add preheader to last_added_blocks.

* gcc.dg/pr39453.c: New.
* gcc.dg/pr42246.c: New.


Added:
trunk/gcc/testsuite/gcc.dg/pr39453.c
trunk/gcc/testsuite/gcc.dg/pr42246.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/sel-sched-ir.c
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42246

[Bug middle-end/42245] ICE in verify_backedges for 197.parser with sel-sched

2010-01-14 Thread amonakov at gcc dot gnu dot org



--- Comment #11 from amonakov at gcc dot gnu dot org  2010-01-14 10:41 
---
Fixed by revision 155890


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42245

[Bug rtl-optimization/39453] ICE : in init_seqno, at sel-sched.c:6433

2010-01-14 Thread amonakov at gcc dot gnu dot org



--- Comment #7 from amonakov at gcc dot gnu dot org  2010-01-14 10:44 
---
Fixed by revision 155892


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39453

[Bug rtl-optimization/42294] [4.5 Regression] ICE in code_motion_path_driver for 416.gamess

2010-01-14 Thread amonakov at gcc dot gnu dot org



--- Comment #10 from amonakov at gcc dot gnu dot org  2010-01-14 10:47 
---
Subject: Bug 42294

Author: amonakov
Date: Thu Jan 14 10:46:57 2010
New Revision: 155893

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=155893
Log:
2010-01-14  Alexander Monakov  amona...@ispras.ru

PR rtl-optimization/42294
* sel-sched-ir.h (struct _sel_insn_data): Update comment.
* sel-sched.c (move_exprs_to_boundary): Transitively add all
originators' originators.

* gfortran.dg/pr42294.f: New.


Added:
trunk/gcc/testsuite/gfortran.dg/pr42294.f
Modified:
trunk/gcc/ChangeLog
trunk/gcc/sel-sched-ir.h
trunk/gcc/sel-sched.c
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42294

[Bug tree-optimization/42771] [4.5 Regression] ICE: in graphite_loop_normal_form, at graphite-sese-to-poly.c (2)

2010-01-25 Thread amonakov at gcc dot gnu dot org



--- Comment #5 from amonakov at gcc dot gnu dot org  2010-01-25 17:06 
---
We fail to find number of iterations after rewriting reductions out of SSA. 
Before graphite pass, IR looks like (for the previous testcase, pr42771.c):

bb 9:
  # j_26 = PHI j_20(10)

bb 10:
  # j_33 = PHI j_26(9), 1(16)
  D.2747_16 = B[j_33][0];
  D.2748_17 = j_33 + -1;
  D.2749_18 = B[D.2748_17][0];
  D.2750_19 = D.2749_18 ^ D.2747_16;
  B[j_33][0] = D.2750_19;
  j_20 = j_33 + 1;
  if (jm_14(D)  j_20)
goto bb 9;
  else
goto bb 11;

At the time of the ICE, IR is transformed to
(gdb) call debug_loop(use_loop, 3)
loop_3 (header = 10, latch = 9, niter = (unsigned int) jm_14(D) + 4294967294,
upper_bound = 2147483646)
{
  bb_9 (preds = {bb_10 }, succs = {bb_10 })
  {
  bb 9:
# .MEM_24 = PHI .MEM_78(10)
# VUSE .MEM_24
j_26 = Close_Phi.13[0];
# .MEM_79 = VDEF .MEM_24
General_Reduction.14[0] = j_26;

  }
  bb_10 (preds = {bb_9 bb_16 }, succs = {bb_9 bb_11 })
  {
  bb 10:
# .MEM_15 = PHI .MEM_79(9), .MEM_77(16)
# VUSE .MEM_15
D.2766_76 = General_Reduction.14[0];
j_33 = D.2766_76;
# VUSE .MEM_15
D.2747_16 = B[j_33][0];
D.2748_17 = j_33 + -1;
# VUSE .MEM_15
D.2749_18 = B[D.2748_17][0];
D.2750_19 = D.2749_18 ^ D.2747_16;
# .MEM_30 = VDEF .MEM_15
B[j_33][0] = D.2750_19;
j_20 = j_33 + 1;
# .MEM_78 = VDEF .MEM_30
Close_Phi.13[0] = j_20;
if (jm_14(D)  j_20)
  goto bb 9;
else
  goto bb 11;
  }
}

We would fail to discover scalar evolution of j_33.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42771

[Bug tree-optimization/42771] [4.5 Regression][graphite] ICE: in graphite_loop_normal_form, at graphite-sese-to-poly.c (2)

2010-02-10 Thread amonakov at gcc dot gnu dot org



--- Comment #10 from amonakov at gcc dot gnu dot org  2010-02-10 18:26 
---
(In reply to comment #9)
 Fixed as described in 
 http://gcc.gnu.org/ml/gcc-patches/2010-02/msg00436.html
 

I don't see how this patch makes simple_iv call from number_of_iterations_exit
return true for j_20.  Could you please kindly explain?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42771

[Bug tree-optimization/43012] [4.5 Regression][graphite] wrong code for -floop-strip-mine in 453.povray

2010-02-10 Thread amonakov at gcc dot gnu dot org



--- Comment #2 from amonakov at gcc dot gnu dot org  2010-02-10 18:41 
---
Confirming. Reproducible on amd64-linux.

This appears to be a bug in CLooG.  Disable CLooG optimizations on graphite
branch fixes the bug.  The problem is that CLooG generates wrong bounds for
parts of strip-mined loop (bounds of the first and the last loops are wrong):

for (scat_3=-51;scat_3=63;scat_3++) {
  S3(scat_3) ;
  S4(scat_3) ;
}
for (scat_3=64;scat_3=76;scat_3++) {
  S3(scat_3) ;
  S6(scat_3) ;
}
for (scat_3=77;scat_3=88;scat_3++) {
  S3(scat_3) ;
  S8(scat_3) ;
}
for (scat_3=89;scat_3=-1;scat_3++) {
  S3(scat_3) ;
}


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-02-10 18:41:13
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43012

[Bug tree-optimization/43012] [4.5 Regression][graphite] wrong code for -floop-strip-mine in 453.povray

2010-02-11 Thread amonakov at gcc dot gnu dot org



--- Comment #3 from amonakov at gcc dot gnu dot org  2010-02-11 14:28 
---
(In reply to comment #2)
 Confirming. Reproducible on amd64-linux.
 
 This appears to be a bug in CLooG.  Disable CLooG optimizations on graphite
 branch fixes the bug.  The problem is that CLooG generates wrong bounds for
 parts of strip-mined loop (bounds of the first and the last loops are wrong):

I've sent a CLooG patch.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43012

[Bug c/43052] Inline memcmp is much slower than glibc's

2010-02-12 Thread amonakov at gcc dot gnu dot org



--- Comment #1 from amonakov at gcc dot gnu dot org  2010-02-12 15:45 
---
Confirmed. GCC simply emits repz cmpsb.  There was even an e-mail with
benchmark results and a patch (never applied):

http://gcc.gnu.org/ml/gcc-patches/2009-09/msg02129.html


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-02-12 15:45:19
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052

[Bug tree-optimization/36905] [4.3/4.4/4.5 Regression] IV-opts needs a little help with a[i+1]

2010-02-16 Thread amonakov at gcc dot gnu dot org



--- Comment #7 from amonakov at gcc dot gnu dot org  2010-02-16 17:43 
---
(In reply to comment #6)

Looks like this has been fixed.  We do generate good code:

fred:
li 0,100
mtctr 0
.L2:
sthu 3,2(4)
bdnz .L2
blr
.size   fred, .-fred
.ident  GCC: (GNU) 4.5.0 20100215 (experimental)

And ivopts dump is quite sane:

bb 2:
  ivtmp.13_17 = (unsigned int) out1_5(D);

bb 3:
  # i_13 = PHI i_3(4), 0(2)
  # ivtmp.13_15 = PHI ivtmp.13_16(4), ivtmp.13_17(2)
  i_3 = i_13 + 1;
  ivtmp.13_16 = ivtmp.13_15 + 2;
  D.2031_18 = (void *) ivtmp.13_16;
  MEM[base: D.2031_18] = in_7(D);
  if (i_3 != 100)
goto bb 4;
  else
goto bb 5;

bb 4:
  goto bb 3;


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36905

[Bug pending/41998] GCC 4.6 pending patches meta-bug

2010-02-17 Thread amonakov at gcc dot gnu dot org



--- Comment #6 from amonakov at gcc dot gnu dot org  2010-02-17 16:32 
---
Handle ADDR_EXPR in SCEV

http://gcc.gnu.org/ml/gcc-patches/2010-02/msg00666.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41998

[Bug tree-optimization/43174] New: Teaching SCEV about ADDR_EXPR causes regression

2010-02-25 Thread amonakov at gcc dot gnu dot org

With patch from here: http://gcc.gnu.org/ml/gcc-patches/2010-02/msg00668.html
IVopts begin to create IVs for expressions like a0[i][j][0].  This may cause
regressions in stack usage and code size (also possibly speed).  Test case:

/* ---8--- */
enum {N=123};
int a0[N][N][N], a1[N][N][N], a2[N][N][N], a3[N][N][N],
a4[N][N][N], a5[N][N][N], a6[N][N][N], a7[N][N][N];

int foo() {
  int i, j, k, s = 0;
  for (i = 0; i  N; i++)
for (j = 0; j  N; j++)
  for (k = 0; k  N; k++) {
  s += a0[i][j][k]; s += a1[i][j][k]; s += a2[i][j][k]; s += a3[i][j][k];
  s += a4[i][j][k]; s += a5[i][j][k]; s += a6[i][j][k]; s += a7[i][j][k];
  }
  return s;
}
/* ---8--- */

Without the patch, IVopts produce one IV for j loop and 8 IVs for k loop.  With
the patch, IVopts additionally produce 8 IVs for j loop (with 123*4 increment),
4 of which live on stack (on x86-64, -O2).

Creation of IVs that live on stack is likely due to inexact register pressure
estimation in IVopts.

However, it would be nice if IVopts could notice that it's cheaper to take the
final value of inner loop IVs (e.g. a0[i][j][k]) instead of incrementing IV
holding a0[i][j][0] by 123*4.  It would decrease register pressure and allow
to generate perfect code for the test case.


-- 
   Summary: Teaching SCEV about ADDR_EXPR causes regression
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: amonakov at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43174

[Bug tree-optimization/43209] [4.5 Regression] ICE in try_improve_iv_set, at tree-ssa-loop-ivopts.c:5238

2010-02-28 Thread amonakov at gcc dot gnu dot org



--- Comment #4 from amonakov at gcc dot gnu dot org  2010-02-28 22:01 
---
Confirmed.
The first invocation of get_computation_aff fails with ustep == (long) j, cstep
== (unsigned long) j: constant_multiple_of (ustep, cstep, rat) returns false
(j is int, STRIP_NOPS ({u,c}step) preserves conversions).


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-02-28 22:01:43
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43209

[Bug tree-optimization/43174] Teaching SCEV about ADDR_EXPR causes regression

2010-03-01 Thread amonakov at gcc dot gnu dot org



--- Comment #1 from amonakov at gcc dot gnu dot org  2010-03-01 17:43 
---
Created an attachment (id=20001)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20001action=view)
Simplify increments in IVopts using final values of inner loop IVs

A quick  dirty attempt to implement register pressure reduction in outer loops
by using final values of inner loop IVs.  Currently, given
for (i = 0; i  N; i++)
  for (j = 0; j  N; j++)
s += a[i][j];
we generate something like
bb1
L1:
s.0 = PHI(0, s.2)
i.0 = PHI(0, i.1)
ivtmp.0 = a[i.0][0]
bb2
L2:
s.1 = PHI(s.0, s.2)
j.0 = PHI(122, j.1)
ivtmp.1 = PHI(ivtmp.0, ivtmp.2)
s.2 = s.1 + MEM(ivtmp.1)
ivtmp.2 = ivtmp.1 + 4
j.1 = j.0 - 1
if (j.1 = 0) goto L2
bb3
i.1 = i.0 + 1
if (i.1 = 122) goto L1

This together with the patch mentioned in the previous comment allows to
generate:
ivtmp.0 = a[0][0]
bb1
L1:
s.0 = PHI(0, s.2)
i.0 = PHI(122, i.1)
ivtmp.1 = PHI(ivtmp.0, ivtmp.4)
bb2
L2:
s.1 = PHI(s.0, s.2)
j.0 = PHI(122, j.1)
ivtmp.2 = PHI(ivtmp.1, ivtmp.3)
s.2 = s.1 + MEM(ivtmp.2)
ivtmp.3 = ivtmp.2 + 4
j.1 = j.0 - 1
if (j.1 = 0) goto L2
bb3
ivtmp.4 = ivtmp.3 // would be ivtmp.4 = ivtmp.1 + stride
i.1 = i.0 - 1
if (i.1 = 0) goto L1

The improvement is that ivtmp.1 is not live across the inner loop.

The approach is to store final values of IVs in a hashtable, mapping SSA_NAME
of initial value in the preheader to aff_tree with final value, and then try to
replace increments of new IVs with uses of IVs from inner loops (currently I
just implemented a brute force loop over all IV uses to find a useful entry in
that hashtable).
Does this make sense and sound acceptable?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43174

[Bug tree-optimization/43236] -ftree-loop-distribution produces wrong code in reload1.c:delete_output_reload(), bootstrap fails

2010-03-03 Thread amonakov at gcc dot gnu dot org



--- Comment #6 from amonakov at gcc dot gnu dot org  2010-03-03 13:06 
---
Not a regression, off-by-one error in reverse iteration case is since day one.
Patch:
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 13ac7ea..110abdc 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -285,6 +285,8 @@ generate_memset_zero (gimple stmt, tree op0, tree nb_iter,
   addr_base = fold_convert_loc (loc, sizetype, addr_base);
   addr_base = size_binop_loc (loc, MINUS_EXPR, addr_base,
  fold_convert_loc (loc, sizetype, nb_bytes));
+  addr_base = size_binop_loc (loc, PLUS_EXPR, addr_base,
+ TYPE_SIZE_UNIT (TREE_TYPE (op0)));
   addr_base = fold_build2_loc (loc, POINTER_PLUS_EXPR,
   TREE_TYPE (DR_BASE_ADDRESS (dr)),
   DR_BASE_ADDRESS (dr), addr_base);

This fixes the -O[123] miscompilations. -Os is slightly harder to fix, since we
use wrong number of iterations (cond bb is executed 11 times, latch bb with
assignment 10 times).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43236

[Bug tree-optimization/43236] -ftree-loop-distribution produces wrong code in reload1.c:delete_output_reload(), bootstrap fails

2010-03-03 Thread amonakov at gcc dot gnu dot org



--- Comment #7 from amonakov at gcc dot gnu dot org  2010-03-03 13:38 
---
(In reply to comment #6)
 This fixes the -O[123] miscompilations. -Os is slightly harder to fix, since 
 we
 use wrong number of iterations (cond bb is executed 11 times, latch bb with
 assignment 10 times).

I don't see what is the proper fix for the -Os problem.  The loop structure is
as follows:
bb2
i = 20
goto bb4

 bb3
  i--
  a[i] = 0
 bb4
  if (i  10) goto bb3
Thus, bb4 is header, bb3 is latch, number_of_exit_cond_executions() is 11,
just_once_each_iteration_p() is true for both bb3 and bb4 (?!)


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43236

[Bug tree-optimization/43236] -ftree-loop-distribution produces wrong code in reload1.c:delete_output_reload(), bootstrap fails

2010-03-09 Thread amonakov at gcc dot gnu dot org



--- Comment #8 from amonakov at gcc dot gnu dot org  2010-03-09 16:55 
---
Given the fact that loop distribution only works for two-bb loops, I think the
fix is to simply take number of latch executions when the stmt is in the latch.

diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -389,6 +391,8 @@ generate_builtin (struct loop *loop, bitmap partition, bool
copy_p)
goto end;

  write = stmt;
+ if (bb == loop-latch)
+   nb_iter = number_of_latch_executions (loop);
}
}
 }


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43236

[Bug tree-optimization/43236] -ftree-loop-distribution produces wrong code in reload1.c:delete_output_reload(), bootstrap fails

2010-03-10 Thread amonakov at gcc dot gnu dot org

--- Comment #9 from amonakov at gcc dot gnu dot org  2010-03-10 12:54 
---
Subject: Bug 43236

Author: amonakov
Date: Wed Mar 10 12:53:51 2010
New Revision: 157339

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=157339
Log:
PR tree-optimization/43236
* tree-loop-distribution.c (generate_memset_zero): Fix off-by-one
error in calculation of base address in reverse iteration case.
(generate_builtin): Take number of latch executions if the statement
is in the latch.

* gcc.c-torture/execute/pr43236.c: New.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/pr43236.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-loop-distribution.c

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43236

[Bug tree-optimization/43236] -ftree-loop-distribution produces wrong code in reload1.c:delete_output_reload(), bootstrap fails

2010-03-10 Thread amonakov at gcc dot gnu dot org



--- Comment #10 from amonakov at gcc dot gnu dot org  2010-03-10 12:57 
---
Both issues are fixed with above commit.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43236

[Bug gcov-profile/43341] New: pragma pack changes padding in struct gcov_info on 64-bit archs

2010-03-12 Thread amonakov at gcc dot gnu dot org

cat EOF  t.c
int foo(){}
#pragma pack(1)
EOF

gcc -S -fprofile-generate t.c
grep '^\.LPBX' -A 2 t.s

.LPBX0:
.long   875574314
.quad   0

The padding ('.zero 4') between .long and .quad that correspond to the first
two fields of struct gcov_info (an int and a ptr) is gone.  This makes building
Firefox with profile feedback impossible on amd64.  At least gcc-4.[1345]
behave this way.


-- 
   Summary: pragma pack changes padding in struct gcov_info on 64-
bit archs
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: amonakov at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43341

[Bug lto/43355] New: Undefined references with -flto -fuse-linker-plugin, dependent on object file ordering

2010-03-13 Thread amonakov at gcc dot gnu dot org

cat EOF a.c 
extern int puts(const char*);
char *program;
void fail() {puts(program);}
EOF
cat EOF b.c 
extern int puts(const char*);
extern char *program;
extern void fail();
void usage() {puts(program);}
int main(int argc, char *argv[])
{
  program = argv[0];
  if (argc)
usage();
  else
fail();
  return 0;
}
EOF

gcc -flto -fuse-linker-plugin -O2 b.c a.c 
/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.0-pre/../../../../x86_64-pc-linux-gnu/bin/ld:
/tmp/ccvkxseZ.lto.o: in function usage:ccHTGBCK.o(.text+0x3): error: undefined
reference to 'program'
/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.0-pre/../../../../x86_64-pc-linux-gnu/bin/ld:
/tmp/ccvkxseZ.lto.o: in function fail:ccHTGBCK.o(.text+0x13): error: undefined
reference to 'program'
/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.0-pre/../../../../x86_64-pc-linux-gnu/bin/ld:
/tmp/ccvkxseZ.lto.o: in function main:ccHTGBCK.o(.text+0x2c): error: undefined
reference to 'program'
collect2: ld returned 1 exit status

Works with reversed ordering or without -fuse-linker-plugin


-- 
   Summary: Undefined references with -flto -fuse-linker-plugin,
dependent on object file ordering
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: amonakov at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43355

[Bug rtl-optimization/32283] [4.3/4.4/4.5 regression] Missed induction variable optimization

2010-03-16 Thread amonakov at gcc dot gnu dot org



--- Comment #28 from amonakov at gcc dot gnu dot org  2010-03-16 15:26 
---
To provide an update of the situation on 4.5 trunk:
AFAIK the situation has been generally improved with Zdenek's second commit (in
comment #23) and auto-inc-dec improvements in 4.5.  However, on the particular
testcase discussed here (comment #20), we still don't DTRT on ia64 (powerpc is
OK, don't know about arm).  There is unfortunate interplay between IVOPTS, RTL
PRE, RTL loop analysis and the auto-inc-dec pass.

First, ivopts produce harder-to-grok sequence in loop preheader, transforming
bb 3:
  pretmp.2_8 = (short int) v_4(D);

bb 4:
  # i_13 = PHI i_6(5), 0(3)
  a[i_13] = pretmp.2_8;
  i_6 = i_13 + 1;
  if (len_3(D)  i_6)
goto bb 5;
  else
goto bb 6;

bb 5:
  goto bb 4;

to

bb 3:
  pretmp.2_8 = (short int) v_4(D);
  ivtmp.7_11 = (long unsigned int) a[0];
  D.2006_17 = (unsigned int) len_3(D);
  D.2007_18 = D.2006_17 + 4294967295;
  D.2008_19 = (long unsigned int) D.2007_18;
  D.2009_20 = D.2008_19 * 2;
  a.13_21 = (long unsigned int) a;
  D.2011_22 = a.13_21 + 2;
  D.2012_23 = D.2009_20 + D.2011_22;

bb 4:
  # ivtmp.7_1 = PHI ivtmp.7_12(5), ivtmp.7_11(3)
  D.2005_16 = (void *) ivtmp.7_1;
  MEM[base: D.2005_16]{a[i]} = pretmp.2_8;
  ivtmp.7_12 = ivtmp.7_1 + 2;
  if (ivtmp.7_12 != D.2012_23)
goto bb 5;
  else
goto bb 6;

bb 5:
  goto bb 4;

The preheader is not cleaned up until RTL PRE.  Then, PRE transforms
L54:
   51 NOTE_INSN_BASIC_BLOCK
   52 [r371:DI]=r373:SI#0
   53 r371:DI=r371:DI+0x2
   55 r392:BI=r371:DI!=r381:DI
   56 pc={(r392:BI!=0x0)?L54:pc}

to

L83:
   82 NOTE_INSN_BASIC_BLOCK
   77 r397:DI=r371:DI+0x2
L54:
   51 NOTE_INSN_BASIC_BLOCK
   52 [r371:DI]=r373:SI#0
   75 r371:DI=r397:DI
  REG_EQUAL: r371:DI+0x2
   55 r392:BI=r371:DI!=r381:DI
   56 pc={(r392:BI!=0x0)?L83:pc}
  REG_DEAD: r392:BI
  REG_BR_PROB: 0x26ac

... which is something auto-inc-dec pass is not able to handle.  If I disable
rtl pre with -fdbg-cnt=pre:0, auto-inc is generated, but doloop pass is
confused instead:
Loop 1 is simple:
  simple exit 4 - 5
  infinite if: (expr_list:REG_DEP_TRUE (subreg:SI (and:DI (plus:DI (minus:DI
(ashift:DI (reg:DI 390)
(const_int 1 [0x1]))
(reg:DI 371 [ ivtmp.7 ]))
(const:DI (plus:DI (symbol_ref:DI (a) [flags 0x2]  var_decl
0x77968000 a)
(const_int 2 [0x2]
(const_int 1 [0x1])) 0)
(nil))
  number of iterations: (lshiftrt:DI (plus:DI (minus:DI (reg:DI 381 [ D.2012 ])
(reg:DI 371 [ ivtmp.7 ]))
(const_int -2 [0xfffe]))
(const_int 1 [0x1]))
  upper bound: -2
Doloop: Possible infinite iteration case.
Doloop: The loop is not suitable.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32283

[Bug tree-optimization/43415] [4.4/4.5 Regression] Consumes large amounts of memory and time in PRE at -O3

2010-03-18 Thread amonakov at gcc dot gnu dot org



--- Comment #1 from amonakov at gcc dot gnu dot org  2010-03-18 11:30 
---
Confirming.  4.5 trunk needs lots of memory in PRE.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu dot
   ||org, amonakov at gcc dot gnu
   ||dot org
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Keywords||compile-time-hog, memory-hog
  Known to fail||4.4.3 4.5.0
  Known to work||4.4.2
   Last reconfirmed|-00-00 00:00:00 |2010-03-18 11:30:27
   date||
Summary|[4.4 regression] gcc takes  |[4.4/4.5 Regression]
   |unusually large amounts of  |Consumes large amounts of
   |memory and time to compile  |memory and time in PRE at -
   |nested for loop at -O3  |O3
   Target Milestone|--- |4.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43415

[Bug c/43423] gcc should vectorize this loop through iteration range splitting

2010-03-18 Thread amonakov at gcc dot gnu dot org



--- Comment #1 from amonakov at gcc dot gnu dot org  2010-03-18 18:13 
---
Graphite is able to split the loop, but then the vectorizer punts anyway:

gcc -O3 -ftree-vectorizer-verbose=7 -fgraphite-identity -S t.c

t.c:11: note: not vectorized: number of iterations cannot be computed.
t.c:9: note: not vectorized: number of iterations cannot be computed.
t.c:3: note: vectorized 0 loops in function.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423

[Bug target/43603] gcc-4.4.3 ICE on ia64 with -O3

2010-04-06 Thread amonakov at gcc dot gnu dot org



--- Comment #2 from amonakov at gcc dot gnu dot org  2010-04-06 17:10 
---
Thanks for the analysis.
This is reproducible on trunk with -O2 -fsel-sched-pipelining
-fselective-scheduling2 (with -O3, pressure-aware loop invariant motion
slightly changes the code, and it's not possible to disable it (not even with
-fno-ira-loop-pressure, because it's enabled unconditionally in
ia64_override_options)).
The real problem is that we are attempting to clone an instruction with asm
operands as a bookkeeping copy -- that should never happen.  Thus, I think
copy_rtx calls should stay.  We'll have to fix the scheduler to never attempt
such code motion.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||abel at gcc dot gnu dot org,
   ||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43603

[Bug rtl-optimization/45472] [4.5/4.6 Regression] ICE: in move_op_ascend, at sel-sched.c:6124 with -fselective-scheduling2

2010-09-20 Thread amonakov at gcc dot gnu dot org



--- Comment #4 from amonakov at gcc dot gnu dot org  2010-09-20 14:49 
---
A small testcase to illustrate the problem with volatile fields.

//---8---
struct vv {volatile long a, b;} vv1, vv2;

int foo()
{
  vv1 = vv2;
}
//---8---

gcc/cc1 -O2 -frename-registers -fschedule-insns2 vol.c

movqvv2+8(%rip), %rax
movqvv2(%rip), %rdx
movq%rax, vv1+8(%rip)
movq%rdx, vv1(%rip)

The compiler reorders accesses to volatile fields.  As Andrey said, /v bits are
missing on MEMs even in the .expand dump.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45472

[Bug gcov-profile/43341] pragma pack changes padding in struct gcov_info on 64-bit archs

2010-04-21 Thread amonakov at gcc dot gnu dot org



--- Comment #1 from amonakov at gcc dot gnu dot org  2010-04-21 16:43 
---
Created an attachment (id=20455)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20455action=view)
proposed patch


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43341

[Bug gcov-profile/43825] gcov is initialized wrong on x86_64

2010-04-21 Thread amonakov at gcc dot gnu dot org



--- Comment #7 from amonakov at gcc dot gnu dot org  2010-04-21 16:45 
---


*** This bug has been marked as a duplicate of 43341 ***


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org
 Status|WAITING |RESOLVED
 Resolution||DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43825

[Bug gcov-profile/43341] pragma pack changes padding in struct gcov_info on 64-bit archs

2010-04-21 Thread amonakov at gcc dot gnu dot org



--- Comment #2 from amonakov at gcc dot gnu dot org  2010-04-21 16:45 
---
*** Bug 43825 has been marked as a duplicate of this bug. ***


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||tglek at mozilla dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43341

[Bug gcov-profile/43825] gcov is initialized wrong on x86_64

2010-04-21 Thread amonakov at gcc dot gnu dot org



--- Comment #8 from amonakov at gcc dot gnu dot org  2010-04-21 16:48 
---
Taras, to avoid triggering the problem from firefox you can search for the file
(as I remember there is only one in xulrunner) with #pragma pack(1) and does
not reset it, and add #pragma pack() in the end of that file.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43825

[Bug gcov-profile/43341] pragma pack changes padding in struct gcov_info on 64-bit archs

2010-04-21 Thread amonakov at gcc dot gnu dot org



--- Comment #3 from amonakov at gcc dot gnu dot org  2010-04-21 16:54 
---
(In reply to comment #1)
 Created an attachment (id=20455)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20455action=view) [edit]
 proposed patch
 

GCC generates gcov structures at runtime, and #pragma pack(1) in the source
file affects their layout.  We probably can reset the alignment in
create_coverage to avoid that.  The above patch implements a different approach
-- it rearranges structure fields and manually sets alignment so that layout
does not depend on current structure packing.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43341

[Bug fortran/42958] Weird temporary array allocation

2010-04-28 Thread amonakov at gcc dot gnu dot org



--- Comment #19 from amonakov at gcc dot gnu dot org  2010-04-28 15:15 
---
(In reply to comment #18)
  3) for the same reason you can also drop the + 1 in computing the allocation
 size which is currently (ubound - lbound + 1) * 4

Sorry, but isn't +1 needed because bounds are inclusive?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42958

[Bug c/44116] 64bit inodes for source code causes Value too large for defined data type (XFS,inode64)

2010-05-13 Thread amonakov at gcc dot gnu dot org



--- Comment #1 from amonakov at gcc dot gnu dot org  2010-05-13 14:46 
---
 r...@matylda1: /mnt/data/kasparek# LC_ALL=C gcc -o test.o test-10356.c
 cc1: error: test-10356.c: Value too large for defined data type

 The first this I need to help with is how to
 check if the code that causes this (expect somewhere is used 32bit variable to
 store the inode) is from gcc itself or it is some third-party library. I

Execute
gcc -o test.o test-10356.c -### 21 | grep cc1
to get the cc1 command line.  Then use gdb --args cc1 cmdline to debug the
compiler.  Getting a backtrace before the abort would be nice.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44116

[Bug testsuite/44343] malloc check in libstdc++-v3/testsuite/22_locale/codecvt/unshift/char/1.cc

2010-05-31 Thread amonakov at gcc dot gnu dot org



--- Comment #1 from amonakov at gcc dot gnu dot org  2010-05-31 14:23 
---
Yes, it's a bug in 1.cc that was fixed for 4.6.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44343

[Bug rtl-optimization/44838] [4.6 regression] RTL loop unrolling causes FAIL: gcc.dg/pr39794.c

2010-07-07 Thread amonakov at gcc dot gnu dot org



--- Comment #16 from amonakov at gcc dot gnu dot org  2010-07-07 09:54 
---
(In reply to comment #15)
 Subject: Re:  [4.6 regression] RTL loop
 unrolling causes FAIL: gcc.dg/pr39794.c
 
 I am not sure what you mean -- I may be misunderstanding how rtl alias 
 analysis
 works, but as far as I can tell, what unroller does (just preserving the
 MEM_ATTRs) is conservatively correct (so, potentially it may make us believe
 that there are dependences that are not really present, but it should not 
 cause
 a wrong-code bug).

Consider this simplified example:

for (i ...)
  {
/*A*/  t = a[i];
/*B*/  a[i+1] = t;
  }
MEM_ATTRS would indicate that memory references in A and B do not alias.

Unrolling by 2 produces:
for (i ...)
  {
/*A */ t = a[i];
/*B */ a[i+1] = t;
/*A'*/ t = a[i+1];
/*B'*/ a[i+2] = t;
  }
Preserving MEM_ATTRS wrongly indicates that memory references in B and A' do
not alias, and the scheduler then may happen to lift A' above B.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838

[Bug rtl-optimization/43494] gcc.c-torture/execute/vector-2.c fails with -fpic/-fPIC

2010-07-20 Thread amonakov at gcc dot gnu dot org



--- Comment #9 from amonakov at gcc dot gnu dot org  2010-07-20 08:25 
---
It's probably worth noting that the disambiguated MEMs are of different widths:

 load from (mem/c/i:DI (reg/f:DI 14 r14 [351]) [2 t+8 S8 A64])
 store to  (mem/s/j/c:SI (reg/f:DI 15 r15 [343]) [2 t+4 S4 A32])

(btw, IIUC the above mems do not alias indeed, and the problem is in
disambiguating the above store and a load from
(mem/c/i:DI (post_inc:DI (reg/f:DI 14 r14 [351])) [2 t+0 S8 A128]) )

Also, with -fno-strict-aliasing the failure vanishes.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43494

[Bug rtl-optimization/43494] gcc.c-torture/execute/vector-2.c fails with -fpic/-fPIC

2010-07-20 Thread amonakov at gcc dot gnu dot org



--- Comment #13 from amonakov at gcc dot gnu dot org  2010-07-20 14:13 
---
(In reply to comment #10)
 Re. comment 9: Well, the order of *this* store and *this* load is the
 difference between the test case failing or passing. So I do not think the
 problem is between this load and another store.

To clarify, I'm saying that the problem is in moving insn 9 (the store) past
insn 21, not past insn 28.  Your simplified dumps in comment #10 support that
(r15 is incremented and decremented, r14 is post-modified; thus, both insn 21
and insn 9 touch memory[r12]).  Insn 28 is not relevant to the miscompilation.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43494

[Bug rtl-optimization/43494] [4.4/4.5/4.6 Regression] Overlooked dependency causes wrong scheduling, wrong code

2010-07-21 Thread amonakov at gcc dot gnu dot org



--- Comment #17 from amonakov at gcc dot gnu dot org  2010-07-21 08:32 
---
(In reply to comment #16)
 OK, I think I finally understand what Alexander tried to explain, and I've
 annotated the code. Alexander, does this look right to you?

Yes, thanks.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43494

[Bug rtl-optimization/43494] [4.4/4.5/4.6 Regression] Overlooked dependency causes wrong scheduling, wrong code

2010-07-21 Thread amonakov at gcc dot gnu dot org



--- Comment #21 from amonakov at gcc dot gnu dot org  2010-07-21 10:07 
---
(In reply to comment #20)
 (Even sel-sched apparently does not use cselib, that's surprising!)

Offtopic: yes, using cselib in sel-sched is not quite straightforward, since we
need it to work on arbitrary regions (as I understand cselib is designed to
work on EBBs).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43494

[Bug rtl-optimization/38644] [4.3/4.4/4.5/4.6 Regression] Optimization flag -O1 -fschedule-insns2 causes wrong code

2010-08-12 Thread amonakov at gcc dot gnu dot org



--- Comment #22 from amonakov at gcc dot gnu dot org  2010-08-12 10:12 
---
It looks like patch from comment #16 should fix the problem, but was not
reviewed and/or applied.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38644

[Bug rtl-optimization/38644] [4.3/4.4/4.5/4.6 Regression] Optimization flag -O1 -fschedule-insns2 causes wrong code

2010-08-12 Thread amonakov at gcc dot gnu dot org



--- Comment #25 from amonakov at gcc dot gnu dot org  2010-08-12 12:00 
---
(In reply to comment #23)
 The patch from comment #16 only fixes the symptom, and only on ARM. It is not 
 a
 proper fix for the generic problem that is apparently also visible on POWER.

PR30282 audit trail contains more discussion of this problem.  Jim Wilson
argues that this problem should be addressed by emitting stack ties in
epilogues for targets that suffer from this problem (other targets apparently
do not thanks to red zone).  POWER was fixed that way (PR44199).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38644

[Bug rtl-optimization/44919] ICE on ia64 with -O3 at sel-sched.c:4672

2010-09-06 Thread amonakov at gcc dot gnu dot org

--- Comment #8 from amonakov at gcc dot gnu dot org  2010-09-06 08:57 
---
Subject: Bug 44919

Author: amonakov
Date: Mon Sep  6 08:56:43 2010
New Revision: 163904

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=163904
Log:
PR rtl-optimization/44919
* sel-sched.c (move_cond_jump): Remove assert, check that
the several blocks case can only happen with mutually exclusive
insns instead.  Rewrite the movement code to support moving through
several basic blocks. 

* g++.dg/opt/pr44919.C: New.

Added:
trunk/gcc/testsuite/g++.dg/opt/pr44919.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/sel-sched.c
trunk/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44919

[Bug rtl-optimization/44919] ICE on ia64 with -O3 at sel-sched.c:4672

2010-09-06 Thread amonakov at gcc dot gnu dot org



--- Comment #9 from amonakov at gcc dot gnu dot org  2010-09-06 09:00 
---
(In reply to comment #7)
 Any progress with the copyright assignment?

The copyright assignment is renewed, and I have committed the patch to the
current development branch on Andrey's behalf.  It will be committed to release
branches in a few days.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44919

[Bug rtl-optimization/44919] ICE on ia64 with -O3 at sel-sched.c:4672

2010-09-12 Thread amonakov at gcc dot gnu dot org



--- Comment #11 from amonakov at gcc dot gnu dot org  2010-09-12 20:34 
---
Subject: Bug 44919

Author: amonakov
Date: Sun Sep 12 20:34:26 2010
New Revision: 164234

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=164234
Log:
Backport from mainline
2010-09-06  Andrey Belevantsev  a...@ispras.ru

PR rtl-optimization/44919
* sel-sched.c (move_cond_jump): Remove assert, check that
the several blocks case can only happen with mutually exclusive
insns instead.  Rewrite the movement code to support moving through
several basic blocks.

* g++.dg/opt/pr44919.C: New.


Added:
branches/gcc-4_5-branch/gcc/testsuite/g++.dg/opt/pr44919.C
Modified:
branches/gcc-4_5-branch/gcc/ChangeLog
branches/gcc-4_5-branch/gcc/sel-sched.c
branches/gcc-4_5-branch/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44919

[Bug rtl-optimization/44919] ICE on ia64 with -O3 at sel-sched.c:4672

2010-09-12 Thread amonakov at gcc dot gnu dot org



--- Comment #12 from amonakov at gcc dot gnu dot org  2010-09-12 20:36 
---
Subject: Bug 44919

Author: amonakov
Date: Sun Sep 12 20:35:53 2010
New Revision: 164235

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=164235
Log:
Backport from mainline
2010-09-06  Andrey Belevantsev  a...@ispras.ru

PR rtl-optimization/44919
* sel-sched.c (move_cond_jump): Remove assert, check that
the several blocks case can only happen with mutually exclusive
insns instead.  Rewrite the movement code to support moving through
several basic blocks.

* g++.dg/opt/pr44919.C: New.


Added:
branches/gcc-4_4-branch/gcc/testsuite/g++.dg/opt/pr44919.C
Modified:
branches/gcc-4_4-branch/gcc/ChangeLog
branches/gcc-4_4-branch/gcc/sel-sched.c
branches/gcc-4_4-branch/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44919

[Bug rtl-optimization/44919] ICE on ia64 with -O3 at sel-sched.c:4672

2010-09-12 Thread amonakov at gcc dot gnu dot org



--- Comment #13 from amonakov at gcc dot gnu dot org  2010-09-12 20:38 
---
Fixed on release branches with above commits.


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44919

[Bug rtl-optimization/45652] [4.6 Regression] gcc.dg/compat/scalar-by-value-3 FAILs with -O2 -fselective-scheduling2

2010-09-13 Thread amonakov at gcc dot gnu dot org



--- Comment #2 from amonakov at gcc dot gnu dot org  2010-09-13 16:53 
---
Confirmed.  Not related to PR43949 since selective scheduling does not use
cselib.  The miscompilation seems to come from RTL aliasing: sel-sched lifts a
load that references stack via a general-purpose register above a store via
%rsp.

bad  cmdline: cc1 -O2 -fselective-scheduling2 -fdbg-cnt=sel_sched_insn_cnt:31
good cmdline: cc1 -O2 -fselective-scheduling2 -fdbg-cnt=sel_sched_insn_cnt:30

The no-aliasing decision comes from (base_alias_check):
1742  /* If one address is a stack reference there can be no alias:
1743 stack references using different base registers do not alias,
1744 a stack reference can not alias a parameter, and a stack reference
1745 can not alias a global.  */
1746  if ((GET_CODE (x_base) == ADDRESS  GET_MODE (x_base) == Pmode)
1747  || (GET_CODE (y_base) == ADDRESS  GET_MODE (y_base) == Pmode))
1748return 0;

Related GDB session:
Breakpoint 4, base_alias_check (x=0x76f20920, y=0x76f2d018,
x_mode=DImode, y_mode=SImode) at
/home/monoid/checkout/git/gcc-selfixes/gcc/alias.c:1687
1687  rtx x_base = find_base_term (x);
(gdb) up
#1  0x0076da1d in true_dependence_1 (mem=0x76f2d030,
mem_mode=SImode, mem_addr=0x76f2d018, x=0x76f30870,
x_addr=0x76f20920, 
varies=0x14041f2 rtx_varies_p, mem_canonicalized=0 '\000') at
/home/monoid/checkout/git/gcc-selfixes/gcc/alias.c:2440
2440  if (! base_alias_check (x_addr, mem_addr, GET_MODE (x), mem_mode))
(gdb) call debug_rtx(mem)
(mem/s/c:SI (plus:DI (reg/f:DI 7 sp)
(const_int 12 [0xc])) [5 ap.fp_offset+0 S4 A32])
(gdb) call debug_rtx(x)
(mem/s:DI (reg:DI 4 si) [0 MEM[(struct S * {ref-all})addr.0_2]+0 S8 A64])
(gdb) down
#0  base_alias_check (x=0x76f20920, y=0x76f2d018, x_mode=DImode,
y_mode=SImode) at /home/monoid/checkout/git/gcc-selfixes/gcc/alias.c:1687
1687  rtx x_base = find_base_term (x);
(gdb) n
...
(gdb) list
1741
1742  /* If one address is a stack reference there can be no alias:
1743 stack references using different base registers do not alias,
1744 a stack reference can not alias a parameter, and a stack reference
1745 can not alias a global.  */
1746  if ((GET_CODE (x_base) == ADDRESS  GET_MODE (x_base) == Pmode)
1747  || (GET_CODE (y_base) == ADDRESS  GET_MODE (y_base) == Pmode))
1748return 0;
1749
1750  return 1;
(gdb) call debug_rtx(x_base)
(address (reg:DI 4 si))
(gdb) call debug_rtx(y_base)
(address:DI (reg/f:DI 7 sp))
(gdb) fin
Run till exit from #0  base_alias_check (x=0x76f20920, y=0x76f2d018,
x_mode=DImode, y_mode=SImode)
at /home/monoid/checkout/git/gcc-selfixes/gcc/alias.c:1746
0x0076da1d in true_dependence_1 (mem=0x76f2d030, mem_mode=SImode,
mem_addr=0x76f2d018, x=0x76f30870, x_addr=0x76f20920, 
varies=0x14041f2 rtx_varies_p, mem_canonicalized=0 '\000') at
/home/monoid/checkout/git/gcc-selfixes/gcc/alias.c:2440
2440  if (! base_alias_check (x_addr, mem_addr, GET_MODE (x), mem_mode))
Value returned is $58 = 0


-- 

amonakov at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu dot
   ||org
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-09-13 16:53:59
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45652

77 matches

Mail list logo