[Bug tree-optimization/25371] -ftree-vectorize results in internal compiler error on AMD64

2007-05-10 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #10 from sebastian dot pop at cri dot ensmp dot fr  2007-05-10 
09:32 ---
Subject: Re:  -ftree-vectorize results in internal compiler error on AMD64

Zdenek's patch for cleaning the dataref analysis is also fixing this bug.
http://gcc.gnu.org/ml/gcc-patches/2007-05/msg00634.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25371



[Bug tree-optimization/29738] Missed constant propagation into loops

2006-11-06 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #1 from sebastian dot pop at cri dot ensmp dot fr  2006-11-06 
11:48 ---
Subject: Re:  Missed constant propagation into loops

I think the problem is that i is a global variable and thus foo is
potentially
considered as modifying i.  Have you tried

void foo (void);
void bar (void)
{
  int i, j;
  i = 0;
  for (j = 0; j  1; j++)
if (i)
  foo ();
}

This should be a simpler case that either is correctly simplified or I know
what to do for making it work.

For the original problem, why don't we propagate constants in

  #   i_3 = V_MUST_DEF i_2;
  i = 0;

  # NONLOCAL.6_19 = PHI NONLOCAL.6_9(5), NONLOCAL.6_11(2);
  # i_18 = PHI i_7(5), i_3(2);

i.e. replacing the use of i_3 with just the constant 0?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29738



[Bug middle-end/26900] Number of iterations not know for simple loop

2006-06-19 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #11 from sebastian dot pop at cri dot ensmp dot fr  2006-06-19 
07:50 ---
Subject: Re:  Number of iterations not know for simple loop

I thought that this bug should have been fixed by now:
http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01749.html
what is the status of that patch?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26900



[Bug tree-optimization/27331] [4.2 Regression] segfault in fold_convert with -ftree-vectorize

2006-06-16 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #12 from sebastian dot pop at cri dot ensmp dot fr  2006-06-16 
08:55 ---
Subject: Re:  [4.2 Regression] segfault in fold_convert with -ftree-vectorize

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
 The patch is in predcom branch.  

I should have missed this patch on gcc-patches, sorry.

 The relevant part (fixing also another
 similar bug) is below.
 

I think that this is better than what I have proposed, as there is no
reason to keep the dr if we know that its access function is not known.
So your patch is okay for trunk if it bootstraps, and pass regtest.

Thanks.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27331



[Bug tree-optimization/27331] [4.2 Regression] segfault in fold_convert with -ftree-vectorize

2006-06-15 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #9 from sebastian dot pop at cri dot ensmp dot fr  2006-06-15 
13:19 ---
Subject: Re:  [4.2 Regression] segfault in fold_convert with -ftree-vectorize

rakdver at gcc dot gnu dot org wrote:
 Ccing people responsible for data dependence analysis.  

Thanks, for the ping, this bug was not on my todo list.  I'll take a look.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27331



[Bug tree-optimization/27331] [4.2 Regression] segfault in fold_convert with -ftree-vectorize

2006-06-15 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #10 from sebastian dot pop at cri dot ensmp dot fr  2006-06-15 
14:52 ---
Subject: Re:  [4.2 Regression] segfault in fold_convert with -ftree-vectorize

You said that you had a fix in predcom, is that fix in your local
tree, or have you sent a patch to gcc-patches?

Here is a fix for this PR.  I don't like the data-ref code that deals
with component_refs, as it seems very fragile.

Index: tree-data-ref.c
===
--- tree-data-ref.c (revision 114678)
+++ tree-data-ref.c (working copy)
@@ -1947,7 +1947,8 @@ create_data_ref (tree memref, tree stmt,
  object_analysis divided by the size of data type.
   */
   if (!DR_BASE_OBJECT (dr)
-  || (TREE_CODE (memref) == COMPONENT_REF  DR_NUM_DIMENSIONS (dr) == 1))
+  || (TREE_CODE (memref) == COMPONENT_REF  DR_NUM_DIMENSIONS (dr) == 1
+  !automatically_generated_chrec_p (DR_ACCESS_FN (dr, 0
 {
   tree access_fn;
   tree new_step;


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27331



[Bug tree-optimization/27332] [4.2 Regression] ICE in try_interchange_loops with -ftree-loop-linear

2006-05-10 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #3 from sebastian dot pop at cri dot ensmp dot fr  2006-05-10 
12:26 ---
Created an attachment (id=11429)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11429action=view)
fix

proposed fix.  I didn't tested it other than making sure it fixed the bug.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27332



[Bug tree-optimization/26435] [4.1/4.2 regression] ICE with -O1 -ftree-loop-linear and higher optimization

2006-05-10 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #5 from sebastian dot pop at cri dot ensmp dot fr  2006-05-10 
12:32 ---
Created an attachment (id=11430)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11430action=view)
first shot at fixing this

didn't tested the patch, other than with RUNTESTFLAGS=tree-ssa.exp
and the current pr report.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26435



[Bug tree-optimization/19590] IVs with the same evolution not eliminated

2006-04-10 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #12 from sebastian dot pop at cri dot ensmp dot fr  2006-04-10 
09:14 ---
Created an attachment (id=11235)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11235action=view)
proposed fix

This patch fixes the problem, but probably it is a more general optimization
fix than the one described in the PR.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19590



[Bug bootstrap/27063] New: Fail to build gcc-core-4.2 snapshots

2006-04-06 Thread sebastian dot pop at cri dot ensmp dot fr
If you download only the gcc-core-4.2 snapshot, for example
ftp://ftp.uvsq.fr/pub/gcc/snapshots/4.2-20060401/gcc-core-4.2-20060401.tar.bz2
and do a ../configure --enable-languages=c  make
make will stop complaining about some objC file that is not in the package:

make[2]: *** No rule to make target `../../gcc/objc/objc-act.c', needed by
`s-gtype'.  Stop.


-- 
   Summary: Fail to build gcc-core-4.2 snapshots
   Product: gcc
   Version: 4.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sebastian dot pop at cri dot ensmp dot fr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27063



[Bug tree-optimization/26939] loop number of iterations analysis confused by what PRE did (PRE is doing its job)

2006-04-02 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #12 from sebastian dot pop at cri dot ensmp dot fr  2006-04-02 
08:12 ---
Created an attachment (id=11184)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11184action=view)
fix

This patch fixes this problem.  I'll bootntestncommit.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26939



[Bug middle-end/23411] [data deps] Distance on outer loops for self output deps

2006-03-29 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #3 from sebastian dot pop at cri dot ensmp dot fr  2006-03-29 
20:39 ---
Fixed by the recent changes.


-- 

sebastian dot pop at cri dot ensmp dot fr changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23411



[Bug middle-end/23412] [data deps] Overflow problem in Omega

2006-03-29 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #3 from sebastian dot pop at cri dot ensmp dot fr  2006-03-29 
20:40 ---
Fixed on autovect.


-- 

sebastian dot pop at cri dot ensmp dot fr changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23412



[Bug middle-end/23413] [data deps] Overflow problem in Omega

2006-03-29 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #2 from sebastian dot pop at cri dot ensmp dot fr  2006-03-29 
20:42 ---
Fixed on autovect.


-- 

sebastian dot pop at cri dot ensmp dot fr changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23413



[Bug tree-optimization/26719] [4.1/4.2 Regression] Computed (integer) table changes with -O

2006-03-28 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #4 from sebastian dot pop at cri dot ensmp dot fr  2006-03-28 
22:44 ---
Created an attachment (id=11146)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11146action=view)
first step

with this patch scev returns (int) (char) {0,+,1} but then 
chrec_convert_aggressive is removing the casts and the result
is just {0,+,1}.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26719



[Bug tree-optimization/26859] [4.2 Regression] ICE Segmentation Fault

2006-03-28 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #9 from sebastian dot pop at cri dot ensmp dot fr  2006-03-29 
00:27 ---
Created an attachment (id=11147)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11147action=view)
proposed fix

this patch (not tested yet) fixes the problem: it avoids a division by zero.
Part of the problem is that convert returns a step for which it sets the
overflow flag when converting unsigned ffe to int -2.  The patch resets
the overflow flag that has no sense for the step of a sequence.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26859



[Bug tree-optimization/23821] [4.0/4.1/4.2 Regression] DOM and VRP creating harder to optimize code

2006-02-13 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #8 from sebastian dot pop at cri dot ensmp dot fr  2006-02-13 
08:45 ---
Subject: Re:  [4.0/4.1/4.2 Regression] DOM and VRP creating harder to optimize
code

This case reminds me the peeled chrec unification that I had to
disable on autovect branch (I probably have to run the transformation
as a stand alone pass outside the analyzer for not disturbing the user
passes).  In that case we're looking at a code like

loop
  x = phi (0, a)
  a = phi (1, a + 1)
endloop

such that a simple transformation can make x a simple iv.  This case
is also quite important, as it occurs about 300 times during a bootstrap.

Now for the current problem, we could run a pass just after loop_init
for cleaning all these constructs.  As suggested, we would have to
build an equivalence relation either like VRP, or on demand.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23821



[Bug tree-optimization/25371] -ftree-vectorize results in internal compiler error on AMD64

2006-01-05 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #4 from sebastian dot pop at cri dot ensmp dot fr  2006-01-05 
16:32 ---
(In reply to comment #3)
 Vectorizer's dump file can help.
 

vect dump ends with:

/home/seb/ex/pr25371.c:26: note: Access function of PHI: {0, +,
1}_4(get_loop_exit_condition 
  if (i_99  D.1700_228) goto L54; else goto L55;)

/home/seb/ex/pr25371.c:26: note: === vec_transform_loop ===
/home/seb/ex/pr25371.c:26: note: === vect_do_peeling_for_alignment ===

and the problem seems to come from the fact that in
vect_create_addr_base_for_vector_ref, we're calling 
base_offset = force_gimple_operand (base_offset, new_stmt, false, dest);  
with a base_offset that contains a chrec:
((num_tD.1607 *) (long unsigned intD.4) {0, +, nD.1614_23}_4 * 16B);


-- 

sebastian dot pop at cri dot ensmp dot fr changed:

   What|Removed |Added

 CC||sebastian dot pop at cri dot
   ||ensmp dot fr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25371




[Bug tree-optimization/24670] New: VRP ICE in compare_name_with_value

2005-11-04 Thread sebastian dot pop at cri dot ensmp dot fr
This ICE occured during a compile of libfloat with mainline.
Attached is a delta reduced test that produces the ICE when compiled with -O2.

softfloat/bits64/softfloat.c: In function ‘float128_rem’:
softfloat/bits64/softfloat.c:4483: internal compiler error: in
compare_name_with_value, at tree-vrp.c:3064


__inline__ void
shift128Right (int count, long long int *z1Ptr)
{
  long long int z1;
  if (count == 0);
  else if (count  64);
  else
z1 = (count  64) ? count : 0;
  *z1Ptr = z1;
}
float128_rem ()
{
  signed int expDiff;
  long long int aSig1;
  long long int sigMean1;
  if (-64  expDiff)
shift128Right (-expDiff, aSig1);
  add128 (sigMean1);
}


-- 
   Summary: VRP ICE in compare_name_with_value
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sebastian dot pop at cri dot ensmp dot fr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24670



[Bug tree-optimization/18595] [4.0/4.1 Regression] IV-OPTS is O(N^3)

2005-11-03 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #56 from sebastian dot pop at cri dot ensmp dot fr  2005-11-03 
13:24 ---
Subject: Re:  [4.0/4.1 Regression] IV-OPTS is O(N^3)

 
 So, I'd suggest that we add a --param here for max-loop-nest-depth, and then
 just not do this stuff on deeper nests, or ignore all the outer loops in that
 case.
 
 Is that feasible?
 

Mark, I think that this has already been fixed by the patch that added
PARAM_SCEV_MAX_EXPR_SIZE: we give up when the expressions that we
handle are longer than this size (if I remember correctly this
defaults to 20).  For the example in this bug,

{   int j0, j1, j2, j3, j4, j5, j6, j7, j8, j9, a;
a = 0;
for (j0 = 0; j0  2; j0 ++)
for (j1 = j0; j1  2; j1 ++)
for (j2 = j1; j2  2; j2 ++)
for (j3 = j2; j3  2; j3 ++)
for (j4 = j3; j4  2; j4 ++)
for (j5 = j4; j5  2; j5 ++)
for (j6 = j5; j6  2; j6 ++)
for (j7 = j6; j7  2; j7 ++)
for (j8 = j7; j8  2; j8 ++)
for (j9 = j8; j9  2; j9 ++)
a += j0 + j1 + j2 + j3 + j4 + j5 + j6 + j7 + j8 + j9;
return a;
}

we would have a scev with 10 dimensions, i.e. an evolution in each
dimension, thus an expression with about 10 components.

Some numbers for mainline with -O2 (on amd64 with cpufreq):

time ./gcc/cc1 -O2 ~/ex/pr18595_10.c 
real0m0.345s
user0m0.089s
sys 0m0.017s

time ./gcc/cc1 -O2 ~/ex/pr18595_100.c 
 tree PRE  :   0.28 (21%) usr   0.02 (50%) sys   0.29 (20%) wall   
7742 kB (59%) ggc
 tree loop bounds  :   0.14 (10%) usr   0.00 ( 0%) sys   0.15 (10%) wall   
   0 kB ( 0%) ggc
real0m1.776s
user0m1.348s
sys 0m0.050s

time ./gcc/cc1 -O2 ~/ex/pr18595_200.c 
 tree PRE  :   1.53 (24%) usr   0.09 (60%) sys   1.63 (25%) wall  
31149 kB (74%) ggc
 tree loop bounds  :   1.21 (19%) usr   0.00 ( 0%) sys   1.21 (18%) wall   
   0 kB ( 0%) ggc
real0m7.077s
user0m6.363s
sys 0m0.167s

time ./gcc/cc1 -O2 ~/ex/pr18595_300.c
 tree PRE  :   4.69 (28%) usr   0.21 (68%) sys   5.03 (29%) wall  
69961 kB (80%) ggc
 tree loop bounds  :   4.03 (24%) usr   0.00 ( 0%) sys   4.06 (23%) wall   
   0 kB ( 0%) ggc
real0m18.126s
user0m17.022s
sys 0m0.333s

Now, if we remove PRE, we can even compile the case with 1000 nested
loops in a reasonable time:

time ./gcc/cc1 -O2 -fno-tree-pre ~/ex/pr18595_300.c 
 tree loop bounds  :   0.09 ( 4%) usr   0.00 ( 0%) sys   0.24 ( 9%) wall   
   0 kB ( 0%) ggc
real0m3.184s
user0m2.147s
sys 0m0.054s

time ./gcc/cc1 -O2 -fno-tree-pre ~/ex/pr18595_1000.c 
 tree loop bounds  :   2.00 ( 8%) usr   0.00 ( 0%) sys   2.00 ( 8%) wall   
   0 kB ( 0%) ggc
real0m27.171s
user0m26.298s
sys 0m0.169s

So, I think that we can safely close this PR.

Seb


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595



[Bug tree-optimization/18595] [4.0/4.1 Regression] IV-OPTS is O(N^3)

2005-11-03 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #57 from sebastian dot pop at cri dot ensmp dot fr  2005-11-03 
15:02 ---
Subject: Re:  [4.0/4.1 Regression] IV-OPTS is O(N^3)

sebastian dot pop at cri dot ensmp dot fr wrote:
 
 So, I think that we can safely close this PR.
 

The previous numbers were with mainline + patch, otherwise mainline
gets an ice as described in PR24483.  I only now realized this, sorry.

Seb


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595



[Bug tree-optimization/18595] [4.0/4.1 Regression] IV-OPTS is O(N^3)

2005-11-03 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #58 from sebastian dot pop at cri dot ensmp dot fr  2005-11-03 
19:31 ---
Subject: Re:  [4.0/4.1 Regression] IV-OPTS is O(N^3)

Here are again the numbers for mainline with no other patch:

time ./gcc/cc1 -O2 ~/ex/pr18595_10.c 
real0m0.164s
user0m0.116s
sys 0m0.018s

time ./gcc/cc1 -O2 ~/ex/pr18595_100.c 
 tree PRE  :   0.33 ( 3%) usr   0.01 ( 3%) sys   0.34 ( 3%) wall   
7742 kB ( 2%) ggc
 tree loop bounds  :   0.77 ( 7%) usr   0.03 (10%) sys   0.80 ( 7%) wall  
53463 kB (17%) ggc
 tree iv optimization  :   7.70 (74%) usr   0.19 (66%) sys   7.97 (73%) wall 
220347 kB (71%) ggc
real0m10.912s
user0m10.448s
sys 0m0.317s

For 200 nested loops it begins to swap even with -fno-tree-pre, so
we'd need a patch for limiting this.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595



[Bug tree-optimization/18595] [4.0/4.1 Regression] IV-OPTS is O(N^3)

2005-11-03 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #59 from sebastian dot pop at cri dot ensmp dot fr  2005-11-03 
20:10 ---
Subject: Re:  [4.0/4.1 Regression] IV-OPTS is O(N^3)

Here is a first patch that uses PARAM_SCEV_MAX_EXPR_SIZE for limiting
the size of expressions that we want to handle.  I will send it to
gcc-patches once it bootstraps, etc.

time ./gcc/cc1 -O2 ~/ex/pr18595_100.c
 tree loop bounds  :   0.51 (17%) usr   0.03 (23%) sys   0.53 (17%) wall  
32717 kB (29%) ggc
 tree iv optimization  :   1.23 (41%) usr   0.06 (46%) sys   1.29 (40%) wall  
64971 kB (58%) ggc
real0m3.995s
user0m3.015s
sys 0m0.147s

time ./gcc/cc1 -O2 ~/ex/pr18595_200.c
 tree loop bounds  :   2.75 (23%) usr   0.05 (13%) sys   2.80 (22%) wall  
72775 kB (22%) ggc
 tree iv optimization  :   3.74 (31%) usr   0.20 (51%) sys   4.30 (33%) wall 
210289 kB (64%) ggc
real0m13.316s
user0m12.163s
sys 0m0.423s

time ./gcc/cc1 -fno-tree-pre -O2 ~/ex/pr18595_300.c
 tree loop bounds  :   0.16 ( 3%) usr   0.00 ( 0%) sys   0.15 ( 3%) wall   
 508 kB ( 1%) ggc
 tree iv optimization  :   2.26 (47%) usr   0.08 (62%) sys   2.33 (46%) wall  
74743 kB (84%) ggc
real0m6.287s
user0m4.849s
sys 0m0.142s

time ./gcc/cc1 -fno-tree-pre -O2 ~/ex/pr18595_1000.c
 tree loop bounds  :   2.92 ( 4%) usr   0.00 ( 0%) sys   2.92 ( 4%) wall   
 913 kB ( 0%) ggc
 tree iv optimization  :  37.78 (54%) usr   0.91 (65%) sys  41.15 (52%) wall 
786700 kB (93%) ggc
real1m19.611s
user1m10.288s
sys 0m1.403s


Index: tree-scalar-evolution.c
===
--- tree-scalar-evolution.c (revision 106440)
+++ tree-scalar-evolution.c (working copy)
@@ -1982,7 +1982,8 @@ loop_closed_phi_def (tree var)
 /* Analyze all the parameters of the chrec that were left under a symbolic
form,
with respect to LOOP.  CHREC is the chrec to instantiate.  CACHE is the
cache
of already instantiated values.  FLAGS modify the way chrecs are
-   instantiated.  */
+   instantiated.  SIZE_EXPR is used for computing the size of the expression
to
+   be instantiated, and to stop if it exceeds some limit.  */

 /* Values for FLAGS.  */
 enum
@@ -1995,12 +1996,17 @@ enum
 };

 static tree
-instantiate_parameters_1 (struct loop *loop, tree chrec, int flags, htab_t
cache)
+instantiate_parameters_1 (struct loop *loop, tree chrec, int flags, htab_t
cache,
+ int size_expr)
 {
   tree res, op0, op1, op2;
   basic_block def_bb;
   struct loop *def_loop;

+  /* Give up if the path is longer than the MAX that we allow.  */
+  if (size_expr++  PARAM_VALUE (PARAM_SCEV_MAX_EXPR_SIZE))
+return chrec_dont_know;
+
   if (automatically_generated_chrec_p (chrec)
   || is_gimple_min_invariant (chrec))
 return chrec;
@@ -2068,7 +2074,7 @@ instantiate_parameters_1 (struct loop *l
}

   else if (res != chrec_dont_know)
-   res = instantiate_parameters_1 (loop, res, flags, cache);
+   res = instantiate_parameters_1 (loop, res, flags, cache, size_expr);

   bitmap_clear_bit (already_instantiated, SSA_NAME_VERSION (chrec));

@@ -2078,12 +2084,12 @@ instantiate_parameters_1 (struct loop *l

 case POLYNOMIAL_CHREC:
   op0 = instantiate_parameters_1 (loop, CHREC_LEFT (chrec),
- flags, cache);
+ flags, cache, size_expr);
   if (op0 == chrec_dont_know)
return chrec_dont_know;

   op1 = instantiate_parameters_1 (loop, CHREC_RIGHT (chrec),
- flags, cache);
+ flags, cache, size_expr);
   if (op1 == chrec_dont_know)
return chrec_dont_know;

@@ -2094,12 +2100,12 @@ instantiate_parameters_1 (struct loop *l

 case PLUS_EXPR:
   op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
- flags, cache);
+ flags, cache, size_expr);
   if (op0 == chrec_dont_know)
return chrec_dont_know;

   op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
- flags, cache);
+ flags, cache, size_expr);
   if (op1 == chrec_dont_know)
return chrec_dont_know;

@@ -2110,12 +2116,12 @@ instantiate_parameters_1 (struct loop *l

 case MINUS_EXPR:
   op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
- flags, cache);
+ flags, cache, size_expr);
   if (op0 == chrec_dont_know)
return chrec_dont_know;

   op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
- flags, cache);
+ flags, cache, size_expr);
   if (op1 == chrec_dont_know)
return chrec_dont_know;

@@ -2126,12 +2132,12 @@ instantiate_parameters_1 (struct loop *l

 case MULT_EXPR

[Bug tree-optimization/24262] [4.1 Regression] ICE: verify_ssa failed with -O -msse2 -ftree-vectorize

2005-10-12 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #4 from sebastian dot pop at cri dot ensmp dot fr  2005-10-12 
16:26 ---
Subject: Re:  [4.1 Regression] ICE: verify_ssa failed with -O -msse2
-ftree-vectorize

irar at il dot ibm dot com wrote:
 Here scev analyzer calculates the evolution of 'D.1703_5 * 2 + i_15',
 where 'D.1703_5 = i_15/2'.  Scev doesn't handle division, therefore 
 for D.1703_5 we get unknown scev, but then it's combined with the 
 rest of the expression, erroneously leading to {D.1703_5*2, +, 1}.
 

I don't follow: unknown + something should be folded into unknown,
not {D.1703_5*2, +, 1}_x: if D.1703_5 is defined in loop_x, it
potentially has an evolution in loop_x.  I'm thinking that a test:

if (!expr_invariant_in_loop_p (loop, CHREC_LEFT (chrec)))
  then give up with this case,

would solve this bug.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24262



[Bug tree-optimization/24262] [4.1 Regression] ICE: verify_ssa failed with -O -msse2 -ftree-vectorize

2005-10-12 Thread sebastian dot pop at cri dot ensmp dot fr


--- Comment #5 from sebastian dot pop at cri dot ensmp dot fr  2005-10-12 
16:53 ---
Subject: Re:  [4.1 Regression] ICE: verify_ssa failed with -O -msse2
-ftree-vectorize

Sebastian Pop wrote:
 
 if (!expr_invariant_in_loop_p (loop, CHREC_LEFT (chrec)))
   then give up with this case,
 
 would solve this bug.

We already do this! By the way, why do we need the init == expr
check?  The following (not tested yet) patch solves this bug.  Anybody
remembers about why we have the extra check?

*** tree-data-ref.c.~2.44.~ 2005-09-24 21:12:03.0 +0200
--- tree-data-ref.c 2005-10-12 18:54:38.0 +0200
***
*** 1124,1130 
return false;

init = initial_condition_in_loop_num (access_fn, loop-num);
!   if (init == expr  !expr_invariant_in_loop_p (loop, init))
/* Not enough information: may be not loop invariant.  
   E.g., for a[b[i]], we get a[D], where D=b[i]. EXPR is D, its 
   initial_condition is D, but it depends on i - loop's induction
--- 1124,1130 
return false;

init = initial_condition_in_loop_num (access_fn, loop-num);
!   if (!expr_invariant_in_loop_p (loop, init))
/* Not enough information: may be not loop invariant.  
   E.g., for a[b[i]], we get a[D], where D=b[i]. EXPR is D, its 
   initial_condition is D, but it depends on i - loop's induction


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24262



[Bug tree-optimization/23942] [4.1 Regression] loop problem / testcase takes very long time to compile

2005-09-21 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-09-21 17:02 ---
Subject: Re:  [4.1 Regression] loop problem / testcase takes very long time to 
compile

 Random break stops things typically somewhere inside 140 nested calls in scev
 (follow_ssa_edge and friends). I seem to recall there is some backtracking
 involved, I will check.
 

A patch around these lines should fix the problem: it limits the
number of arcs that we walk before giving up.  For the moment I'm
returning didn't found a path from P to P, when we give up.  We have
to handle a don't know symbol for this case.  

I'll test a patch similar to the following tomorow.

Index: Makefile.in
===
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1541
diff -d -u -p -r1.1541 Makefile.in
--- Makefile.in 14 Sep 2005 09:26:41 -  1.1541
+++ Makefile.in 21 Sep 2005 16:51:48 -
@@ -767,7 +767,7 @@ TREE_SSA_LIVE_H = tree-ssa-live.h $(PART
 PRETTY_PRINT_H = pretty-print.h input.h $(OBSTACK_H)
 DIAGNOSTIC_H = diagnostic.h diagnostic.def $(PRETTY_PRINT_H)
 C_PRETTY_PRINT_H = c-pretty-print.h $(PRETTY_PRINT_H) $(C_COMMON_H) $(TREE_H)
-SCEV_H = tree-scalar-evolution.h $(GGC_H) tree-chrec.h
+SCEV_H = tree-scalar-evolution.h $(GGC_H) tree-chrec.h $(PARAMS_H)
 LAMBDA_H = lambda.h tree.h vec.h $(GGC_H)
 TREE_DATA_REF_H = tree-data-ref.h $(LAMBDA_H)
 VARRAY_H = varray.h $(MACHMODE_H) $(SYSTEM_H) coretypes.h $(TM_H)
Index: tree-scalar-evolution.c
===
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v
retrieving revision 2.38
diff -d -u -p -r2.38 tree-scalar-evolution.c
--- tree-scalar-evolution.c 20 Sep 2005 07:09:20 -  2.38
+++ tree-scalar-evolution.c 21 Sep 2005 16:51:48 -
@@ -251,6 +251,7 @@ Software Foundation, 51 Franklin Street,
 #include tree-scalar-evolution.h
 #include tree-pass.h
 #include flags.h
+#include params.h
 
 static tree analyze_scalar_evolution_1 (struct loop *, tree, tree);
 static tree resolve_mixers (struct loop *, tree);
@@ -1022,17 +1023,14 @@ select_loops_exit_conditions (struct loo
 
 /* Depth first search algorithm.  */
 
-static bool follow_ssa_edge (struct loop *loop, tree, tree, tree *);
+static bool follow_ssa_edge (struct loop *loop, tree, tree, tree *, int);
 
 /* Follow the ssa edge into the right hand side RHS of an assignment.
Return true if the strongly connected component has been found.  */
 
 static bool
-follow_ssa_edge_in_rhs (struct loop *loop,
-   tree at_stmt,
-   tree rhs, 
-   tree halting_phi, 
-   tree *evolution_of_loop)
+follow_ssa_edge_in_rhs (struct loop *loop, tree at_stmt, tree rhs, 
+   tree halting_phi, tree *evolution_of_loop, int limit)
 {
   bool res = false;
   tree rhs0, rhs1;
@@ -1050,7 +1048,7 @@ follow_ssa_edge_in_rhs (struct loop *loo
 case NOP_EXPR:
   /* This assignment is under the form a_1 = (cast) rhs.  */
   res = follow_ssa_edge_in_rhs (loop, at_stmt, TREE_OPERAND (rhs, 0),
-   halting_phi, evolution_of_loop);
+   halting_phi, evolution_of_loop, limit);
   *evolution_of_loop = chrec_convert (TREE_TYPE (rhs),
  *evolution_of_loop, at_stmt);
   break;
@@ -1063,7 +1061,7 @@ follow_ssa_edge_in_rhs (struct loop *loo
 case SSA_NAME:
   /* This assignment is under the form: a_1 = b_2.  */
   res = follow_ssa_edge 
-   (loop, SSA_NAME_DEF_STMT (rhs), halting_phi, evolution_of_loop);
+   (loop, SSA_NAME_DEF_STMT (rhs), halting_phi, evolution_of_loop, limit);
   break;
   
 case PLUS_EXPR:
@@ -1081,7 +1079,7 @@ follow_ssa_edge_in_rhs (struct loop *loo
 a = b + c.  */
  res = follow_ssa_edge 
(loop, SSA_NAME_DEF_STMT (rhs0), halting_phi, 
-evolution_of_loop);
+evolution_of_loop, limit);
  
  if (res)
*evolution_of_loop = add_to_evolution 
@@ -1093,7 +1091,7 @@ follow_ssa_edge_in_rhs (struct loop *loo
{
  res = follow_ssa_edge 
(loop, SSA_NAME_DEF_STMT (rhs1), halting_phi, 
-evolution_of_loop);
+evolution_of_loop, limit);
  
  if (res)
*evolution_of_loop = add_to_evolution 
@@ -1109,7 +1107,7 @@ follow_ssa_edge_in_rhs (struct loop *loo
 a = b +   */
  res = follow_ssa_edge 
(loop, SSA_NAME_DEF_STMT (rhs0), halting_phi, 
-evolution_of_loop);
+evolution_of_loop, limit);
  if (res)
*evolution_of_loop = add_to_evolution 
  (loop-num, chrec_convert (type_rhs

[Bug middle-end/23411] [data deps] Distance on outer loops for self output deps

2005-09-14 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-09-14 12:38 ---
Subject: Re:  [data deps] Distance on outer loops for self output deps

In this case neither implementation got the dependece right: there are
bugs in both implementations.  For the following nested loop:

loop_1
  loop_2
A[5] = ...
  end_loop_2
end_loop_1

BAD would answer: (1, 1), that would mean that for getting the same
access we'd have to run loop_1 once *and* loop_2 once: this is false.

BOP would answer: (0, 0), that would mean that neither loop_1 nor
loop_2 carry dependences, in other words, both loops are parallel:
this is false.

The right answer is a set of distance vectors: (0, 1) and (1, 0).  For
getting to the same element in the array we have to run loop_1 once
*or* loop_2 has to run once.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23411


[Bug tree-optimization/23511] [4.1 Regression] Segfault in fold_binary

2005-08-22 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-08-22 12:34 ---
The following patch fixes the problem.  However I cannot boot'n'regtest
it for the moment.  I will commit it only tomorow once it is validated.

seb

*** tree-ssa-loop-niter.c.~2.39.~   2005-08-21 12:48:06.0 +0200
--- tree-ssa-loop-niter.c   2005-08-22 14:30:17.0 +0200
***
*** 1460,1466 
if (init == NULL_TREE
|| step == NULL_TREE
|| TREE_CODE (init) != INTEGER_CST
!   || TREE_CODE (step) != INTEGER_CST)
  break;
  
utype = unsigned_type_for (type);
--- 1460,1468 
if (init == NULL_TREE
|| step == NULL_TREE
|| TREE_CODE (init) != INTEGER_CST
!   || TREE_CODE (step) != INTEGER_CST
!   || TYPE_MIN_VALUE (type) == NULL_TREE
!   || TYPE_MAX_VALUE (type) == NULL_TREE)
  break;
  
utype = unsigned_type_for (type);

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23511


[Bug c/8268] no compile time array index checking

2005-08-22 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-08-22 12:38 ---
(In reply to comment #14)
 (In reply to comment #9)
 
  If we really wanted to tackle this better a compile-time, we'd run a
  pass to look at all the ARRAY_REFs for those which have an out-of-range
  index.  It wouldn't be terribly hard to stick one in just before we
  leave SSA form.
 
 I'll give this a try.
 

Have a look at tree-data-ref.c:analyze_array_indexes
The warning can be implemented in this function.

seb

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8268


[Bug middle-end/23410] [4.1 Regression] FAIL: gcc.c-torture/execute/950612-1.c execution, at -Os and -O3

2005-08-18 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-08-18 11:51 ---
Reduced testcase is:

unsigned long long
f4 (unsigned long long diff)
{
  return ((unsigned long long) ((signed long long) diff  0 ? -diff : diff));
}

main ()
{
  int i;
  for (i = 0; i = 10; i++)
if (f4 ((long long) -i) != i)
  abort ();
  
  exit (0);
}

And the problem is in VRP:

Visiting statement:
diff.0D.1338_6 = (long long intD.5) diffD.1337_4;

(analyze_scalar_evolution 
  (loop_nb = 1)
  (scalar = diff.0_6)
(get_scalar_evolution 
  (scalar = diff.0_6)
  (scalar_evolution = (long long int) {0, +, }_1))
(set_scalar_evolution 
  (scalar = diff.0_6)
  (scalar_evolution = (long long int) {0, +, }_1))
)
(instantiate_parameters 
  (loop_nb = 1)
  (chrec = (long long int) {0, +, }_1)
  (res = (long long int) {0, +, }_1))
Found new range for diff.0_6: [0, 0]

The computation of the min bound is not correct.  I'm working on a patch.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23410


[Bug tree-optimization/23433] [4.1 Regression] ICE: tree check: expected real_cst, have integer_cst in const_binop, at fold-const.c:1512

2005-08-17 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-08-17 23:23 ---
Subject: Re:  [4.1 Regression] ICE: tree check: expected real_cst, have 
integer_cst in const_binop, at fold-const.c:1512

I'm testing this patch on amd64 and i686.  I will commit it once
validated.

Index: tree-chrec.c
===
RCS file: /cvs/gcc/gcc/gcc/tree-chrec.c,v
retrieving revision 2.24
diff -d -u -p -r2.24 tree-chrec.c
--- tree-chrec.c15 Aug 2005 12:26:07 -  2.24
+++ tree-chrec.c17 Aug 2005 23:01:09 -
@@ -539,6 +539,9 @@ chrec_apply (unsigned var,
   if (dump_file  (dump_flags  TDF_DETAILS))
 fprintf (dump_file, (chrec_apply \n);
 
+  if (TREE_CODE (x) == INTEGER_CST  SCALAR_FLOAT_TYPE_P (type))
+x = build_real_from_int_cst (type, x);
+
   if (evolution_function_is_affine_p (chrec))
 {
   /* {a, +, b} (x)  -  a + b*x.  */


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23433


[Bug tree-optimization/23391] [4.1 regression] Tree checking failure due to scev

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-08-15 12:33 ---
Subject: Re:  [4.1 regression] Tree checking failure due to scev

pinskia at gcc dot gnu dot org wrote:
 This is related to PR 19899 which was fixed.
 

Yes, PR is related to PR19899, but same pattern occured in several
places and the fix to PR19899 was not complete.  Fixed now.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23391


[Bug tree-optimization/23391] [4.1 regression] Tree checking failure due to scev

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-08-15 12:35 ---
Subject: Re:  [4.1 regression] Tree checking failure due to scev

Sebastian Pop wrote:
 
 Yes, PR is related to PR19899, but same pattern occured in several
 places and the fix to PR19899 was not complete.  Fixed now.
 

We'll probably need this fix for 4.0 branch as well...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23391


[Bug middle-end/23409] New: [meta-bug] data dependence analyzer (BAD vs. BOP)

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr
This meta-bug tracks differences between Banerjee's Analyzer 
for Data-dependences (BAD) and Bill Pugh's Omega solver.

-- 
   Summary: [meta-bug] data dependence analyzer (BAD vs. BOP)
   Product: gcc
   Version: 3.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sebastian dot pop at cri dot ensmp dot fr
CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23409


[Bug middle-end/23411] New: [data deps] Distance on outer loops for self output deps

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr
The most frequent case that shows up when bootstrapping autovect branch 
with BOOT_CFLAGS=-O2 -fcheck-data-deps is the following:

Dist vectors from the first dependence analyzer:
   10 
Omega dist vectors are not the same:
   00 
Data dependence relation is:
(Data Dep: 
  access_fn_A: {0, +, 1}_3
  access_fn_B: {0, +, 1}_3

 (subscript 
  iterations_that_access_an_element_twice_in_A: 0
  last_conflict: scev_not_known;
  iterations_that_access_an_element_twice_in_B: 0
  last_conflict: scev_not_known;
  (Subscript distance: 0
  )
 )
  distance_vect:  00 
  direction_vect: ==
)

This is caused by a loop containing a data ref like the following:
loop_2
  loop_3
A[{0, +, 1}_3] = ...
  endloop_3
endloop_2

For this pattern, tree-data-ref.c says the following:

  /* There is a distance of 1 on all the outer loops: 
 
 Example: there is a dependence of distance 1 on loop_1 for the array A.
 | loop_1
 |   A[5] = ...
 | endloop
  */

But now that Omega says that dist is (0, 0) I'm not sure anymore whether
this is the standard meaning of distance vectors.

AllenKennedy book states:
Definition 2.9. Suppose that there is a dependence from statement
S1 on iteration i of a loop nest and statement S2 on iteration j, then
the dependence distance vector d(i,j) is defined as a vector of
length n such that d(i,j) = j - i.

-- 
   Summary: [data deps] Distance on outer loops for self output deps
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sebastian dot pop at cri dot ensmp dot fr
CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23411


[Bug middle-end/23409] [meta-bug] data dependence analyzer (BAD vs. BOP)

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr


-- 
   What|Removed |Added

  BugsThisDependsOn||23411


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23409


[Bug middle-end/23411] [data deps] Distance on outer loops for self output deps

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr


-- 
   What|Removed |Added

OtherBugsDependingO||23409
  nThis||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23411


[Bug middle-end/23412] New: [data deps] Overflow problem in Omega

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr
This pattern shows something strange in the results of Omega.
Looking at the step of array accesses it seems like Omega has 
just no mechanism to handle -1 evolutions i.e. 0 in 
unsigned with modulo arithmetics.  This occurs in 
gcc/gcc/real.c during bootstrap on amd64-linux.

Dist vectors from the first dependence analyzer:
   11 
Omega dist vectors are not the same:
  -10 
Data dependence relation is:
(Data Dep: 
  access_fn_A: {1, +, 4294967295}_5
  access_fn_B: {2, +, 4294967295}_5

 (subscript 
  iterations_that_access_an_element_twice_in_A: {0, +, 1}_1
  last_conflict: 1
  iterations_that_access_an_element_twice_in_B: {1, +, 1}_1
  last_conflict: 1
  (Subscript distance: 1
  )
 )
  distance_vect: -10 
  direction_vect: -=
)

-- 
   Summary: [data deps] Overflow problem in Omega
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sebastian dot pop at cri dot ensmp dot fr
CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23412


[Bug middle-end/23409] [meta-bug] data dependence analyzer (BAD vs. BOP)

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr


-- 
   What|Removed |Added

  BugsThisDependsOn||23412


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23409


[Bug middle-end/23412] [data deps] Overflow problem in Omega

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr


-- 
   What|Removed |Added

OtherBugsDependingO||23409
  nThis||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23412


[Bug middle-end/23413] New: [data deps] Overflow problem in Omega

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr
This pattern shows something strange in the results of Omega.
Looking at the step of array accesses it seems like Omega has 
just no mechanism to handle -1 evolutions i.e. 0 in 
unsigned with modulo arithmetics.  This occurs in 
gcc/gcc/real.c during bootstrap on amd64-linux.

Dist vectors from the first dependence analyzer:
   11 
Omega dist vectors are not the same:
  -10 
Data dependence relation is:
(Data Dep: 
  access_fn_A: {1, +, 4294967295}_5
  access_fn_B: {2, +, 4294967295}_5

 (subscript 
  iterations_that_access_an_element_twice_in_A: {0, +, 1}_1
  last_conflict: 1
  iterations_that_access_an_element_twice_in_B: {1, +, 1}_1
  last_conflict: 1
  (Subscript distance: 1
  )
 )
  distance_vect: -10 
  direction_vect: -=
)

-- 
   Summary: [data deps] Overflow problem in Omega
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sebastian dot pop at cri dot ensmp dot fr
CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23413


[Bug middle-end/23412] [data deps] Overflow problem in Omega

2005-08-15 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-08-15 23:16 ---
This one is also probably an overflow in Omega,
but the dependence problem looks pretty simple...
This occurs in gcc/gcc/omega.c on amd64-linux.

Dist vectors from the first dependence analyzer:
   110 
Omega dist vectors are not the same:
   0 10922 10922 
Data dependence relation is:
(Data Dep: 
  access_fn_A: {1, +, 1}_7
  access_fn_B: {1, +, 1}_7

 (subscript 
  iterations_that_access_an_element_twice_in_A: 0
  last_conflict: scev_not_known;
  iterations_that_access_an_element_twice_in_B: 0
  last_conflict: scev_not_known;
  (Subscript distance: 0
  )
 )
  distance_vect:  0 10922 10922 
  direction_vect: =++
)



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23412


[Bug tree-optimization/23391] [4.1 regression] Tree checking failure due to scev

2005-08-14 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-08-15 00:34 ---
Subject: Re:  [4.1 regression] Tree checking failure due to scev

This patch should fix the problem.  There are some more cases that use
build_int_cst instead of build_real.  I'll propose a more complete
patch later.

*** tree-scalar-evolution.c.~2.35.~ 2005-08-13 19:06:26.0 +0200
--- tree-scalar-evolution.c 2005-08-15 02:36:50.0 +0200
***
*** 1680,1686 
opnd10 = TREE_OPERAND (opnd1, 0);
chrec10 = analyze_scalar_evolution (loop, opnd10);
chrec10 = chrec_convert (type, chrec10, at_stmt);
!   res = chrec_fold_minus (type, build_int_cst (type, 0), chrec10);
break;
  
  case MULT_EXPR:
--- 1680,1688 
opnd10 = TREE_OPERAND (opnd1, 0);
chrec10 = analyze_scalar_evolution (loop, opnd10);
chrec10 = chrec_convert (type, chrec10, at_stmt);
!   res = chrec_fold_multiply (type, chrec10, SCALAR_FLOAT_TYPE_P (type)
! ? build_real (type, dconstm1)
! : build_int_cst_type (type, -1));
break;
  
  case MULT_EXPR:
*** tree-chrec.c.~2.23.~2005-08-13 19:06:26.0 +0200
--- tree-chrec.c2005-08-15 02:20:51.0 +0200
***
*** 284,291 
return build_polynomial_chrec 
  (CHREC_VARIABLE (op1), 
   chrec_fold_minus (type, op0, CHREC_LEFT (op1)),
!  chrec_fold_multiply (type, CHREC_RIGHT (op1),
!   build_int_cst_type (type, -1)));
  
default:
  {
--- 284,292 
return build_polynomial_chrec 
  (CHREC_VARIABLE (op1), 
   chrec_fold_minus (type, op0, CHREC_LEFT (op1)),
!  chrec_fold_multiply (type, CHREC_RIGHT (op1), 
SCALAR_FLOAT_TYPE_P (type)
!   ? build_real (type, dconstm1)
!   : build_int_cst_type (type, -1)));
  
default:
  {


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23391


[Bug tree-optimization/22236] [4.1 Regression] wrong code for casts and scev

2005-07-27 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-07-27 09:01 ---
Subject: Re:  [4.1 Regression] wrong code for casts and scev

dberlin at dberlin dot org wrote:
  A sequence of unsigned char 1, 2, ..., 255 has to be converted to
  signed char that would wrap with -fwrapv: 1, 2, ..., 127, -128, ...
  So here is a patch that keeps the cast if the sequence wraps. 
 
 You mean keep the cast at all, or keep it in the evolution describing
 the number of iterations in the loop.
 

Sorry for not being clear.  When chrec_convert fails, it simply
returns fold_convert (type, chrec), and because the folder does not
know about the chrecs at all it just returns a convert_expr 
(outer_type) {(inner_type)x, +, (inner_type)y}.

 If you mean keep it in the number of iterations of the loop, that seems
 wrong:
 
 
 Let's look at the case again
 void abort(void);
 
 static inline void
 foo (signed char a)
 {
   int b = a - 0x7F;
   if (b  1)
 abort();
 }
 
 int main()
 {
   unsigned char b;
   for(b = 0;b 0xFF;b++)
foo (b);
 }
 
 Note that foo doesn't *actually* change the value of the passed in parameter.
 The loop still iterates 0xFF times.
 
 Why is the cast affecting the number of iterations in the loop at all?
 We should still calculate the number of iterations to be 255 in any case.
 

Once inlined, you have a loop with two exits: the first is the natural
exit after 255 iterations, the second is a jump outside the loop to a
bb that contains the abort function.  

If we have an evolution {(signed char) 0, +, (signed char) 1} - 127
can be folded to {(schar) -127, +, (schar) 1}, and the loop exit
guarded by the if (b  1) becomes true for iteration 128.

So, there are two exits: the first one exits after 255 iterations the
second one exits after 128, and niter takes the minimum between these
two estimations, that is 128.

Now the failing step is that it is not possible to convert the
unsigned sequence {0, +, 1} that stands for 0, 1, ..., 255, into the
signed sequence {0, +, 1} that stands for 0, 1, ..., 127.  Here, a
part of the sequence has been lost in the translation.  

The solution here is to keep the cast around the first sequence:
(schar) {(uchar)0, +, (uchar)1}.  This is always safe with both
wrapping and non-wrapping semantics.

 also,
 
 in *all* of the vectorizer testcases, number_of_iterations_in_loop should be 
 determined to be n, an int.
 
 The code in those cases provably doesn't wrap (signed integer arithmetic 
 doesn't wrap without -fwrapv), 
 So why do we think it should wrap, when it's not casted at all?
 

The scev_probably_wraps_p analyzer is still too weak for infering that
the sequences used by the vectorizer don't wrap.

Let's take the example from vect-77.c

  D.2035_21 = offD.2023_11 + iD.2026_30;
  D.2036_22 = (long unsigned intD.4) D.2035_21;
  D.2037_23 = D.2036_22 * 4;
  D.2038_24 = (aintD.2020 *) D.2037_23;
  D.2039_25 = ibD.2022_16 + D.2038_24;

because you have this cast to unsigned and then the multiplication by
4 and an unknown initial value off_11, even if you know that the main
index i_30 doesn't wrap, you still can chose an offset large enough
for making D.2037_23 wrap.  So we need some extra information about
the range of offset if we want to convert the scev of D.2035_21.

For the moment this is the point where the analyzer fails to properly
convert the scev, and produces:

(set_scalar_evolution 
  (scalar = D.2036_22)
  (scalar_evolution = (long unsigned int) {off_11, +, 1}_1))

Is it possible to assume that a variable converted to a pointer type
does not wrap? As in: D.2038_24 = (aintD.2020 *) D.2037_23;



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22236


[Bug tree-optimization/22236] [4.1 Regression] wrong code for casts and scev

2005-07-27 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-07-27 09:04 ---
Subject: Re:  [4.1 Regression] wrong code for casts and scev

dberlin at dberlin dot org wrote:
  I don't think it is possible to properly convert these ivs without
  knowing an approximation of the number of iterations.
 
 I disagree.
 

Okay, we could (and should) use other informations than the number of
iterations for proving non-wrappingness ;-)



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22236


[Bug tree-optimization/22236] [4.1 Regression] wrong code for casts and scev

2005-07-26 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-07-26 10:06 ---
Subject: Re:  [4.1 Regression] wrong code for casts and scev

After inlining, we end up with a loop containing the following code:

   b.0_3 = (signed char) b_8;
   D.1621_4 = (int) b.0_3;
   a_5 = (signed char) D.1621_4;
   D.1640_6 = (int) a_5;
   b_7 = D.1640_6 - 127;
   if (b_7  1) goto L3; else goto L9;
   
that is equivalent to:

   b_7 = ((int) (signed char) (int) (signed char) b_8) - 127;
   if (b_7  1) goto L3; else goto L9;

with b_8 = (unsigned char) {1, +, 1}

   b_7 = ((int) (signed char) (int) (signed char) {(uchar)1, +, (uchar)1}) - 
127;
   if (b_7  1) goto L3; else goto L9;

A sequence of unsigned char 1, 2, ..., 255 has to be converted to
signed char that would wrap with -fwrapv: 1, 2, ..., 127, -128, ...
So here is a patch that keeps the cast if the sequence wraps.  The
remaining problem is that with this patch, chrec_convert gets more
picky about the things that it transforms: in some of the testcases of
the vectorizer, we get some fails because the number of iterations is
not known, making scev_probably_wraps_p to return true, and finally
the conversion is kept.

I have bootstrapped this patch on x86_64, but there are some
regressions:

FAIL: gcc.dg/vect/vect-46.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times Vectorizing an unaligned 
access 2
FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times Alignment of access forced 
using peeling 1
FAIL: gcc.dg/vect/vect-52.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-52.c scan-tree-dump-times Vectorizing an unaligned 
access 2
FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times Alignment of access forced 
using peeling 1
FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times Vectorizing an unaligned 
access 2
FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vectorized 1 loops 3
FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times Alignment of access forced 
using peeling 3

In all these cases, the loop bound is a parameter.  If IP-constant
propagation is not used, chrec_convert has not enough information for
removing the cast.  I propose to modify all these testcases to make
the loop bound explicit.

FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times Vectorizing an unaligned 
access 1
FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times Vectorizing an unaligned 
access 1

For these two regressions, the problem is the same: we end with an
evolution: ib_16 + (aint *) ((long unsigned int) {off_11, +, 1}_1 * 4)
in which the casts cannot be removed because the offset is not known,
and even if the number of iterations is known, chrec_convert cannot
prove that it does not overflow.  I propose to propagate the offset
by hand, or to wait for ipcp ;-)

FAIL: gcc.dg/vect/vect-87.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-87.c scan-tree-dump-times Alignment of access forced 
using peeling 1
FAIL: gcc.dg/vect/vect-88.c scan-tree-dump-times vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-88.c scan-tree-dump-times Alignment of access forced 
using peeling 1

For these two testcases we'll need IP-value range propagation.  I
think that we'll have to modify these testcases by inlining the code.
Dorit, could you look at vect-{77,78,87,88}.c testcases?  Thanks.

* tree-cfg.c (print_succ_bbs): Correctly print successors.
* tree-chrec.c (chrec_convert): Call scev_probably_wraps_p for checking
that the iv does not wrap before converting the iv.
* tree-ssa-loop-ivcanon.c: Correct typo in comment.
* tree-ssa-loop-ivopts.c (idx_find_step): Add a fixme note.
* tree-ssa-loop-niter.c (scev_probably_wraps_p): Check that AT_STMT is
not null.
(convert_step): Add a comment.

Index: tree-cfg.c
===
RCS file: /cvs/gcc/gcc/gcc/tree-cfg.c,v
retrieving revision 2.211
diff -d -u -p -r2.211 tree-cfg.c
--- tree-cfg.c  12 Jul 2005 13:20:28 -  2.211
+++ tree-cfg.c  26 Jul 2005 09:59:38 -
@@ -4454,7 +4454,7 @@ static void print_pred_bbs (FILE *, basi
 static void print_succ_bbs (FILE *, basic_block bb);
 
 
-/* Print the predecessors indexes of edge E on FILE.  */
+/* Print on FILE the indexes for the predecessors of basic_block BB.  */
 
 static void
 print_pred_bbs (FILE *file, basic_block bb)
@@ -4463,11 +4463,11 @@ print_pred_bbs (FILE *file, basic_block 
   edge_iterator ei;
 
   FOR_EACH_EDGE (e, ei, bb-preds)
-fprintf (file, bb_%d, e-src-index);
+fprintf (file, bb_%d , e-src-index);
 }
 
 
-/* Print the successors indexes

[Bug tree-optimization/22236] [4.1 Regression] wrong code for casts and scev

2005-07-26 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-07-26 15:15 ---
Subject: Re:  [4.1 Regression] wrong code for casts and scev

Dorit Naishlos wrote:
 
 The modifications you suggest will make the tests uninteresting - they were
 introduced with unknown loop-bound/offset on purpose. In fact, for most of
 these tests there are already versions with explicit constant values for
 the loop-bound/offset -

Okay, I won't modify any of these testcases.

 vect-46.c with explicit loop-bound becomes vect-40.c
 vect-50.c with explicit loop-bound becomes vect-44.c
 vect-52.c with explicit loop-bound becomes vect-48.c
 vect-58.c with explicit loop-bound becomes vect-54.c
 vect-60.c with explicit loop-bound becomes vect-56.c
 vect-77.c and vect-78.c with explicit offset become vect-75.c
 and vect-92.c also uses unknown loop-bound on purpose.
 
 Can we change something else in the tests to make the evolution-analyzer
 return something saner? by changing types of variables? by using some flag?
 

I don't think it is possible to properly convert these ivs without
knowing an approximation of the number of iterations.

We can get this either from -fipcp, but this would make the testcases
redundant as you said, or having the IP alias analysis that tells us
that the only pointers passed to main1 () in vect-46 point to data
whose size is 256, and from this extract an estimation of the number
of iterations, or last solution explained below.

 (by the way, where does it fail the vectorizer? (what are the last things
 that dump details reports?))

compiling vect-46.c produces the following:

(...
(set_scalar_evolution 
  (scalar = D.2054_15)
  (scalar_evolution = (afloat * restrict) {0, +, 4}_1 + pb_14))
)

/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: Access 
function of ptr: (afloat * restrict) {0, +, 4}_1 + pb_14
/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: not 
vectorized: ptr is loop invariant.
/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: not 
vectorized: unhandled data ref: D.2055_16 = *D.2054_15
/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: bad data 
references.
/home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: vectorized 
0 loops in function.

Now that we keep the cast, (afloat * restrict) {(uint) 0, +, (uint)
4}_1 + pb_14 cannot be folded into {(afloat * restrict)pb_14, +,
(afloat * restrict)4}_1, that's why the vectorizer cannot recognize
the pattern.

The main problem is that the code in the loop contains a cast of the
signed iv i_18 to unsigned:

bb_2 (preds = {bb_3 bb_1 }, succs = {bb_3 bb_5 })
{
  # TMT.5_17 = PHI TMT.5_27(3), TMT.5_26(1);
  # i_18 = PHI i_24(3), 0(1);
L0:;
  D.2050_6 = (long unsigned int) i_18;
  D.2051_7 = D.2050_6 * 4;
  D.2052_8 = (afloat * restrict) D.2051_7;
  D.2053_10 = D.2052_8 + pa_9;
  D.2054_15 = D.2052_8 + pb_14;
  #   VUSE TMT.5_17;
  D.2055_16 = *D.2054_15;
  D.2056_21 = D.2052_8 + pc_20;
  #   VUSE TMT.5_17;
  D.2057_22 = *D.2056_21;
  D.2058_23 = D.2055_16 * D.2057_22;
  #   TMT.5_27 = V_MAY_DEF TMT.5_17;
  *D.2053_10 = D.2058_23;
  i_24 = i_18 + 1;
  if (n_3  i_24) goto L9; else goto L11;

}

The solution is to have an estimation of the loop bound based on the
fact that i_24 and i_18 do not wrap.  From this estimation, I think
that the other vars D.2050_6, D.2051_7 and D.2052_8 can be proved to
not wrap.  I'm working on some code for estimating niter from signed
non wrapping ivs.

Sebastian


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22236


[Bug tree-optimization/21861] [meta-bug] scalar evolution type conversion

2005-06-07 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-06-07 20:02 ---
fixed.

-- 
   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21861


[Bug tree-optimization/21861] New: [meta-bug] scalar evolution type conversion

2005-06-01 Thread sebastian dot pop at cri dot ensmp dot fr
At the moment, there are some missed optimizations and errors 
linked to the same problem: type conversion of induction variables.

-- 
   Summary: [meta-bug] scalar evolution type conversion
   Product: gcc
   Version: 3.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sebastian dot pop at cri dot ensmp dot fr
CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21861


[Bug tree-optimization/21861] [meta-bug] scalar evolution type conversion

2005-06-01 Thread sebastian dot pop at cri dot ensmp dot fr


-- 
   What|Removed |Added

  BugsThisDependsOn||18403, 21029


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21861


[Bug tree-optimization/21029] [4.1 Regression] vrp miscompiles Ada front-end, drops loop exit test in well-defined wrap-around circumstances

2005-04-29 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-04-29 19:51 ---
Subject: Re: [PR tree-optimization/21029, RFC] harmful chrec type conversions

Thanks to Roger Sayle for pointing me at this PR.

Alexandre Oliva wrote:
 
 This is not a final patch; if the idea is considered sound, I'd simply
 remove the if and the then-dead remaining code in chrec_convert.

The code following the if is not dead, at least it still is used in
some cases after hand inlining count_ev_in_wider_type.  I would
propose this patch that is about the same as yours

 Index: gcc/ChangeLog
 from  Alexandre Oliva  [EMAIL PROTECTED]
 
   PR tree-optimization/21029
   * tree-chrec.c (chrec_convert): Handle same-precision integral
   types that differ in signedness like widening conversions.
 

with some more changes as follow:

* tree-chrec.c (chrec_convert): Handle same-precision integral
types that differ in signedness like widening conversions.
Inline count_ev_in_wider_type.
* tree-chrec.h (count_ev_in_wider_type): Remove declaration.
* tree-scalar-evolution.c (count_ev_in_wider_type): Removed.

Zdenek, does this change look right to you?


Index: tree-chrec.c
===
RCS file: /cvs/gcc/gcc/gcc/tree-chrec.c,v
retrieving revision 2.15
diff -c -3 -p -r2.15 tree-chrec.c
*** tree-chrec.c21 Apr 2005 08:48:51 -  2.15
--- tree-chrec.c29 Apr 2005 19:31:51 -
*** tree 
*** 1036,1085 
  chrec_convert (tree type, 
   tree chrec)
  {
!   tree ct;
!   
if (automatically_generated_chrec_p (chrec))
  return chrec;
!   
ct = chrec_type (chrec);
if (ct == type)
  return chrec;
  
!   if (TYPE_PRECISION (ct)  TYPE_PRECISION (type))
! return count_ev_in_wider_type (type, chrec);
! 
!   switch (TREE_CODE (chrec))
  {
! case POLYNOMIAL_CHREC:
!   return build_polynomial_chrec (CHREC_VARIABLE (chrec),
!chrec_convert (type,
!   CHREC_LEFT (chrec)),
!chrec_convert (type,
!   CHREC_RIGHT (chrec)));
! 
! default:
!   {
!   tree res = fold_convert (type, chrec);
! 
!   /* Don't propagate overflows.  */
!   TREE_OVERFLOW (res) = 0;
!   if (CONSTANT_CLASS_P (res))
! TREE_CONSTANT_OVERFLOW (res) = 0;
! 
!   /* But reject constants that don't fit in their type after conversion.
!  This can happen if TYPE_MIN_VALUE or TYPE_MAX_VALUE are not the
!  natural values associated with TYPE_PRECISION and TYPE_UNSIGNED,
!  and can cause problems later when computing niters of loops.  Note
!  that we don't do the check before converting because we don't want
!  to reject conversions of negative chrecs to unsigned types.  */
!   if (TREE_CODE (res) == INTEGER_CST
!TREE_CODE (type) == INTEGER_TYPE
!!int_fits_type_p (res, type))
! res = chrec_dont_know;
! 
!   return res;
!   }
  }
  }
  
  /* Returns the type of the chrec.  */
--- 1036,1104 
  chrec_convert (tree type, 
   tree chrec)
  {
!   tree ct, base, step;
!   struct loop *loop;
! 
if (automatically_generated_chrec_p (chrec))
  return chrec;
! 
ct = chrec_type (chrec);
if (ct == type)
  return chrec;
  
!   if (!evolution_function_is_affine_p (chrec))
  {
!   switch (TREE_CODE (chrec))
!   {
!   case POLYNOMIAL_CHREC:
! return build_polynomial_chrec (CHREC_VARIABLE (chrec),
!chrec_convert (type,
!   CHREC_LEFT (chrec)),
!chrec_convert (type,
!   CHREC_RIGHT (chrec)));
! 
!   default:
! {
!   tree res = fold_convert (type, chrec);
! 
!   /* Don't propagate overflows.  */
!   TREE_OVERFLOW (res) = 0;
!   if (CONSTANT_CLASS_P (res))
! TREE_CONSTANT_OVERFLOW (res) = 0;
! 
!   /* But reject constants that don't fit in their type after
!  conversion.  This can happen if TYPE_MIN_VALUE or
!  TYPE_MAX_VALUE are not the natural values associated
!  with TYPE_PRECISION and TYPE_UNSIGNED, and can cause
!  problems later when computing niters of loops.  Note
!  that we don't do the check before converting because we
!  don't want to reject conversions of negative chrecs to
!  unsigned types.  */
!   if (TREE_CODE (res) == INTEGER_CST
!TREE_CODE (type) == INTEGER_TYPE
!!int_fits_type_p (res, type))
! res = chrec_dont_know;
! 
!   return res

[Bug tree-optimization/20742] [4.0/4.1 Regression] Hang in tree_ssa_iv_optimize_loop

2005-04-04 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-04-04 14:50 ---
Subject: Re:  [4.0/4.1 Regression] Hang in tree_ssa_iv_optimize_loop

rakdver at gcc dot gnu dot org wrote:
 
 Scev probably should keep track of how large expressions it produces, and 
 just give
 up if it grows beyond some limit.
 

Okay.  What about putting this limit in chrec_fold_plus_1 and
chrec_fold_multiply?  I think that this will minimize the size of the
patch for fixing this problem.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20742


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-27 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-27 13:18 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

With the following patch I got some speedup for depth 100.

from: 
 tree iv optimization  :   2.62 (62%) usr   0.27 (82%) sys   2.92 (62%) wall
 tree iv optimization  :   2.33 (59%) usr   0.25 (83%) sys   2.63 (54%) wall

to:
 tree iv optimization  :   1.19 (46%) usr   0.04 (40%) sys   1.26 (45%) wall
 tree iv optimization  :   1.21 (47%) usr   0.05 (45%) sys   1.30 (46%) wall

Basically we're reseting too much information each time scev_reset is
called.  It would even be possible to reset only the part of the scev
database that contains symbols instead of erasing everything.

Do somebody know how to iterate over all the elements of the hash
setting to NULL_TREE the elements that answer true to
chrec_contains_symbols?

Index: tree-scalar-evolution.c
===
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v
retrieving revision 2.16
diff -d -u -p -r2.16 tree-scalar-evolution.c
--- tree-scalar-evolution.c 18 Jan 2005 11:36:26 -  2.16
+++ tree-scalar-evolution.c 27 Jan 2005 13:12:36 -
@@ -2490,7 +2490,7 @@ scev_reset (void)
   for (i = 1; i  current_loops-num; i++)
 {
   loop = current_loops-parray[i];
-  if (loop)
+  if (loop  chrec_contains_symbols (loop-nb_iterations))
loop-nb_iterations = NULL_TREE;
 }
 }


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-27 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-27 15:12 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
 the patch is below (in stronger form -- only removing entries that
 contain invalidated symbols), and indeed helps with this testcase.

Many thanks.

 It however causes measurable slow down on normal code (see analysis
 in one of the previous mails).
 

I see. 

Another idea: would it be possible to insert the invalidated names
during the optimization pass instead of invalidating all the symbols?
This would avoid the first pass over the scev database that search for
symbols.

 Note that your patch is not entirely correct -- in case loop is
 unrolled, its number of iterations is changed, so in this case
 scev_reset should clear its number of iterations unconditionally

or the unroller should do that work on the unrolled loops?



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-27 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-27 20:38 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
  Another idea: would it be possible to insert the invalidated names
  during the optimization pass instead of invalidating all the symbols?
  This would avoid the first pass over the scev database that search for
  symbols.
 
 I don't understand this proposal.
 

Anyway, I see that I was wrong: even if you mark the SSA_NAMEs that
are removed in an optimization, and that you invalidate only those
symbols, you still have to walk the scev database to ensure that there
is no other evolution that references the removed SSA_NAMEs.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-25 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-25 10:32 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
 Adding the instantiation cache was compile time neutral on normal code,
 so I don't think the effect on scev cost is significant.
 

How that?  You end up querying and caching an evolution for every
instantiation of an SSA_NAME!  

I will quantify the number of queries wrt the number of times this
information is useful.  I think that with my patch, the instantiation
cache information is used zero times on a bootstrap.

 The problem is that we end up calling the instantiate_parameters_1
 function 83214+2453273 (outside + recursive) times (for N=100).
 We also call analyze_scalar_evolution_1 1146317 times.  Many of these
 calls create POLYNOMIAL_CHREC nodes (2894131 calls to build3_stat).
 Large part of 5142869 of build_int_cst_wide calls is very likely also
 due to scev analysis.  This is not going to be cheap.  Removing the
 instantiation cache definitely would not decrease the number of
 instantiate_parameters_1 calls (might increase it, in fact, although
 I don't know how significantly).
 

This also could be a bad use of the scev analysis.

For Steven: you can have a O(N**3) algorithm very simply:

  loop O(N) times
loop O(N) times
  algorithm in O(N)

 One part of the problem is that loop optimizers need to throw away the
 scev caches after each change to loops, which leads to recounting the
 information too much.  Allowing to invalidate only changed parts helps,
 but seems to be relatively costly on normal code -- you need to scan the
 hash table for things that refer to removed or from some other reason
 invalidated ssa names, which is slow.  

Just set the SSA_NAME to not_analyzed_yet or NULL_TREE, and the next time
you'll ask for the evolution of this SSA_NAME you'll analyze the
evolution from scratch.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-25 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-25 10:39 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
 
 How?  If the reference is left in symbolic form, it means that you know
 nothing about its value, so the only thing you can do with it is to
 check its equality/inequality, in code like
 
 while (...)
   {
 i = something_weird ();
 
 a[i] = ...;  (a)
 a[i+1] = ...;  (b)
 a[i] = ...;  (c)
   }
 

The following is probably a more frequent case:

i = 0
x = something_weird () or a function parameter
loop
  i = i + 1 
  a[i] = ...
  ... = a[i + x]
endloop

where you end with a symbolic distance vector.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-25 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-25 11:02 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
 
 (*) I hope; scev is a mess of mutualy recursive functions --
 analyze_scalar_evolution calling number_of_iterations calling
 instantiate_parameters calling analyze_scalar_evolution again, with
 instantiate_parameters hacked so that the infinite cycle cannot occur is
 my favourite one.  Nobody can say anything sure about behavior of scev
 -- it is not even well defined what analyze_scalar_evolutions will
 return to you, 

It returns to you an evolution that might contain SSA_NAMEs.

 unless you call instantiate_parameters or resolve_mixers
 on the result.
 

And once you call instantiate_parameters on the result you're
guaranteed that what you get does contain only determined constants,
or otherwise the result is chrec_dont_know.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-25 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-25 12:03 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
 
 --- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni 
 dot cz  2005-01-25 11:14 ---
 Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)
 
  rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
   Adding the instantiation cache was compile time neutral on normal code,
   so I don't think the effect on scev cost is significant.
   
  
  How that?  You end up querying and caching an evolution for every
  instantiation of an SSA_NAME!  
 
 which is pretty cheap (constant time operation); note that we do an
 equivalent lookup in the analyze_scalar_evolution call in any case, also
 without causing any compile time problems.  I haven't measured any slowdown on
 normal testcases.
 
  I will quantify the number of queries wrt the number of times this
  information is useful.  I think that with my patch, the instantiation
  cache information is used zero times on a bootstrap.
 
 I don't think so.  Please come up with some numbers, otherwise this
 discussion is pointless.
 

during a bootstrap: 

instantiation cache queries: 1146452

instantiation cache uses: 247452 of which 143977 were scev_not_known,
the other were SSA_NAMEs.

this was counted with a grep|wc with the following patch: 

Index: tree-scalar-evolution.c
===
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v
retrieving revision 2.16
diff -d -u -p -r2.16 tree-scalar-evolution.c
--- tree-scalar-evolution.c 18 Jan 2005 11:36:26 -  2.16
+++ tree-scalar-evolution.c 25 Jan 2005 12:00:57 -
@@ -1898,8 +1898,15 @@ get_instantiated_value (htab_t cache, tr
   pattern.var = version;
   info = htab_find (cache, pattern);
 
+  fprintf (stdout, IC_query \n);
+
   if (info)
-return info-chrec;
+{
+  fprintf (stdout, IC_used_once \n);
+  print_generic_expr (stdout, info-chrec, 0);
+  fprintf (stdout, \n);
+  return info-chrec;
+}
   else
 return NULL_TREE;
 }
@@ -1915,6 +1922,8 @@ set_instantiated_value (htab_t cache, tr
   pattern.var = version;
   slot = htab_find_slot (cache, pattern, INSERT);
 
+  fprintf (stdout, IC_query \n);
+
   if (*slot)
 info = *slot;
   else
@@ -1990,7 +1999,8 @@ instantiate_parameters_1 (struct loop *l
 result again.  */
   bitmap_set_bit (already_instantiated, SSA_NAME_VERSION (chrec));
   res = analyze_scalar_evolution (def_loop, chrec);
-  res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs, 
cache);
+  if  (res != chrec_dont_know)
+   res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs, 
cache);
   bitmap_clear_bit (already_instantiated, SSA_NAME_VERSION (chrec));
 
   /* Store the correct value to the cache.  */
@@ -2000,8 +2010,14 @@ instantiate_parameters_1 (struct loop *l
 case POLYNOMIAL_CHREC:
   op0 = instantiate_parameters_1 (loop, CHREC_LEFT (chrec),
  allow_superloop_chrecs, cache);
+  if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+
   op1 = instantiate_parameters_1 (loop, CHREC_RIGHT (chrec),
  allow_superloop_chrecs, cache);
+  if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+
   if (CHREC_LEFT (chrec) != op0
  || CHREC_RIGHT (chrec) != op1)
chrec = build_polynomial_chrec (CHREC_VARIABLE (chrec), op0, op1);
@@ -2010,8 +2026,14 @@ instantiate_parameters_1 (struct loop *l
 case PLUS_EXPR:
   op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs, cache);
+  if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+
   op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs, cache);
+  if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+
   if (TREE_OPERAND (chrec, 0) != op0
  || TREE_OPERAND (chrec, 1) != op1)
chrec = chrec_fold_plus (TREE_TYPE (chrec), op0, op1);
@@ -2020,8 +2042,14 @@ instantiate_parameters_1 (struct loop *l
 case MINUS_EXPR:
   op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs, cache);
+  if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+
   op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs, cache);
+  if (op1 == chrec_dont_know)
+return chrec_dont_know;
+
   if (TREE_OPERAND (chrec, 0) != op0
  || TREE_OPERAND (chrec, 1) != op1)
 chrec = chrec_fold_minus (TREE_TYPE (chrec), op0, op1);
@@ -2030,8 +2058,14 @@ instantiate_parameters_1 (struct loop *l

[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-25 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-25 12:44 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
 More seriously -- which of the possibilities?  If I have loops like
 
 while (...)
   {
 while (...)
   {
 x_1 = something ();
   }
 x_2 = phi (x_1);
 x_3 = x_2 + 1;
   }
 
 What will analyze_scalar_evolutions return for x_3? There are (at least)
 three possible valid values:
 
 x_3

This would be the answer if analyze_scalar_evolutions would be the
identity function.  If you want, you could change analyze_scalar_evolutions
such that it behaves like that, and decide that the instantiation do
the rest of the work (I mean moving the code that is currently in
analyze_scalar_evolutions to the instantiation phase).

 x_2 + 1

If you decide to reconstruct the tree expression, there is no reason
to stop on a phi node that has a single argument.  Why would you like
to get this answer as the reconstructed tree?

 x_1 + 1
 

IMO this would be the answer, although I didn't checked.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-25 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-25 16:26 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

  If you decide to reconstruct the tree expression, there is no reason
  to stop on a phi node that has a single argument.  Why would you like
  to get this answer as the reconstructed tree?
 
 because this answer preserves loop closed ssa form -- the answer x_1 + 1
 copy propagates the value outsied of the loop.  In some applications
 this choice could make sense.
 

Okay, if it makes sense, you could modify the analyzer such that it
stops reconstructing the tree once it is on the phi following a loop.
This won't change the information provided after the instantiation.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-24 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-24 21:36 ---
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

 
 Still there remain some inefficiences within the scev analysis itself.
 

Zdenek, have you tried to revert the patch that caches the
instantiation information?  This could speed up things a little.  

On a side note, I wonder how many times we're using this instantiation
cache, in other words whether we pay a high price for the scev
analysis for some (probably) rare cases...



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


[Bug tree-optimization/19224] [4.0 regression] Endless loop compiling simple file: Bug in tree-scalar-evolution.c (instantiate_parameters_1)?

2005-01-02 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-02 20:37 ---
(In reply to comment #5)
 Caused by an exponential time complexity of instantiate_parameters.
 I am working on a patch.

The following patch solves the exponential time complexity.

Sorry for the inconvenient.
Sebastian

*** instantiate_parameters_1 (struct loop *l
*** 1946,1960 
 result again.  Avoid the cyclic instantiation in mixers.  */
bitmap_set_bit (already_instantiated, SSA_NAME_VERSION (chrec));
res = analyze_scalar_evolution (def_loop, chrec);
!   res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs);
bitmap_clear_bit (already_instantiated, SSA_NAME_VERSION (chrec));
return res;
  
  case POLYNOMIAL_CHREC:
op0 = instantiate_parameters_1 (loop, CHREC_LEFT (chrec),
  allow_superloop_chrecs);
op1 = instantiate_parameters_1 (loop, CHREC_RIGHT (chrec),
  allow_superloop_chrecs);
if (CHREC_LEFT (chrec) != op0
  || CHREC_RIGHT (chrec) != op1)
chrec = build_polynomial_chrec (CHREC_VARIABLE (chrec), op0, op1);
--- 2005,2026 
 result again.  Avoid the cyclic instantiation in mixers.  */
bitmap_set_bit (already_instantiated, SSA_NAME_VERSION (chrec));
res = analyze_scalar_evolution (def_loop, chrec);
!   if (res != chrec_dont_know)
!   res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs);
bitmap_clear_bit (already_instantiated, SSA_NAME_VERSION (chrec));
return res;
  
  case POLYNOMIAL_CHREC:
op0 = instantiate_parameters_1 (loop, CHREC_LEFT (chrec),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, CHREC_RIGHT (chrec),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
if (CHREC_LEFT (chrec) != op0
  || CHREC_RIGHT (chrec) != op1)
chrec = build_polynomial_chrec (CHREC_VARIABLE (chrec), op0, op1);
*** instantiate_parameters_1 (struct loop *l
*** 1963,1970 
--- 2029,2042 
  case PLUS_EXPR:
op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
if (TREE_OPERAND (chrec, 0) != op0
  || TREE_OPERAND (chrec, 1) != op1)
chrec = chrec_fold_plus (TREE_TYPE (chrec), op0, op1);
*** instantiate_parameters_1 (struct loop *l
*** 1973,1980 
--- 2045,2058 
  case MINUS_EXPR:
op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
if (TREE_OPERAND (chrec, 0) != op0
  || TREE_OPERAND (chrec, 1) != op1)
  chrec = chrec_fold_minus (TREE_TYPE (chrec), op0, op1);
*** instantiate_parameters_1 (struct loop *l
*** 1983,1990 
--- 2061,2074 
  case MULT_EXPR:
op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
if (TREE_OPERAND (chrec, 0) != op0
  || TREE_OPERAND (chrec, 1) != op1)
chrec = chrec_fold_multiply (TREE_TYPE (chrec), op0, op1);
*** instantiate_parameters_1 (struct loop *l
*** 2018,2027 
--- 2102,2120 
  case 3:
op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op2 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 2),
  allow_superloop_chrecs);
+   if (op2 == chrec_dont_know)
+   return

[Bug tree-optimization/19224] [4.0 regression] Endless loop compiling simple file: Bug in tree-scalar-evolution.c (instantiate_parameters_1)?

2005-01-02 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-02 21:46 ---
(In reply to comment #7)
 Subject: Re:  [4.0 regression] Endless loop compiling simple file: Bug in
tree-scalar-evolution.c (instantiate_parameters_1)?
 
 not really (well, maybe in this particular case, but not in general).
 Your patch definitely helps (and you should submit it anyway), but
 yhe problem is that even with your patch it may happen that
 instantiate_parameters is called recursively several times for one
 ssa name if the expression contains it several times, and if this
 happens for several expressions in a row, you get the exponential
 complexity.  I am just testing the patch below that caches the results
 to avoid this problem.
 

Richtig.  I'm just speeding the chrec_dont_know case, whereas your patch
solves the more general problem of huge expressions that have to be 
instantiated and that don't necessarily lead to unknown evolutions.

Thanks.

PS: I'll test my patch separately and will post it to gcc-patches.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19224


[Bug tree-optimization/19224] [4.0 regression] Endless loop compiling simple file: Bug in tree-scalar-evolution.c (instantiate_parameters_1)?

2005-01-02 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2005-01-02 23:15 ---
Subject: Re:  [4.0 regression] Endless loop compiling simple file: Bug in 
tree-scalar-evolution.c (instantiate_parameters_1)?

rakdver at gcc dot gnu dot org wrote:
 
 --- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-02 
 18:55 ---
 Caused by an exponential time complexity of instantiate_parameters.
 I am working on a patch.
 

The following patch solves the exponential time complexity.

Sorry for the inconvenient.
Sebastian

*** instantiate_parameters_1 (struct loop *l
*** 1946,1960 
 result again.  Avoid the cyclic instantiation in mixers.  */
bitmap_set_bit (already_instantiated, SSA_NAME_VERSION (chrec));
res = analyze_scalar_evolution (def_loop, chrec);
!   res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs);
bitmap_clear_bit (already_instantiated, SSA_NAME_VERSION (chrec));
return res;
  
  case POLYNOMIAL_CHREC:
op0 = instantiate_parameters_1 (loop, CHREC_LEFT (chrec),
  allow_superloop_chrecs);
op1 = instantiate_parameters_1 (loop, CHREC_RIGHT (chrec),
  allow_superloop_chrecs);
if (CHREC_LEFT (chrec) != op0
  || CHREC_RIGHT (chrec) != op1)
chrec = build_polynomial_chrec (CHREC_VARIABLE (chrec), op0, op1);
--- 2005,2026 
 result again.  Avoid the cyclic instantiation in mixers.  */
bitmap_set_bit (already_instantiated, SSA_NAME_VERSION (chrec));
res = analyze_scalar_evolution (def_loop, chrec);
!   if (res != chrec_dont_know)
!   res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs);
bitmap_clear_bit (already_instantiated, SSA_NAME_VERSION (chrec));
return res;
  
  case POLYNOMIAL_CHREC:
op0 = instantiate_parameters_1 (loop, CHREC_LEFT (chrec),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, CHREC_RIGHT (chrec),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
if (CHREC_LEFT (chrec) != op0
  || CHREC_RIGHT (chrec) != op1)
chrec = build_polynomial_chrec (CHREC_VARIABLE (chrec), op0, op1);
*** instantiate_parameters_1 (struct loop *l
*** 1963,1970 
--- 2029,2042 
  case PLUS_EXPR:
op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
if (TREE_OPERAND (chrec, 0) != op0
  || TREE_OPERAND (chrec, 1) != op1)
chrec = chrec_fold_plus (TREE_TYPE (chrec), op0, op1);
*** instantiate_parameters_1 (struct loop *l
*** 1973,1980 
--- 2045,2058 
  case MINUS_EXPR:
op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
if (TREE_OPERAND (chrec, 0) != op0
  || TREE_OPERAND (chrec, 1) != op1)
  chrec = chrec_fold_minus (TREE_TYPE (chrec), op0, op1);
*** instantiate_parameters_1 (struct loop *l
*** 1983,1990 
--- 2061,2074 
  case MULT_EXPR:
op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs);
+   if (op1 == chrec_dont_know)
+   return chrec_dont_know;
+ 
if (TREE_OPERAND (chrec, 0) != op0
  || TREE_OPERAND (chrec, 1) != op1)
chrec = chrec_fold_multiply (TREE_TYPE (chrec), op0, op1);
*** instantiate_parameters_1 (struct loop *l
*** 2018,2027 
--- 2102,2120 
  case 3:
op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
  allow_superloop_chrecs);
+   if (op0 == chrec_dont_know)
+   return chrec_dont_know;
+ 
op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
  allow_superloop_chrecs);
+   if (op1

[Bug tree-optimization/17100] Missed opportunity for value range optimization

2004-12-19 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  
2004-12-19 23:43 ---
Patch in autovect-branch:
http://gcc.gnu.org/ml/gcc-patches/2004-12/msg01444.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17100


[Bug middle-end/18005] [4.0 Regression] ICE with simple loop with VLA

2004-10-19 Thread sebastian dot pop at cri dot ensmp dot fr

--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2004-10-19 
10:03 ---
Subject: Re:  [4.0 Regression] ICE with simple loop with VLA

Patch is here:

http://gcc.gnu.org/ml/gcc-patches/2004-10/msg01592.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18005