from:"Daniel Berlin"

Re: [Bug tree-optimization/32183] [4.3 Regression] reassoc2 can more extra calculations into a loop

2007-10-10 Thread Daniel Berlin

On 10 Oct 2007 08:58:00 -, steven at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:


 --- Comment #33 from steven at gcc dot gnu dot org  2007-10-10 08:57 
 ---
 What happened with the suggestion to only do this in reassoc2 (see comment
 #27)?


Yeah, i'm not sure why we just made both reassocs more expensive when
we only care what happens with the second.

Re: [Bug c++/33604] [4.3 Regression] Revision 119502 causes significantly slower results with 4.3 compared to 4.2

2007-10-01 Thread Daniel Berlin

I'm not fixing this until someone can tell me what exactly is going
wrong.  There have been *so* many changes to PTA since that revision
that the majority of the code it touched doesn't even do the same
thing anymore.
My guess is that this is a case where adding extra vdefs/vuses made
some dumb optimizer able to see something it can't when the chains are
separate like they should be.


On 1 Oct 2007 21:04:40 -, hjl at lucon dot org
[EMAIL PROTECTED] wrote:


 --- Comment #4 from hjl at lucon dot org  2007-10-01 21:04 ---
 I saw 40% performance regression at -O3 with testcase in comment #1 on
 Linux/x86-64.


 --


 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33604

Re: [Bug c/32575] [4.2/4.3 regression] With -ftree-vrp miscompiles a single line of code in SQLite

2007-09-05 Thread Daniel Berlin

On 28 Aug 2007 15:58:29 -, jakub at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:


 --- Comment #6 from jakub at gcc dot gnu dot org  2007-08-28 15:58 ---
 if (a == 0) a = bar (); isn't necessary either.

 salias has:

   # BLOCK 2 freq:1
   # PRED: ENTRY [100.0%]  (fallthru,exec)
   # VUSE qD.2026_12(D), SMT.25D.2079_13(D) { qD.2026 SMT.25D.2079 }
   D.2027_3 = foo ();
   pD.2025_4 = (struct S *) D.2027_3;
   if (pD.2025_4 == 0B)
 goto bb 3;
   else
 goto bb 4;
   # SUCC: 3 [7.3%]  (true,exec) 4 [92.7%]  (false,exec)

   # BLOCK 3 freq:735
   # PRED: 2 [7.3%]  (true,exec)
   # qD.2026_15 = VDEF qD.2026_12(D)
   # SMT.25D.2079_16 = VDEF SMT.25D.2079_13(D)
   # SMT.26D.2080_17 = VDEF SMT.26D.2080_14(D) { qD.2026 SMT.25D.2079
 SMT.26D.2080 }
   __builtin_memset (qD.2026, 0, 24);
   # SUCC: 4 [100.0%]  (fallthru,exec)

   # BLOCK 4 freq:1
   # PRED: 2 [92.7%]  (false,exec) 3 [100.0%]  (fallthru,exec)
   # qD.2026_11 = PHI qD.2026_12(D)(2), qD.2026_15(3)
   # pD.2025_1 = PHI pD.2025_4(2), qD.2026(3)
   # qD.2026_18 = VDEF qD.2026_11 { qD.2026 }
   pD.2025_1-s1D.2008 = aD.2021_6(D);
   # qD.2026_19 = VDEF qD.2026_18 { qD.2026 }
   pD.2025_1-s2D.2009 = bD.2022_7(D);

 Shouldn't the VDEFs be a PHI of some SMT and qD?
For VDEF/VUSE, you will never have a PHI of anything other than
multiple versions of the same SMT/virtual variable.

The above looks right to me at a glance.
It is probably pruning the result using TBAA which is what p-s isn't
thought to access the SMT.

Re: [Bug tree-optimization/33159] [4.3 Regression] wrong VDEF for gcc.target/i386/cmov4.c

2007-08-23 Thread Daniel Berlin

Yes, you are right.
I wasn't thinking clearly

 --- Comment #4 from bonzini at gnu dot org  2007-08-23 14:04 ---
 Hmmm, a store into an int * could not touch nodekind itself, only a store
 into an int ** could.

 Isn't SMT.8 the VDEF saying it could touch *the thing pointed to by nodekind*?

Re: [Bug c++/32900] New: [4.2/4.3 regression] compile time and memory regression

2007-07-25 Thread Daniel Berlin


Points-to memory with these is almost nothing, so don't look at meef.
It looks like size goes up for each function and is not fully
recovered by the time we start the next.

On 25 Jul 2007 22:25:22 -, debian-gcc at lists dot debian dot org
[EMAIL PROTECTED] wrote:

[forwarded from http://bugs.debian.org/431608]

c++ source files generated with sip-qt take much longer (4.2) and much more
memory (4.3) to build, than building with 4.1:

4.1 200707180m58.881s  about 400mb
4.2.1 release  86m13.933s  about 400mb
4.3 20070720   14m51.718s  about 1.5gb

built on i486-linux-gnu. 4.1 and 4.2 are built with --enable-checking=release,
4.3 with the setting from trunk.

  Matthias


--
   Summary: [4.2/4.3 regression] compile time and memory regression
   Product: gcc
   Version: 4.2.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: debian-gcc at lists dot debian dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32900

Re: [Bug tree-optimization/32746] [4.3 Regression] tree-ssa-operands int.comp error

2007-07-22 Thread Daniel Berlin


I already submitted a patch for this (see my followup to HP that fixes
valid_gimple_expression_p).
As soon as i can bootstrap on darwin, i will commit it.
If someone wants to do so before me, all you need to do is change
is_gimple_addressable to is_gimple_id in valid_gimple_expression_p

Re: [Bug tree-optimization/32746] [4.3 Regression] tree-ssa-operands int.comp error

2007-07-13 Thread Daniel Berlin


valid_gimple_expression_p claims
((struct RegisterLayout *) (char *) SimulatedRegisters)-intmask;

is valid GIMPLE, when it is not.



On 13 Jul 2007 23:37:00 -, hp at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--- Comment #4 from hp at gcc dot gnu dot org  2007-07-13 23:36 ---
Also happens for cris-axis-elf and likely other 32-bit platforms.


--

hp at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||hp at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32746

Re: [Bug tree-optimization/32705] [4.3 regression] ICE in set_ssa_val_to, at tree-ssa-sccvn.c:1022

2007-07-11 Thread Daniel Berlin


The only way i can see this happening is if you have a truly
uninitialized variable, or there is something we have missed.

Does this function have cfun-static_chain_decl being used, and we
have a copy of that here?

It is theoretically safe to call set_ssa_to_val with to == vn_top, but
it's probably a bug somewhere, and i'd rather eliminate the bug cases
before turning it off.

On 11 Jul 2007 20:10:10 -, ebotcazou at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--- Comment #5 from ebotcazou at gcc dot gnu dot org  2007-07-11 20:10 
---
 Can someone paste the output of debug_generic_stmt (to) and
 debug_tree(to) at the point of failure?

(gdb) p debug_tree(to)
 var_decl 0x557f7114 vn_top.181
type void_type 0x55716804 void sizes-gimplified visited VOID
align 8 symtab 0 alias set 36 canonical type 0x55716804
pointer_to_this pointer_type 0x55716870
used ignored VOID file ../c87b26b.adb line 4
align 8
$4 = void
(gdb) p debug_generic_stmt(to)
vn_top.181


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32705

Re: [Bug tree-optimization/32328] [4.2/4.3 Regression] -fstrict-aliasing causes skipped code

2007-07-04 Thread Daniel Berlin


On 4 Jul 2007 03:29:25 -, mmitchel at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--


Just as an update:
I have been working with richi (I code, he tests :P) diligently on a
patch for mainline, and have one that fixes the dealii regression (and
thus, should fix this as well).

Re: [Bug middle-end/30075] Missed optimizations with -fwhole-program -combine

2007-06-26 Thread Daniel Berlin


On 26 Jun 2007 03:10:26 -, acahalan at gmail dot com
[EMAIL PROTECTED] wrote:



--- Comment #4 from acahalan at gmail dot com  2007-06-26 03:10 ---
(In reply to comment #3)
 Subject: Re:  Missed optimizations with -fwhole-program -combine

 I would not expect this to be fixed anytime soon.  I have yet to find
 any real people who use either combine or -fwhole-program.  They use
 *way* too much memory on real programs.  As a result, no real people
 involved in optimization work on optimizers for them.

I'm real, and I want to use those.



That's nice and all, but i still wouldn't expect any work on them
until LTO is finished.
They are useless options right now.
I'd vote to remove them.

Re: [Bug tree-optimization/30052] [4.2 Regression] possible quadratic behaviour.

2007-05-20 Thread Daniel Berlin


On 20 May 2007 04:57:45 -, pluto at agmk dot net
[EMAIL PROTECTED] wrote:



--- Comment #25 from pluto at agmk dot net  2007-05-20 05:57 ---
Subject: Re:  [4.2 Regression] possible quadratic behaviour.




--



Change line 4275 of the patched tree-ssa-structalias.c to be rhs.var =
vi-id instead of rhs.var = id

Remove the id variable declaration.

This would have only affected fortran 



http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30052

Re: [Bug tree-optimization/30052] [4.2 Regression] possible quadratic behaviour.

2007-05-19 Thread Daniel Berlin


On 19 May 2007 14:30:43 -, pluto at agmk dot net
[EMAIL PROTECTED] wrote:



--- Comment #21 from pluto at agmk dot net  2007-05-19 15:30 ---
with this patc gcc works much better.

xf86ScanPci.i : 84MB / ~5sec.
sipQtCorepart0.ii.bz2 : 340MB / ~440sec


There are optimizations that could be made to the 440 seconds if they
are in PTA solving, but they wouldn't really help mainline much, so
i'm not sure if it is worth it.

Re: [Bug tree-optimization/30052] [4.2 Regression] possible quadratic behaviour.

2007-05-19 Thread Daniel Berlin


On 19 May 2007 17:16:35 -, pluto at agmk dot net
[EMAIL PROTECTED] wrote:



--- Comment #23 from pluto at agmk dot net  2007-05-19 18:16 ---
bad news, this patch ices fortran build:

(...)
../../../libgfortran/intrinsics/selected_int_kind.f90:22: internal compiler
error: in process_constraint, at tree-ssa-structalias.c:2260


Meh, send me the file.
This is just a small bug somewhere in the backport.

Re: [Bug libstdc++/29286] [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

2007-05-14 Thread Daniel Berlin


On 14 May 2007 08:25:27 -, rguenth at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--- Comment #60 from rguenth at gcc dot gnu dot org  2007-05-14 09:25 
---
But it doesn't have a result, does it?  Given that, I wonder how moving stmts
across it is prevented?


Okay, so then it needs an LHS that defines a new SSA name, otherwise,
we'll end up with dead ones everywhere, and they will keep other dead
code alive.

Re: [Bug tree-optimization/30604] Unable to coalesce ssa_names x and y which are marked as MUST COALESCE

2007-03-09 Thread Daniel Berlin


On 8 Mar 2007 20:12:16 -, amacleod at redhat dot com
[EMAIL PROTECTED] wrote:



--- Comment #7 from amacleod at redhat dot com  2007-03-08 20:12 ---
Looking at the original testcase, the complaint is that _t_8232 and _t_3 are
both used in the PHI definition of _t_7.  (using mainline from march 5th)

ie,  _t_7(ab) = PHI , _t_8232, ... , _t_3, ...



Uh, did you not put the (ab) next to the arguments, or do they really
not have SSA_NAME_OCCURS_IN_ABNORMAL_PHI set on them? (They should)


I can't really read the detailed output from FRE, but it does seem to have
replaced a bunch of expressions with _t_3, so that would appear to be the
culprit.


It won't value number things with SSA_NAME_OCCURS_IN_ABNORMAL_PHI set,
so it should never eliminate anything with them.

Re: [Bug tree-optimization/30089] Compiling FreeFem3d uses unreasonable amount of time and memory

2007-01-13 Thread Daniel Berlin


okay, i'll update changelog, submit and commit.

On 13 Jan 2007 23:02:13 -, rguenth at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--- Comment #21 from rguenth at gcc dot gnu dot org  2007-01-13 23:02 
---
The patch fixed the freefem memory regression.


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30089

Re: [Bug tree-optimization/30089] Compiling FreeFem3d uses unreasonable amount of time and memory

2007-01-09 Thread Daniel Berlin


Try the attached, let me know how it goes.



On 9 Jan 2007 21:17:05 -, rguenth at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--- Comment #16 from rguenth at gcc dot gnu dot org  2007-01-09 21:17 
---
Pling!


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30089


--- gcc/tree.h	(/mirror/gcc-trunk)	(revision 1114)
+++ gcc/tree.h	(/local/gcc-clean)	(revision 1114)
@@ -2449,10 +2449,14 @@ struct tree_decl_minimal GTY(())
 struct tree_memory_tag GTY(())
 {
   struct tree_decl_minimal common;
+
+  bitmap GTY ((skip)) aliases;
+
   unsigned int is_global:1;
 };
 
 #define MTAG_GLOBAL(NODE) (TREE_MEMORY_TAG_CHECK (NODE)-mtag.is_global)
+#define MTAG_ALIASES(NODE) (TREE_MEMORY_TAG_CHECK (NODE)-mtag.aliases)
 
 struct tree_struct_field_tag GTY(())
 {
--- gcc/tree-ssa-alias.c	(/mirror/gcc-trunk)	(revision 1114)
+++ gcc/tree-ssa-alias.c	(/local/gcc-clean)	(revision 1114)
@@ -90,6 +90,7 @@ struct alias_stats_d
 
 /* Local variables.  */
 static struct alias_stats_d alias_stats;
+static bitmap_obstack alias_bitmap_obstack;
 
 /* Local functions.  */
 static void compute_flow_insensitive_aliasing (struct alias_info *);
@@ -99,7 +100,7 @@ static bool may_alias_p (tree, HOST_WIDE
 static tree create_memory_tag (tree type, bool is_type_tag);
 static tree get_smt_for (tree, struct alias_info *);
 static tree get_nmt_for (tree);
-static void add_may_alias (tree, tree, struct pointer_set_t *);
+static void add_may_alias (tree, tree);
 static struct alias_info *init_alias_info (void);
 static void delete_alias_info (struct alias_info *);
 static void compute_flow_sensitive_aliasing (struct alias_info *);
@@ -194,19 +195,21 @@ static void
 mark_aliases_call_clobbered (tree tag, VEC (tree, heap) **worklist,
 			 VEC (int, heap) **worklist2)
 {
+  bitmap aliases;
+  bitmap_iterator bi;
   unsigned int i;
-  VEC (tree, gc) *ma;
   tree entry;
   var_ann_t ta = var_ann (tag);
 
   if (!MTAG_P (tag))
 return;
-  ma = may_aliases (tag);
-  if (!ma)
+  aliases = may_aliases (tag);
+  if (!aliases)
 return;
 
-  for (i = 0; VEC_iterate (tree, ma, i, entry); i++)
+  EXECUTE_IF_SET_IN_BITMAP (aliases, 0, i, bi)
 {
+  entry = referenced_var (i);
   if (!unmodifiable_var_p (entry))
 	{
 	  add_to_worklist (entry, worklist, worklist2, ta-escape_mask);
@@ -264,7 +267,8 @@ compute_tag_properties (void)
   changed = false;  
   for (k = 0; VEC_iterate (tree, taglist, k, tag); k++)
 	{
-	  VEC (tree, gc) *ma;
+	  bitmap ma;
+	  bitmap_iterator bi;
 	  unsigned int i;
 	  tree entry;
 	  bool tagcc = is_call_clobbered (tag);
@@ -277,8 +281,9 @@ compute_tag_properties (void)
 	  if (!ma)
 	continue;
 
-	  for (i = 0; VEC_iterate (tree, ma, i, entry); i++)
+	  EXECUTE_IF_SET_IN_BITMAP (ma, 0, i, bi)
 	{
+	  entry = referenced_var (i);
 	  /* Call clobbered entries cause the tag to be marked
 		 call clobbered.  */
 	  if (!tagcc  is_call_clobbered (entry))
@@ -508,8 +513,9 @@ sort_mp_info (VEC(mp_info_t,heap) *list)
 static void
 create_partition_for (mp_info_t mp_p)
 {
+  bitmap_iterator bi;
   tree mpt, sym;
-  VEC(tree,gc) *aliases;
+  bitmap aliases;
   unsigned i;
 
   if (mp_p-num_vops = (long) MAX_ALIASED_VOPS)
@@ -556,11 +562,12 @@ create_partition_for (mp_info_t mp_p)
   else
 {
   aliases = may_aliases (mp_p-var);
-  gcc_assert (VEC_length (tree, aliases)  1);
+  gcc_assert (!bitmap_empty_p (aliases));
 
   mpt = NULL_TREE;
-  for (i = 0; VEC_iterate (tree, aliases, i, sym); i++)
+  EXECUTE_IF_SET_IN_BITMAP (aliases, 0, i, bi)
 	{
+	  sym = referenced_var (i);
 	  /* Only set the memory partition for aliased symbol SYM if
 	 SYM does not belong to another partition.  */
 	  if (memory_partition (sym) == NULL_TREE)
@@ -614,11 +621,10 @@ rewrite_alias_set_for (tree tag, bitmap 
   else
 {
   /* Create a new alias set for TAG with the new partitions.  */
-  var_ann_t ann;
 
-  ann = var_ann (tag);
-  for (i = 0; VEC_iterate (tree, ann-may_aliases, i, sym); i++)
+  EXECUTE_IF_SET_IN_BITMAP (MTAG_ALIASES (tag), 0, i, bi)
 	{
+	  sym = referenced_var (i);
 	  mpt = memory_partition (sym);
 	  if (mpt)
 	bitmap_set_bit (new_aliases, DECL_UID (mpt));
@@ -627,9 +633,7 @@ rewrite_alias_set_for (tree tag, bitmap 
 	}
 
   /* Rebuild the may-alias array for TAG.  */
-  VEC_free (tree, gc, ann-may_aliases);
-  EXECUTE_IF_SET_IN_BITMAP (new_aliases, 0, i, bi)
-	VEC_safe_push (tree, gc, ann-may_aliases, referenced_var (i));
+  bitmap_copy (MTAG_ALIASES (tag), new_aliases);
 }
 }
 
@@ -691,7 +695,10 @@ compute_memory_partitions (void)
   /* Each reference to VAR will produce as many VOPs as elements
 	 exist in its alias set.  */
   mp.var = var;
-  mp.num_vops = VEC_length (tree, may_aliases (var));
+  if (!may_aliases (var))
+	mp.num_vops = 0;
+  else
+	mp.num_vops = bitmap_count_bits (may_aliases (var));
 
   /* No point grouping singleton alias sets.  */
   if

Re: [Bug libstdc++/29286] [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

2007-01-01 Thread Daniel Berlin


On 1 Jan 2007 00:41:44 -, mark at codesourcery dot com
[EMAIL PROTECTED] wrote:



--- Comment #26 from mark at codesourcery dot com  2007-01-01 00:41 ---
Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

dberlin at gcc dot gnu dot org wrote:

 If we add a placement_new_expr, and not try to revisit our interpretation of
 the standard, we can just DTRT and fix placement new. This would be best for
 optimizations, and IMHO, for users.

I agree that treating placement new specially makes sense.  The first
argument to a placement new operator could be considered to have an
unspecified dynamic type on entrance to the operator, while the return
value has the dynamic type specified by the operator.  (So that the
pointer returned by new (x) int has type int *.)


Right.



I'm not sure that placement_new_expr is the best way to accomplish this,
but, maybe it is.  Another possibility would be to define an attribute
or attributes to specify the dynamic type of arguments and return types,
and then have the C++ front end annotate all placement new operators
with those attributes.

It would be nice if we could transform those attributes on
gimplification to something like an
an alias preserving cast (or something of that nature) that states
that the  cast is type unioning for alias purposes (IE that the
possible types of the result for TBAA/etc purposes is the union of the
type of the cast and the type of the cast's operand)..
Not a fully fleshed out idea, just something that popped into my head.

Re: [Bug tree-optimization/29922] [4.3 Regression] [Linux] ICE in insert_into_preds_of_block

2006-12-19 Thread Daniel Berlin


I will try to get back to this bug this week. I was fighting some
other fights last week, i apologize.

Re: [Bug libstdc++/30203] New: std::vector::size() 10x speedup (patch)

2006-12-14 Thread Daniel Berlin


And what are the timings with a recent version of g++ and actually
turning on optimization?


On 13 Dec 2006 17:38:06 -, charles at rebelbase dot com
[EMAIL PROTECTED] wrote:

vector::size() in bits/stl_vector.h is currently implemented as

  size_type
  size() const
  { return size_type(end() - begin()); }

A faster implementation is

  size_type
  size() const
  { return _M_impl._M_finish - _M_impl._M_start; }

Which avoids the temporary iterators' life cycles
and operator- calls.

I tried a simple timing test on both implementations,
and the latter appears to be 10x faster:

(11:35:56)(charles xyzzy)(~): cat test.cc
#include vector
int main () {
  std::vectorint x (100);
  unsigned long l = 0;
  const unsigned long iterations = 1;
  for (unsigned long i=0; iiterations; ++i)
l += x.size ();
  return 0;
}
(11:35:58)(charles xyzzy)(~): g++ -o test test.cc -lstdc++
(11:36:05)(charles xyzzy)(~): time ./test

real0m3.692s
user0m3.676s
sys 0m0.004s
(11:36:10)(charles xyzzy)(~): cat test2.cc
#include vector
int main () {
  std::vectorint x (100);
  unsigned long l = 0;
  const unsigned long iterations = 1;
  for (unsigned long i=0; iiterations; ++i)
l += x._M_impl._M_finish - x._M_impl._M_start;
  return 0;
}
(11:36:13)(charles xyzzy)(~): g++ -o test2 test2.cc -lstdc++
(11:36:19)(charles xyzzy)(~): time ./test2

real0m0.342s
user0m0.336s
sys 0m0.004s


--
   Summary: std::vector::size() 10x speedup (patch)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: libstdc++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: charles at rebelbase dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30203

Re: [Bug middle-end/30075] Missed optimizations with -fwhole-program -combine

2006-12-05 Thread Daniel Berlin


I would not expect this to be fixed anytime soon.  I have yet to find
any real people who use either combine or -fwhole-program.  They use
*way* too much memory on real programs.  As a result, no real people
involved in optimization work on optimizers for them.

On 5 Dec 2006 19:38:51 -, pinskia at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--

Re: [Bug debug/29792] DWARF: Not all inline concrete instances are being generated

2006-11-14 Thread Daniel Berlin


OK, so I'll have to find another way of using the DWARF info to see if a inline
routine, such as __task_rq_lock was used at all in the build or was just
included in the DWARF info but not referenced anywhere, have to dig more into
the available information...

BTW, if, in these cases, DW_TAG_subroutine is not referenced, what is the
purpose of it being included? Is there a reason my limited knowledge is not
realising?


Well, it is referenced. It did exist in the source, and was inlined.
That's what we output.  DW_TAG_subprogram with no PC range is actually
common.

Because all the inlined instances were optimized away, there are no
DW_TAG_inlined_* entries for them.




--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29792

Re: [Bug debug/29792] DWARF: Not all inline concrete instances are being generated

2006-11-13 Thread Daniel Berlin


On 12 Nov 2006 20:39:43 -, acme at mandriva dot com
[EMAIL PROTECTED] wrote:



--- Comment #5 from acme at mandriva dot com  2006-11-12 20:39 ---
(In reply to comment #4)
 The only thing left from __task_rq_lock is a label.

SNIP

 task_cpu were inlined and we constant proped the value of rq the first of the
 way through the function which we inlined this to.

OK, I thought that this was due to something like what you described, even not
knowing that much about gcc internals, but I thought that even in this case the
DW_TAG_inlined_subroutine would be emitted, or hoped to as it would allow me to
do what I want with my tools :-\


There is nothing to emit debug info about, so we don't.

Re: [Bug debug/29792] DWARF: Not all inline concrete instances are being generated

2006-11-13 Thread Daniel Berlin


On 13 Nov 2006 16:16:50 -, acme at mandriva dot com
[EMAIL PROTECTED] wrote:



--- Comment #8 from acme at mandriva dot com  2006-11-13 16:16 ---
  OK, I thought that this was due to something like what you described, even 
not
  knowing that much about gcc internals, but I thought that even in this case 
the
  DW_TAG_inlined_subroutine would be emitted, or hoped to as it would allow 
me to
  do what I want with my tools :-\

 There is nothing to emit debug info about, so we don't.


Well, at least gcc emits this:

 1a2f2: Abbrev Number: 65 (DW_TAG_subprogram)
 DW_AT_sibling : a324
 DW_AT_name: (indirect string, offset: 0x4515): __task_rq_lock
 DW_AT_decl_file   : 1
 DW_AT_decl_line   : 378
 DW_AT_prototyped  : 1
 DW_AT_type: 9a2f
 DW_AT_inline  : 3  (declared as inline and inlined)

But no DW_TAG_inlined_subroutine, as we've been discussing:

[EMAIL PROTECTED] net-2.6.20]$ readelf -wi 
../OUTPUT/qemu/net-2.6.20/kernel/sched.o
| grep a2f2
 DW_AT_sibling : a2f2
 1a2f2: Abbrev Number: 65 (DW_TAG_subprogram)
[EMAIL PROTECTED] net-2.6.20]$


I'm quite aware of what GCC outputs here :)

However, past the initial declarations, we don't output debug
information about what the state of the IR is at random points in the
compilation, only about what the final output looks like.  Since there
is no inlined code left, we don't end up saying there is an inlined
subroutine.

Even if we could change this, i'm not sure we'd want to.  It doesn't
seem incorrect at all to do what we do.
Otherwise, you'd end up with inlined subroutine dies with no low
pc/high pc associated with them, which seems nonsensical.

Re: [Bug java/29587] jc1: out of memory allocating 4072 bytes after a total of 708630224 bytes

2006-11-09 Thread Daniel Berlin


Can you try the attached and let me know if it fixes it?


fordanglin.diff
Description: Binary data

Re: [Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread Daniel Berlin


A detailed proposal:

So here is what i was thinking of.  When i say symbols below, I mean
some VAR_DECL or structure that has a name (like our memory tags
do).  A symbol is *not* a real variable that occurred in the user
program.  When I say varaible i mean a variable that occurred in the
user program.

The real problem with our alias system in terms of precision, and
often in terms of number of useless vops, is that we are trying to use
real, existing, variables, to approximate the portions of the heap a
statement accesses.

When things access portions of the heap we can't see (nonlocal
variables), we fall down badly in terms of precision because we can
eliminate every single local variable as an alias, and need to then
just say it accesses some nonlocal variable.  This causes precision
problems because it means that statements accessing nonlocal variables
that we can *prove* don't interfere, still currently share a VUSE
between them.

We also have useless vops whenever we have points-to sets that
intersect between all statements that interfere, because we end up
adding aliases for you can eliminate the members of the alias set

We also currently rely on operand-scan time pruning, which is very ugly.

There is a way to get the minimal number of vuses/vdefs necessary to
represent  completely precise (in terms of info we have) aliasing,
while still being able to degrade the precision gracefully in order to
enable the vuses/vdefs necessary to go down

The scheme i propose *never* has overlapping live ranges of the
individual symbols, even though the symbols may represent pieces of
the same heap.

In other words, you can rely on the fact that once an individual
symbol has a new version, there will never be a vuse of an old version
of that symbol.



The current vdef/vuse scheme consists of creating memory tags to
represent portions of the heap.  When a memory tag has aliases, we use
it's alias list to generate virtual operands.  When a memory tag does
not have aliases, we generate a virtual operand of the base symbol.

The basic idea in the new scheme is to never have a list of aliases
for a symbol representing portions of the heap.  The symbols
representing portions of the heap are themselves always the target of
a vuse/vdef.  The aliases they represent is immaterial (though we can
keep a list if something wants it).

This enables us to have a smaller number of vops, and have something
else generate the set of symbols in a precise manner, rather than have
things like the operand scanner try to post process it.

The symbols are also attached to the load/store statements, and not to
the variables.

The operand renamer only has to add vuses/vdefs for all the symbols
attached to a statement, and it is done.

In the simplest, dumb, non-precise version of this scheme, this means
you only have one symbol, called MEM, and generate vuse/vdefs
linking every load/store together.

In the absolute most-precise version of this scheme, you partition the
loads/store conflicts in statements into symbols that represent
statement conflictingness.

In a completely naive, O(N^3) version, the following algorithm will
work and generate completely precise results:

Collect all loads and stores into a list (lslist)
for each statement in lslist (x):
 for each statement in lslist (y):
   if x conflicts with y:
  if there is no partition for x, y, create a new one containing x and y.
  otherwise
   for every partition y belongs to:
 if all members of this partition have memory access that
conflicts with x:
  add x to this partition
otherwise
 create a new partition containing all members of the
partition except the ones x does not conflict with.
 add x to this partition


This is a very very slow way to do it, but it should be clear (there
are much much much faster ways to do this).

Basically, a single load/store statement can belong to multiple
partitions.  All members of a given partition conflict with each
other.

given the following set of memory accesses statements:

a, b, c, d

where:
a conflicts with b and c
b conflicts with c and d
c conflicts with a and b
d conflicts with a and c

you will end up with 3 partitions:
part1: {a, b, c}
part2: {b, c, d}
part3: {d, a, c}

statement c will conflict with every member of partition 1 and thus
get partition 1, rather than a new partition.

You now create symbols for each partition, and for each statement in
the partition, add the symbol to it's list.

Thus, in the above example we get

statement a - symbols: MEM.PART1, MEM.PART3
statement b - symbols: MEM.PART1, MEM.PART2
statement c - symbols: MEM.PART1, MEM.PART2, MEM.PART3
statement d - symbols MEM.PART2, MEM.PART3

As mentioned before, the operand renamer simply adds a vdef/vuse for
each symbol in the statement list.

Note that this is the minimal number of symbols necessary to precisely
represent the conflicting accesses.

If the number of partitions grows

Re: [Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread Daniel Berlin



Memory SSA brings down the number of virtual operators to exactly one per
statement.


However, it does so in a way that makes the traditional things that
actually want to do cool memory optimizations, harder.

I'm still on the fence over whether it's a good idea or not.



 verified before we introduce milion new bugs with mem-ssa (nothing
 personal, it simply is too large and too intrusive change not to bring
 any).

Intrusive?  Well, the only pass that was wired to the previous virtual operator
scheme was PRE.  DSE is also wired but to a lesser extent.  No other
optimization had to be changed for mem-ssa.  It's obviously intrusive in the
renamer, but that's it.


Uh, LIM and store sinking are too.  Roughly all of our memory optimizations are.

The basic problem is in mem-ssa that vdefs and vuses don't accurately
reflect what symbols are being defined and used anymore.  They
represent the factoring of a use and definition of a whole bunch of
symbols.

Things like PRE and DSE break not because they are wired to the
previous virtual operator scheme so much, but because they rely on
the virtual use/def chains accurately representing where a symbol
representing a memory access dies.  In mem-ssa, you have VDEF's of the
same symbol all over the place.

The changes i have to make to PRE (and to the other things) to account
for this is actually to rebuild the non-mem-ssa-factored (IE the
current factored) form out of the chains by seeing what symbols they
really affect.

This is going to be expensive, and IMHO, is what almost all of our SSA
memory optimizations are going to have to do.

So while mem-ssa doesn't affect *precision*, it does affect how you
can use the chains in a very significant way.

For at least all the opts i see us doing, it makes them more or less
useless without doing things (like reexpanding them) first. Because
this is true, I'm not sure it's a good idea at all, which is why i'm
still on the fence.

Re: [Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread Daniel Berlin


In mem-ssa, you have VDEF's of the
same symbol all over the place.


 version of a symbol

Re: [Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-06 Thread Daniel Berlin


Zdenek, can you revert your patch  until we fix this?
It might be a month or two before i get back to it.

(Yeah, i know it sucks to have to do this, but)

On 6 Nov 2006 15:12:30 -, hjl at lucon dot org
[EMAIL PROTECTED] wrote:



--- Comment #14 from hjl at lucon dot org  2006-11-06 15:12 ---
I checked gcc 4.3. The same source code, which is miscompiled in gcc from
SPEC CPU 2006, is there. It is most likely that gcc 4.3 is also miscompiled
and now generating wrong unwind/debug info, if not wrong instructions.


--

hjl at lucon dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2006-11-06 15:12:29
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680

Re: [Bug java/29587] jc1: out of memory allocating 4072 bytes after a total of 708630224 bytes

2006-11-05 Thread Daniel Berlin


On 5 Nov 2006 21:22:24 -, dave at hiauly1 dot hia dot nrc dot ca
[EMAIL PROTECTED] wrote:



--- Comment #7 from dave at hiauly1 dot hia dot nrc dot ca  2006-11-05 
21:22 ---
Subject: Re:  jc1: out of memory allocating 4072 bytes after a total of
708630224 bytes

 Can you bzip2 compress -fdump-tree-alias-vops-details-blocks-stats (it's going
 to be very large) and put it somewhere for me?

The files are here: ftp://hiauly1.hia.nrc.ca/outgoing/berlin/.


Thanks!
So this ends up being what i thought.  The variables aren't being
collapsed, but i can't figure out why (IE it can't prove they are the
same).  This causes it to give them separate solution bitmaps, and the
solutions are very large,and involve thousands of variables, so
thousands * thousands = a lot of memory.


However, all of these variables should collapse, as they do in the
earlier functions.
They also collapse on my machine on this testcase (which admittedly
has different code there).
It is, in fact, *incredibly* strange that not a single variable is
collapsed or unified in this function.

I'm not sure what to do here.
Can you poke around in perform_var_substitution and see if you can
figure out what conditions are causing all the variables to fail out
of the collapsing.  Particularly, roughly every variable that has a
constaint like foo = ESCAPED_VARS in the dump should be getting
collapsed to ESCAPED_VARS.


Can you poke

Re: [Bug java/29587] jc1: out of memory allocating 4072 bytes after a total of 708630224 bytes

2006-11-04 Thread Daniel Berlin



The change on the 19th caused a significant increase in memory
consumption http://gcc.gnu.org/ml/gcc-patches/2006-10/msg01029.html
and java bootstrap failures on s390, s390x and ia64.  See this
thread http://gcc.gnu.org/ml/gcc-patches/2006-10/msg01058.html.


Except that all of these were fixed in the followup patch and a later
typo fix, *including* the memory usage (see honza's tester).

Re: [Bug tree-optimization/14784] [Tree-ssa] alias analysis deficiency

2006-10-31 Thread Daniel Berlin


Details, source, etc needed.


On 31 Oct 2006 15:02:02 -, hjl at lucon dot org
[EMAIL PROTECTED] wrote:



--- Comment #10 from hjl at lucon dot org  2006-10-31 15:02 ---
It miscompiles dwarf2out.c in gcc in SPEC CPU 2006.

Re: [Bug tree-optimization/29585] [4.2/4.3 Regression] tree check: expected ssa_name, have var_decl in is_old_name, at tree-into-ssa.c:558

2006-10-25 Thread Daniel Berlin


On 25 Oct 2006 05:23:00 -, pinskia at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--- Comment #4 from pinskia at gcc dot gnu dot org  2006-10-25 05:22 ---
_ZTCN33_GLOBAL__N_t.cc__2292CFAC11NullostreamE0_13basic_ostream

#   _ZTI13basic_ostream = V_MAY_DEF _ZTI13basic_ostream_16;
#   _ZTIN33_GLOBAL__N_t.cc__2292CFAC11NullostreamE = V_MAY_DEF
_ZTIN33_GLOBAL__N_t.cc__2292CFAC11NullostreamE_17;
#   _ZTCN33_GLOBAL__N_t.cc__2292CFAC11NullostreamE0_13basic_ostream =
V_MAY_DEF
_ZTCN33_GLOBAL__N_t.cc__2292CFAC11NullostreamE0_13basic_ostream;
#   _ZTSN33_GLOBAL__N_t.cc__2292CFAC11NullostreamE = V_MAY_DEF
_ZTSN33_GLOBAL__N_t.cc__2292CFAC11NullostreamE;
#   _ZTVN10__cxxabiv120__si_class_type_infoE = V_MAY_DEF
_ZTVN10__cxxabiv120__si_class_type_infoE;
#   _ZTI8ios_base = V_MAY_DEF _ZTI8ios_base;
#   _ZTS13basic_ostream = V_MAY_DEF _ZTS13basic_ostream;
#   _ZTVN10__cxxabiv121__vmi_class_type_infoE = V_MAY_DEF
_ZTVN10__cxxabiv121__vmi_class_type_infoE;
#   _ZTS8ios_base = V_MAY_DEF _ZTS8ios_base;
#   _ZTVN10__cxxabiv117__class_type_infoE = V_MAY_DEF
_ZTVN10__cxxabiv117__class_type_infoE;
#   SFT.5 = V_MAY_DEF SFT.5;
#   SFT.6 = V_MAY_DEF SFT.6;
#   SFT.7 = V_MAY_DEF SFT.7;
#   SFT.8 = V_MAY_DEF SFT.8;
#   SFT.9 = V_MAY_DEF SFT.9;
#   NONLOCAL.15 = V_MAY_DEF NONLOCAL.15;
this_9-_vptr.basic_ostream = iftmp.1_13;




Uh, this is pretty weird.
*all* of these should have been marked for renaming, not just NONLOCAL.

Re: [Bug tree-optimization/25737] ACATS c974001 c974013 hang with struct aliasing

2006-09-24 Thread Daniel Berlin


On 24 Sep 2006 18:23:41 -, ebotcazou at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--- Comment #37 from ebotcazou at gcc dot gnu dot org  2006-09-24 18:23 
---
 No, really, you don't seem to understand.
 If you respect these DECL_NONADDRESSABLE_P or
 TYPE_NONALIASED_COMPONENT flags, you are going to make them unaliased.
 Your whole bug report is that they are not aliased and should be, and
 that the loads and stores currently don't interfere but should.

I think I understand your viewpoint: the name of TYPE_NONALIASED_COMPONENT
and DECL_NONADDRESSABLE_P seems to imply than setting them would always
result in less V_MAY_DEF's in the code.  But...


The name, and all the documentation, which say they cannot be
addressed, which means they cannot be pointed to by any pointer, which
means they are unaliased.


 Diego, the short summary is that Eric has some Ada testcases where we
 end up with less V_MAY_DEF's than he thinks should.  He believes that
 respecting these flags, which specify you cannot form the address of a
 certain component, etc, will somehow cause him to end up with more
 aliasing and fix his testcase by anything other than luck.

...that's not so simple.  If you look at how these flags work in GCC 3.x,
you'll see that setting them has some impact on the alias sets used to access
memory references, via can_address_p and the MEM_KEEP_ALIAS_SET_P flag.
In GCC 4 dialect, this would result in different V_MAY_DEF's, not less.


If so, then you've both hacked around something more funamental, and
the documentation of all these flags don't actually match what you
really mean.




I'm not saying that this is a sane design or that we should try to replicate
it in GCC 4, I'm just saying that for the time being struct aliasing totally
overlooks this mechanism and doesn't work for Ada because of that.

Okay, and i'm saying i don't plan on accepting fixes that appear to
hack around well accepted infrastructure to try to fix symptoms.
Really. That's all.  I'm not going to approve patches that randomly
skip fields because it seems to get the right result sometimes.  If
you want to try to explain what all this is actually trying to do, i'm
happy to work with you to come up with a sane solution.

Re: [Bug tree-optimization/28944] New: tree-dce incorrectly removes an assignment.

2006-09-03 Thread Daniel Berlin


asm volatile
(
push   %1  \n\t
call   *%0 \n\t
add$4, %%esp   \n\t
:
: r ( test ), r ( x )
);

asm statements are not allowed to alter control flow

Re: [Bug tree-optimization/28937] [4.2 regression] ICE in add_virtual_operand, at tree-ssa-operands.c:1309

2006-09-03 Thread Daniel Berlin


Why does loop change the SMT usage?
In addition, since there are times loop doesn't do anything, you
should simply be returning PROP_smt_usage when it does do something,
and nothing otherwise.

On 4 Sep 2006 03:52:04 -, pinskia at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:



--- Comment #4 from pinskia at gcc dot gnu dot org  2006-09-04 03:52 ---
Note the patch is:
Index: tree-ssa-loop.c
===
--- tree-ssa-loop.c (revision 116671)
+++ tree-ssa-loop.c (working copy)
@@ -405,9 +405,11 @@ struct tree_opt_pass pass_complete_unrol
   TV_COMPLETE_UNROLL,  /* tv_id */
   PROP_cfg | PROP_ssa, /* properties_required */
   0,   /* properties_provided */
-  0,   /* properties_destroyed */
+  PROP_smt_usage,  /* properties_destroyed */
   0,   /* todo_flags_start */
-  TODO_dump_func | TODO_verify_loops,  /* todo_flags_finish */
+  TODO_dump_func
+| TODO_verify_loops
+| PROP_smt_usage,  /* todo_flags_finish */
   0/* letter */
 };


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28937

Re: [Bug tree-optimization/28798] remove_phi_node attempts removal of a phi node resized by resize_phi_node

2006-08-25 Thread Daniel Berlin

hosking at cs dot purdue dot edu wrote:
 --- Comment #13 from hosking at cs dot purdue dot edu  2006-08-24 15:27 
 ---
 Is this enough?
 
 Here is the dump output, followed by stack traces at the resize and remove
 points (the remove goes on to fail). 

So, this edge can't exist.
Note:

 Its src is:
 
 (gdb) p *(e-src)
 $12 = {
   index = 0, 
 }
 
 Its dest is:
 
 (gdb) p *(e-dest)
 $13 = {
   index = 0, 
 }
 

It claims to be an edge from block 0 to block 0, but your according to
your dump, block 0 is not a successor of block 0 (IE it is not a self loop).

--Dan

Re: [Bug tree-optimization/15452] [tree-ssa] Optimize cascaded a = a == 0;

2006-08-24 Thread Daniel Berlin

pinskia at gcc dot gnu dot org wrote:
 --- Comment #6 from pinskia at gcc dot gnu dot org  2006-08-24 04:27 
 ---
 Another interesting case would be (but which could be handled by VRP):
 int
 foo (int a)
 {
   a = a!=0;
   a = a!=0;
   a = a!=0;
   a = a!=0;
   a = a!=0;
   return a;
 }
 Which should be optimized to:
 int foo(int a) { return a!=0;}
 
 
Uh, FRE could also optimize this to the same thing, I just don't
remember whether it bothers to look at conditionals as eliminable
expressions.

Re: [Bug tree-optimization/28798] remove_phi_node attempts removal of a phi node resized by resize_phi_node

2006-08-23 Thread Daniel Berlin

hosking at cs dot purdue dot edu wrote:
 --- Comment #7 from hosking at cs dot purdue dot edu  2006-08-23 22:29 
 ---
 This is with the Modula-3 backend.  I am porting it to 4.1.1 and encountered
 this problem with -O3 turned on.
 
 
Does 4.1 have the check for EDGE_CRITICAL_P in insert_aux?

If not, that is the problem.

Re: [Bug tree-optimization/28798] remove_phi_node attempts removal of a phi node resized by resize_phi_node

2006-08-23 Thread Daniel Berlin

hosking at cs dot purdue dot edu wrote:
 --- Comment #11 from hosking at cs dot purdue dot edu  2006-08-24 00:57 
 ---
 (In reply to comment #9)
 Does 4.1 have the check for EDGE_CRITICAL_P in insert_aux?
 
 Yes:
 
   /* This can happen in the very weird case
  that our fake infinite loop edges have caused a
  critical edge to appear.  */
   if (EDGE_CRITICAL_P (pred))
 {
   cant_insert = true;
   break;
 }
 
 

Honestly, there should be no other case in which the edge actually needs
to be split.  It is just a shortcut rather than trying to whether we
want the beginning of the succ or the end of the pred (it figures it out
for us).

If you could attach the dump from
-fdump-tree-crited-vops-details-blocks-stats, and tell me what pred,
src, and block are, that would be helpful.

Without more, it's either something *very* strange in the code modula3
is creating (or broken gimplification), *or* the edge inserter is
confused and believes it needs to create a block in a case it doesn't.

Re: [Bug tree-optimization/28798] remove_phi_node attempts removal of a phi node resized by resize_phi_node

2006-08-22 Thread Daniel Berlin

pinskia at gcc dot gnu dot org wrote:
 --- Comment #2 from pinskia at gcc dot gnu dot org  2006-08-22 06:17 
 ---
 We should never had needed resize_phi_node inside PRE and resize_phi_node also
 does an exact replacement so that means you are keeping a reference to the old
 PHI node when adding an edge which is wrong.
 
 
PRE never directly calls resize_phi_node
The insert_on_edge call PRE makes should *never* cause the number of
predecessors to change, so i can't see why resize_phi_node would ever be
called.

Without an example case where it does, i can't debug this further.

However, it's not wrong to keep a reference to a phi node when a
random edge in the program changes.  The API that doesn't allow such a
thing is just broken. This is a symptom of the fact that our phi node
arguments are stored in pretend vectors, even though it would be saner
to use an embedded vec in that structure.  This would allow reallocating
arguments without having to change the entire phi node structure.

Re: [Bug tree-optimization/28643] redundant phi-node in latch-block prevents vectorization

2006-08-08 Thread Daniel Berlin

pinskia at gcc dot gnu dot org wrote:
 --- Comment #1 from pinskia at gcc dot gnu dot org  2006-08-08 01:47 
 ---
 SSA copy prop with dce after that should really be the correct way.
 

Err, SSA copy prop should be enough, actually, since after copy-prop,
the phi will have no users (and they shouldn't care about code with no
uses that doesn't access memory).

Though it's interesting that this redundant phi survives so long. What
is creating it?

Re: [Bug c/28073] Type-punned pointer passed as function parameter generates bad assembly sequence

2006-06-19 Thread Daniel Berlin

sorenj at us dot ibm dot com wrote:
 --- Comment #2 from sorenj at us dot ibm dot com  2006-06-19 16:44 ---
 Changing just one line of the test program to the (AFAIK) legal C code.  By
 casting through void *, we are addressing Andrew's concerns about violating 
 the
 C rules. 

No you aren't.
The only thing that matters is what the type of the dereferenced pointer
is, not the intermediate casts.

For example,

int *foo
float b;
float *c;

b = 5.0
foo = (int*)b
c = (float *)foo
printf(%f\n, *c);


is legal.


 
   Foo *pFoo = *(Foo **) ((void *)longPtr); /* // BAD! */

Still not legal.

 
 eliminates the type-punned warning, even at the highest possible warning 
 level,
 and continues to generate code the results in a bad return value.  This test
 case illustrates that this problem is actually worse than we originally
 thought, as now incorrect code is generated without any warning.

We can't issue warnings in every case because it is impossible to detect
every case.

We could probably issue a warning in this case.

Re: [Bug tree-optimization/28003] [4.2 Regression] optimizer bug

2006-06-13 Thread Daniel Berlin

pinskia at gcc dot gnu dot org wrote:
 --- Comment #2 from pinskia at gcc dot gnu dot org  2006-06-13 04:41 
 ---
 Hmm, we get after dce, just:
   reduced_cell_two_folds[26] = {};
 
 And DCE removes:
   this_616 = reduced_cell_two_folds[26].u;
 
   #   SMT.68_1055 = V_MAY_DEF SMT.68_1054;
   this_616-elems[0] = 1;
   #   SMT.68_1056 = V_MAY_DEF SMT.68_1055;
   this_616-elems[1] = 0;
   #   SMT.68_1057 = V_MAY_DEF SMT.68_1056;
   this_616-elems[2] = 0;
 ...
   this_621 = reduced_cell_two_folds[26].h;
 ...
   #   SMT.68_1058 = V_MAY_DEF SMT.68_1057;
   this_621-elems[0] = 2;
   #   SMT.68_1059 = V_MAY_DEF SMT.68_1058;
   this_621-elems[1] = 1;
   #   SMT.68_1060 = V_MAY_DEF SMT.68_1059;
   this_621-elems[2] = 1;
 
 
 Which does not make sense.  Nothing is special in alias shows what is going
 wrong.
 
 

The only thing i can think of is that SMT.68 is not marked global.
Is it?

Re: [Bug target/27855] reassociation pass produces ~30% slower matrix multiplication code

2006-06-02 Thread Daniel Berlin

steven at gcc dot gnu dot org wrote:
 --- Comment #4 from steven at gcc dot gnu dot org  2006-06-02 23:19 
 ---
 Real bug, despite Andrew's usual portion of x86-hate.
 
 

It'd be good to know what exactly is going wrong.
Reassociation only touches floating point because someone asked me to
make it touch floating point.

It still shouldn't have *this* much of an affect, my guess is it is
triggering some bad behavior elsewhere.

Re: [Bug middle-end/27445] create_tmp_var_raw (gimplify.c) inadventently asserts 'volatile' on temps

2006-05-05 Thread Daniel Berlin


 I haven't looked into the rev. history, to see why/when this fix was made,
 but will ask the hypothetical: was this fix made to workaround the
 misbehavior in create_tmp_var_raw()?  Note that create_tmp_var_raw()
 is exported from gimplify.c and appears to be called from quite a few
 places.  The question arises: what are the preconditions for calling
 create_tmp_var_raw()?  If you want to assert that it uses whatever
 type was passed in and all the callers have to remove qualifiers
 as necessary that's fine, but requires some knowledge of the original
 intent behind create_tmp_var_raw() and the assumptions its callers make.
 I'd be temtpted to add an assert that the type passed in has no qualifiers
 if that is a pre-condition.
 
Compiler temporaries we generate explicitly, have the same qualifiers as
the expression they are generated from.  This is by design.

Re: [Bug tree-optimization/26304] [4.2 Regression] 25_algorithms/prev_permutation/1.cc on powerpc{64,}-linux and powerpc-darwin

2006-04-23 Thread Daniel Berlin

On Sun, 2006-04-23 at 23:14 +, pinskia at gcc dot gnu dot org wrote:
 
 --- Comment #17 from pinskia at gcc dot gnu dot org  2006-04-23 23:14 
 ---
 Rewritting that loop like:
 [kudzu:local/trunk/gcc] pinskia% svn diff tree-ssa-loop-niter.c
 Index: tree-ssa-loop-niter.c
 ===
 --- tree-ssa-loop-niter.c   (revision 113199)
 +++ tree-ssa-loop-niter.c   (working copy)
 @@ -1939,6 +1939,7 @@ scev_probably_wraps_p (tree type, tree b
tree unsigned_type, valid_niter;
tree base_plus_step, bpsps;
int cps, cpsps;
 +  bool known_not_to_wrap;
 
/* FIXME: The following code will not be used anymore once
   http://gcc.gnu.org/ml/gcc-patches/2005-06/msg02025.html is
 @@ -2077,8 +2078,10 @@ scev_probably_wraps_p (tree type, tree b
 
estimate_numbers_of_iterations_loop (loop);
for (bound = loop-bounds; bound; bound = bound-next)
 -if (proved_non_wrapping_p (at_stmt, bound, type, valid_niter))
 -  return false;
 +if (!proved_non_wrapping_p (at_stmt, bound, type, valid_niter))
 +  known_not_to_wrap = false;
 +  if (known_not_to_wrap)
 +   return false;
 
/* At this point we still don't have a proof that the iv does not
   overflow: give up.  */
 

known_to_wrap may be uninitialized at the if statement here.
You need to init it to true.

Re: [Bug tree-optimization/27140] Compiling LLVM now takes nearly 5x as long with 4.1 as it did with 4.0

2006-04-13 Thread Daniel Berlin



On Apr 13, 2006, at 1:30 PM, rspencer at x10sys dot com wrote:




--- Comment #6 from rspencer at x10sys dot com  2006-04-13  
20:30 ---

Created an attachment (id=11261)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11261action=view)
Timing results with -fno-tree-salias

Andrew Pinskia suggested that I try -fno-tree-salias. This decreased
compilation time by about 10% (244 secs vs 265 secs).


Only by virtue of the fact that you have a smaller number of phi nodes.
It's not going to give an order of magnitude improvement here.

Re: [Bug tree-optimization/19590] IVs with the same evolution not eliminated

2006-04-08 Thread Daniel Berlin

 --- Comment #10 from stevenb dot gcc at gmail dot com  2006-04-08 21:13 
 ---
 Subject: Re:  IVs with the same evolution not eliminated
 
  The new SCC value numberer for PRE i'm working on gets this case right (and
  this is in fact, one of the advantages of SCC based value numbering).
 
 Is the SCC-VN patch I posted long ago still of some use to you, or are
 you writing something new from scratch?
I ended up rewriting it from scratch, for other reasons.

In particular
1. I keep separate hash tables for unary, binary, references, and phi
expressions, each with their own structure
This is because you really want valuized structures in the hash table.
Your implementation will get the wrong answers during optimistic lookup
at times, because the value representative for a phi argument can change
and will get hashed to the wrong value.
2. I keep track of what expressions simplified to, and whether they have
constants in the simplified expression.   This enables much more
simplification that simply storing the value number name.

In particular, in something like
int main(int argc)
{
  int a;
  int b;
  int c;
  int d;
  a = argc + 4;
  b = argc + 8;
  c = a  b;
  d = a + 4;
  return c + d;
}

We will prove that d and b have the same value.

BTW, you missed the part of the thesis where he explains that phi nodes
in different blocks can't be congruent to each other (this isn't quite
true, but it's a much harder property to prove).

3. I needed the structures i made so i could directly transform the
results into value handles.

Re: [Bug tree-optimization/27056] New: ICE in loop_depth_of_name

2006-04-06 Thread Daniel Berlin

On Thu, 2006-04-06 at 11:49 +, jakub at gcc dot gnu dot org wrote:
 On the attached testcase with today's gcc-4_1-branch
 -m32 -g -O2 I get ICE during copy propagation.  Unfortunately, even doing 
 minor
 changes in different routines makes the problem go away.
 What I see in the dumps is:
 1) at *t26.ssa, in draw_digit, there are two SSA_NAMEs with version 2:

This is already wrong :)

Re: [Bug tree-optimization/26944] [4.1/4.2 Regression] -ftree-ch generates worse code

2006-03-31 Thread Daniel Berlin


 Compare pretmp.28_49 with pretmp.32_11, why are the arguments in a different
 order? Is there something unstable in the PRE algorithm?
 

No, we just call fold on the expressions we build, and whatever it gives
us, we use :)

Re: [Bug tree-optimization/26781] [4.2 Regression] ICE in tree-ssa-pre.c at create_component_ref_by_piec

2006-03-21 Thread Daniel Berlin

On Tue, 2006-03-21 at 15:02 +, malitzke at metronets dot com wrote:
 
 --- Comment #5 from malitzke at metronets dot com  2006-03-21 15:02 
 ---
 The two if (tree_code(genop) == VALUE_HANDLE) at lines 2190 of tree-ssa-pre.c
 look suspicious to me.
 
 
They aren't suspicious at all.

Re: [Bug tree-optimization/26726] -fivopts producing out of bounds array refs

2006-03-17 Thread Daniel Berlin

On Fri, 2006-03-17 at 12:40 +, mueller at gcc dot gnu dot org wrote:
 
 --- Comment #2 from mueller at gcc dot gnu dot org  2006-03-17 12:40 
 ---
 one possible workaround would be to lower the ARRAY_REF's to indirect mem 
 refs,
 which I don't track
 
 

Uh, no.
We are in fact, trying to do the exact opposite in the future (keep
things array ref as long as possible)

Re: [Bug tree-optimization/26626] [4.2 Regression] ICE in in add_virtual_operand

2006-03-09 Thread Daniel Berlin

On Thu, 2006-03-09 at 22:54 +, pinskia at gcc dot gnu dot org wrote:
 
 --- Comment #3 from pinskia at gcc dot gnu dot org  2006-03-09 22:54 
 ---
 The difference between copyprop and before is the following.
 Before:
   rv.0_3 = rv.0_2;
   #   VUSE NMT.7_13;
   D.1900_4 = rv.0_3-d;
 
 After:
 rv.0_3 = rv.0_2;
 #   VUSE SMT.6;
 D.1900_4 = rv.0_2-d;


This is nonsensical, and very bad.

Re: [Bug tree-optimization/26608] New: address of local variables are said to escape even though it is obvious they don't

2006-03-08 Thread Daniel Berlin

On Wed, 2006-03-08 at 18:59 +, pinskia at gcc dot gnu dot org wrote:
 Testcase:
 int *d1;
 int g(int *b)
 {
   d1 = b;
 }
 int f(int a, int b, int c)
 {
   int i, j;
   int *d;
   if (a)
 d = i;
   else
 d = j;
   i = 2;
   j = 3;
   g(b);
   if (i!=2)
link_error();
   if (j!=3)
link_error();
   return *d;
 }
 int main(void)
 {
   f(1, 2,3);
   return 0;
 }
 
 This should link with optimize but right now i and j are said to be call
 clobbered for some reason.

What does the dump say.
My guess is that it believes that they are returned from the call, even
though they are not.

Re: [Bug tree-optimization/26443] [4.2 regression] ICE in add_virtual_operand, at tree-ssa-operands.c:1867

2006-02-24 Thread Daniel Berlin

On Fri, 2006-02-24 at 13:06 +, pinskia at gcc dot gnu dot org wrote:
 
 --- Comment #2 from pinskia at gcc dot gnu dot org  2006-02-24 13:06 
 ---
 Confirmed.  Though VRP2 is just doing constant propagation at this point.
 
 

Last time i looked at a bug like this, it was actually some other pass
not rescanning operands when it should have.

Re: [Bug fortran/26444] gfortran does not compile cp2k

2006-02-23 Thread Daniel Berlin

On Thu, 2006-02-23 at 18:37 +, jb at gcc dot gnu dot org wrote:
 
 --- Comment #2 from jb at gcc dot gnu dot org  2006-02-23 18:37 ---
 I have the current CVS of cp2k, it fails with
 
 gfortran  -c -O3 -g -ffast-math -fomit-frame-pointer message_passing.f90
 ...
 message_passing.f90: In function 'mp_perf_env_create':
 message_passing.f90:58: internal compiler error: in add_virtual_operand, at
 tree-ssa-operands.c:1867
 
 Confirmed.
 
 And yes, it seems cp2k is a good testsuite for modern Fortran features.
 
 

This assert means some pass changed TMT usage without the right update
flags.
Andrew, can you try to figure out what pass did this (it should be
relatively simple to see what the last pass touching the statement in
question is).

Re: [Bug tree-optimization/14784] [Tree-ssa] alias analysis deficiency

2006-02-16 Thread Daniel Berlin

On Thu, 2006-02-16 at 21:40 +, pinskia at gcc dot gnu dot org wrote:
 
 --- Comment #4 from pinskia at gcc dot gnu dot org  2006-02-16 21:40 
 ---
 We get:
   # bitmap_free_7 = PHI bitmap_free_1(4), bitmap_free_6(5);
 L0:;
 
   # bitmap_free_1 = PHI bitmap_free_7(3), bitmap_free_2(2);
 L4:;
   #   VUSE bitmap_free_1;
   D.1534_4 = head_3-using_obstack;
   if (D.1534_4 != 0) goto L1; else goto L0;
 
 L1:;
   #   bitmap_free_6 = V_MUST_DEF bitmap_free_1;
   bitmap_free = elt_5;
   goto bb 3 (L0);
 
 I cannot figure out why Daniel's recent patches did not fix this one.

Probably the !POINTER_TYPE_P check

Re: [Bug tree-optimization/8361] [4.1/4.2 regression] C++ compile-time performance regression

2006-02-11 Thread Daniel Berlin


 Flags: -O3
 
 GCC 4.0 (release branch today):
 real0m24.412s   0m25.000s   0m24.771s
 user0m23.921s   0m24.430s   0m24.210s
 sys 0m0.368s0m0.408s0m0.420s
 
 GCC 4.1 (release branch today):
 real0m33.260s   0m33.140s   0m33.188s
 user0m32.602s   0m32.522s   0m32.554s
 sys 0m0.556s0m0.544s0m0.600s
 
 GCC 4.2 (trunk today):
 real0m36.544s   0m36.614s   0m36.492s
 user0m35.950s   0m35.942s   0m35.994s
 sys 0m0.544s0m0.600s0m0.464s
 
 
 Significant compile time sinks in GCC 4.1 that don't appear in GCC 4.0:
  tree PTA  :   2.31 ( 7%) usr
  tree SSA incremental  :   2.14 ( 6%) usr
  expand:   1.71 ( 5%) usr
 

So, could you do me a favor if you get a chance, and change the macro
DONT_PROPAGATE_WITH_ANYTHING to 1 in tree-ssa-structalias.c, and see if
it speeds it up at all?

Re: [Bug tree-optimization/24169] Address (full struct) escapes even though the called function does not cause it to escape

2006-01-03 Thread Daniel Berlin

On Sun, 2006-01-01 at 00:41 +, pinskia at gcc dot gnu dot org wrote:
 
 --- Comment #1 from pinskia at gcc dot gnu dot org  2006-01-01 00:41 
 ---
 Just a clarification here, I just want the SFT for k.j to be considered call
 clobbered for this testcase.
 
This is not anywhere near as easy as you think it is.

In fact, we used to only call clobber k.j.  Because our standards
experts tell us that doing pointer arithmetic magic to get back to k.i
is legal, we could only consider this function to clobber *just* k.j if
the pointer doesn't escape from f, *and* f does not do any pointer
arithmetic on it's arguments.

This is usually *not* the case, making this testcase more or less not
interesting at all.

Re: [Bug rtl-optimization/24762] [killloop-branch] code motion of non-invariant expressions with hard registers.

2005-11-09 Thread Daniel Berlin

On Wed, 2005-11-09 at 23:45 +, steven at gcc dot gnu dot org wrote:
 
 --- Comment #10 from steven at gcc dot gnu dot org  2005-11-09 23:45 
 ---
 Actually, flow.c does get it right.

Okay, then df.c on dataflow branch should get it right too.

Re: [Bug tree-optimization/24694] New: Address taken and addressable variables and call clobber

2005-11-06 Thread Daniel Berlin

On Sun, 2005-11-06 at 15:46 +, pinskia at gcc dot gnu dot org wrote:
 Take the following code:
 int f(int);
 int g(void)
 {
   int i;
   int *iptr = i;
   int **ipp = iptr;
   **ipp = 1;
   f(i);
   return **ipp;
 }
 --
 Here we consider i being call clobber because we lose the fact that iptr is
 addressable 

 but we don't look to see if its address escapes at all (which in
 this case it does not).
No, we don't actually.
In fact, that's not even close to what happens.

iptr isn't renamed, and thus, we assume the address taking of i and storage 
into iptr is the same as a global store, 
because we know nothing about unrenamed variables.

Re: [Bug rtl-optimization/8361] [3.4/4.0/4.1 regression] C++ compile-time performance regression

2005-10-12 Thread Daniel Berlin

On Thu, 2005-10-13 at 03:34 +, pinskia at gcc dot gnu dot org wrote:
 
 --- Comment #57 from pinskia at gcc dot gnu dot org  2005-10-13 03:34 
 ---
 A semi recent 4.1 (the 10th) gives:
  tree PTA  :   1.60 ( 6%) usr   0.02 ( 1%) sys   1.73 ( 6%) wall  
 10338 kB ( 1%) ggc
  tree alias analysis   :   1.32 ( 5%) usr   0.19 (10%) sys   1.48 ( 5%) wall  
 18910 kB ( 3%) ggc
 
 while 4.0 gave:
  tree PTA  :   0.50 ( 2%) usr   0.00 ( 0%) sys   0.48 ( 2%) wall
  tree alias analysis   :   0.73 ( 3%) usr   0.00 ( 0%) sys   0.76 ( 3%) wall
 
 So this is definitely a 4.1 regression.
 
 

I'm pretty sure we run PTA more times in 4.1 than 4.0
Maybe i'm wrong.
Can you oprofile this and give me some kind of hotspot to look into in
PTA?

Re: [Bug libgcj/24170] [SECURITY] readdir_r considered harmful

2005-10-02 Thread Daniel Berlin




On Sun, 2 Oct 2005, ben at decadentplace dot org dot uk wrote:




--- Comment #1 from ben at decadentplace dot org dot uk  2005-10-02 23:16 
---
Can someone please remove this from public view, as Mozilla does for security
bugs  on their Bugzilla?


Unlike mozilla, we do not remove security bugs from public view.
Nobody has ever set a policy for gcc that says we should (IE 
taking position on the merits of whether we should have such a policy, we 
don't).

Re: [Bug tree-optimization/24146] Optimizes away FPU control word store

2005-09-30 Thread Daniel Berlin

On Fri, 2005-09-30 at 13:58 +, rearnsha at gcc dot gnu dot org
wrote:
 --- Additional Comments From rearnsha at gcc dot gnu dot org  2005-09-30 
 13:58 ---
 (In reply to comment #1)
  volatile is needed here.
 
 No, the manual says:
 An @code{asm} instruction without any output operands will be treated
 identically to a volatile @code{asm} instruction.
 
 So this insn should be kept even though it isn't explicitly volatile.
 

Then i guess we should teach the FE to just mark them volatile, so we
don't have to worry about this in the middle end.

Re: [Bug tree-optimization/24146] [4.0 Regression] Optimizes away FPU control word store

2005-09-30 Thread Daniel Berlin

On Fri, 2005-09-30 at 14:07 +, pinskia at gcc dot gnu dot org wrote:
 --- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-30 
 14:07 ---
 I still say this is invalid.
 

well, that just makes you wrong.

the docs clearly say it's supposed to be treated as volatile.

Re: [Bug tree-optimization/24001] Simple redundancy not eliminated

2005-09-22 Thread Daniel Berlin

On Thu, 2005-09-22 at 08:31 +, rguenth at gcc dot gnu dot org wrote:
 --- Additional Comments From rguenth at gcc dot gnu dot org  2005-09-22 
 08:31 ---
 load-pre should sink the load and fix the problem at the tree level.

Uh, load PRE doesn't sink loads, it would lift it.

Re: [Bug middle-end/23672] Fold does not fold (a^b)^a to b

2005-09-16 Thread Daniel Berlin

On Sat, 2005-09-17 at 02:12 +, pinskia at gcc dot gnu dot org wrote:
 --- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-17 
 02:12 ---
 Confirmed.
 

The new reassoc should take care of this

Re: [Bug tree-optimization/23386] [4.1 Regression] bitmap.c is being miscompiled (VRP)

2005-08-14 Thread Daniel Berlin

On Sun, 2005-08-14 at 17:32 +, pinskia at gcc dot gnu dot org wrote:
 --- Additional Comments From pinskia at gcc dot gnu dot org  2005-08-14 
 17:32 ---
 Here is something which is a little more reduced:
 int f[100];
 int g[100];
 unsigned char
 f1 (int a, int b)
 {
   unsigned ix;
   if (a == b)
 return 1;
   for (ix = 4; ix--;)
   if (f[ix] != g[ix])
 return 0;
   return 1;
 }
 
 int main(void)
 {
   if (!f1 (1, 2))
 __builtin_abort();
   return 0;
 }
 
 
The SSA version used in the pointer arithmetic doesn't wrap. The other
SSA versions do.
We can't afford to simply assume that everything wraps, or else we can't
calculate the number of iterations on pretty much any loop.

Re: [Bug tree-optimization/23361] Can't eliminate empty loops with power of two step and variable bounds

2005-08-12 Thread Daniel Berlin

On Fri, 2005-08-12 at 19:10 +, pinskia at gcc dot gnu dot org wrote:
 --- Additional Comments From pinskia at gcc dot gnu dot org  2005-08-12 
 19:10 ---

Personally, i think -funsafe-loop-optimizations should be on by default
in -O3, with a warning for when we rely on it.

It's *incredibly* rare that a user actually intends for a loop counter
to be able to overflow.

Re: [Bug libstdc++/23278] SJLJ-exceptions broken

2005-08-09 Thread Daniel Berlin




On Tue, 9 Aug 2005, jacob dot navia at ants dot com wrote:



--- Additional Comments From jacob dot navia at ants dot com  2005-08-09 
19:57 ---
If I can't mix SJLJ exceptions with DWARF2 exceptions how this is supposed to
work?

How is what supposed to work?


I mean I have to rebuild all libraries including libc, libm, and whatever

Yes.



This can't be.


It is.


 Besides, why this mixing should lead to the address of a
function being stored in the high 32 bits of a 64 bit address?
Possibly because it's attempting to read the wrong place as if it 
was an unwind table, and gets confused.


k

Re: [Bug java/1427] gcj should generate N_MAIN stab or DW_AT_entry_point dwarf2 debug info

2005-08-08 Thread Daniel Berlin

On Tue, 2005-08-09 at 04:11 +, woodzltc at sources dot redhat dot
com wrote:
 --- Additional Comments From woodzltc at sources dot redhat dot com  
 2005-08-09 04:11 ---
 OK. I had some time and would like to have a look into this, and I found 
 something inconsistent. My founding is listed below, wishing that it can help 
 clarify the situation a little:  
  
 1. Someone mentioned DW_AT_entry_point in above comments. It should be a typo 
 IMHO.  In DWARF standard, there is no such an attribute named 
 DW_AT_entry_point, but there does exist a tag named DW_TAG_entry_point. 
  
 2. Seen from the DWARF standard, DW_TAG_entry_point doesn't live to act as 
 what was supposed to do.  Section-3.3 of DWARF-3 standard (Subroutine and 
 Entry Point Entries) says: 

DWARF3 is not quite standardized yet.
But it's weeks away.

  
 DW_TAG_entry_pointA Fortran alternate entry point 

Yes, well, i can bring it up if you want, but it seems the right way to
describe your entry points.

  
 Although I am not very sure about what it means by alternate entry point.  
 But I believe that it is not to represent the entry point in the final 
 executable.  

This is wrong, at least for fortran.

  
 3. I had a browsing over the DWARF standard, didn't found anything that is 
 the 
 same as N_MAIN in stabs.  Maybe we can suggest DWARF to add such a tag?  Any 
 comments? 
  

Please add an issue on dwarf.freestandards.org and i'll take it from
there.

 Regards 
 - Wu Zhou

Re: [Bug c++/23278] New: SJLJ-exceptions broken

2005-08-07 Thread Daniel Berlin

On Sun, 2005-08-07 at 19:50 +, jacob dot navia at ants dot com
wrote:
 We have a program (c++) that needs c++ SJLJ exceptions. We have built all
 compilers from 3.3.1 to 3.3.6 and they all have the same bug:

 In the first throw that the program does, we get an exception in the runtime

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 1166014832 (LWP 24573)]
 parse_lsda_header (context=0x457f6978,
 p=0xd5a040 Address 0xd5a040 out of bounds, info=0x457f6900)
 at ../../../../gcc-3.3.6/libstdc++-v3/libsupc++/eh_personality.cc:62
 62lpstart_encoding = *p++;
 


You can't mix SJLJ exceptions and dwarf2 exceptions, which is what happened 
here, AFAICT

Re: [Bug c++/22602] New: I can't enter a bug here

2005-07-21 Thread Daniel Berlin

On Fri, 2005-07-22 at 00:57 +, jacob dot navia at ants dot com
wrote:
 Because there is a size limitation to 64K in this software.
 I prepared a single file with no includes that faithfully reproduced the bug:
 bug0.cpp: In member function 'double AtomicDouble::CompareExchange(double, 
 double) volatile':
 bug0.cpp:4999: internal compiler error: in create_tmp_var, at gimplify.c:368
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See URL:http://gcc.gnu.org/bugs.html for instructions.
 
 This took me hours. THEN, I entered here the file.
 
 This software told me when I pressed the submmit button that
 my stuff was bigger than 64K, then IT DISCARDED ALL MY INPUT.
 
 NICE.
 
 I have worked like 3 hours more but the file size went down fro; 350k to 162K
 only. It is becoming increasingly difficult to reduce the size.
 
 In this times, *ANY* include directive will produce file sizes of more than
 64K.
 
 Why this stupid limitation?
 

Uh, becuase we want you to *attach the file*, not *paste it into the
comments*.

Click create new attachment

Why the heck would we want to see 65k of text in the comments of a bug?

Re: [Bug tree-optimization/22376] PTA is slow on a silly unrealistic test case

2005-07-14 Thread Daniel Berlin

On Thu, 2005-07-14 at 17:13 +, pinskia at gcc dot gnu dot org wrote:
 --- Additional Comments From pinskia at gcc dot gnu dot org  2005-07-14 
 17:13 ---
 Confirmed, patch here: 
 http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00918.html.
 

I'm waiting for mainline to settle a bit before committing to make sure
we don't cause more problems.

Re: Someone introduced a libiberty crashing bug in the past week

2005-06-20 Thread Daniel Berlin

On Mon, 2005-06-20 at 16:05 +, Joseph S. Myers wrote:
 On Mon, 20 Jun 2005, Daniel Berlin wrote:
 
  The crash line is 
  3729  if (pedantic  !DECL_IN_SYSTEM_HEADER (fundecl))
  
  Here, fundecl is null.
 
 Any problem with fundecl being null should also be reproducible with a 
 call through a function pointer where fundecl would never have been set to 
 non-null anyway.  Restoring
 
   fundecl = function;
 
 in the if (TREE_CODE (function) == FUNCTION_DECL) part of 
 build_function_call should fix the particular ICE, but the problem with 
 function pointers should still get a PR filed.

I'll do this

Re: [Bug tree-optimization/21712] missed optimization due with const function and pulling out of loops

2005-05-22 Thread Daniel Berlin

On Sun, 2005-05-22 at 19:36 +, rakdver at gcc dot gnu dot org wrote:
 --- Additional Comments From rakdver at gcc dot gnu dot org  2005-05-22 
 19:36 ---
 Because do_something does not have to return, therefore
 get_type2 does not necessarily have to be executed.
 In this case we cannot move the call to get_type2 from
 the loop (since do_something could for example initialize
 some table used internally by get_type2).
 

This is wrong.
do_something can't write.
it's const.

Re: [Bug tree-optimization/21712] missed optimization due with const function and pulling out of loops

2005-05-22 Thread Daniel Berlin

 .
 
 Nevertheless, even if we are very strict with the definition, moving
 get_type2 out of the loop is not a good idea, since get_type2 might
 potentially be very expensive (and we have no way how to determine
 that this is not the case), thus we would lose in case get_type2
 should be never executed.
 
 

Don't we attempt to detect zero trip loops?
(If not, we should :P)

Re: [Bug tree-optimization/21712] missed optimization due with const function and pulling out of loops

2005-05-22 Thread Daniel Berlin

On Sun, 2005-05-22 at 21:13 +, rakdver at atrey dot karlin dot mff
dot cuni dot cz wrote:
 --- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni 
 dot cz  2005-05-22 21:13 ---
 Subject: Re:  missed optimization due with const function and pulling out of 
 loops
 
   Nevertheless, even if we are very strict with the definition, moving
   get_type2 out of the loop is not a good idea, since get_type2 might
   potentially be very expensive (and we have no way how to determine
   that this is not the case), thus we would lose in case get_type2
   should be never executed.
   
   
  
  Don't we attempt to detect zero trip loops?
  (If not, we should :P)
 
 I don't see how this is relevant to the PR.
 

Uh, you claimed we won't move get_type2 out, even if it is const,
becuase it might not normally execute.

If we can't prove we don't execute the loop, you should move it out.
Otherwise, your logic would hold for get_type1 just the same, which we
*do* move out of the loop.

IOW, there is no reason to move get_type1 out but not get_type2

Re: [Bug tree-optimization/21712] missed optimization due with const function and pulling out of loops

2005-05-22 Thread Daniel Berlin

On Sun, 2005-05-22 at 21:36 +, rakdver at gcc dot gnu dot org wrote:
 --- Additional Comments From rakdver at gcc dot gnu dot org  2005-05-22 
 21:36 ---
 Do you still believe we should move gettype2 out of the loop???

Okay, let's compromise.
If i move cgraph do noreturn and infinite loop detection, so that we
know everything we can about do_something and gettype2 that is possible,
and we detect neither for do_something, are you still going to claim
that we shouldn't move it out of the loop?

ISTM that presuming a call in a loop is incredibly expensive seems
wrong, when that call is const. Your case seems the very extreme corner
case, not the common case.

People mark const on simple calls (remember, const can't read from
anything but readonly memory), not huge monster calls that do lots of
stuff.

Re: [Bug tree-optimization/21712] missed optimization due with const function and pulling out of loops

2005-05-22 Thread Daniel Berlin

On Sun, 2005-05-22 at 21:51 +, rakdver at atrey dot karlin dot mff
dot cuni dot cz wrote:
 --- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni 
 dot cz  2005-05-22 21:50 ---
 Subject: Re:  missed optimization due with const function and pulling out of 
 loops
 
  const is different from pure, const cannot read from memory.
 
 this is something that have been discussed many times; some people like
 the definition with behaves like if (that enables you for example to
 cache or precompute the results of the function) more, and it is used in
 several existing programs.  Anyway, the argument that the function may
 be costly is valid regardless of whether you want to strictly enforce
 the no memory access constraint, or whether you use the more useful
 definition.
 

These people are strictly wrong, and will in fact get burned by the new
pure/const detection (which is better about recursive calls).

We shouldn't let people who have the wrong definition of const get in
the way of optimization

Re: [Bug tree-optimization/21712] missed optimization due with const function and pulling out of loops

2005-05-22 Thread Daniel Berlin

 on the other hand, we should not let the definition make the concept
 useless.  Being able to make

The definition actually matches what other compilers call isolated (no
access to global variables)  combined with the property called
side-effect free (calling multiple times with same parameters is same
as calling once).
We could of course, split these concepts if we wanted to.


 
 int something(int i)
 {
   static int a[100];
 
   if (a[i] == 0)
 a[i] = somewhat_slow_computation;
   return a[i];
 }
 
 const is fairly useful.

 
 Anyway, moving possibly non-executed const function may cause also
 other problems.  Consider
 
 int my_fancy_divide(int x, int z) attribute(const)
 {
   return x / z;
 }
 
 while (...)
   {
 if (z != 0)
   x = my_fancy_divide (x,z)
   }

Uh, this may be const, but you can't move these out anyway because the
value of the parameters has changed, so i'm not sure what you are going
for.
 Thus you would also have to require the const function to be total.
 Making const still more and more useless.
 
 

const has a very specific definition already. Moving get_type2 out of
the loop is consistent with that definition.

Re: [Bug tree-optimization/13761] [tree-ssa] component refs to the same struct should not alias

2005-04-23 Thread Daniel Berlin

On Sat, 2005-04-23 at 16:52 +, steven at gcc dot gnu dot org wrote:
 --- Additional Comments From steven at gcc dot gnu dot org  2005-04-23 
 16:52 ---
 Will the second part of the struct alias merge fix Dann's original 
 test case?  (http://gcc.gnu.org/wiki/Structure Aliasing Part II) 
 

Yes, but not immediately.

structure aliasing part ii is really two parts

First, is a new alias analyzer to handle structure fields, allow
inteprocedural analysis.
Second is improving our representation to handle base+offset
dereferences.

The second is a lot harder than one would think.

Re: [Bug middle-end/20674] unexpected result from floating compare

2005-03-28 Thread Daniel Berlin

On Mon, 2005-03-28 at 23:05 +, piaget at us dot ibm dot com wrote:
 --- Additional Comments From piaget at us dot ibm dot com  2005-03-28 
 23:05 ---
 323 compares 2 values across a function call ... somthing a programmer can 
 reasonably consider. My problem occurs with 2 successive lines of code 
 admittedly with 2 compares per line). I don't have a problem that the value 
 of 
 the variable changes after precision truncation ... but it seems like a bug 
 that the compiler uses a full precision value for the 1st test and a 
 truncated 
 value for the 2nd test (the 2nd test being the next line of C++ code).

Except, the value could have been spilled and reloaded from registers
between those two source lines, which on x86, is where the problem comes
from.
The problem is no different simply because the *source* lines happen to
be right next to each other.

Re: [Bug rtl-optimization/20376] The missed-optimization of general induction variables in the new rtl-level loop optimizer cause performance degradation.

2005-03-07 Thread Daniel Berlin

On Tue, 2005-03-08 at 03:18 +, pinskia at physics dot uc dot edu
wrote:
 --- Additional Comments From pinskia at physics dot uc dot edu  
 2005-03-08 03:18 ---
 Subject: Re:  The missed-optimization of general induction variables in the 
 new rtl-level loop optimizer cause performance degradation.
 
 
 On Mar 7, 2005, at 10:16 PM, Diego Novillo wrote:
 
  pinskia at gcc dot gnu dot org wrote:
 
  Why isn't the tree level loop IV-OPTs doing this?
  Because variable i is static.
 
 I think you commenting on the wrong bug.

In swim, most of the loop bounds are accessed through the COMMON block,
which is a structure.

Re: [Bug tree-optimization/20134] New: 176.gcc miscompare with -m64 after DOM change

2005-02-21 Thread Daniel Berlin

On Tue, 2005-02-22 at 00:12 +, janis at gcc dot gnu dot org wrote:
 The SPEC CPU2000 test 176.gcc has been failing on powerpc64-*-linux-gnu
 with -m64 -O1 since this patch was added:
   
  
 
 2004-10-23  Daniel Berlin  [EMAIL PROTECTED]
   
  
 
 * tree-ssa-dom.c (record_equality): Use loop depth to determine
 which way to record the equality as well.
 (loop_depth_of_name): New function.


This can't be the real cause of the problem, however, it must just be
exposing the latent bug.
It just changes the direction we record the equality, so that we will
use one variable instead of another.
The code still believes both variables to be equal.
In other words, there is something in record_equality that isn't
correct, or some pass later on is now doing something wrong as a result.

Can you print out the values of x, y, and prev_x we are passing to
record_const_or_copy_1 in record_equality before and after the patch,
for that function?

Re: [Bug tree-optimization/14741] missing transformations lead to poorly optimized code

2005-01-28 Thread Daniel Berlin


On Fri, 28 Jan 2005, jv244 at cam dot ac dot uk wrote:
--- Additional Comments From jv244 at cam dot ac dot uk  2005-01-28 
16:31 ---
You could try gfortran -O3 -mtune=pentium4  -ffast-math -mfpmath=sse
-ftree-loop-linear -ftree-vectorize yourcode.f90 and see if it helps.
Unhappily, seems to make things slower:
multgen/basic_mult  gfortran -O3 -mtune=pentium4  -ffast-math -mfpmath=sse
-ftree-loop-linear -ftree-vectorize mult.f90
mult.f90:0: warning: SSE instruction set disabled, using 387 arithmetics
You'd need -msse2 or -msse (or is it -march=pentium4 that enables these?)

Re: [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-23 Thread Daniel Berlin

I believe seb/zdenek already submitted patches for speeding up scev quite 
recently, with the goal of alleviating this problem.
I'm pretty sure they have not been applied yet.

Re: [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)

2005-01-23 Thread Daniel Berlin


On Sun, 24 Jan 2005, rakdver at gcc dot gnu dot org wrote:
--- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-24 
01:46 ---
On a side note, PRE also seems to have problems with the testcase.  With the
patch mentioned above, the largest consumers of compile time are ivopts (45%)
and pre (20%).
Uh, there was a bug filed about this, and i fixed it, last i looked.

Re: [Bug inline-asm/11203] source doesn't compile with -O0 but they compile with -O3

2005-01-22 Thread Daniel Berlin



The reason is dead simple: register allocation is NP-complete, so it
is even *theoretically* not possible to write register allocators that
always find a coloring.
register allocation in general is NP-complete, yes, but it seems u forget 
that
this is about finding the optimal solution while gcc fails finding any solution
which in practice is a matter of assigning the registers beginning from the most
constrained operands to the least, and copying a few things on the stack if gcc
cant figure out howto access them, sure this method might fail in 0.001% of the
practical cases and need a 2nd or 3rd pass where it tries different registers
it might also happen that in some intentionally overconstrained cases it ends up
searching the whole 5040 possible assignments of 7 registers onto 7 non memory
operands but still it wont fail
Just to also point out, it doesn't appear to be NP complete for register 
interference graphs, because they all seem to be 1-perfect.
Various papers have observed this, and i've actually  compiled all of gcc, 
libstdc++, etc, and every package ever on my computer, and not once has a 
single non-1-perfect interference graph 
occurred [my compiler would abort if it was true].

On 1-perfect graphs you can solve this problem in O(time it takes to 
determine the max clique), and there already exists a polynomial time 
algorithm for max-clique on perfect graphs.


 
That means any register allocator will always
fail on some very constrained asm input.
now that statement is just false, not to mention irrelevant as none of 
these asm
statemets are unreasonably constrained
You are correct, NP completeness does not imply impossiblity.
There are only a finite number of possibilities.

 And you cannot allow it to
run indefinitely until a coloring is found, because then you've turned
the graph coloring problem into the halting problem because you can't
prove that a coloring exists and that the register allocator algorithm
will terminate.
this is ridiculous, the number of possible colorings is finite, u can 
always try
them all in finite time
You are right, he is wrong.

Re: [Bug debug/19367] [4.0 Regression] ICE: tree_check in lookup_local_die with local `using'

2005-01-10 Thread Daniel Berlin


On Mon, 10 Jan 2005, pinskia at gcc dot gnu dot org wrote:
--- Additional Comments From pinskia at gcc dot gnu dot org  2005-01-10 
21:56 ---
Confirmed, I think this is the boost ICE.
This happens because the orig_decl that we are trying to use in emitting 
the using decl info appears to have been trashed or garbage collected 
before we emit it.

I think i know why.

Re: [Bug debug/19267] New: [4.0 regression] execute/921215-1.c fails with -fpic at -O3 -g

2005-01-05 Thread Daniel Berlin


On Wed, 5 Jan 2005, ghazi at gcc dot gnu dot org wrote:
When running the testsuite with -fpic/-fPIC, I get an additional failure in the
testsuite with mainline:
FAIL: gcc.c-torture/execute/921215-1.c compilation,  -O3 -g
The regression appeared sometime in the last day or so between these postings:
http://gcc.gnu.org/ml/gcc-testresults/2005-01/msg00135.html
http://gcc.gnu.org/ml/gcc-testresults/2005-01/msg00179.html
The compilation dies like this:
921215-1.c:22: internal compiler error: in gen_subprogram_die, at
dwarf2out.c:11207
in the source we have:
11207 gcc_assert (errorcount);
The problem is that errorcount is zero, so the gcc_assert() dies.
I'm about to submit a patch that will fix this.

Fix longstanding bugzilla anoyance

2004-12-19 Thread Daniel Berlin

Accept bug should now assign the bug to you, as one expects it to.
Sorry it took so long for me to fix this, it kept falling off my todo 
list since it was really a minor annoyance :)

--Dan

Re: GCC C bug: sizeof a union of structs returns zero value

2004-12-16 Thread Daniel Berlin


On Thu, 16 Dec 2004, Hugh Daniel wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 Note, I gave up on GNATS after repeatedly getting this error message
no matter what I did to the text:

You have not described how to repeat the bug
You have not defined a category for the bug

 If there is a maintainer of the [EMAIL PROTECTED] bot I would
be happy to help debug the problem with your script.
If you can pass me the full raw email message you sent to the script 
(including headers, etc), i'm  happy to try to debug it.

Note that the [EMAIL PROTECTED] is (or should be) deprecated.
The bug reporting instructions will point you to report bugs using our 
bugzilla system now.

The gcc-gnats script is only really to handle the occasional gcc-gnats 
email that comes in.

--Dan

Re: [Bug rtl-optimization/16613] [3.4 Regression] compile time regression, when adding cerr usage

2004-12-10 Thread Daniel Berlin


On Fri, 10 Dec 2004, andre maute wrote:
Once more i couldn't upload an attachment
with the bugzilla upload form, so i send it here.
You can email it to [EMAIL PROTECTED] with a subject of Bug 16613 
(or whatever the bug number is), and it'll auto-add it to the bug for you.

Re: [Bug c++/18368] New: C++ error message regression

2004-11-07 Thread Daniel Berlin

Yes, it happens ta global scope too.
struct foo {}
void method () {}
will give the same error
On Sun, 8 Nov 2004, sabre at nondot dot org wrote:
On this c++ code:
struct C {
 struct foo { int A; }
 void method();
};
This probably also happens at global scope.
-Chris

96 matches

Mail list logo