Re: [patch tree-optimization]: Improve reassociation pass for bitwise-operations

2011-08-04 Thread Michael Matz
Hi, On Wed, 3 Aug 2011, Kai Tietz wrote: This machinery doen't work in this case That's why you have to extend it. The issue about this machinery is that it assumes that the statement itself gets transformed, but for normalized form of invert of bitwise operations it is essential to

Re: [patch tree-optimization]: Improve reassociation pass for bitwise-operations

2011-08-03 Thread Michael Matz
Hi, On Tue, 2 Aug 2011, Kai Tietz wrote: this patch improves the ability of reassociation pass to simplifiy more complex bitwise-binary operations with comparisons. We break-up for this patch statements like (X | Y) != 0 to X != 0 | Y != 0, and (X | Y) == 0 to expanded X == 0 Y == 0.

Re: [patch tree-optimization]: Improve reassociation pass for bitwise-operations

2011-08-03 Thread Michael Matz
Hi, On Wed, 3 Aug 2011, Kai Tietz wrote: Implement all of this in the normal reassoc machinery that already exists. Don't implement your own walker (which btw is superlinear because you recurse into both operands).  If no simplifications were possible you have to fold back the NOTs

Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-27 Thread Michael Matz
Hi, On Wed, 27 Jul 2011, Richard Guenther wrote: I don't think it is safe to try to get at the VLA type the way you do. I don't understand in what way it's not safe. Do you mean I don't manage to find the type always, or that I find the wrong type, or something else? I think you

Re: [patch] Fix PR tree-optimization/49771

2011-07-26 Thread Michael Matz
Hi, On Tue, 26 Jul 2011, Ulrich Weigand wrote: Well, REG_ATTRS-decl is again a decl, not an SSA name. I suppose we'd need to pick a conservative REGNO_POINTER_ALIGN during expansion of the SSA name partition - iterate over all of them in the partition and pick the lowest alignment. Or

Re: Fix pass_partition_blocks vs -O0

2011-07-25 Thread Michael Matz
Hi, On Fri, 22 Jul 2011, Richard Henderson wrote: Well, technically it's not broken yet. It will be as soon as it starts touching DF data, since this pass runs before pass_df_initialize_no_opt. But the only real consumer of BB_PARTITION is pass_reorder_blocks. And that pass is already

Re: [PATCH] Fix PR49715, (float)unsigned - (float)signed

2011-07-22 Thread Michael Matz
Hi, On Fri, 22 Jul 2011, Richard Guenther wrote: Regresses vectorization on i?86 because that defines floathi expanders but the vectorizer does not recognize a short - float conversion as that requires different sized vectors (the int - short truncation is also a complication for it). But

Re: [PATCH] Rewrite TRUTH_NOT_EXPR as BIT_{NOT,XOR}_EXPR

2011-07-19 Thread Michael Matz
Hi, On Tue, 19 Jul 2011, Richard Guenther wrote: *** forward_propagate_comparison (gimple stm *** 1164,1170 } /* We can propagate the condition into a statement that computes the logical negation of the comparison result. */ ! else if

Re: [PATCH] Make VRP optimize useless conversions

2011-07-11 Thread Michael Matz
Hi, On Mon, 11 Jul 2011, Richard Guenther wrote: The following actually works. Bootstrapped and tested on x86_64-unknown-linux-gnu. Can you double-check it? Seems sensible. Given this: short s; int i; for (s = 0; s = 127; s++) i += (signed char)(unsigned char)s; return i;

Re: [PATCH] Make VRP optimize useless conversions

2011-07-08 Thread Michael Matz
Hi, On Fri, 8 Jul 2011, Richard Guenther wrote: It should be indeed safe with the current handling of conversions, but better be safe. So, like the following? No. The point is that you can't compare the bounds that VRP computes with each other when the outcome affects correctness. Think

Re: [patch tree-optimization]: [3 of 3]: Boolify compares more

2011-07-08 Thread Michael Matz
Hi, On Fri, 8 Jul 2011, Kai Tietz wrote: This is the reworked patch, It fixes vrp to handle bitwise one-bit precision typed operations and to handle some type hoisting cases, Some cases can't be handled as long as vrp doesn't allows to insert new statements in folding pass. To have in

Re: [PATCH] Fix folding of -(unsigned)(a * -b)

2011-07-07 Thread Michael Matz
Hi, On Thu, 7 Jul 2011, Richard Guenther wrote: Index: gcc/fold-const.c === --- gcc/fold-const.c (revision 175962) +++ gcc/fold-const.c (working copy) @@ -7561,7 +7561,7 @@ fold_unary_loc (location_t loc, enum tre if

Re: [PATCH] Make VRP optimize useless conversions

2011-07-07 Thread Michael Matz
Hi, On Thu, 7 Jul 2011, Richard Guenther wrote: + tree rhs1 = gimple_assign_rhs1 (stmt); + gimple def_stmt = SSA_NAME_DEF_STMT (rhs1); + value_range_t *final, *inner; + + /* Obtain final and inner value-ranges for a conversion + sequence

Re: [PATCH] Fix PR49645, with C FE pieces

2011-07-06 Thread Michael Matz
Hi, On Wed, 6 Jul 2011, Richard Guenther wrote: *** copy_reference_ops_from_ref (tree ref, V *** 579,585 memset (temp, 0, sizeof (temp)); /* We do not care for spurious type qualifications. */ ! temp.type = TYPE_MAIN_VARIANT (TREE_TYPE (ref));

Re: [PATCH] Address lowering [1/3] Main patch

2011-07-05 Thread Michael Matz
Hi, On Tue, 5 Jul 2011, William J. Schmidt wrote: Hm, I didn't think it was (currently) possible for a gimple statement to have a mem-ref on both RHS and LHS. Is that incorrect? This is easily changed if so, or if the possibility should be left open for the future. Think aggregate

Re: [PATCH] Address lowering [1/3] Main patch

2011-07-04 Thread Michael Matz
Hi, On Mon, 4 Jul 2011, Richard Guenther wrote: I still do not like the implementation of yet another CSE machinery given that we already have two. From reading it it really seems to be a normal block-local CSE, without anything fancy. Hence, moving the pass just a little earlier (before

Re: PATCH [5/n]: Prepare x32: PR middle-end/48016: Inconsistency in non-local goto save area

2011-06-29 Thread Michael Matz
Hi, On Wed, 29 Jun 2011, H.J. Lu wrote: diff --git a/gcc/function.c b/gcc/function.c index 81c4d39..131bc09 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -4780,7 +4780,7 @@ expand_function_start (tree subr)                       cfun-nonlocal_goto_save_area,                  

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-28 Thread Michael Matz
Hi, On Tue, 28 Jun 2011, Richard Guenther wrote: I'd name the predicate value_preserving_conversion_p which I think is what you mean. harmless isn't really descriptive. Note that you include non-value-preserving conversions, namely int - unsigned int. It seems that Andrew really does

Re: varpool alias reorg

2011-06-27 Thread Michael Matz
Hi, On Mon, 27 Jun 2011, Jan Hubicka wrote: I still like to stream unmodified builtins as builtins, as that is similar to pre-loading the streamer caches with things like void_type_node or sizetype. Doing so will need us to solve the other one decl rules probly. I didn't really got

Re: [PATCH] Middle-end arrays, forward-ported to trunk (again)

2011-06-22 Thread Michael Matz
Hi, On Tue, 21 Jun 2011, Richard Guenther wrote: I failed to see where the scalarizer inserts the temporary vars it creates into the scope blocks (thus the gimplify.c hunk ...). Any help here is welcome. The scoping of the scalarizer is a bit funny. gfc_start_scalarized_body sets up

Re: [PATCH PR45098] Disallow NULL pointer in pointer arithmetic

2011-06-20 Thread Michael Matz
Hi, On Mon, 20 Jun 2011, Richard Guenther wrote: of the specifications; rather, we should consider whether there is a situation where a user could reasonably expect NULL + 0 to be valid.  In the example by Richard, int __attribute__((noinline)) foo (void *p, int i) {  return p +

Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10))

2011-06-17 Thread Michael Matz
Hi, On Thu, 16 Jun 2011, Richard Guenther wrote: If people want to not create useless conversions in the first place, though, I suspect there are lots of places that create useless conversions in the compiler. Yeah, the above looks it comes from the frontends - gimplification should

Re: PATCH [5/n]: Prepare x32: PR middle-end/48016: Inconsistency in non-local goto save area

2011-06-15 Thread Michael Matz
Hi, On Sat, 11 Jun 2011, H.J. Lu wrote: We are very inconsistent when saving and restoring non-local goto save area. See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48016 for detailed analysis. OK for trunk? + /* FIXME: update_nonlocal_goto_save_area may pass SA in the wrong mode.

Re: Cgraph alias reorg 2/14 (introduction of alias walkers)

2011-06-10 Thread Michael Matz
Hi, On Fri, 10 Jun 2011, Jan Hubicka wrote: +static inline struct cgraph_node * +cgraph_function_or_thunk_node (struct cgraph_node *node, enum availability *availability) +{ + if (availability) +*availability = cgraph_function_body_availability (node); + return node; + return

Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10))

2011-06-07 Thread Michael Matz
Hi, On Tue, 7 Jun 2011, Richard Guenther wrote: fold_convert_loc it to the expected type, while the middle-end has the notion of useless type conversions, fold-const.c is also used by FEs and I think it is expected to have the types exactly matching. So (T)s1[10] instead of s1[10]

Re: [4.6 PATCH] Workaround for stack slot sharing problems with unrolling (PR fortran/49103)

2011-06-07 Thread Michael Matz
Hi, On Tue, 7 Jun 2011, Richard Guenther wrote: +         tree base = get_base_address (lhs); Probably easier and more complete to do if (lhs TREE_CODE (lhs) != SSA_NAME) { tree base = get_base_address (lhs); I don't like the patch too

better wpa [2/n]: merge some top-level trees

2011-06-01 Thread Michael Matz
Hi, so here's something more of my patch queue. It adds the facility to merge also other trees than types over compilation unit borders. This specific patch only deals with STRING_CST and INTEGER_CST nodes. Originally I used that place for merging declarations, hence the naming of some

Re: better wpa [2/n]: merge some top-level trees

2011-06-01 Thread Michael Matz
Hi, On Wed, 1 Jun 2011, Michael Matz wrote: Hi, so here's something more of my patch queue. It adds the facility to merge also other trees than types over compilation unit borders. This specific patch only deals with STRING_CST and INTEGER_CST nodes. Originally I used that place

[rfa] Give thunks correct RESULT_DECL

2011-06-01 Thread Michael Matz
Hi, I noticed this a while ago while working on early merging of decls. When we build thunk decls ourself we give RESULT_DECL of it integer_type, even when the thunk decl itself says something else. (In particular thunks can very well return void or a pointer type). This fixes that glitch.

Re: [PR48866] three alternative fixes

2011-05-30 Thread Michael Matz
Hi, On Mon, 30 May 2011, Alexandre Oliva wrote: 3. expand dominators before dominated blocks, so that DEFs of replaceable SSA names are expanded before their uses. Expand them when they're encountered, but not requiring a REG as a result. Save the RTL expression that

Re: RFC: explicitely mark out-of-scope deaths

2011-05-27 Thread Michael Matz
Hi, On Thu, 26 May 2011, Martin Jambor wrote: I assume DSE does not remove the stores as that would defeat the purpose of the patch. Right. (The volatileness currently prevents the removal). If after optimizations such as SRA, these special stores are the only statements accessing the

Re: RFC: explicitely mark out-of-scope deaths

2011-05-27 Thread Michael Matz
Hi, On Thu, 26 May 2011, Steven Bosscher wrote: on IRC we discussed about this, here's the RFC patch.  It bootstraps and causes some minor regressions most probably due to some missing sprinkled checks for the special clobber insns and sometimes due to having to adjust some regexps.

Re: RFC: explicitely mark out-of-scope deaths

2011-05-27 Thread Michael Matz
Hi, On Fri, 27 May 2011, Jakub Jelinek wrote: On Fri, May 27, 2011 at 03:59:47PM +0200, Michael Matz wrote: On Thu, 26 May 2011, Steven Bosscher wrote: on IRC we discussed about this, here's the RFC patch.  It bootstraps and causes some minor regressions most probably due to some

RFC: explicitely mark out-of-scope deaths

2011-05-26 Thread Michael Matz
Hi, on IRC we discussed about this, here's the RFC patch. It bootstraps and causes some minor regressions most probably due to some missing sprinkled checks for the special clobber insns and sometimes due to having to adjust some regexps. Anyway, stack slot sharing is currently using the

Re: PATCH: Add pause intrinsic

2011-05-26 Thread Michael Matz
Hi, On Thu, 26 May 2011, Andrew Haley wrote: +Generates the @code{pause} machine instruction. But that's missing the fact that it generates a compiler memory barrier, which is important. And if you think it's not a compiler memory barrier, please explain a. Why it's not a

Re: Better location streaming

2011-05-26 Thread Michael Matz
Hi, On Thu, 26 May 2011, Richard Guenther wrote: Hmm, I plan to optimize string streaming (since we always stream one uleb to set it is non-NULL that can be easilly handled by assigining NULL string index 0).  How precisely you however suggest to bitpack line/column and string offset

Re: PATCH: Add pause intrinsic

2011-05-25 Thread Michael Matz
Hi, On Wed, 25 May 2011, Richard Guenther wrote: asm volatile ( : : : memory) in fact will work as a full memory barrier How?  You surely need MFENCE or somesuch, unless all you care about is a compiler barrier.  That's what I think needs to be clarified. Well, yes, I'm talking

Ensure all frontends have the same number of common nodes

2011-05-20 Thread Michael Matz
Hi, for various changes we can run into the situation that frontends fill the cache with a different number of common nodes than the LTO frontend, leading to confusion of which numbers mean which tree. This can happen when some global trees are NULL, or when some global trees are

Merge OBJS-common,-md,-archive

2011-05-20 Thread Michael Matz
Hi, On Fri, 20 May 2011, Joseph S. Myers wrote: (Apart from the arbitrary division between GCC_OBJS and the xgcc link rule, mentioned above, there are other arbitrary divisions that don't make sense to me. In particular, the separation between OBJS-common, OBJS-md and OBJS-archive, all

Re: Don't let search bots look at buglist.cgi

2011-05-17 Thread Michael Matz
Hi, On Mon, 16 May 2011, Ian Lance Taylor wrote: httpd being in the top-10 always, fiddling with bugzilla URLs? (Note, I don't have access to gcc.gnu.org, I'm relaying info from multiple instances of discussion on #gcc and richi poking on it; that said, it still might not be web

Re: Don't let search bots look at buglist.cgi

2011-05-16 Thread Michael Matz
Hi, On Mon, 16 May 2011, Andrew Haley wrote: On 16/05/11 10:45, Richard Guenther wrote: On Fri, May 13, 2011 at 7:14 PM, Ian Lance Taylor i...@google.com wrote: I noticed that buglist.cgi was taking quite a bit of CPU time. I looked at some of the long running instances, and they were

Re: [patch gimplifier]: Make sure TRUTH_NOT_EXPR has boolean_type_node type and argument

2011-05-16 Thread Michael Matz
Hi, On Mon, 16 May 2011, Richard Guenther wrote: We can't use a test for BOOLEAN_TYPE as the middle-end considers a INTEGER_TYPE with same precision/signedness as compatible and thus may propagate a variable of INTEGER_TYPE there. I don't understand why promoting bools to

Re: Don't let search bots look at buglist.cgi

2011-05-16 Thread Michael Matz
Hi, On Mon, 16 May 2011, Andrew Haley wrote: It's not quite the same information, surely. Wouldn't searchers be directed to an email rather than the bug itself? Yes, though there is a link in all mails. Right, so we are contemplating a reduction in search quality in exchange for

Re: Don't let search bots look at buglist.cgi

2011-05-16 Thread Michael Matz
Hi, On Mon, 16 May 2011, Andrew Haley wrote: It routinely is. bugzilla performance is terrible most of the time for me (up to the point of five timeouts in sequence), svn speed is mediocre at best, and people with access to gcc.gnu.org often observe loads 25, mostly due to I/O .

Re: [PATCH] Fix tree parts of PR18041

2011-05-10 Thread Michael Matz
Hi, On Tue, 10 May 2011, Richard Guenther wrote: struct B { unsigned bit0 : 1; unsigned bit1 : 1; }; void foo (struct B *b) { b-bit0 = b-bit0 | b-bit1; } we with this patch generate D.2686_2 = b_1(D)-bit0; D.2688_4 = b_1(D)-bit1; D.2693_10 = D.2688_4 ^ D.2686_2;

Re: [PATCH] split tree_type, a.k.a. tuplifying types

2011-05-10 Thread Michael Matz
Hi, On Tue, 10 May 2011, Nathan Froyd wrote: +  /* Do not stream TYPE_POINTER_TO or TYPE_REFERENCE_TO.  */ Add some wording as to why not? This was copied from existing comments, but I do not remember why we were doing this. Not too critical, anyway. I'm not entirely sure; I'm

Re: Cgraph thunk reorg

2011-05-06 Thread Michael Matz
Hi, On Fri, 6 May 2011, Jan Hubicka wrote: *** dump_cgraph_node (FILE *f, struct cgraph *** 1874,1880 if (node-only_called_at_exit) fprintf (f, only_called_at_exit); ! fprintf (f, \n called by: ); for (edge = node-callers; edge; edge =

Re: [google]: initialize language field for clone function struct

2011-05-04 Thread Michael Matz
Hi, On Wed, 4 May 2011, Richard Guenther wrote: It prevents save_expr from being called at global level, since you cannot create SAVE_EXPRs outside functions.  Likewise in variable_size. I see several places in fold-const.c that are not properly guarded then. But anyway, if it is

Re: [google]: initialize language field for clone function struct

2011-05-04 Thread Michael Matz
Hi, On Wed, 4 May 2011, Richard Kenner wrote: There are pros and cons about early optimization, actually. Generating extremely optimized IL very early can actually tie up subsequent passes. For instance, loop unrolling and vectorization. There are others in the literature. Sure, in

Recognize -Ofast like -ffast-math for crtfastmath.o

2011-05-04 Thread Michael Matz
Hi, -Ofast is intended to be -O3 plus -ffast-math. For the compiler proper this works, but under -ffast-math we add crtfastmath.o (or some equivalent) to the link command line on some targets. As usual for our specs this uses matching on command line arguments, hence we'll explicitely have

Re: [PATCH] Finish int_const_binop partial transition

2011-05-03 Thread Michael Matz
Hi, On Tue, 3 May 2011, Richard Guenther wrote: --- 5858,5890 /* If these are the same operation types, we can associate them assuming no overflow. */ ! if (tcode == code) ! { ! double_int mul; ! int overflow_p; ! mul =

Re: Turn streamer cache to pointer_map

2011-05-02 Thread Michael Matz
Hi, On Mon, 2 May 2011, Richard Guenther wrote:    /* The mapping between tree nodes and slots into the nodes array.  */ !   struct pointer_map_t GTY((skip)) *node_map; If you skip node_map you can end up with false entries for re-used trees.  So I don't think that's a good idea. Or

Re: Turn streamer cache to pointer_map

2011-05-02 Thread Michael Matz
Hi, On Mon, 2 May 2011, Richard Guenther wrote: --- 348,367                            bool insert_at_next_slot_p)   {     void **slot;     unsigned ix;     bool existed_p;     gcc_assert (t); !   slot = pointer_map_insert (cache-node_map, t); !   if (!*slot) ix

Re: better wpa [1/n]: merge types during read-in

2011-04-29 Thread Michael Matz
Hi, On Thu, 21 Apr 2011, Richard Guenther wrote: It would have been nice to have the top-level tree merging as a separate patch, as I am not convinced it is correct, but see below ... I'll split it out. Like so (also including the other remarks). Regstrapping on x86_64-linux in

Fix initialization of warn_maybe_uninitialized

2011-04-28 Thread Michael Matz
Hi, since the split of warn_uninitialized to warn_maybe_uninitialized the fortran and Ada frontends have changed behaviours (causing uninit_func.adb to fail). I've left out the Java frontend because it also didn't set warn_uninitialized with -Wall before, and go because it doesn't do

Re: Add inline-analysis predicates to edges

2011-04-27 Thread Michael Matz
Hi, On Wed, 27 Apr 2011, Jan Hubicka wrote: *** false_predicate (void) *** 163,168 --- 166,195 } + /* Return true if P is (false). */ + + static inline bool + true_predicate_p (struct predicate *p) Comment doesn't match function. + { + return

Re: better wpa [1/n]: merge types during read-in

2011-04-21 Thread Michael Matz
Hi, On Wed, 20 Apr 2011, Michael Matz wrote: It would have been nice to have the top-level tree merging as a separate patch, as I am not convinced it is correct, but see below ... I'll split it out. Like so (also including the other remarks). Regstrapping on x86_64-linux in progress

Re: Improve stack layout heuristic.

2011-04-21 Thread Michael Matz
Hi, On Wed, 20 Apr 2011, Easwaran Raman wrote: But you're right - not adding that conflict doesn't actually reduce the size of bit maps. Reverting back to what was there originally. Thanks, I have no more issues with the patch. You'll need to find someone who can formally approve it,

Re: [PATCH] make LABEL_DECL has its own rtx field for its associated CODE_LABEL

2011-04-21 Thread Michael Matz
Hi, On Wed, 20 Apr 2011, Richard Guenther wrote: I had occasion to try this today; this inheritance structure doesn't work.  The truncated inheritance tree looks like: * decl_common  * field_decl  * const_decl  * decl_with_rtl    * label_decl    * result_decl    * parm_decl

Re: better wpa [1/n]: merge types during read-in

2011-04-20 Thread Michael Matz
Hi, On Wed, 20 Apr 2011, Richard Guenther wrote: + /* A hashtable of trees that potentially refer to variables or functions +    that must be replaced with their prevailing variant.  */ + static GTY((if_marked (ggc_marked_p), param_is (union tree_node))) htab_t +   tree_with_vars; +

Re: Improve stack layout heuristic.

2011-04-20 Thread Michael Matz
Hi, On Tue, 19 Apr 2011, Easwaran Raman wrote: That is correct but is also what the use of stack_vars[u].representative achieves alone, ...  I am adding a check to that effect. ... without any check. @@ -596,7 +581,8 @@   if (vb-conflicts)     {      

Re: better wpa [1/n]: merge types during read-in

2011-04-20 Thread Michael Matz
Hi, On Wed, 20 Apr 2011, Richard Guenther wrote: If t is a type, why fix up its field if it may not be the canonical variant? Because type merging to work sometimes requires already canonicalized fields, at least that's what I found in investigating why some types weren't merged

Fix PR48703: segfault in mangler due to -g

2011-04-20 Thread Michael Matz
Hi, as noted in the bug trail the fix for PR48207 broke compilation of C++ programs with -g. This variant fixes the bug too without breaking -g. Basically we have to set assembler names early also for TYPE_DECLs, we can't rely on the frontends langhook to do that after free_lang_data. Okay

Re: Fix PR48703: segfault in mangler due to -g

2011-04-20 Thread Michael Matz
Hi, I wrote: Basically we have to set assembler names early also for TYPE_DECLs, we can't rely on the frontends langhook to do that after free_lang_data. Okay for trunk assuming regstrapping on x86_64-linux works? Patch retracted, doesn't even survive testsuite. The problem is that we

better wpa [1/n]: merge types during read-in

2011-04-19 Thread Michael Matz
Hi, I have a backlog of random improvements to the WPA phase of LTO compilations, all with the common goal of reducing peak memory usage. I was basically dumping all trees that the WPA phase read in, and then tried to think about which trees can be merged with already existing ones very

Re: Improve stack layout heuristic.

2011-04-18 Thread Michael Matz
Hi, [FWIW I can't approve patches, but some feedback nevertheless] On Sun, 17 Apr 2011, Easwaran Raman wrote: This patch impoves the heuristic used in assigning stack location to stack variables. Currently, if there are 3 variables A, B and C with their sizes in increasing order and A and

Fix PR48622 (lto ICE, lto bootstrap)

2011-04-16 Thread Michael Matz
Hi, since r172430 lto bootstrap is broken, as well as the attached testcase (pr48622) and cpu2006 compilation (pr48645). The inline summary writer used a different order for size and time than the reader expected. I've committed the below patch as obvious (r172603) after verifying that lto

Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Michael Matz
Hi, On Thu, 14 Apr 2011, Dominique Dhumieres wrote: I have forgotten to mentionned that I have a variant of fatigue in which I have done the inlining manually along with few other optimizations and the timing for it is [macbook] lin/test% gfc -Ofast fatigue_v8.f90 [macbook] lin/test% time

Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Michael Matz
Hi, On Fri, 15 Apr 2011, Dominique Dhumieres wrote: Michael, Yes, this is due to the DECL_EXPR statement which is rendered by the dumper just the same as a normal decl. The testcase looks for exactly one such decl, but with -fstack-arrays there are exactly two for each such array.

Re: Implement stack arrays even for unknown sizes

2011-04-14 Thread Michael Matz
Hi, On Thu, 14 Apr 2011, Michael Matz wrote: no stack-arrayswith stack-arrays no addtional options: 10.2s 8.8s + -fwhole-program: 7.1s 8.8s + -fwhole-program -flto: 10.1s

Re: Implement stack arrays even for unknown sizes

2011-04-12 Thread Michael Matz
Hello, On Mon, 11 Apr 2011, Steven Bosscher wrote: Try this patch.  I've verified that capacita and nf work with it and -march=native -ffast-math -funroll-loops -fstack-arrays -O3 .  In fact all of polyhedron works for me on these flags.  (I've set a ulimit -s of 512MB, but I don't know

Re: Fix PR47612

2011-04-12 Thread Michael Matz
Hi, On Tue, 12 Apr 2011, Bernd Schmidt wrote: This fixes a problem on cc0 machines where we split a sequence of insns at a point where we shouldn't - between a cc0 setter and a cc0 user. The fix is simple enough; just make sure not to pick a cc0 setter as the end of such a sequence. The

Re: Implement stack arrays even for unknown sizes

2011-04-11 Thread Michael Matz
Hi, On Sun, 10 Apr 2011, Dominique Dhumieres wrote: I find that both nf.f90 and capacita.f90 segfault in runtime for any stack size. On x86_64-apple-darwin10, nf.f90 works. However if I run it through valgrind I get ==64815== Memcheck, a memory error detector ==64815== Copyright (C)

Re: Implement stack arrays even for unknown sizes

2011-04-11 Thread Michael Matz
On Sat, 9 Apr 2011, Paul Richard Thomas wrote: I find that both nf.f90 and capacita.f90 segfault in runtime for any stack size. Try this patch. I've verified that capacita and nf work with it and -march=native -ffast-math -funroll-loops -fstack-arrays -O3 . In fact all of polyhedron works

Re: Fix PR48389: ICE in make_edges

2011-04-08 Thread Michael Matz
Hi, On Fri, 8 Apr 2011, Jakub Jelinek wrote: On Fri, Apr 08, 2011 at 03:33:49PM +0200, Michael Matz wrote: --- testsuite/gcc.target/i386/pr48389.c (revision 0) +++ testsuite/gcc.target/i386/pr48389.c (revision 0) @@ -0,0 +1,12 @@ +/* PR middle-end/48389 */ +/* { dg-do compile

Re: PATCH: PR middle-end/48440: [4.7 Regression] FAIL: gcc.c-torture/compile/labels-3.c

2011-04-07 Thread Michael Matz
Hi, On Thu, 7 Apr 2011, Richard Guenther wrote: 5600      newx = simplify_subreg (outermode, op, innermode, byte); (gdb) f 1 #1  0x00708494 in expand_expr_real_2 (ops=0x7fffb0c0, target=0x0,    tmode=VOIDmode, modifier=EXPAND_INITIALIZER)    at

Re: [cxx-mem-model] bitfield tests

2011-04-06 Thread Michael Matz
Hi, On Mon, 4 Apr 2011, Aldy Hernandez wrote: (5) Do we agree that all such cpus use a byte-granular modification mask? Now, as of (0) I might agree to disregard the original Alpha, but as the embedded world moves to SMP I'm not sure we can disregard non-cache coherent NUMA setups

Re: [PATCH] make LABEL_DECL has its own rtx field for its associated CODE_LABEL

2011-04-05 Thread Michael Matz
Hi, On Mon, 4 Apr 2011, Nathan Froyd wrote: On Mon, Apr 04, 2011 at 05:52:00PM +0200, Steven Bosscher wrote: Have you looked into maybe putting the CODE_LABEL for a LABEL_DECL in an on-the-side structure (hash table, whatever)? It looks like it is only used during expansion of SWITCH

Re: Fix realloc_on_assign_2.f03, random segfaults/ICEs

2011-03-30 Thread Michael Matz
Hi, On Wed, 30 Mar 2011, Tobias Burnus wrote: On 03/30/2011 06:21 PM, Michael Matz wrote: Okay for trunk? (regstrapping on x86_64-linux in progress) OK. Thanks for tracing the bug. I think the issue could be the reason for the elusive PR 47516 - thus, you might consider adding that PR

Random cleanups [2/4]: canonicalize ctor values

2011-03-30 Thread Michael Matz
Hi, this came up when looking into why the static ctors contain useless trees (like casts). We can simply canonicalize them while varpool analyzes pending decls. It'll look at initialzers once, where we can gimplify them. This requires making canonicalize_constructor_val be able to be

Random cleanups [3/4]: zero out DECL_VINDEX field

2011-03-30 Thread Michael Matz
Hi, I noticed this while working on early-merging LTO. The DECL_VINDEX slot of FUNCTION_DECLs is supposed to hold the numeric index of the vtable slot if it's a virtual function. During parsing the C++ frontend uses it to hold a reference to itself, which then later is supposed to be

Random cleanups [4/4]: Streamlining streamer

2011-03-30 Thread Michael Matz
Hi, I fear I wasn't as thorough in also splitting this one into several patches, but the different cleanups are at least mostly in different files. They are: * lto-lang remembers all builtin decls in a local list, to be returned by the getdecls langhook. But as we have our own

Re: [wwwdocs] Add Subversion revisions to the timeline

2011-03-26 Thread Michael Matz
Hi, On Sat, 26 Mar 2011, Richard Guenther wrote:   GCC 4.7 Stage 1 (starts 2011-03-14)      GCC 4.6.0 release (2011-03-25) -       | +       | r171512        |        v The idea is to include the copy-source revision on the trunk or the respective branch, so that you can use the

Re: [wwwdocs] Add Subversion revisions to the timeline

2011-03-26 Thread Michael Matz
Hi, On Sat, 26 Mar 2011, Richard Guenther wrote: Uh, well - the information is readily available from SVN Hmm, you have a very unusual definition of readily available :) Well - of course svn sucks, but svn log --stop-on-copy svn://gcc.gnu.org/svn/gcc/branches/branch-name | tail

Re: [build, lto] Only accept -fuse-linker-plugin if linker supports -plugin (PR lto/46944)

2011-03-25 Thread Michael Matz
Hi, [sorry for breaking the threading I've deleted the mails I'm answering already] In any case, citing from http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01250.html Here's the patch I came up with. It is on top of the previous one, so if we want to backport to 4.6 later, both are

Re: [PATCH] Fix -fcrossjumping at -O1 (PR rtl-optimization/48156)

2011-03-18 Thread Michael Matz
Hi, On Fri, 18 Mar 2011, Kenneth Zadeck wrote: I believe that this is not the right way to go. if someone specifies -fcrossjumping, then the pass should turn on live for the duration of the pass just as ifcvt does.If they ask for crossjumping you should give them crossjumping and not

Re: [PATCH][i386] Implement ix86_emit_swdivsf more efficiently

2011-03-17 Thread Michael Matz
Hi, On Mon, 14 Mar 2011, Richard Guenther wrote: This rewrites the iteration step of swdivsf to be more register efficient (two registers instead of four, no load of a FP constant). This matches how ICC emits the rcp sequence and causes no overall loss of precision (Micha might still

Re: [PATCH 01/18] add typed_tree structure

2011-03-11 Thread Michael Matz
Hi, On Thu, 10 Mar 2011, Nathan Froyd wrote: * tree.h (struct typed_tree): New. IMO this should be called tree_typed, like the other structs in tree.h . Ciao, Michael.

<    3   4   5   6   7   8