Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

2008-03-05 Thread Jan Hubicka
On Wed, Mar 05, 2008 at 09:38:13PM +0100, Michael Matz wrote: Hi, On Wed, 5 Mar 2008, Aurelien Jarno wrote: So I think gcc at least needs an *option* to revert to the old behavior, and there's a good argument to make it the default for now, at least for x86/x86-64 on Linux.

Re: API for callgraph and IPA passes for whole program optimization

2008-03-09 Thread Jan Hubicka
Hi, based on the discussion, this is change I would like to do to the passmanager. I am sending the header change only first, because the actual change will need updating all PM datastructure initializers and compensate testsuite and documentation for the removal of RTL dump letters so I would

Re: API for callgraph and IPA passes for whole program optimization

2008-03-09 Thread Jan Hubicka
Jan Hubicka wrote: This looks mostly fine to me. note that i added you to pr35094 since this patch will resolve that issue. I guess that one of the questions that i would have is why not have there be a base structure for the core passmanager fields, and then a union that contains

Re: API for callgraph and IPA passes for whole program optimization

2008-03-12 Thread Jan Hubicka
On 3/9/08 7:26 AM, Jan Hubicka wrote: compensate testsuite and documentation for the removal of RTL dump letters so I would rather do that just once. Does this seem OK? Yup, thanks for doing this. The patch include the read/write methods that will be just placeholders on mainline

Re: gcc 4.3.0 i386 default question

2008-03-12 Thread Jan Hubicka
Hi, Did the default i386 CPU model that gcc generates code for change between 4.2.x and 4.3.0? I didn't see anything in the release notes that jumps out at me about this. There wasnt any intend to change the codebase. However the default tunning now has changed to generic model.

Re: gcc 4.3.0 i386 default question

2008-03-12 Thread Jan Hubicka
David Edelsohn wrote: Joel Sherrill writes: Joel Those all look like checks to see if the compiler itself Joel supports Altivec -- not a run-time check on the hardware Joel like the Neon check_effective_target_arm_neon_hw appears Joel to be. Look at

Re: Bootstrap comparison failures on i586

2008-04-05 Thread Jan Hubicka
if (parts.base) { if (REGNO_POINTER_ALIGN (REGNO (parts.base)) 32) -- 820 return 0; } I think parts.base is OK so it's probably REGNO_POINTER_ALIGN Uh, while converting the regno_pointer_align from GGC to malloced memory, I mistakely used xmalloc instead of

Re: Bootstrap comparison failures on i586

2008-04-05 Thread Jan Hubicka
comitted the following as obvious. I am sorry for all the fallout... Index: ChangeLog === *** ChangeLog (revision 133932) --- ChangeLog (working copy) *** *** 1,5 --- 1,7 2008-04-05 Jan Hubicka [EMAIL

Re: Bootstrap failure on i686-apple-darwin9

2008-04-15 Thread Jan Hubicka
Does this help? Thanks for tha answer, but now I have: ... ../../gcc-4.4-work/gcc/except.c: In function 'set_nothrow_function_flags': ../../gcc-4.4-work/gcc/except.c:2787: error: 'struct rtl_data' has no member named 'epilogue_delay_list' make[3]: *** [except.o] Error 1 ... Sorry,

Re: ICE while bootstrap for x86_64-pc-mingw32 in cp-demangle.c: 2905

2008-05-02 Thread Jan Hubicka
Hi, it seems, that the compilation of libiberty for x86_64-pc-mingw32 leads to an ICE in cp-demangle.c:2905: internal compiler error verify_cgraph_node failed breaks bootstrap. Hi, this was caused by my PM patch. I've commited fix for it this morning, so please let me know if any problems

Re: apparent memory increase

2008-05-23 Thread Jan Hubicka
You may have seen this warning from the memory consumption tester: http://gcc.gnu.org/ml/gcc-regression/2008-05/msg00041.html ... related to the recent identifier GC patch. I looked into this a little. My theory is that this is an artifact of how the tester collects its data. In

Re: Why is the length of *sse_prologue_save_insn 135?

2008-05-24 Thread Jan Hubicka
Hi Jan, Uros, Hi, I guess I was just lazy to figure out the size at a time of writting the pattern. Length is not used for anything useful at the moment, but fixing it definitly won't hurt. Honza i386.md has (define_insn *sse_prologue_save_insn [(set (mem:BLK (plus:DI (match_operand:DI

Re: [lto] Streaming out language-specific DECL/TYPEs

2008-06-03 Thread Jan Hubicka
On Mon, Jun 2, 2008 at 5:10 PM, Diego Novillo [EMAIL PROTECTED] wrote: In g++.dg/torture/20070621-1.C we are trying to stream out a structure that contains a TEMPLATE_DECL. This currently causes a failure in lto-function-out.c:output_tree because not only TEMPLATE_DECL is C++-specific,

Re: [lto] Streaming out language-specific DECL/TYPEs

2008-06-03 Thread Jan Hubicka
On Mon, Jun 2, 2008 at 20:37, Kenneth Zadeck [EMAIL PROTECTED] wrote: the problem with making this a langhook is that there is no there-there in that on the serialize in side, you would have to recreate the c++ front end code that expects this tree code. (if there is no such code, then

Re: Is this a typo in setup_incoming_varargs_64?

2008-06-05 Thread Jan Hubicka
Hi, setup_incoming_varargs_64 in i386.c has /* Compute address to jump to : label - 5*eax + nnamed_sse_arguments*5 */ The comments don't match the code. Shout the comments be /* Compute address to jump to : label - 4*eax + nnamed_sse_arguments*4 */

Re: [lto] Streaming out language-specific DECL/TYPEs

2008-06-05 Thread Jan Hubicka
Jan Hubicka wrote: Sure if it works, we should be lowering the types during gimplification so we don't need to store all this in memory... But C++ FE still use its local data later in stuff like thunks, but we will need to cgraphize them anyway. I agree. The only use of language

Re: [whopr] Design/implementation alternatives for the driver and WPA

2008-06-05 Thread Jan Hubicka
Hi, I am jumping in somewhat late, as yesterday I was on meetings without internet access. (and I probably will be offline again tomorrow) I think that in basic terms we all mostly agree (we want to implement optimization scheme that does not get everything into memory, we want to parallelize the

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-05 Thread Jan Hubicka
1. Extend the register save area to put upper 128bit at the end. Pros: Aligned access. Save stack space if 256bit registers are used. Cons Split access. Require more split access beyond 256bit. 2. Extend the register save area to put full 265bit YMMs at the end. The

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-06 Thread Jan Hubicka
ymm0 and xmm0 are the same register. xmm0 is the lower 128bit of xmm0. I am not sure if we need separate XMM registers from YMM registers. Yes, I know that xmm0 is lower part of ymm0. I still think we ought to be able to support varargs that do save ymm0 registers only when ymm values are

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-09 Thread Jan Hubicka
On Fri, Jun 06, 2008 at 06:50:26AM -0700, H.J. Lu wrote: On Fri, Jun 06, 2008 at 10:28:34AM +0200, Jan Hubicka wrote: ymm0 and xmm0 are the same register. xmm0 is the lower 128bit of xmm0. I am not sure if we need separate XMM registers from YMM registers. Yes, I

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jan Hubicka
I don't understand why you want to pass __m256 and 256-bit vector values to anonymous arguments in registers. The only thing the vararg functions would do with it would be save it somewhere on the stack. Given the x86_64 ABI, you can't expect calling an implicitly prototyped or non-vararg

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jan Hubicka
On Tue, Jun 10, 2008 at 8:11 AM, Jakub Jelinek [EMAIL PROTECTED] wrote: On Tue, Jun 10, 2008 at 04:50:14PM +0200, Jan Hubicka wrote: 1) make __m256 passed on stack on variadic functions and in registers otherwse. Then we don't need to worry about varargs changes at all. This will break

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-15 Thread Jan Hubicka
On Wed, Jun 11, 2008 at 07:49:12AM -0700, H.J. Lu wrote: I guess we all agree on passing variadic arguments on stack (that is only those belonging on ...) and rest in registers. It seems easiest in regard to future register set extensions too. Only negative thing is that calls to

Re: newlib libgcov

2008-06-15 Thread Jan Hubicka
Hello, In our GCC porting, we use newlib instead of libc. Today I tried to use profiling feedback based optimization with option -fprofile-arcs. But the executable doesn't produce .gcda file. I examined the disassembled binary file and found the following functions are basically just dummy

Re: Results for 4.4.0 20080618 (experimental) (GCC) testsuite on i686-pc-linux-gnu

2008-06-18 Thread Jan Hubicka
FAIL: gcc.dg/weak/weak-6.c (test for errors, line 5) FAIL: gcc.dg/weak/weak-6.c (test for excess errors) FAIL: gcc.dg/weak/weak-7.c (test for errors, line 5) FAIL: gcc.dg/weak/weak-7.c (test for excess errors) These look like they were caused by one of your patches. Yes, they get

Re: Merging tuples branch into mainline today

2008-07-25 Thread Jan Hubicka
I think that someone, though, should be committed to fixing this pass ASAP after it's checked in; waiting until late August to fix it seems bad. Is there someone else who can commit to working on it as a high priority after the main tuples checkin? I would obviously vote in favor of

Re: Merging tuples branch into mainline today

2008-07-25 Thread Jan Hubicka
Jan Hubicka wrote: So while the passes are probably now well in benchmark toy category and they will need many changes to be useful in general, I think it is good to have something we can test the framework at. Do these passes actually help on benchmarks? Yes, those help signifcandly

Re: [tuples] New memory/time comparison vs trunk

2008-07-27 Thread Jan Hubicka
- The rest of the memory utilization difference is mostly in inlining (240Kb) and SSA update (50Kb). I think the main focus points should be DSE and trying to get a good way of measuring the memory utilization differences. Jan, any suggestion? I've switched memory tester to tuples now.

Re: Recent warning regression: no return statement in function returning non-void

2008-07-27 Thread Jan Hubicka
On Sun, Jul 27, 2008 at 1:18 PM, Gerald Pfeifer [EMAIL PROTECTED] wrote: I believe the following happened in the last 48 or so hours; I saw this triggered by my nightly Wine builds which in turn use my nightly GCC builds. ;-) For code like the following where we have an infinite loop in

Enabling IPCP by default

2008-08-24 Thread Jan Hubicka
Hi, Since most of issues with IPCP should be fixed now and it should be as strong as possible with the elementary textbook quality algorithm it uses, I would like to enable it by default. I've tested it on SPEC and C++ behcmarks yeterday and didn't measured any significant improvments. There is

Re: Enabling IPCP by default

2008-08-24 Thread Jan Hubicka
Jan Hubicka [EMAIL PROTECTED] writes: If there are no complains, I will enable ipcp as proposed after remaining patches are tested and comitted (that would be about day after tomorrow) It breaks Ada on ia64: I was hitting same problem on x86_64 and it should be fixed now. Honza /tmp

Re: Enabling IPCP by default

2008-08-28 Thread Jan Hubicka
Hi, after IRA I've re-done x86-64 SPECint testing (SPECfp, CSiBE and C++ benchmark failed because tree was broken at that point, I will get results tomorrow, but there was no surprises already before) also with the new code to eliminate arguments. Luis also did PPC SPEC runs. The most important

Re: Enabling IPCP by default

2008-08-29 Thread Jan Hubicka
Hi, tonight testing on x86_64, i386 and IA-64 didn't seem to bring any new surprises, so I've comitted the following patch. I will also update changes page of 4.4. * doc/invoke.texi (-fipa-cp): Enabled by default at -O2/-Os/-O3 (-fipa-cp-clone): Enabled by default at -O3.

Re: Enabling IPCP by default

2008-08-29 Thread Jan Hubicka
On Fri, 29 Aug 2008, Jan Hubicka wrote: Hi, tonight testing on x86_64, i386 and IA-64 didn't seem to bring any new surprises, so I've comitted the following patch. I will also update changes page of 4.4. * doc/invoke.texi (-fipa-cp): Enabled by default at -O2/-Os/-O3

Re: Enabling IPCP by default

2008-08-29 Thread Jan Hubicka
On Fri, 29 Aug 2008, Jan Hubicka wrote: Hi, tonight testing on x86_64, i386 and IA-64 didn't seem to bring any new surprises, so I've comitted the following patch. I will also update changes page of 4.4. * doc/invoke.texi (-fipa-cp): Enabled by default at -O2/-Os/-O3

New branch: pretty-ipa

2008-11-12 Thread Jan Hubicka
Hi, with LTO getting closer it is obvious that IPA infrastructure needs work and also is getting more interesting ;) I don't think it makes sense to do all the work on LTO branch that contains a lot of temporary stuff, so I've created pretty-ipa branch that unlike LTO branch is targetted to merge

Re: Line insn notes in modulo-sched

2006-03-13 Thread Jan Hubicka
Hi Ayal, The SMS implementation in GCC, in modulo-sched.c, uses line notes to find insn locations, see find_line_note. Why are you using line notes instead of insn locators? Line notes are on the list of Things That Should Not Be, and insn locators replace them. Is there a reason for

Re: [RFC] IL cleanups

2006-05-04 Thread Jan Hubicka
Hi, nice that you are going to look into it. I am quite interested to help here as you can probably guess ;) The overall plan looks good to me. (and is pretty compatible with what I believe is needed) There are a lots of details however Anything else I may have missed? There are other

Re: Multiple calls to __gcov_init

2006-05-04 Thread Jan Hubicka
On Tue, Apr 25, 2006 at 03:05:26PM +0200, Richard Guenther wrote: On 4/25/06, Momchil Velikov [EMAIL PROTECTED] wrote: Why does GCC emit multiple calls to __gcov_init, via mulitple (two) entries in the ctors table? For example int foo () { return 0; } compiled with gcc -S

Re: [RFC] IL cleanups

2006-05-04 Thread Jan Hubicka
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jan Hubicka wrote on 05/04/06 08:36: If you are interested in some sort of integration of changes in IPA branch (IE whole program in SSA form), I can probably prepare sort of merge patches for review (pretty much as I intend to finally do

Re: Why does __float80 depend on -mmmx/-msse?

2006-06-27 Thread Jan Hubicka
On the other hand, as far as I can see, __float80 is undocumented and unused for the i386. Why does it exist? Jan added it with __float128 also: 2003-10-30 Jan Hubicka [EMAIL PROTECTED] (ix86_init_mmx_sse_builtins): Add __float80, __float128. I think

Patch for PR15832 and PR28071

2006-07-26 Thread Jan Hubicka
Vladimir, I've run into problem with your patch for PR15832 on the testcase PR rtl-optimization/28071. Bug2.c compiled with -O3 -fno-tree-pre -fno-tree-fre it needs about 200MB extra memory for bitmaps because the bitmap stack_regs ends up very dense and it is copied indo all the pavins. (I have

Re: sorry, unimplemented: 64-bit mode not compiled in - ?!

2006-07-28 Thread Jan Hubicka
On Thu, Jul 27, 2006 at 12:56:14PM +0200, Denis Vlasenko wrote: does it mean I need a cross-compiler (to x86_64) to use -m64? It's strange because then -m64 is not useful at all - x86_64 cross compiler defaults to 64 bit anyway... right? It overrides -m32 earlier on the command line.

Re: Incorrect application of loop exit heuristic?

2006-08-08 Thread Jan Hubicka
I was looking in to a degradation for perlbmk on PowerPC and tracked it down to a mispredicted branch within a loop ( if (...) return 0; within the loop). GCC is statically predicting the loop exit as not taken bne-, but it is obviously being taken the greatest share of the time because

Re: Incorrect application of loop exit heuristic?

2006-08-17 Thread Jan Hubicka
Pat Haugen [EMAIL PROTECTED] wrote on 08/08/2006 11:07:58 AM: Jan Hubicka [EMAIL PROTECTED] wrote on 08/08/2006 01:04:33 AM: The code there is basically avoiding loops with many exists to be predicted to not loop at all (ie if you have 10 exits, having every exit with 10

Re: Incorrect application of loop exit heuristic?

2006-08-20 Thread Jan Hubicka
Pat Haugen [EMAIL PROTECTED] wrote on 08/08/2006 11:07:58 AM: Jan Hubicka [EMAIL PROTECTED] wrote on 08/08/2006 01:04:33 AM: The code there is basically avoiding loops with many exists to be predicted to not loop at all (ie if you have 10 exits, having every exit

Re: mismatched parentheses in reload1.c

2006-08-21 Thread Jan Hubicka
This patch: r116277 | hubicka | 2006-08-21 02:00:14 +0200 (Mon, 21 Aug 2006) | 6 lines PR rtl-optimization/28071 * reload1.c (reg_has_output_reload): Turn into regset. (reload_as_needed,

Re: Trunk bootstrap failure on Linux/x86_64 in reload1.c

2006-08-21 Thread Jan Hubicka
On Mon, Aug 21, 2006 at 05:25:09PM -0400, Andrew Pinski wrote: On Aug 21, 2006, at 11:59 AM, Andreas Jaeger wrote: Trunk fails to build for me with: Maybe related (from http://gcc.gnu.org/regtest/HEAD/): 2006-08-16T23:25:59Z 2006-08-17T14:40:57Z pass native 116195

Re: Trunk bootstrap failure on Linux/x86_64 in reload1.c

2006-08-22 Thread Jan Hubicka
Jan Hubicka [EMAIL PROTECTED] writes: On Mon, Aug 21, 2006 at 05:25:09PM -0400, Andrew Pinski wrote: On Aug 21, 2006, at 11:59 AM, Andreas Jaeger wrote: Trunk fails to build for me with: Maybe related (from http://gcc.gnu.org/regtest/HEAD/): 2006-08-16T23:25

Re: Incorrect application of loop exit heuristic?

2006-08-23 Thread Jan Hubicka
Pat Haugen [EMAIL PROTECTED] wrote on 08/21/2006 01:22:25 PM: Jan Hubicka [EMAIL PROTECTED] wrote on 08/19/2006 07:51:42 PM: Hi, this patch at least hides the ugly details within some abstraction so we can eventally go for propagating reliability information across CFG

Re: mismatched parentheses in reload1.c

2006-08-24 Thread Jan Hubicka
) --- ChangeLog (working copy) *** *** 1,5 --- 1,9 2006-08-24 Jan Hubicka [EMAIL PROTECTED] + * reload1.c (emit_reload_insns): Fix yet another typo in my patch. + + 2006-08-24 Jan Hubicka [EMAIL PROTECTED] + PR debug/26881 * cgraph.c: Fix

Re: IPA branch

2006-09-25 Thread Jan Hubicka
Jan -- I'm trying to plan for GCC 4.3 Stage 1. The IPA branch project is clearly a good thing, and you've been working on it for a long time, so I'd really like to get it into GCC 4.3. However, I'm a little concerned, in reading the project description, that it's not all that far

Re: on removal of line number notes at the end of BBs

2006-10-12 Thread Jan Hubicka
On Oct 11, 2006, Ian Lance Taylor [EMAIL PROTECTED] wrote: int x; int f() { x = 0; while(1); } We get line number notes for code only up to x = 0;. I assume this is only a problem when not optimizing. The opposite, actually. It's optimization that breaks it. Of course

Memory usage of 4.2 versus 4.3 (at branchpoints)

2006-10-21 Thread Jan Hubicka
Hi, to give some perspective to the discussion on memory usage, I generated comparsion of 4.2 branchpoint to 4.3 branchpoint from logs of our memory tester. I would say it is quite pleasing to see that 4.3 is not really regression relative 4.2 in most tests like it was custom in previous

Re: compiling very large functions.

2006-11-05 Thread Jan Hubicka
On 11/4/06, Kenneth Zadeck [EMAIL PROTECTED] wrote: Richard Guenther wrote: On 11/4/06, Kenneth Zadeck [EMAIL PROTECTED] wrote: Richard Guenther wrote: On 11/4/06, Kenneth Zadeck [EMAIL PROTECTED] wrote: I think that it is time that we in the GCC community took some time to

Re: compiling very large functions.

2006-11-07 Thread Jan Hubicka
Brooks Moses wrote on 11/06/06 17:41: Is there a need for any fine-grained control on this knob, though, or would it be sufficient to add an -O4 option that's equivalent to -O3 but with no optimization throttling? We need to distinguish two orthogonal issues here: effort and enabled

Re: Whole program optimization and functions-only-called-once.

2009-11-12 Thread Jan Hubicka
On Wed, Nov 4, 2009 at 8:19 PM, Toon Moene t...@moene.org wrote: Jan, I had some time to study the example I sent you a couple of weeks ago. According to visible inspection of the source code, there are 5 functions (subroutines in Fortran parlance) that are called once: MAIN  

Re: is LTO aimed for large programs?

2009-11-12 Thread Jan Hubicka
Perhaps the question is when not to use -flto and use -fwhopr instead? My rule of thumb is: Try -flto first, if it does not work (running out of memory), try -fwhopr. I think the advantage of -flto is also that it is better tested, while -fwhopr has known issues. -fwhopr is quite broken in

Re: Whole program optimization and functions-only-called-once.

2009-11-12 Thread Jan Hubicka
On Wed, Nov 4, 2009 at 1:20 PM, Toon Moene t...@moene.org wrote: You don't happen to recall the bug number ? It might be related to PR 41735 which I noticed when looking at the generated assembly and trying to compare 4.5 to 4.4. I fixed this bug today, so it might help. But it is related

Re: Build broken in libstdc++ on x86_64-linux

2009-11-12 Thread Jan Hubicka
Hi, the build is currently, ie 154122, broken in libstdc++-v3: ./src/system_error.cc:95:1: internal compiler error: Segmentation fault Version 154120 works fine for me. I am testing patch for that still. The current version is (updated per Joseph's comment about COMDAT making

Re: Build broken in libstdc++ on x86_64-linux

2009-11-12 Thread Jan Hubicka
Jan Hubicka wrote: I am testing patch for that still. The current version is (updated per Joseph's comment about COMDAT making sence on !PUBLIC functions). Thanks Honza, I just built successfully r154128 Note that there are still testsuite regressions found by the sanity check

Re: Whole program optimization and functions-only-called-once.

2009-11-12 Thread Jan Hubicka
Hi, this is WIP patch to deal with the unreachable clones problem. It basically renders the clones as unanalyzed cgraph nodes (but with still body in) so IPA passes don't see them. Honza Index: cgraph.c === --- cgraph.c

Re: aliases without a _DECL?

2010-02-05 Thread Jan Hubicka
Hi, I have no idea what you would like to achieve by this? I assume that you want to add aliases to given declaration without actually creating alias DECLs, just assembler symbol names. But without the DECLs there would be absolutely no way to reffer to these within current unit, so I guess

Re: aliases without a _DECL?

2010-02-05 Thread Jan Hubicka
On 02/05/2010 05:56 PM, Jan Hubicka wrote: But without the DECLs there would be absolutely no way to reffer to these within current unit, so I guess cgraph don't need to care about them much (i.e. they can just be some list assigned to node or decl). Right, they would just be for binary

Re: Peculiar XPASS of gcc.dg/guality/inline-params.c

2010-03-29 Thread Jan Hubicka
Hi, I have run the testcase with the early inliner disabled and noticed that gcc.dg/guality/inline-params.c XPASSes with early inlining and XFAILs without it. The reason for the (expected) failure is that IPA-CP removes a parameter which is constant (but also unused?). I reckon this is

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
It was. Unfortunately,work on it stopped last year and it is unlikely that I will be assigned to this again. I still have some personal interest on the feature, but given time restrictions, we should make contingency plans. Perhaps the easiest option is to remove the feature. WHOPR does

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
Well, I think this is independent. It makes a lot of sense to make profiling to work in a way so instrumentation happens at linktime with LTO and we can read stuff back. This is relatively easy to do: we need to rewrite profiling pass to work on SSA (that is easy and desirable anyway and on

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
On 4/8/10 14:10 , Jan Hubicka wrote: So I think tying WHOPR and profile feedback too close together is a mistake. Sorry, I didn't mean that. My intent is to make whopr/lto use profiling information if it is available. Much like we do with other optimization decisions

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
On 4/8/10 14:10 , Jan Hubicka wrote: So I think tying WHOPR and profile feedback too close together is a mistake. Sorry, I didn't mean that. My intent is to make whopr/lto use profiling information if it is available. Much like we do with other optimization decisions

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
On 4/8/10 14:30 , Jan Hubicka wrote: On 4/8/10 14:10 , Jan Hubicka wrote: So I think tying WHOPR and profile feedback too close together is a mistake. Sorry, I didn't mean that. My intent is to make whopr/lto use profiling information if it is available. Much like we do

Re: WHOPR bootstrap, when/how?

2010-04-09 Thread Jan Hubicka
On Thu, 8 Apr 2010, Jan Hubicka wrote: :) We need debug info and hammer out all bugs of course! I would also like to see possiblity to LTO bootstrap without gold and possibility to not generate assembly into LTO .o files. In the typical use where one builds app with LTO

Re: WHOPR bootstrap, when/how?

2010-04-09 Thread Jan Hubicka
On Fri, 9 Apr 2010, Jan Hubicka wrote: On Thu, 8 Apr 2010, Jan Hubicka wrote: :) We need debug info and hammer out all bugs of course! I would also like to see possiblity to LTO bootstrap without gold and possibility to not generate assembly into LTO .o files

Re: branch probabilities on multiway branches

2010-04-13 Thread Jan Hubicka
Hi All, The following bit of code in predict.c implies branch probabilities are strictly evenly distributed for multiway branches at present. The comment suggests it is possible to generate better estimates for more generic cases, apart from being involved. Could anyone point me to the

Re: branch probabilities on multiway branches

2010-04-15 Thread Jan Hubicka
On Thu, Apr 15, 2010 at 1:11 PM, Rahul Kharche ra...@icerasemi.com wrote: The calculate branch probabilities algorithm (1) in the Wu Larus paper also evenly distributes branch probabilities when number of outgoing edges is 2, e.g. switch cases implemented as jump tables. Are they any

Re: LTO question

2010-04-28 Thread Jan Hubicka
On 4/28/10 10:26 , Manuel López-Ibá?ez wrote: Not yet, I mistakenly thought -fwhole-program is the same as -fwhopr and it is just for solving scaling issue of large program.(These two options do look similar :-). I shall try next. Yep, -fwhopr is not ideal name, but I guess there is not

Re: LTO question

2010-04-29 Thread Jan Hubicka
2010/4/29 Jan Hubicka hubi...@ucw.cz: On 4/28/10 10:26 , Manuel López-Ibá?ez wrote: Not yet, I mistakenly thought -fwhole-program is the same as -fwhopr and it is just for solving scaling issue of large program.(These two options do look similar :-). I shall try next. Yep

Re: LTO vs static library archives [was Re: lto1: internal compiler error: in lto_symtab_merge_decls_1, at lto-symtab.c:549]

2010-04-29 Thread Jan Hubicka
Well, we'd then need to re-architect the symbol merging and LTO unit read-in to properly honor linking semantics (drop a LTO unit from an archive if it doesn't resolve any unresolved symbols). I don't know how easy that will be, but it shouldn't be impossible at least. We also should keep

Re: LTO vs static library archives [was Re: lto1: internal compiler error: in lto_symtab_merge_decls_1, at lto-symtab.c:549]

2010-04-29 Thread Jan Hubicka
2010/4/29 Jan Hubicka hubi...@ucw.cz: Well, we'd then need to re-architect the symbol merging and LTO unit read-in to properly honor linking semantics (drop a LTO unit from an archive if it doesn't resolve any unresolved symbols).  I don't know how easy that will be, but it shouldn't

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
GCC-4.5.0 and LLVM-2.7 were released recently. To understand where we stand after releasing GCC-4.5.0 I benchmarked it on SPEC2000 for x86/x86-64 and posted the comparison of it with the previous GCC releases and LLVM-2.7. Even benchmarking SPEC2000 takes a lot of time on the fastest

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
Thanks for the comments. FDO will probably improve SPEC2000 score. Although it is not obvious for some tests because the train data sets for them are different from the reference data sets and it might actually mislead the compiler. There are several studies on the topic and it is

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
BTW we are also tracking SPEC2k6 with and without LTO (not FDO runs) http://gcc.opensuse.org/SPEC/CINT/sb-barbella.suse.de-ai-64/recent.html http://gcc.opensuse.org/SPEC/CINT/sb-barbella.suse.de-head-64-2006/recent.html not all 2k6 tests pass with LTO so it will need a bit care to compare

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
for that yet (well, hoping that submitting the thesis will make this easier). What are the LIPO's features that are missing in -flto -fprofile-use? Honza David On Thu, Apr 29, 2010 at 2:38 PM, Steven Bosscher stevenb@gmail.com wrote: On Thu, Apr 29, 2010 at 11:27 PM, Jan Hubicka hubi

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
2010/4/30 Jan Hubicka hubi...@ucw.cz: Thanks for the suggestion. Raksit currently is busy with merging trunk changes back to lw-ipo branch which can be a daunting task. After that this can be done.  (Our internal release is based on 4.4). I must say that LIPO is something I always

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-30 Thread Jan Hubicka
In theory, LIPO should not generate better results than LTO+FDO. What makes LIPO attractive is that it allows distributed build from the beginning. Its integration with large distributed build system is also easy. Another point is that LIPO can be decoupled from FDO as well. The integration

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-30 Thread Jan Hubicka
Interesting.  My plan for profiling with LTO is to ultimately make it linktime transform.  This will be more difficult with WHOPR (i.e. instrumenting need function bodies that are not available at WPA time), but I believe it is solvable: just assign uids to the edges and do

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-05-01 Thread Jan Hubicka
Vortex needs -fno-strict-aliasing. It casts between two record types with one record being a 'prefix' of another. So today runs are complette. Thanks to Richi who fixed ICE in symtab merging that affected perl and GCC. With vortex problem was that in addition to -fno-strict-aliasing it is

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-05-02 Thread Jan Hubicka
On Sat, May 1, 2010 at 2:36 AM, Jan Hubicka hubi...@ucw.cz wrote: Vortex needs -fno-strict-aliasing.  It casts between two record types with one record being a 'prefix' of another. So today runs are complette.  Thanks to Richi who fixed ICE in symtab merging that affected perl

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-05-04 Thread Jan Hubicka
On Sun, May 2, 2010 at 6:45 AM, Jan Hubicka hubi...@ucw.cz wrote: That depends. The following cases exist in vortex: 1) the value is runtime constant -- it is read from input file but never changed -- e.g.: QueBug. Nothing can be done by the compiler in this case; 2) Global variable

Re: GIMPLE types merging in LTO compiler

2010-05-14 Thread Jan Hubicka
On Fri, May 14, 2010 at 9:33 PM, Eric Botcazou ebotca...@adacore.com wrote: Ugh.  This presents a chicken-and-egg problem to symbol resolution and type-merging. To be clear, the issue is sth like unit1 - int size; int a[size]; unit2 -- extern int size; extern

Re: Does `-fwhole-program' make sense when compiling shared libraries?

2010-05-17 Thread Jan Hubicka
On Mon, May 17, 2010 at 10:57:31AM -0700, Toon Moene wrote: On 05/17/2010 08:08 PM, Dave Korn wrote: Hi! PR42904 is a bug where, when compiling a windows DLL using -fwhole-program, the compiler optimises away the entire library body, because there's no dependency

Re: Does `-fwhole-program' make sense when compiling shared libraries?

2010-05-18 Thread Jan Hubicka
[ hmf. This one got lost to an smtp error when I sent it yesterday. It appears there's more or less agreement that at the moment you're supposed to manually annotate all external entry points if you want to use -fwhole-program on a library. On windows, where we often do that anyway, it

Re: Where does the time go?

2010-05-21 Thread Jan Hubicka
On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li davi...@google.com wrote: On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher stevenb@gmail.com wrote: On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li davi...@google.com wrote: stack variable overlay and stack slot assignments

Re: Where does the time go?

2010-05-21 Thread Jan Hubicka
2010/5/21 Jan Hubicka hubi...@ucw.cz: On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li davi...@google.com wrote: On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher stevenb@gmail.com wrote: On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li davi...@google.com wrote

Re: externally_visible and resoultion file

2010-05-27 Thread Jan Hubicka
On Wed, May 26, 2010 at 5:53 PM, Bingfeng Mei b...@broadcom.com wrote: Hi, Richard, With resolution file generated by GOLD (or I am going to hack gnu LD),  is externally_visible attribute still needed to annotate those symbols accessed from non-LTO objects when compiling with

Re: gcc compilation broken with --enable-checking=release

2010-05-27 Thread Jan Hubicka
Hi, I've committed the following fix. * cgraph.h (struct cgraph_node): Mark former_clone_of by GTY ((skip)). * cgraphunit.c (clone_of_p): Compile only when checking is enabled. Index: cgraph.h === *** cgraph.h

Unused variables and functions and missing const decls in cc1 binary

2010-05-29 Thread Jan Hubicka
Hi, I do not have time to poke too much about this, but with whole-program build it is easy to see what functions ends up being unused in final cc1 binary. Not all of those are unnecesary (and some are for future use, for debugging or used by other binaries), but it might serve as guideline to

Re: Issue with LTO/-fwhole-program

2010-06-11 Thread Jan Hubicka
Ah, so the problem is the missing -flto in the second compilation step? I think this is a bug in the compiler for not reporting this somehow. Is there are PR open for this? Compiler can not report it because it does not see the other object files. It is really up to user to understand

Re: Issue with LTO/-fwhole-program

2010-06-11 Thread Jan Hubicka
On 11/06/2010 14:26, Jan Hubicka wrote: Perhaps we can somehow poison the object names that are brought local with -fwhole-program so linking explode, but I am not sure there is way to do so. Could emit warning symbols, but, like the others, I don't see why collect2 can't spot

Re: [RFC] Cleaning up the pass manager

2010-06-15 Thread Jan Hubicka
I have been thinking about doing some cleanups to the pass manager. The goal would be to have the pass manager be the central driver of every action done by the compiler. In particular, the front ends should make use of it and the callgraph manager, instead of the twisted interactions we

Re: Massive performance regression from switching to gcc 4.5

2010-06-25 Thread Jan Hubicka
On Fri, Jun 25, 2010 at 8:15 AM, Jonathan Adamczewski jadam...@utas.edu.au wrote: On 25/06/10 06:39, Richard Guenther wrote: There are btw. some bugs wrt accounting of functions called once being inlined in 4.5 which were fixed on trunk which allow extra inlining. Are these changes

<    1   2   3   4   5   6   7   8   9   10   >