Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
What do you think of the following plan for turning cgraph into a class hierarchy? We cannot finish it until we have gengtype understanding single inheritance, but we can start changing APIs in preparation. Good you told me, I was about trying that myself. Did not know gengtype do not

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
On 9/5/12, Xinliang David Li davi...@google.com wrote: On Sep 5, 2012 Jan Hubicka hubi...@ucw.cz wrote: OK, the basic idea is that symtab_node is basetype of cgraph_node and varpool_node. We may want to drop the historica cgraph/varpool names here, since function_node/variable_node

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
The cgraph redesign probably deserves more discussion. 1) It may be worthwhile to abstract the graph manipulation code into a utility class which is templatized. graphT, nodeT with node inheriting from T. 2) Introduce a global symbol table containing a function table and a global

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
On 9/5/12, Xinliang David Li davi...@google.com wrote: On Sep 5, 2012 Jan Hubicka hubi...@ucw.cz wrote: OK, the basic idea is that symtab_node is basetype of cgraph_node and varpool_node. We may want to drop the historica cgraph/varpool names here, since function_node

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
On Wed, Sep 5, 2012 at 5:41 PM, Lawrence Crowl cr...@googlers.com wrote: On 9/5/12, Xinliang David Li davi...@google.com wrote: On Sep 5, 2012 Jan Hubicka hubi...@ucw.cz wrote: OK, the basic idea is that symtab_node is basetype of cgraph_node and varpool_node. We may want to drop

Re: Cgraph Modification Plan

2012-09-06 Thread Jan Hubicka
Areas that are confusing and need clean up (IMO) include: 1) handling of aliases and clones I am slowly cleaning up alias stuff, it had major reorg in 4.7 and further cleanups in 4.8. Do you have more specific suggestions? 2) reachability, needed, analyzed bits. The needed bit is not in

Re: Cgraph Modification Plan

2012-09-07 Thread Jan Hubicka
Sorry to interrupt here, but please finish the existing partial C++ transitions instead of starting to work on new ones. Current stage1 will not last forever (stage1 is usually 6 months, so its natural end would be end of September). I'd rather have the current transition to a symbol table

Re: Libtool update for gcc-4.8 (slim-lto bootstrap)?

2012-09-11 Thread Jan Hubicka
Is there any interest in updating the in-tree libtool to something newer? This update would allow to use a -fno-fat-lto-objects lto-bootstrap target, that should speed up the (lto) build time. If there is interest, when would be the best date for such an update? There is definitely an

Re: Cgraph Modification Plan

2012-09-12 Thread Jan Hubicka
We do not yet seem to have consensus on a long term plan. Would it be reasonable to start on short term prepatory work? In particular, I was think we could do Add converters and testers. Change callers to use those. and maybe Change callers to use type-safe parameters.

Re: g++.dg/tree-ssa/pr45453.C time out

2012-10-31 Thread Jan Hubicka
On Wed, Oct 31, 2012 at 3:17 PM, Paolo Carlini paolo.carl...@oracle.com wrote: Hi, whoever a few days ago or so broke this test, can please either fix the testcase, the compiler or just xfail for now the testcase itself, to avoid everybody the waste of time? If you want me to do

Re: g++.dg/tree-ssa/pr45453.C time out

2012-10-31 Thread Jan Hubicka
On Wed, Oct 31, 2012 at 3:17 PM, Paolo Carlini paolo.carl...@oracle.com wrote: Hi, whoever a few days ago or so broke this test, can please either fix the testcase, the compiler or just xfail for now the testcase itself, to avoid everybody the waste of time? If you want

Re: RFC: [ARM] Disable peeling

2012-12-10 Thread Jan Hubicka
I agree that this is a sledgehammer. If aligned/unaligned loads/stores have the same cost then reflect that in the vectorized stmt cost hook. If that alone does not prevent peeling for alignment to happen then the fix is to not consider doing peeling for alignment if aligned/unaligned

Re: Identical basic blocks live long in RTL flow.

2013-01-16 Thread Jan Hubicka
Basic blocks 8/9/10 are identical and live until pass jump2, which is after register allocation. I think these duplicated BBs do not contain additional information and should be better to be removed ASAP, because they might interfere with other passes like ifcvt. So should this issue be

Re: System V Application Binary Interface 0.99.5

2013-01-31 Thread Jan Hubicka
Well, it's hardly an optimization if it's incorrect, and it seems to be incorrect. As the old saying goes, I can make your code infinitely fast if you don't care about the results. It's incorrect to rely on the extension taking place. It's not incorrect to do the extension. The

Re: System V Application Binary Interface 0.99.5

2013-01-31 Thread Jan Hubicka
On 01/30/2013 04:49 PM, Michael Matz wrote: Hmm? GCC generates code that doesn't rely on the extension taking place. Sure, I didn't mean to suggest it was: it's LLVM that's incorrect. Yes, that is LLVM bug. I am surprised that it went unnoticed for so long, but I guess it is difficult to

Re: System V Application Binary Interface 0.99.5

2013-02-03 Thread Jan Hubicka
On 02/01/2013 12:38 AM, Jan Hubicka wrote: Doing the extensions at caller side always is however IMO a preformance bug in GCC. We can definitly drop them at -Os, for non-PRS targets and for calls within compilation unit where we know that GCC is not really producing code like

Re: make_decl_one_only and inlining

2013-02-17 Thread Jan Hubicka
Hi, We recently got a bug report for the GCC D compiler frontend which shows that we currently don't inline any templated functions. The reason seems to be that decl_replaceable_p always returns true for D template functions. We currently just mark such template function instances using

Re: make_decl_one_only and inlining

2013-02-17 Thread Jan Hubicka
Set DECL_COMDAT. You said that didn't work but you didn't fully explain why. A DECL_COMDAT function should be output in every object file in which it is referenced. I wasn't sure if that's the correct approach. If it is, some further investigation will be necessary why it doesn't

Re: Compiler speed (vanilla vs. LTO, PGO and LTO+PGO)

2013-03-26 Thread Jan Hubicka
Yes, the binary size is 8-10% smaller. Unfortunately there are no performance improvements. LTO+PGO-disable-plugin: -rwxr-xr-x 1 markus markus 15025568 Mar 25 15:49 cc1 -rwxr-xr-x 1 markus markus 16198584 Mar 25 15:49 cc1plus -rwxr-xr-x 1 markus markus 13907328 Mar 25 15:49 lto1

Re: Compiler speed (vanilla vs. LTO, PGO and LTO+PGO)

2013-03-28 Thread Jan Hubicka
). Honza Index: ChangeLog === --- ChangeLog (revision 197205) +++ ChangeLog (working copy) @@ -1,5 +1,9 @@ 2013-03-28 Jan Hubicka j...@suse.cz + * lto-cgraph.c (merge_profile_summaries): Fix overflows. + +2013-03-28 Jan

Re: pure/const function attribute and memoization

2013-05-18 Thread Jan Hubicka
On 05/15/2013 11:01 AM, Richard Biener wrote: Now - if there would ever be an architecture where special call-site preparation is required for a callee to write to global memory then marking a function 'const' when it does in fact write to global memory then GCC may choose to optimize the

Re: pure/const function attribute and memoization

2013-05-20 Thread Jan Hubicka
http://sourceware.org/ml/libc-alpha/2013-05/msg00389.html The function is in glibc's math/atest-exp2.c file. I see, I was curious what made LLVM developers to implement the feature about making memory writes unreachable. While I see wild interpretation of the documentation allows it, I

Re: Branch prediction

2013-05-30 Thread Jan Hubicka
Is there any documentation for what gcc does with branch prediction information it gets from profiling? I am interested in this for modern Pentium processors where you can no longer give hints. The profile feedback drives optimizations (i.e. decision what to optimize for speed and what for

Re: What's up with g++.dg/ext/mv*.C?

2013-06-13 Thread Jan Hubicka
On 06/13/2013 12:35 PM, Paolo Carlini wrote: On 06/13/2013 12:28 PM, Paolo Carlini wrote: Hi, these FAILs are much more recent but frankly I'm also puzzled: is a fix actively in the making? Do we have any sort of time for that? This is PR57548 and a patch is approved but unapplied:

Re: Generate coverage informations in different sections.

2013-06-17 Thread Jan Hubicka
Hi, I'm a Xen developer. We have coverage support (lcov replacement) in order to extract coverage information. However would be very helpful to have a way to put counters, structures and strings (file names) related to coverage in different section. Actually there are no such options (it

Re: Wrong code for i686 target with -O3 -flto

2013-08-05 Thread Jan Hubicka
Quoting Uros Bizjak ubiz...@gmail.com: On Sun, Aug 4, 2013 at 2:34 AM, NightStrike nightstr...@gmail.com wrote: On Mon, Jul 22, 2013 at 5:22 AM, Igor Zamyatin izamya...@gmail.com wrote: Hi All! Unfortunately now the compiler generates wrong code for i686 target when options -O3 and -flto are

Re: Bootstrap broken in libobjc/sendmsg.c

2013-09-06 Thread Jan Hubicka
.. looks like this is target/58269, which therefore affects x86_64-linux too. Now this reproduces to me, too. apppy_args expansion is trying to preserve AVX register in V8SF mode when AVX is disabled. This leads to move expander to not allow moving it and we end up infinitely recursing trying

Re: Bootstrap broken in libobjc/sendmsg.c

2013-09-06 Thread Jan Hubicka
.. looks like this is target/58269, which therefore affects x86_64-linux too. Now this reproduces to me, too. apppy_args expansion is trying to preserve AVX register in V8SF mode when AVX is disabled. This leads to move expander to not allow moving it and we end up infinitely

Re: Bootstrap broken in libobjc/sendmsg.c

2013-09-06 Thread Jan Hubicka
On Fri, Sep 06, 2013 at 02:38:01PM +0200, Jan Hubicka wrote: .. looks like this is target/58269, which therefore affects x86_64-linux too. Now this reproduces to me, too. apppy_args expansion is trying to preserve AVX register in V8SF mode when AVX is disabled. This leads

Re: RFC: Inlines, LTO and GCC

2013-09-10 Thread Jan Hubicka
On 10/09/13 10:11, Jakub Jelinek wrote: On Tue, Sep 10, 2013 at 10:06:04AM +0200, David Brown wrote: This last point is crucial. I haven't looked at the code in question, but one point to check is how the functions are called. If they are often called with constant values, then they may

Re: RFC: Inlines, LTO and GCC

2013-09-10 Thread Jan Hubicka
But then inlining / cloning is no longer cheap, no? And will be disabled at -O2? If you declare it inline and not static inline it will be inlined pretty much as before, only it will get unified if it ends up out of line in multiple units. Main difference in between static and non-static

Re: RFC: Inlines, LTO and GCC

2013-09-10 Thread Jan Hubicka
But then inlining / cloning is no longer cheap, no? And will be disabled at -O2? If you declare it inline and not static inline it will be inlined pretty much as before, only it will get unified if it ends up out of line in multiple units. Main difference in between static and

Re: libgcc/sync.c vs. cgraph alias tracking

2013-10-08 Thread Jan Hubicka
MIPS16 code can't do atomic operations directly, so it calls into out-of-line versions that are compiled as -mno-mips16. These out-of-line versions use the same open-coded implementation as you'd get in normal -mno-mips16 code. Hmm, and I assume you don't want to use target attribute for

Re: libgcc/sync.c vs. cgraph alias tracking

2013-10-09 Thread Jan Hubicka
Jan Hubicka hubi...@ucw.cz writes: MIPS16 code can't do atomic operations directly, so it calls into out-of-line versions that are compiled as -mno-mips16. These out-of-line versions use the same open-coded implementation as you'd get in normal -mno-mips16 code. Hmm, and I assume

Re: Testing ICEs resulting from profile directed optimization

2013-10-10 Thread Jan Hubicka
Hi, I have found an ICE reported as (PR 58682) and I have a fix. Cool :) However the testcase involved: * compiling a 5 .i files with -fprofile-generate= * running the executable * compiling the same 5 .i files with -fprofile-use=, and only then getting the ICE. Is there anything in

Re: Testing ICEs resulting from profile directed optimization

2013-10-11 Thread Jan Hubicka
-Original Message- From: Jan Hubicka [mailto:hubi...@ucw.cz] Sent: 10 October 2013 17:24 To: Paulo Matos Cc: gcc@gcc.gnu.org Subject: Re: Testing ICEs resulting from profile directed optimization Hi, I have found an ICE reported as (PR 58682) and I have a fix

Re: Tutorials/pointers for IPA writing passes

2013-12-17 Thread Jan Hubicka
Hi, the overall description how IPA optimization is structured (describing difference in between simple_ipa_opt_pass and ipa_opt_pass) is in http://arxiv.org/pdf/1010.2196v2.pdf and also in lto.texi. There is not exactly a tutrial, but generally simple_ipa_opt_pass is easier to start with since

Re: Fwd: LLVM collaboration?

2014-02-10 Thread Jan Hubicka
1. There IS an unnecessary fence between GCC and LLVM. License arguments are one reason why we can't share code as easily as we would like, but there is no argument against sharing ideas, cross-reporting bugs, helping each other implement a better compiler/linker/assembler/libraries just

Re: Fwd: LLVM collaboration?

2014-02-11 Thread Jan Hubicka
or missing symbols. Honza I'm assuming the extra symbols would be discarded if no library is found, together with the warning, right? Maybe an error if -Wall or whatever. Can we get someone from the binutils community to opine on that? cheers, --renato On 11 February 2014 02:29, Jan

Re: Fwd: LLVM collaboration?

2014-02-11 Thread Jan Hubicka
On 2014.02.11 at 13:02 -0500, Rafael Espíndola wrote: On 11 February 2014 12:28, Renato Golin renato.go...@linaro.org wrote: Now copying Rafael, which can give us some more insight on the LLVM LTO side. Thanks. On 11 February 2014 09:55, Renato Golin renato.go...@linaro.org

Re: Fwd: LLVM collaboration?

2014-02-11 Thread Jan Hubicka
Since both toolchains do the magic, binutils has no incentive to create any automatic detection of objects. It is mostly a historical decision. At the time the design was for the plugin to be matched to the compiler, and so the compiler could pass that information down to the linker.

Re: Fwd: LLVM collaboration?

2014-02-12 Thread Jan Hubicka
On Wed, 12 Feb 2014, Richard Biener wrote: What about instead of our current odd way of identifying LTO objects simply add a special ELF note telling the linker the plugin to use? .note._linker_plugin '/./libltoplugin.so' that way the linker should try 1) loading that plugin,

TYPE_BINFO and canonical types at LTO

2014-02-13 Thread Jan Hubicka
Hi, I have noticed that record_component_aliases is called during LTO time and it examines contents of BINFO: 0x5cd7a5 record_component_aliases(tree_node*) ../../gcc/alias.c:1005 0x5cd4a9 get_alias_set(tree_node*) ../../gcc/alias.c:895 0x5cc67a

Re: TYPE_BINFO and canonical types at LTO

2014-02-14 Thread Jan Hubicka
This smells bad, since it is given a canonical type that is after the structural equivalency merging that ignores BINFOs, so it may be completely different class with completely different bases than the original. Bases are structuraly merged, too and may be exchanged for normal fields

Re: TYPE_BINFO and canonical types at LTO

2014-02-14 Thread Jan Hubicka
This smells bad, since it is given a canonical type that is after the structural equivalency merging that ignores BINFOs, so it may be completely different class with completely different bases than the original. Bases are structuraly merged, too and may be exchanged for

Re: TYPE_BINFO and canonical types at LTO

2014-02-16 Thread Jan Hubicka
On Fri, 14 Feb 2014, Jan Hubicka wrote: This smells bad, since it is given a canonical type that is after the structural equivalency merging that ignores BINFOs, so it may be completely different class with completely different bases than the original. Bases

Re: TYPE_BINFO and canonical types at LTO

2014-02-17 Thread Jan Hubicka
Yeah, ok. But we treat those types (B and C) TBAA equivalent because structurally they are the same ;) Luckily C has a proper field for its base (proper means that offset and size are correct as well as the type). It indeed has DECL_ARTIFICIAL set and yes, we treat those as real fields

Re: TYPE_BINFO and canonical types at LTO

2014-02-18 Thread Jan Hubicka
Non-ODR types born from other frontends will then need to be made to alias all the ODR variants that can be done by storing them into the current canonical type hash. (I wonder if we want to support cross language aliasing for non-POD?) Surely for accessing components of non-POD

Re: TYPE_BINFO and canonical types at LTO

2014-02-19 Thread Jan Hubicka
On Tue, 18 Feb 2014, Jan Hubicka wrote: Non-ODR types born from other frontends will then need to be made to alias all the ODR variants that can be done by storing them into the current canonical type hash. (I wonder if we want to support cross language aliasing for non-POD

Re: WPA stream_out form memory consumption

2014-03-25 Thread Jan Hubicka
Hello, I've been compiling Chromium with LTO and I noticed that WPA stream_out forks and do parallel: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02621.html. I am unable to fit in 16GB memory: ld uses about 8GB and lto1 about 6GB. When WPA start to fork, memory consumption increases so

Re: WPA stream_out form memory consumption

2014-04-02 Thread Jan Hubicka
Hello, taking latest trunk gcc, I built Firefox and Chromium. Both projects compiled without debugging symbols and -O2 on an 8-core machine. Firefox: -flto=9, peak memory usage (in LTRANS): 11GB Chromium: -flto=6, peak memory usage (in parallel WPA phase ): 16.5GB I see, the

Re: WPA stream_out form memory consumption

2014-04-02 Thread Jan Hubicka
Previous email presents a bit misleading graphs (influenced by --enable-gather-detailed-mem-stats). Firefox: -flto=9, WPA peak: 8GB, LTRANS peak: 8GB -flto=4, WPA peak: 5GB, LTRANS peak: 3.5GB -flto=1, WPA peak: 3.5GB, LTRANS peak: ~1GB These data shows that parallel WPA streaming

Re: WPA stream_out form memory consumption

2014-04-03 Thread Jan Hubicka
: Previous patch is wrong, I did a mistake in name ;) Martin On 03/27/2014 09:52 AM, Martin Liška wrote: On 03/25/2014 09:50 PM, Jan Hubicka wrote: Hello, I've been compiling Chromium with LTO and I noticed that WPA stream_out forks and do parallel: http://gcc.gnu.org/ml/gcc

Re: WPA stream_out form memory consumption

2014-04-03 Thread Jan Hubicka
Firefox: cgraph.c:869 (cgraph_create_edge_1) 0: 0.0% 0: 0.0% 130358176: 6.9% 0: 0.0%1253444 cgraph.c:510 (cgraph_allocate_node) 0: 0.0% 0: 0.0% 182236800: 9.7% 0: 0.0% 555600 toplev.c:960

Re: WPA stream_out form memory consumptions

2014-04-03 Thread Jan Hubicka
I resend the mail, because I was given 502 error. On 04/03/2014 12:43 AM, Jan Hubicka wrote: Hello, taking latest trunk gcc, I built Firefox and Chromium. Both projects compiled without debugging symbols and -O2 on an 8-core machine. Firefox: -flto=9, peak memory usage (in LTRANS

Re: WPA stream_out form memory consumption

2014-04-07 Thread Jan Hubicka
I added new graph for 'xloc.column = 0' hack, just applied this single patch to trunk. Link: https://drive.google.com/file/d/0B0pisUJ80pO1MW11WHdjMk9KQnc/edit?usp=sharing Good, does these two patches combine together well? (they are rater orthogonal, but perhaps with columns disabled

Re: WPA stream_out form memory consumption

2014-04-07 Thread Jan Hubicka
AFAIK we settled on a simpler one dropping columns at stream-out time that also helped. As for the correct way to do the optimization we agreed(?) that streaming the locations elsewhere and using references to them is more appropriate. At stream-in (or before stream-out) we can then read

Re: WPA stream_out form memory consumption

2014-04-15 Thread Jan Hubicka
On Mon, Apr 7, 2014 at 8:20 PM, Jan Hubicka hubi...@ucw.cz wrote: AFAIK we settled on a simpler one dropping columns at stream-out time that also helped. As for the correct way to do the optimization we agreed(?) that streaming the locations elsewhere and using references to them

Re: IPA: Devirtualization versus placement new

2014-04-25 Thread Jan Hubicka
Summary: Devirtualization uses type information to determine if a virtual method is reachable from a call site. If type information indicates that it is not, devirt marks the site as unreachable. I think this is wrong, and it breaks some programs. At least, it should not do this if the

Re: IPA: Devirtualization versus placement new

2014-04-25 Thread Jan Hubicka
On 04/25/2014 03:14 PM, Volker Simonis wrote: Could you therefore please re-categorize this as devirt bug. It is an IPA bug. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60965 Now when I have interest from ubsan direction, I wanted to ask. Would it make sense to turn those unreachables

Re: IPA: Devirtualization versus placement new

2014-04-25 Thread Jan Hubicka
On Fri, Apr 25, 2014 at 08:23:22PM +0200, Jan Hubicka wrote: On 04/25/2014 03:14 PM, Volker Simonis wrote: Could you therefore please re-categorize this as devirt bug. It is an IPA bug. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60965 Now when I have interest from ubsan

Re: Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Jan Hubicka
On 05/20/14 04:09, Bruce Adams wrote: Hi, I've been tracking the latest releases of gcc since 4.7 or so (variously interested in C++1y support, cilk and openmp). One thing I've found hard to locate is information about planned inclusions for future releases. As much relies on unpredictable

Re: New C++ IPA fails

2014-05-22 Thread Jan Hubicka
The fix is attached. Ok to commit? OK, thanks! Honza

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-07-15 Thread Jan Hubicka
On 25 June 2014 10:26, Bingfeng Mei b...@broadcom.com wrote: Why is GCC code size so much bigger than LLVM? Does -Ofast have more unrolling on GCC? It doesn't seem increasing code size help performance (164.gzip 197.parser) Is there comparisons for O2? I guess that is more useful for

Re: Comparison of GCC-4.9 and LLVM-3.4 performance on SPECInt2000 for x86-64 and ARM

2014-07-15 Thread Jan Hubicka
On 15 July 2014 15:43, Jan Hubicka hubi...@ucw.cz wrote: I also noticed that GCC code size is bigger for both firefox and libreoffice. There was some extra bloat in 4.9 compared to 4.8. Martin did some tests with -O2 and various flags, perhaps we could trottle some of -O2 optimizations

Re: Symtab node table introduction and GGC issues

2014-07-25 Thread Jan Hubicka
Hello, thank you for you advice. It really looks that I face the same issue as you seen. As you suggested, I will start with a global pointer to symtab and we'll see further integration into context class (where it should reside according to me). Note that I would suggest an

Re: ????: Is there any possibility to parallel compilation in a single file?

2014-07-29 Thread Jan Hubicka
Thank you for your answer. I find the most time consuming process in compiling a file is the optimization of the cgraph nodes (execute all_passes), This process is sequence, one node by one node. If we divide the cgraph nodes into unrelated forest, we can parallel it, is this way

Re: LTO bootstrap compare errors for ARM64

2014-08-07 Thread Jan Hubicka
As a First step I compared the objump -D dump between stage2-gcc/gimple.o and stage3-gcc/gimple.o. Differences are in LTO sections .gnu.lto_.decls.0, .gnu.lto_.symtab. Ref: http://paste.ubuntu.com/7949238/ If you see the differences already in .o files (i.e. at compile time), I think the

Re: LTO bootstrap compare errors for ARM64

2014-08-13 Thread Jan Hubicka
Hi Honza, I did not find any differences in tree level dumps. These are the dump differences in IPA In gimple-fold.c.000i.cgraph (--Snip--) _Z25gimple_build_omp_continueP9tree_nodeS0_/761 (gimple_build_omp_continue(tree_node*, tree_node*)) @0x3ff7ebda548 ---

Re: LTO inhibiting dwarf lexical blocks output

2014-08-15 Thread Jan Hubicka
So... I've been getting my feet wet with LTO and debugging and I noticed a seemingly unrelated yet annoying problem. On x86-64, gcc.dg/guality/pr48437.c fails when run in LTO mode. I've compared the dwarf output with and without LTO, and I noticed that the DW_TAG_lexical_block is missing

Re: LTO inhibiting dwarf lexical blocks output

2014-08-17 Thread Jan Hubicka
On Fri, Aug 15, 2014 at 10:08:38PM +0200, Steven Bosscher wrote: On Fri, Aug 15, 2014 at 9:59 PM, Aldy Hernandez wrote: So... I've been getting my feet wet with LTO and debugging and I noticed a seemingly unrelated yet annoying problem. On x86-64, gcc.dg/guality/pr48437.c fails when

Re: LTO inhibiting dwarf lexical blocks output

2014-08-18 Thread Jan Hubicka
The following seems to fix it. In testing now. Will streaming as non-reference prevent DECL from being merged and tails of BLOCK_VAR chains to be corrupted? Honza Richard. Richard. Thanks. Aldy

Re: LTO bootstrap compare errors for ARM64

2014-08-20 Thread Jan Hubicka
--) min size: 6 --- min size: 0 6590c6590 min size: 14 --- min size: 0 6607c6607 min size: 28 (--Snip--) On 7 August 2014 19:14, Jan Hubicka hubi...@ucw.cz wrote: As a First step I compared the objump -D dump between stage2-gcc

Re: LTO inhibiting dwarf lexical blocks output

2014-08-20 Thread Jan Hubicka
On August 18, 2014 8:46:00 PM CEST, Jan Hubicka hubi...@ucw.cz wrote: The following seems to fix it. In testing now. Will streaming as non-reference prevent DECL from being merged and tails of BLOCK_VAR chains to be corrupted? Yes, the decl ends up in the function section

Re: Some questions about pass web

2014-09-09 Thread Jan Hubicka
On 09/03/14 02:35, Steven Bosscher wrote: On Wed, Sep 3, 2014 at 9:17 AM, Bin.Cheng wrote: Last time I tried, there are several passes after loop_done and before auto-inc-dec can't handle auto-increment addressing mode, including fweb. It surprises me that pass_web can't handle AUTOINC.

[RFC] Dealing with ODR violations in GCC

2014-09-11 Thread Jan Hubicka
Hi, I went through excercise of running LTO bootstrap with ODR verification on. There are some typename clashes I guess we want to fix. I wonder what approach is preferred, do we want to introduce anonymous namespaces for those? Honza ../../gcc/tlink.c:62:16: warning: type ‘struct

Re: How to access the virtual table?

2014-09-12 Thread Jan Hubicka
I am trying to access the virtual table. My pass is hooked after pass_ipa_pta. Consider Class A which contains virtual function. An object created as : A a; is translated in GIMPLE as struct A a; From variable a we can get its type which is struct A. I tried to see how the

Skipping assembler when producing slim LTO files

2014-09-23 Thread Jan Hubicka
Hi, This patch is something I was playing around with assistance of Ian Taylor. It seems I need bit more help though :) It adds support for direct output of SLIM LTO files to the compiler binary. It works as proof of concept, but there are two key parts missing 1) extension of libiberty's

Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
Shouldn't -fbypass-asm be simply mangled by the driver? That is, the user simply specifies -fbypass-asm and via spec magic the driver substitutes this with -fbypass-asm=crtbegin.o? That way at least the user interface should be stable (as we're supposedly removing the requirement for that

Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
On Wed, Sep 24, 2014 at 7:47 AM, Andi Kleen a...@firstfloor.org wrote: I wonder how hard it would be to fix simple-object to be able to create from scratch. From a quick look it would be mostly adding the right values into the header? That would need some defines per target. It could

Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
On Wed, Sep 24, 2014 at 7:47 AM, Andi Kleen a...@firstfloor.org wrote: I wonder how hard it would be to fix simple-object to be able to create from scratch. From a quick look it would be mostly adding the right values into the header? That would need some defines per target. It

Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
On Wed, Sep 24, 2014 at 6:32 PM, Jan Hubicka hubi...@ucw.cz wrote: Libreoffice shows that GCC needs about twice as much of system time. According to profiles, good part is the ugly way we pass stuff down to assembler and other part is memory use during the copmilation stage. Are you

Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
On Wed, Sep 24, 2014 at 10:04 AM, Steven Bosscher stevenb@gmail.com wrote: On Wed, Sep 24, 2014 at 6:32 PM, Jan Hubicka hubi...@ucw.cz wrote: Libreoffice shows that GCC needs about twice as much of system time. According to profiles, good part is the ugly way we pass stuff down

Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Jan Hubicka
On Wed, Sep 24, 2014 at 11:47 PM, Ian Lance Taylor wrote: On Wed, Sep 24, 2014 at 10:04 AM, Steven Bosscher wrote: Are you using -pipe? AFAIR this still isn't the default, even on GNU/Linux, but it is typically a lot faster than without. Is that true even when TMPDIR is on a ram disk?

Re: cgraph_node::verify - quite strong condition that was met by IPA-ICF

2014-09-25 Thread Jan Hubicka
Hello. I've been finalizing IPA ICF testing process and I met a condition for lto-bootstrap, where cgraph_node::verify encounters error: In WPA, I prove that gen_vec_initv16qi can be merged with gen_vec_initv2sf. In the following case, ale local calls are redirected: while

Re: [BUILDROBOT] Ada broken

2014-10-02 Thread Jan Hubicka
On Thu, Oct 02, 2014 at 09:52:31PM +0200, Jan-Benedict Glaw wrote: It seems that a full bootstrap including Ada got broken somewhere in the range of r215789 .. r215799. I'm bisecting it (on powerpc64-linux, where it also shows up); it needs full bootstrapping every time, so will be

Re: Loop peeling

2014-10-29 Thread Jan Hubicka
On Wed, Oct 29, 2014 at 12:53 PM, Tejas Belagod tejas.bela...@arm.com wrote: On 29/10/14 09:32, Richard Biener wrote: On Tue, Oct 28, 2014 at 4:55 PM, Evandro Menezes e.mene...@samsung.com wrote: While doing some benchmark flag mining on AArch64, I noticed that -fpeel-loops was a

Re: Loop peeling

2014-10-29 Thread Jan Hubicka
On Tue, Oct 28, 2014 at 4:55 PM, Evandro Menezes e.mene...@samsung.com wrote: While doing some benchmark flag mining on AArch64, I noticed that -fpeel-loops was a mined option often. As a matter of fact, when using it always, even without FDO, it seemed to raise most benchmarks and to

Re: LTO IPA inline decisions in GCC trunk.

2014-11-06 Thread Jan Hubicka
Hi Honza, Hello, I experimented building Coremark with both PGO and LTO at -O3 level on Aarch64 machine. First I generated profiles using the recommended seeds in Coremark's readme.txt. Then compiled again with -O3 -flto and -fprofile-use. I tried using GCC Linaro compiler (september)

Re: gcc-4_9 inlines less funcs than gcc-4_8 because of used_as_abstract_origin flag.

2014-11-22 Thread Jan Hubicka
Hi, this is patch I commited to mainline 2014-11-22 Jan Hubicka hubi...@ucw.cz * ipa.c (symbol_table::remove_unreachable_nodes): Mark all inline clones as having abstract origin used. * ipa-inline-transform.c (can_remove_node_now_p_1): Drop abstract origin check

Re: gcc-4_9 inlines less funcs than gcc-4_8 because of used_as_abstract_origin flag.

2014-11-22 Thread Jan Hubicka
Thanks for the fix. Is it ok to backport it to gcc-4_9? Yes, it is OK assuming that there are no problems with the patch for a week. (it ought to be safe) Honza

Re: virtual stack regs.

2007-06-19 Thread Jan Hubicka
I would like to get some more information about pr32374. I do not know what virtual_stack_vars are and there is no documentation in the doc directory. It is documented: @findex VIRTUAL_STACK_VARS_REGNUM @cindex @code{FRAME_GROWS_DOWNWARD} and virtual registers @item VIRTUAL_STACK_VARS_REGNUM

Re: failed to compile trunk svn rev 126124

2007-06-29 Thread Jan Hubicka
Same here (OpenSUSE 10.2, gcc 4.1.3), also for rev. 126127. This was caused by accidental commit of mine (ie I've commited cse.c with sharing changes). It is reverted now. Honza

Re: no_new_pseudos

2007-07-02 Thread Jan Hubicka
Kenneth Zadeck wrote: I do not remember if it was stevenb or bonzini that observed that because of changes that came with the dataflow branch it is now trivial to get rid of no_new_pseudos. For the record, this was Steven's observation. And Kenner confirming that this was the original

Re: AMD64 ABI compatibility

2007-07-10 Thread Jan Hubicka
On 09 July 2007 20:48, Nicolas Alt wrote: Hi! On the AMD64 / x86-64Bit architecture, some arguments of a functions are passed using registers, but there seem to be two different conventions out there. The standard ABI uses 6 registers, but Microsoft compilers use only 4. Because of

Re: AMD64 ABI compatibility

2007-07-10 Thread Jan Hubicka
Windows and GCC ABIs are on x86-64 more different than that (they was historically developed in parallel). GCC 4.3 will support attribute for this calling convention contributed by Kai Tiez and Richard Henderson, but before that there is not much to do... Note: My name is Kai Tietz ;) not

Re: AMD64 ABI compatibility

2007-07-10 Thread Jan Hubicka
For MS I would probably suggest ms_abi (it makes it cleaner that the attribute is affecting calling convetion). For our abi I am not sure, we can sysv_abi or something else... I will prepare an patch for it. For me ms_abi and sysv_abi is fine. I would say that it is in general very

Re: AMD64 ABI compatibility

2007-07-10 Thread Jan Hubicka
I am on that tricky thing ;) I think I need in i386.c an global variable ix86_amd64_abi which helds the the current function abi. This means also that I have to use instead of TARGET_64BIT_MS_ABI this variable. This var may initioalized by init_cumulative_args and the overriden

Re: AMD64 ABI compatibility

2007-07-11 Thread Jan Hubicka
I thank you very much for your great help. Currently I am stucked on x86_function_value_regno_p (macro FUNCTION_VALUE_REGNO_P). It is not clear what's to do here for the FIRST_FLOAT_REG case. Maybe I could use the ms_abi variant for sysv_abi as default too. But I think, it breaks 87 fpu

Re: AMD64 ABI compatibility

2007-07-11 Thread Jan Hubicka
Jan Hubicka wrote on 11.07.2007 14:01:54: I thank you very much for your great help. Currently I am stucked on x86_function_value_regno_p (macro FUNCTION_VALUE_REGNO_P). It is not clear what's to do here for the FIRST_FLOAT_REG case. Maybe I could use the ms_abi variant

  1   2   3   4   5   6   7   8   9   10   >