Re: calloc = malloc + memset

2014-06-23 Thread Andi Kleen
Marc Glisse marc.gli...@inria.fr writes: Hello, this is a stage 1 patch, and I'll ping it then, but if you have comments now... FWIW i believe the transformation will break a large variety of micro benchmarks. calloc internally knows that memory fresh from the OS is zeroed. But the memory

Re: calloc = malloc + memset

2014-06-23 Thread Andi Kleen
On Mon, Jun 23, 2014 at 09:00:02PM +0200, Marc Glisse wrote: On Mon, 23 Jun 2014, Andi Kleen wrote: FWIW i believe the transformation will break a large variety of micro benchmarks. calloc internally knows that memory fresh from the OS is zeroed. But the memory may not be faulted in yet

Re: calloc = malloc + memset

2014-06-23 Thread Andi Kleen
On Mon, Jun 23, 2014 at 10:14:25PM +0200, Marc Glisse wrote: On Mon, 23 Jun 2014, Andi Kleen wrote: I would prefer to not do it. For the sake of micro benchmarks? Yes benchmarks are important. -Andi

Re: [GOOGLE] Report the difference between profiled and guessed or annotated branch probabilities.

2014-06-27 Thread Andi Kleen
Yi Yang ahyan...@google.com writes: Hi, This patch adds an option. When the option is enabled, GCC will add a record about it in an elf section called .gnu.switches.text.branch.annotation for every branch. This would be nice to have even in mainline for the normal profiling. -Andi --

Re: [Committed] Fix lto.c compiling

2014-06-29 Thread Andi Kleen
Andrew Pinski pins...@gmail.com writes: I committed this as obvious. The changelog says it all. I think the problem is that LTO bootstraps frequently error out with -Werror. That incentives LTO users to bootstrap with --disable-werror. [e.g. currently the graphite libraries don't build this

Re: [PATCH, i386] Add prefixes avoidance tuning for silvermont target

2014-07-02 Thread Andi Kleen
Ilya Enkovich enkovich@gmail.com writes: Silvermont processors have penalty for instructions having 4+ bytes of prefixes (including escape bytes in opcode). This situation happens when REX prefix is used in SSE4 instructions. This patch tries to avoid such situation by preferring

Re: [PATCH, i386] Add prefixes avoidance tuning for silvermont target

2014-07-02 Thread Andi Kleen
Mike Stump mikest...@comcast.net writes: Silvermont processors have penalty for instructions having 4+ bytes of prefixes (including escape bytes in opcode). This situation happens when REX prefix is used in SSE4 instructions. This patch tries to avoid such situation by preferring xmm0-xmm7

[PATCH 2/2] Remove x86 cmpstrnsi

2014-07-03 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com In my tests the optimized glibc out of line strcmp is always faster than using inline rep ; cmpsb, even for small strings. The Intel optimization manual also recommends to not use it. So remove the cmpstrnsi instruction. Tested on Sandy Bridge, Westmere

[PATCH 1/2] Remove i386 cmpstrnsi peephole

2014-07-03 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com The peephole that removes the code to compute a tristate for cmpstrnsi when only a boolean jump is needed never triggers in my tests. Just remove it. gcc/: 2014-07-02 Andi Kleen a...@linux.intel.com * config/i386/i386.md: Remove peepholes

[PATCH] Fix bootstrap with ICL

2014-07-08 Thread Andi Kleen
[I couldn't find a patch submission address for ICL, so I'm sending this here] With ICL enabled and an LTO boot strap the ICL build always errors out due to -Werror=maybe-undefined. The following patch fixes the LTO build for me by initializing the variables in question. All warnings were

Re: [PATCH] Fix bootstrap with ICL

2014-07-09 Thread Andi Kleen
On Wed, Jul 09, 2014 at 10:17:01AM -0700, Mike Stump wrote: On Jul 8, 2014, at 9:01 PM, Andi Kleen a...@firstfloor.org wrote: With ICL enabled and an LTO boot strap the ICL build always errors out due to -Werror=maybe-undefined. The following patch fixes the LTO build for me by initializing

Abstract incremental hashing

2014-07-15 Thread Andi Kleen
This patchkit abstracts incremental hashing in tree.c and lto.c to make it easier to plug in new and more efficient hash algorithms. Right now it uses the old hash algorithms. So it's just a cleanup. Passes bootstrap and testing on x86_64-linux. -Andi

[PATCH 2/4] Convert LTO type hashing to the new inchash interface

2014-07-15 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Should not really change any behavior, it's just a more abstract interface, but uses the same underlying hash functions. lto/: 2014-07-10 Andi Kleen a...@linux.intel.com * lto.c (hash_canonical_type): Convert to inchash

[PATCH 1/4] Add an abstract incremental hash data type

2014-07-15 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Some files in gcc, like lto or tree, do large scale incremential hashing. The current jhash implementation of this could be likely improved by using an incremential hash that does not do a full rehashing for every new value added. This patch adds a new

[PATCH 4/4] Convert lto streamer out hashing to inchash

2014-07-15 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com No substantial changes, although the hash values will be slightly different. gcc/: 2014-07-10 Andi Kleen a...@linux.intel.com * lto-streamer-out.c (hash_tree): Convert to inchash. (add_flag): New macro. --- gcc/lto-streamer-out.c | 245

[PATCH 3/4] Convert the tree.c type hashing over to inchash

2014-07-15 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Again should not really change any behavior (except for some minor differences with empty types) gcc/: 2014-07-10 Andi Kleen a...@linux.intel.com * tree.c (build_type_attribute_qual_variant): Use inchash. (type_hash_list): Dito

Re: [PATCH 1/4] Add an abstract incremental hash data type

2014-07-16 Thread Andi Kleen
On Wed, Jul 16, 2014 at 10:40:53PM -0400, Trevor Saunders wrote: +++ b/gcc/inchash.h +class inchash +{ + hashval_t val; normal style would be explicit private: at the end. Ok. + public: + + /* Start incremential hashing, optionally with SEED. */ + void begin (hashval_t

[PING] Re: Abstract incremental hashing

2014-07-20 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: This patchkit abstracts incremental hashing in tree.c and lto.c to make it easier to plug in new and more efficient hash algorithms. Right now it uses the old hash algorithms. So it's just a cleanup. Passes bootstrap and testing on x86_64-linux. Ping

Re: [PATCH 1/4] Add an abstract incremental hash data type

2014-07-23 Thread Andi Kleen
Btw, what will be the way to plug in an alternative hash function? That is, there doesn't seem to be a separation of interface and implementation in your patch (like with a template or a base-class you inherit from). Just change the inchash.h include file. The point was to only change a

Re: [PATCH 4/4] Convert lto streamer out hashing to inchash

2014-07-23 Thread Andi Kleen
On Tue, Jul 22, 2014 at 09:40:15PM -0600, Jeff Law wrote: On 07/15/14 23:31, Andi Kleen wrote: From: Andi Kleen a...@linux.intel.com No substantial changes, although the hash values will be slightly different. gcc/: 2014-07-10 Andi Kleen a...@linux.intel.com * lto-streamer

Re: [PATCH 4/4] Convert lto streamer out hashing to inchash

2014-07-23 Thread Andi Kleen
I think we managed to stay bytecode compatible for 4.8 release series. (Richi knows better) Nope, fortran options broke it at some point. -Andi -- a...@linux.intel.com -- Speaking for myself only.

Re: [PATCH 4/4] Convert lto streamer out hashing to inchash

2014-07-23 Thread Andi Kleen
On Wed, Jul 23, 2014 at 06:00:35PM +0200, Richard Biener wrote: On July 23, 2014 5:15:53 PM CEST, Andi Kleen a...@firstfloor.org wrote: I think we managed to stay bytecode compatible for 4.8 release series. (Richi knows better) Nope, fortran options broke it at some point. We try hard

Re: [PATCH 4/4] Convert lto streamer out hashing to inchash

2014-07-23 Thread Andi Kleen
On Wed, Jul 23, 2014 at 04:21:59PM +0200, Richard Biener wrote: On Wed, Jul 23, 2014 at 5:40 AM, Jeff Law l...@redhat.com wrote: On 07/15/14 23:31, Andi Kleen wrote: From: Andi Kleen a...@linux.intel.com No substantial changes, although the hash values will be slightly different

Re: [PATCH 1/4] Add an abstract incremental hash data type

2014-07-23 Thread Andi Kleen
So there will be at most one hash implementation? One per binary I expect. Modern hash functions are pretty good, so it's unlikely that someone needs to come up with special purpose hashes. I found Bob Jenkins' spooky is rather good for this case (very large incremential keys), but it is only

Re: [PATCH 1/4] Add an abstract incremental hash data type

2014-07-23 Thread Andi Kleen
Why didn't you replace the tree.c uses BTW? Patches were already quite big, but I'll add it. Actually I handled them all in tree.c. Did you mean something else? I didn't convert all of tree-ssa-* and dwarf* so far, and a few other places. This can be done step by step. -Andi

Updated incremental hash patchkit

2014-07-24 Thread Andi Kleen
This version addresses the review feedback. begin is gone now. add_flag is in the class. The changes in tree.c are nearer the original code now. Some other minor cleanups. Passed bootstrap and test and x86_64-linux. Ok to commit now? Thanks, -Andi

[PATCH 2/4] Convert LTO type hashing to the new inchash interface

2014-07-24 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Should not really change any behavior, it's just a more abstract interface, but uses the same underlying hash functions. lto/: 2014-07-24 Andi Kleen a...@linux.intel.com * lto.c (hash_canonical_type): Convert to inchash

[PATCH 1/4] Add an abstract incremental hash data type

2014-07-24 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Some files in gcc, like lto or tree, do large scale incremential hashing. The current jhash implementation of this could be likely improved by using an incremential hash that does not do a full rehashing for every new value added. This patch adds a new

[PATCH 4/4] Convert lto streamer out hashing to inchash

2014-07-24 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com No substantial changes, although the hash values will be slightly different. v2: add_flag moved to inchash. Some minor changes. gcc/: 2014-07-24 Andi Kleen a...@linux.intel.com * lto-streamer-out.c (hash_tree): Convert to inchash. --- gcc/lto

[PATCH 3/4] Convert the tree.c type hashing over to inchash

2014-07-24 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com v2: Use commutative interface. Be much nearer to the old code. gcc/: 2014-07-24 Andi Kleen a...@linux.intel.com * tree.c (build_type_attribute_qual_variant): Use inchash. (type_hash_list): Dito. (attribute_hash_list): Dito

Re: Avoid multiple entry SCC regions

2014-07-25 Thread Andi Kleen
Jan Hubicka hubi...@ucw.cz writes: I am lto bootstrapping/regtesting x86_64-linux and intend to comming once it passes. You'll have to redo it with hstates, sorry, as it conflicts with my patchkit which I checked in earlier. -Andi -- a...@linux.intel.com -- Speaking for myself only

Convert more incremental hash users to inchash

2014-07-27 Thread Andi Kleen
This patchkit converts more incremental hash users to the new inchash class. The only larger change is for rtl hashing, which I had to move to a new file to avoid problems with the generator program. All changes should only minimally change behavior. Bootstrapped and tested on x86_64-linux. Ok

[PATCH 2/6] Convert asan.c to inchash

2014-07-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com gcc/: 2014-07-25 Andi Kleen a...@linux.intel.com * asan.c (asan_mem_ref_hasher::hash): Convert to inchash. --- gcc/asan.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index 475dd82..f7fa55f

[PATCH 3/6] Convert ipa-devirt to inchash

2014-07-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com gcc/: 2014-07-25 Andi Kleen a...@linux.intel.com * ipa-devirt.c (polymorphic_call_target_hasher::hash): Convert to inchash. --- gcc/ipa-devirt.c | 20 +--- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/gcc

[PATCH 4/6] Convert tree-ssa-dom to inchash

2014-07-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com gcc/: 2014-07-25 Andi Kleen a...@linux.intel.com * tree-ssa-dom.c (iterative_hash_exprs_commutative): Convert to inchash. (iterative_hash_hashable_expr): Dito. (avail_expr_hash): Dito. --- gcc/tree-ssa-dom.c | 79

[PATCH 6/6] Convert tree-ssa-tail-merge to inchash

2014-07-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com gcc/: 2014-07-25 Andi Kleen a...@linux.intel.com * tree-ssa-tail-merge.c (same_succ_hash): Convert to inchash. --- gcc/tree-ssa-tail-merge.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/gcc/tree-ssa

[PATCH 5/6] Convert tree-ssa-sccvn to inchash

2014-07-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com gcc/: 2014-07-25 Andi Kleen a...@linux.intel.com * tree-ssa-sccvn.c (vn_reference_op_compute_hash): (vn_reference_compute_hash): (vn_nary_op_compute_hash): (vn_phi_compute_hash): * tree-ssa-sccvn.h

[PATCH 1/6] RTL dwarf2out changes

2014-07-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Convert dwarf2out and rtl.c to the new inchash interface. I moved the rtl hash code to another file to avoid having to link all the hash code into the generator functions. gcc/: 2014-07-25 Andi Kleen a...@linux.intel.com * Makefile.in (OBJS

Re: Basic speculation support for polymorphic_call_context

2014-07-28 Thread Andi Kleen
Jan Hubicka hubi...@ucw.cz writes: There are similar testcases in bugzilla where we do not devirtualize because we lost track of type promises C++ language makes on memory accesses. This may give us a clue how common these are. How would the user know without some optional warning? -Andi

Re: [PATCH 1/6] RTL dwarf2out changes

2014-07-28 Thread Andi Kleen
On Mon, Jul 28, 2014 at 11:48:58AM -0700, Cary Coutant wrote: + /* ??? MD5 of another hash doesn't make a lot of sense... */ + hash = hstate.end(); CHECKSUM (hash); [citation needed] I don't see why you think that. Maybe it'd be nicer if we could use hash_loc_operands() to feed its

Re: [PATCH, MPX runtime 1/2] Integrate MPX runtime library

2014-11-11 Thread Andi Kleen
Joseph Myers jos...@codesourcery.com writes: On Tue, 11 Nov 2014, Ilya Enkovich wrote: Hi, This patch integrates MPX runtime library into GCC source tree. MPX runtime is responsible for initialization of MPX feature in HW, signal handling, reporting etc. Library is linked to codes

Re: [PATCH, MPX runtime 1/2] Integrate MPX runtime library

2014-11-11 Thread Andi Kleen
It is similar to libsanitizer. Put it in glibc isn't going to work well for MPX. Can you explain it more please? -Andi

Re: [PATCH, MPX runtime 1/2] Integrate MPX runtime library

2014-11-11 Thread Andi Kleen
On Tue, Nov 11, 2014 at 01:04:42PM -0800, H.J. Lu wrote: On Tue, Nov 11, 2014 at 1:01 PM, Andi Kleen a...@firstfloor.org wrote: It is similar to libsanitizer. Put it in glibc isn't going to work well for MPX. Can you explain it more please? Are you suggesting putting MPX run-time

[WEB][PATCH] Describe -pg and LTO changes

2014-11-16 Thread Andi Kleen
This patch describes some user visible changes that were added to gcc 5. Ok to commit? -Andi Index: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.25 diff -u -r1.25

Re: [PING^2] Re: [PATCH] Add memory barriers to xbegin/xend/xabort

2014-11-17 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: Ping^2! Andi Kleen a...@firstfloor.org writes: Ping! From: Andi Kleen a...@linux.intel.com xbegin/xend/xabort were missing memory barriers. This can lead to memory operations being moved out of transactions, which would cause unexpected races

Re: [WEB][PATCH] Describe -pg and LTO changes

2014-11-17 Thread Andi Kleen
Jan Hubicka hubi...@ucw.cz writes: the C and C++ languages to support data and task parallelism./li +liNew attribute codeno_reorder/code prevents reordering of selected symbols. + This enables to link-time optimize Linux kernel without need to use +

Re: [PATCH] Add memory barriers to xbegin/xend/xabort

2014-11-17 Thread Andi Kleen
H.J. Lu hjl.to...@gmail.com writes: On Wed, Oct 29, 2014 at 11:07 PM, Andi Kleen a...@linux.intel.com wrote: Hmm, can't the insns themselves properly clobber/use memory? The transactions don't really use the memory. They just guard it, like a lock. So the intrinsic doesn't know what

Re: [GOOGLE] Fix AutoFDO size issue

2014-11-17 Thread Andi Kleen
Xinliang David Li davi...@google.com writes: Ok for now as a workraround, but this is probably not a long term fix. Is the workaround needed for the mainline autofdo version too? -Andi David On Mon, Nov 17, 2014 at 12:47 PM, Dehao Chen de...@google.com wrote: The patch was updated to

[PATCH 2/2] Make -Q --help print param defaults and min/max values

2014-09-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Make -Q --help print the --param default, min, max values, similar to how it does print the defaults for other flags. This is useful to let a option auto tuner automatically query all needed information abourt gcc params (previously it needed to access

[PATCH 1/2] Remove -fshort-double

2014-09-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com -fshort-double has crashes the compiler since 4.6 (see PR60410) Since it's an obscure option that apparently nobody uses it the best way to fix it seems to just remove it. This prevents constant ICEs when running an gcc optimization flags autotuner. gcc

Re: [PING] [PATCH] Add direct support for Linux kernel __fentry__ patching

2014-09-27 Thread Andi Kleen
-not __fentry__ } } */ /* Origin: Andi Kleen */ diff --git a/gcc/testsuite/gcc.target/i386/record-mcount.c b/gcc/testsuite/gcc.target/i386/record-mcount.c index dae413e..26b0dbc 100644 --- a/gcc/testsuite/gcc.target/i386/record-mcount.c +++ b/gcc/testsuite/gcc.target/i386/record-mcount.c @@ -1,5 +1,5

Re: [PING] [PATCH] Add direct support for Linux kernel __fentry__ patching

2014-09-27 Thread Andi Kleen
On Sat, Sep 27, 2014 at 06:45:21PM +0200, Dominique d'Humières wrote: I think the patch for gcc.target/i386/nop-mcount.c should be True. Thanks. -Andi

[PATCH 2/2] Remove x86 cmpstrnsi

2014-09-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com In my tests the optimized glibc out of line strcmp is always faster than using inline rep ; cmpsb, even for small strings. The Intel optimization manual also recommends to not use it. So remove the cmpstrnsi instruction. Tested on Sandy Bridge, Westmere

[PATCH 1/2] Remove i386 cmpstrnsi peephole

2014-09-27 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com The peephole that removes the code to compute a tristate for cmpstrnsi when only a boolean jump is needed never triggers in my tests. Just remove it. gcc/: 2014-09-27 Andi Kleen a...@linux.intel.com * config/i386/i386.md: Remove peepholes

Re: [PATCH 2/2] Remove x86 cmpstrnsi

2014-09-27 Thread Andi Kleen
On Sat, Sep 27, 2014 at 08:45:18PM +0200, Oleg Endo wrote: On Sat, 2014-09-27 at 11:10 -0700, Andi Kleen wrote: From: Andi Kleen a...@linux.intel.com In my tests the optimized glibc out of line strcmp is always faster than using inline rep ; cmpsb, even for small strings. The Intel

Re: [PATCH 1/2] Remove -fshort-double

2014-09-29 Thread Andi Kleen
As we saw LTO fixes for -fshort-double it's clear that this flag _is_ used for some embedded archs. Did we? It has been ICEing since 4.5, which is before LTO. -Andi

Re: [PATCH 1/2] Remove -fshort-double

2014-09-29 Thread Andi Kleen
So - no, you can't simply remove it. But IMHO it should become a target-specific flag. How about a patch to just disable it for x86? -Andi

[PATCH 3/5] Add test cases for all the new cilk errors

2014-10-01 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com gcc/testsuite/: 2014-09-30 Andi Kleen a...@linux.intel.com * c-c++-common/cilk-plus/CK/errors.c: New test. --- gcc/testsuite/c-c++-common/cilk-plus/CK/errors.c | 56 1 file changed, 56 insertions(+) create mode 100644

[PATCH 4/5] Fix some of the existing Cilk tests for the new errors.

2014-10-01 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com gcc/testsuite/: 2014-09-30 Andi Kleen a...@linux.intel.com * c-c++-common/cilk-plus/AN/misc.c (main): Handle new cilk errors. --- gcc/testsuite/c-c++-common/cilk-plus/AN/misc.c | 8 1 file changed, 4 insertions(+), 4 deletions

[PATCH 5/5] Add illegal cilk checks to C++ front.

2014-10-01 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Add calls for several illegal Cilk cases to the C++ frontend. C++ usually doesn't ICE unlike C on illegal cilk, but it's better to match C in what is allowed and what is not. if (_Cilk_spawn ...) is still not errored, but at least it doesn't ICE. gcc/cp

[PATCH 2/5] Error out for Cilk_spawn or array expression in forbidden places

2014-10-01 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com _Cilk_spawn or Cilk array expressions are only allowed on their own, but not in for(), if(), switch, do, while, goto, etc. The C parser didn't always check for that, which lead to ICEs earlier for invalid code. Add a generic helper that checks this and call

[PATCH 1/5] Fix error location for cilk error message

2014-10-01 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Output the correct location for an existing cilk error message. gcc/c-family/: 2014-09-28 Andi Kleen a...@linux.intel.com * cilk.c (recognize_spawn): Use expression location for error message. --- gcc/c-family/cilk.c | 2 +- 1 file

[PATCH 2/2] Add illegal cilk checks to C++ front.

2014-10-03 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Add calls for several illegal Cilk cases to the C++ frontend. C++ usually doesn't ICE unlike C on illegal cilk, but it's better to match C in what is allowed and what is not. if (_Cilk_spawn ...) is still not errored, but at least it doesn't ICE. gcc/cp

Updated cilk error patches

2014-10-03 Thread Andi Kleen
This version addresses the localization problem pointed out by Joseph. No other changes. I only reposted the two changed patches in the patchkit, the others have already been approved. Passes bootstrap and test suite on x86_64-linux. -Andi

[PATCH 1/2] Error out for Cilk_spawn or array expression in forbidden places

2014-10-03 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com _Cilk_spawn or Cilk array expressions are only allowed on their own, but not in for(), if(), switch, do, while, goto, etc. The C parser didn't always check for that, which lead to ICEs earlier for invalid code. Add a generic helper that checks this and call

Re: [PATCH 2/2] Add illegal cilk checks to C++ front.

2014-10-03 Thread Andi Kleen
On Fri, Oct 03, 2014 at 07:10:05PM +0200, Paolo Carlini wrote: Hi, On 10/03/2014 04:08 PM, Andi Kleen wrote: + if (check_no_cilk (destination, + Cilk array notation cannot be used as a computed goto expression, + %_Cilk_spawn% statement cannot be used as a computed goto

Re: [PATCH 2/2] Add illegal cilk checks to C++ front.

2014-10-03 Thread Andi Kleen
I have no idea, but there are lots of error_at() all over while don't use _. So I just follow precedence. The problem is, you are *not* calling error_at directly, you are According to Joseph it's ok because I named the arguments _msgid. -Andi

Re: [PATCH] AutoFDO patch for trunk

2014-10-14 Thread Andi Kleen
Dehao Chen de...@google.com writes: + +@item -fauto-profile +@itemx -fauto-profile=@var{path} +@opindex fauto-profile +Enable sampling based feedback directed optimizations, and optimizations +generally profitable only with profile feedback available. + +The following options are enabled:

Re: [PATCH, i386]: Fix PR 59432, sync/atomic FAILs on 32bit x86 systems without .cfi directives

2014-10-16 Thread Andi Kleen
Uros Bizjak ubiz...@gmail.com writes: Hello! Now that %ebx is no more fixed, we can remove all PIC related complications in atomic_compare_and_swapdwi_doubleword pattern. The immediate consequence is, that we avoid hidden xchgs that clobbered unwinding state. Could also do the same in

Re: [Google/4-8] Support for user-guided feedback-directed library optimization

2014-05-09 Thread Andi Kleen
Teresa Johnson tejohn...@google.com writes: Passes regression tests. Ok for google branches? +{ + char parameter[1000]; + sprintf (parameter, %s=%ld,

Re: [Google/4-8] Support for user-guided feedback-directed library optimization

2014-05-10 Thread Andi Kleen
On Fri, May 09, 2014 at 08:11:40PM -0700, Teresa Johnson wrote: Thanks for catching that, I will fix it. BTW I first misunderstood the goal of you patch. (probably because there is no documentation ... something that should also be fixed) I originally thought it was a way to let user code use

Re: [RFC] Old school parallelization of WPA streaming

2014-02-20 Thread Andi Kleen
I plan to commit it shortly (i am just slowly progressing through the bugreports and TODOs cumulated) - indeed for bigger apps and edit/relink cycle it is an life saver ;) I haven't tested exactly around this, but I see a ~10s (~5%) improved kernel LTO build time going from 4.9-20140209 to

Re: [wwwdocs] RFC - mention Cilk Plus in the GCC 4.9 release notes

2014-03-08 Thread Andi Kleen
Iyer, Balaji V balaji.v.i...@intel.com writes: The sentence Current only... should be changed to something like this: Currently all the features except _Cilk_for has been implemented. It would be also good if the documentation mentioned that you have to specify -lcilkrts -Andi --

Re: [wwwdocs] RFC - mention Cilk Plus in the GCC 4.9 release notes

2014-03-08 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: Iyer, Balaji V balaji.v.i...@intel.com writes: The sentence Current only... should be changed to something like this: Currently all the features except _Cilk_for has been implemented. It would be also good if the documentation mentioned that you have

Re: Cilk with -lcilkrts (was: Re: [wwwdocs] RFC - mention Cilk Plus in the GCC 4.9 release notes)

2014-03-08 Thread Andi Kleen
On Sat, Mar 08, 2014 at 09:22:54PM +0100, Tobias Burnus wrote: Andi Kleen wrote: It would be also good if the documentation mentioned that you have to specify -lcilkrts Wouldn't it make more sense to automatically add the option? For instance like the following? Or do we need to do the same

Re: [wwwdocs] RFC - mention Cilk Plus in the GCC 4.9 release notes

2014-03-08 Thread Andi Kleen
_Cilk_spawn is the correct keyword. cilk_spawn can be used if the user includes cilk/cilk.h which has the following 3 lines (and that's the whole file) #define cilk_spawn _Cilk_spawn #define cilk_sync _Cilk_sync #define cilk_for _Cilk_for In Cilk there are basically 3 keywords:

Re: [wwwdocs] RFC - mention Cilk Plus in the GCC 4.9 release notes

2014-03-08 Thread Andi Kleen
Everything except _Cilk_for should be supported. Imagine you're a new cilk user. For you it's totally obvious what everything is. But someone new to it they won't it know anything about everything. So you have to tell them. -Andi

[C++ PING] Re: [PATCH 5/5] Add illegal cilk checks to C++ front.

2014-10-26 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: Ping! Can someone from the C++ side please approve this patch? That's the only patch not approved in this patch kit, but blocking the commit. -Andi From: Andi Kleen a...@linux.intel.com Add calls for several illegal Cilk cases to the C++ frontend. C

[PATCH] Add memory barriers to xbegin/xend/xabort

2014-10-28 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com xbegin/xend/xabort were missing memory barriers. This can lead to memory operations being moved out of transactions, which would cause unexpected races. Always generate implicit memory barriers for these intrinsics. The compat header versions always

Re: [PATCH] Add memory barriers to xbegin/xend/xabort

2014-10-30 Thread Andi Kleen
Hmm, can't the insns themselves properly clobber/use memory? The transactions don't really use the memory. They just guard it, like a lock. So the intrinsic doesn't know what memory is used inside the transaction, but the accesses still cannot be moved out. I think a barrier is the only

Re: [C++ PING^2] Re: [PATCH 5/5] Add illegal cilk checks to C++ front.

2014-11-03 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: Ping!^2 Andi Kleen a...@firstfloor.org writes: Ping! Can someone from the C++ side please approve this patch? That's the only patch not approved in this patch kit, but blocking the commit. -Andi From: Andi Kleen a...@linux.intel.com Add calls

Re: RFC: Update ISL under gcc/infrastructure/ ? // Remove CLooG?

2014-11-10 Thread Andi Kleen
Roman Gareev gareevro...@gmail.com writes: Hi Tobias, I've attached a patch which removes using of CLooG library from Graphite. Is it fine for trunk? Could you please also remove -Werror by default from cloog? Currently with LTO builds warnings in one of these libraries usually break the

Re: [C++ PING^3] Re: [PATCH 5/5] Add illegal cilk checks to C++ front.

2014-11-10 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: Ping!^3 Andi Kleen a...@firstfloor.org writes: Ping!^2 Andi Kleen a...@firstfloor.org writes: Ping! Can someone from the C++ side please approve this patch? That's the only patch not approved in this patch kit, but blocking the commit. -Andi

[PING] Re: [PATCH] Add memory barriers to xbegin/xend/xabort

2014-11-10 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: Ping! From: Andi Kleen a...@linux.intel.com xbegin/xend/xabort were missing memory barriers. This can lead to memory operations being moved out of transactions, which would cause unexpected races. Always generate implicit memory barriers

Re: [PATCH 5/5] Add illegal cilk checks to C++ front.

2014-11-10 Thread Andi Kleen
On Sun, Nov 09, 2014 at 11:03:50PM -0600, Jason Merrill wrote: On 10/01/2014 11:26 PM, Andi Kleen wrote: + if (check_no_cilk (cond, in a condition for a for-loop)) Why is this one in while the others are as? I think in was somewhere hard coded in the test suite and I wanted to minimize test

Re: [PING^3] Re: [PATCH 1/2] Add -B support to gcc-ar/ranlib/nm

2014-08-27 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: PING! Andi Kleen a...@firstfloor.org writes: PING^2 ! Would be nice to make slim bootstrap work, it really speeds it up quite a bit. From: Andi Kleen a...@linux.intel.com To use gcc-{ar,ranlib} for boot strap we need to add a -B option

Re: [PING^3] Re: [PATCH 1/2] Add -B support to gcc-ar/ranlib/nm

2014-08-30 Thread Andi Kleen
Hi Richard, On Thu, Aug 28, 2014 at 10:18:22AM +0200, Richard Biener wrote: This also matches joined -B/foo +{ + const char *arg = av[i] + 2; + const char *end; + + memmove (av + i, av + i + 1, sizeof (char *) * ((ac + 1) - i)); + ac--; + if (*arg ==

[PATCH 1/2] Add -B support to gcc-ar/ranlib/nm

2014-08-31 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com To use gcc-{ar,ranlib} for boot strap we need to add a -B option to the tool. Since ar has weird and unusual argument conventions implement the code by hand instead of using any libraries. v2: Fix typo v3: Improve comments. Use strlen. Use DIR_SEPARATOR. Add

[PATCH 2/2] Support slim LTO bootstrap

2014-08-31 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Change the bootstrap-lto config file to use slim (non fat) LTO.. Speeds up the LTO bootstrap by ~18% on a 4 core system. This requires using gcc-ar/ranlib in post stage 1 builds, so these are passed to all sub builds. v2: Change existing config file

Updated slim LTO patchkit

2014-08-31 Thread Andi Kleen
This patchkit implements slim LTO bootstrap, which speeds up LTO bootstrap by only compiling once. I implemented all of Richard's feedback for the new -B option for gcc-ar. Passes full bootstrap and test on x86_64-linux and LTO bootstrap. Ok to commit now? -Andi

Re: [PATCH 2/2] Support slim LTO bootstrap

2014-08-31 Thread Andi Kleen
Andi Kleen a...@firstfloor.org writes: diff --git a/config/bootstrap-lto-slim.mk b/config/bootstrap-lto-slim.mk new file mode 100644 index 000..9e065e1 --- /dev/null +++ b/config/bootstrap-lto-slim.mk This file was not supposed to be included anymore. I removed it in my version. Instead

Re: [PATCH 2/2] Support slim LTO bootstrap

2014-09-01 Thread Andi Kleen
-ffat-lto-objects is automatically active for hosts not supporting the linker plugin. Is gcc-ar$(exeext) unconditionally built and found on such hosts? Will gcc-ar work without finding a linker plugin on such hosts? Currently it errors out. I suppose that could be turned into a warning I

Re: [PATCH 1/2] Add -B support to gcc-ar/ranlib/nm

2014-09-01 Thread Andi Kleen
On Mon, Sep 01, 2014 at 01:34:17PM +0200, Richard Biener wrote: On Sun, Aug 31, 2014 at 4:51 PM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com To use gcc-{ar,ranlib} for boot strap we need to add a -B option to the tool. Since ar has weird and unusual

[PATCH] gcc-ar: Turn plugin not found case into a warning

2014-09-01 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com Only give a warning when gcc-ar/nm/ranlib cannot find the plugin. In this case do not pass a plugin argument to the wrapped program. This should make it work on non linker plugin systems, so that the build system can use it unconditionally. gcc/: 2014-09

[PATCH] Add -fno-instrument-function

2014-09-01 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com [This was an old patch of mine that has been posted before, but never made it in] This adds a new C/C++ option to force __attribute__((no_instrument_function)) on every function compiled. This is useful together with LTO. You may want to have the whole

[PATCH] Add direct support for Linux kernel __fentry__ patching

2014-09-01 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com The Linux kernel dynamically patches in __fentry__ calls in and out at runtime. This allows using function tracing for debugging in production kernels without (significant) performance penalty. For this it needs a table pointing to each __fentry__ call

[PATCH] Force rtl templates to be inlined

2014-09-02 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com I noticed that with the trunk compiler a range of the new rtl inlines show up as hot in a profiler during stage1. I think that happens because stage1 is not using optimization and does not inline plain inline. And these rtl inlines are very frequently called

Re: [PATCH] Force rtl templates to be inlined

2014-09-02 Thread Andi Kleen
there have been bugs in the past in the area of always_inline too. You're arguing for my patch. It would find those bugs. -Andi

Re: [PATCH] Add -fno-instrument-function

2014-09-02 Thread Andi Kleen
Hmm, why not make -no-pg (does that exist?) and/or -mno-fentry I'm not sure. do this? That is, I don't see the need for a new option. That would be really odd behavior. An yes/no option whose default is controlled by other object files' command line. And -pg would be for all files in LTO,

Re: [PATCH] Force rtl templates to be inlined

2014-09-02 Thread Andi Kleen
Or we simply should make -finline work at -O0 (I suppose it might already work?) and use it. Yes that's probably better. There are more hot inlines in the stage 1 profile (like wi::storage_ref or vec::length) I suspect with the ongoing C++'ification that will get worse. -Andi --

  1   2   3   4   5   6   7   >