Marc Glisse marc.gli...@inria.fr writes:
Hello,
this is a stage 1 patch, and I'll ping it then, but if you have
comments now...
FWIW i believe the transformation will break a large variety of micro
benchmarks.
calloc internally knows that memory fresh from the OS is zeroed.
But the memory
On Mon, Jun 23, 2014 at 09:00:02PM +0200, Marc Glisse wrote:
On Mon, 23 Jun 2014, Andi Kleen wrote:
FWIW i believe the transformation will break a large variety of micro
benchmarks.
calloc internally knows that memory fresh from the OS is zeroed.
But the memory may not be faulted in yet
On Mon, Jun 23, 2014 at 10:14:25PM +0200, Marc Glisse wrote:
On Mon, 23 Jun 2014, Andi Kleen wrote:
I would prefer to not do it.
For the sake of micro benchmarks?
Yes benchmarks are important.
-Andi
Yi Yang ahyan...@google.com writes:
Hi,
This patch adds an option. When the option is enabled, GCC will add a
record about it in an elf section called
.gnu.switches.text.branch.annotation for every branch.
This would be nice to have even in mainline for the normal profiling.
-Andi
--
Andrew Pinski pins...@gmail.com writes:
I committed this as obvious. The changelog says it all.
I think the problem is that LTO bootstraps frequently error out with
-Werror. That incentives LTO users to bootstrap with --disable-werror.
[e.g. currently the graphite libraries don't build this
Ilya Enkovich enkovich@gmail.com writes:
Silvermont processors have penalty for instructions having 4+ bytes of
prefixes (including escape bytes in opcode). This situation happens
when REX prefix is used in SSE4 instructions. This patch tries to
avoid such situation by preferring
Mike Stump mikest...@comcast.net writes:
Silvermont processors have penalty for instructions having 4+ bytes of
prefixes (including escape bytes in opcode). This situation happens
when REX prefix is used in SSE4 instructions. This patch tries to
avoid such situation by preferring xmm0-xmm7
From: Andi Kleen a...@linux.intel.com
In my tests the optimized glibc out of line strcmp is always faster than
using inline rep ; cmpsb, even for small strings. The Intel optimization manual
also recommends to not use it. So remove the cmpstrnsi instruction.
Tested on Sandy Bridge, Westmere
From: Andi Kleen a...@linux.intel.com
The peephole that removes the code to compute a tristate for cmpstrnsi
when only a boolean jump is needed never triggers in my tests. Just
remove it.
gcc/:
2014-07-02 Andi Kleen a...@linux.intel.com
* config/i386/i386.md: Remove peepholes
[I couldn't find a patch submission address for ICL, so I'm sending this
here]
With ICL enabled and an LTO boot strap the ICL build always errors
out due to -Werror=maybe-undefined. The following patch fixes
the LTO build for me by initializing the variables in question.
All warnings were
On Wed, Jul 09, 2014 at 10:17:01AM -0700, Mike Stump wrote:
On Jul 8, 2014, at 9:01 PM, Andi Kleen a...@firstfloor.org wrote:
With ICL enabled and an LTO boot strap the ICL build always errors
out due to -Werror=maybe-undefined. The following patch fixes
the LTO build for me by initializing
This patchkit abstracts incremental hashing in tree.c and lto.c
to make it easier to plug in new and more efficient hash algorithms.
Right now it uses the old hash algorithms. So it's just a cleanup.
Passes bootstrap and testing on x86_64-linux.
-Andi
From: Andi Kleen a...@linux.intel.com
Should not really change any behavior, it's just a more abstract
interface, but uses the same underlying hash functions.
lto/:
2014-07-10 Andi Kleen a...@linux.intel.com
* lto.c (hash_canonical_type): Convert to inchash
From: Andi Kleen a...@linux.intel.com
Some files in gcc, like lto or tree, do large scale incremential hashing.
The current jhash implementation of this could be likely improved
by using an incremential hash that does not do a full rehashing
for every new value added.
This patch adds a new
From: Andi Kleen a...@linux.intel.com
No substantial changes, although the hash values will be slightly
different.
gcc/:
2014-07-10 Andi Kleen a...@linux.intel.com
* lto-streamer-out.c (hash_tree): Convert to inchash.
(add_flag): New macro.
---
gcc/lto-streamer-out.c | 245
From: Andi Kleen a...@linux.intel.com
Again should not really change any behavior (except
for some minor differences with empty types)
gcc/:
2014-07-10 Andi Kleen a...@linux.intel.com
* tree.c (build_type_attribute_qual_variant): Use inchash.
(type_hash_list): Dito
On Wed, Jul 16, 2014 at 10:40:53PM -0400, Trevor Saunders wrote:
+++ b/gcc/inchash.h
+class inchash
+{
+ hashval_t val;
normal style would be explicit private: at the end.
Ok.
+ public:
+
+ /* Start incremential hashing, optionally with SEED. */
+ void begin (hashval_t
Andi Kleen a...@firstfloor.org writes:
This patchkit abstracts incremental hashing in tree.c and lto.c
to make it easier to plug in new and more efficient hash algorithms.
Right now it uses the old hash algorithms. So it's just a cleanup.
Passes bootstrap and testing on x86_64-linux.
Ping
Btw, what will be the way to plug in an alternative hash function?
That is, there doesn't seem to be a separation of interface
and implementation in your patch (like with a template or a base-class
you inherit from).
Just change the inchash.h include file. The point was to only
change a
On Tue, Jul 22, 2014 at 09:40:15PM -0600, Jeff Law wrote:
On 07/15/14 23:31, Andi Kleen wrote:
From: Andi Kleen a...@linux.intel.com
No substantial changes, although the hash values will be slightly
different.
gcc/:
2014-07-10 Andi Kleen a...@linux.intel.com
* lto-streamer
I think we managed to
stay bytecode compatible for 4.8 release series. (Richi knows better)
Nope, fortran options broke it at some point.
-Andi
--
a...@linux.intel.com -- Speaking for myself only.
On Wed, Jul 23, 2014 at 06:00:35PM +0200, Richard Biener wrote:
On July 23, 2014 5:15:53 PM CEST, Andi Kleen a...@firstfloor.org wrote:
I think we managed to
stay bytecode compatible for 4.8 release series. (Richi knows better)
Nope, fortran options broke it at some point.
We try hard
On Wed, Jul 23, 2014 at 04:21:59PM +0200, Richard Biener wrote:
On Wed, Jul 23, 2014 at 5:40 AM, Jeff Law l...@redhat.com wrote:
On 07/15/14 23:31, Andi Kleen wrote:
From: Andi Kleen a...@linux.intel.com
No substantial changes, although the hash values will be slightly
different
So there will be at most one hash implementation?
One per binary I expect. Modern hash functions are pretty good,
so it's unlikely that someone needs to come up with special
purpose hashes.
I found Bob Jenkins' spooky is rather good for this case (very
large incremential keys), but it is only
Why didn't you replace the tree.c uses BTW?
Patches were already quite big, but I'll add it.
Actually I handled them all in tree.c. Did you
mean something else?
I didn't convert all of tree-ssa-* and dwarf* so far,
and a few other places. This can be done step by step.
-Andi
This version addresses the review feedback. begin is gone now.
add_flag is in the class. The changes in tree.c are nearer
the original code now. Some other minor cleanups.
Passed bootstrap and test and x86_64-linux. Ok to commit
now?
Thanks,
-Andi
From: Andi Kleen a...@linux.intel.com
Should not really change any behavior, it's just a more abstract
interface, but uses the same underlying hash functions.
lto/:
2014-07-24 Andi Kleen a...@linux.intel.com
* lto.c (hash_canonical_type): Convert to inchash
From: Andi Kleen a...@linux.intel.com
Some files in gcc, like lto or tree, do large scale incremential hashing.
The current jhash implementation of this could be likely improved
by using an incremential hash that does not do a full rehashing
for every new value added.
This patch adds a new
From: Andi Kleen a...@linux.intel.com
No substantial changes, although the hash values will be slightly
different.
v2: add_flag moved to inchash. Some minor changes.
gcc/:
2014-07-24 Andi Kleen a...@linux.intel.com
* lto-streamer-out.c (hash_tree): Convert to inchash.
---
gcc/lto
From: Andi Kleen a...@linux.intel.com
v2: Use commutative interface. Be much nearer to the old
code.
gcc/:
2014-07-24 Andi Kleen a...@linux.intel.com
* tree.c (build_type_attribute_qual_variant): Use inchash.
(type_hash_list): Dito.
(attribute_hash_list): Dito
Jan Hubicka hubi...@ucw.cz writes:
I am lto bootstrapping/regtesting x86_64-linux and intend to comming once it
passes.
You'll have to redo it with hstates, sorry, as it conflicts with my
patchkit which I checked in earlier.
-Andi
--
a...@linux.intel.com -- Speaking for myself only
This patchkit converts more incremental hash users to the new
inchash class. The only larger change is for rtl hashing,
which I had to move to a new file to avoid problems
with the generator program. All changes should only
minimally change behavior.
Bootstrapped and tested on x86_64-linux. Ok
From: Andi Kleen a...@linux.intel.com
gcc/:
2014-07-25 Andi Kleen a...@linux.intel.com
* asan.c (asan_mem_ref_hasher::hash): Convert to inchash.
---
gcc/asan.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/gcc/asan.c b/gcc/asan.c
index 475dd82..f7fa55f
From: Andi Kleen a...@linux.intel.com
gcc/:
2014-07-25 Andi Kleen a...@linux.intel.com
* ipa-devirt.c (polymorphic_call_target_hasher::hash):
Convert to inchash.
---
gcc/ipa-devirt.c | 20 +---
1 file changed, 9 insertions(+), 11 deletions(-)
diff --git a/gcc
From: Andi Kleen a...@linux.intel.com
gcc/:
2014-07-25 Andi Kleen a...@linux.intel.com
* tree-ssa-dom.c (iterative_hash_exprs_commutative): Convert to inchash.
(iterative_hash_hashable_expr): Dito.
(avail_expr_hash): Dito.
---
gcc/tree-ssa-dom.c | 79
From: Andi Kleen a...@linux.intel.com
gcc/:
2014-07-25 Andi Kleen a...@linux.intel.com
* tree-ssa-tail-merge.c (same_succ_hash): Convert to inchash.
---
gcc/tree-ssa-tail-merge.c | 22 ++
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/gcc/tree-ssa
From: Andi Kleen a...@linux.intel.com
gcc/:
2014-07-25 Andi Kleen a...@linux.intel.com
* tree-ssa-sccvn.c (vn_reference_op_compute_hash):
(vn_reference_compute_hash):
(vn_nary_op_compute_hash):
(vn_phi_compute_hash):
* tree-ssa-sccvn.h
From: Andi Kleen a...@linux.intel.com
Convert dwarf2out and rtl.c to the new inchash interface.
I moved the rtl hash code to another file to avoid having to link
all the hash code into the generator functions.
gcc/:
2014-07-25 Andi Kleen a...@linux.intel.com
* Makefile.in (OBJS
Jan Hubicka hubi...@ucw.cz writes:
There are similar testcases in bugzilla where we do not devirtualize because
we
lost track of type promises C++ language makes on memory accesses. This may
give us a clue how common these are.
How would the user know without some optional warning?
-Andi
On Mon, Jul 28, 2014 at 11:48:58AM -0700, Cary Coutant wrote:
+ /* ??? MD5 of another hash doesn't make a lot of sense... */
+ hash = hstate.end();
CHECKSUM (hash);
[citation needed] I don't see why you think that. Maybe it'd be nicer
if we could use hash_loc_operands() to feed its
Joseph Myers jos...@codesourcery.com writes:
On Tue, 11 Nov 2014, Ilya Enkovich wrote:
Hi,
This patch integrates MPX runtime library into GCC source tree. MPX
runtime is responsible for initialization of MPX feature in HW, signal
handling, reporting etc. Library is linked to codes
It is similar to libsanitizer. Put it in glibc isn't going to work well
for MPX.
Can you explain it more please?
-Andi
On Tue, Nov 11, 2014 at 01:04:42PM -0800, H.J. Lu wrote:
On Tue, Nov 11, 2014 at 1:01 PM, Andi Kleen a...@firstfloor.org wrote:
It is similar to libsanitizer. Put it in glibc isn't going to work well
for MPX.
Can you explain it more please?
Are you suggesting putting MPX run-time
This patch describes some user visible changes that were
added to gcc 5.
Ok to commit?
-Andi
Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.25
diff -u -r1.25
Andi Kleen a...@firstfloor.org writes:
Ping^2!
Andi Kleen a...@firstfloor.org writes:
Ping!
From: Andi Kleen a...@linux.intel.com
xbegin/xend/xabort were missing memory barriers. This can
lead to memory operations being moved out of transactions, which would
cause unexpected races
Jan Hubicka hubi...@ucw.cz writes:
the C and C++ languages to support data and task parallelism./li
+liNew attribute codeno_reorder/code prevents reordering of
selected symbols.
+ This enables to link-time optimize Linux kernel without need to use
+
H.J. Lu hjl.to...@gmail.com writes:
On Wed, Oct 29, 2014 at 11:07 PM, Andi Kleen a...@linux.intel.com wrote:
Hmm, can't the insns themselves properly clobber/use memory?
The transactions don't really use the memory. They just guard it,
like a lock.
So the intrinsic doesn't know what
Xinliang David Li davi...@google.com writes:
Ok for now as a workraround, but this is probably not a long term fix.
Is the workaround needed for the mainline autofdo version too?
-Andi
David
On Mon, Nov 17, 2014 at 12:47 PM, Dehao Chen de...@google.com wrote:
The patch was updated to
From: Andi Kleen a...@linux.intel.com
Make -Q --help print the --param default, min, max values, similar
to how it does print the defaults for other flags. This is useful
to let a option auto tuner automatically query all needed information
abourt gcc params (previously it needed to access
From: Andi Kleen a...@linux.intel.com
-fshort-double has crashes the compiler since 4.6 (see PR60410)
Since it's an obscure option that apparently nobody uses it the
best way to fix it seems to just remove it.
This prevents constant ICEs when running an gcc optimization flags
autotuner.
gcc
-not __fentry__ } } */
/* Origin: Andi Kleen */
diff --git a/gcc/testsuite/gcc.target/i386/record-mcount.c
b/gcc/testsuite/gcc.target/i386/record-mcount.c
index dae413e..26b0dbc 100644
--- a/gcc/testsuite/gcc.target/i386/record-mcount.c
+++ b/gcc/testsuite/gcc.target/i386/record-mcount.c
@@ -1,5 +1,5
On Sat, Sep 27, 2014 at 06:45:21PM +0200, Dominique d'Humières wrote:
I think the patch for gcc.target/i386/nop-mcount.c should be
True. Thanks.
-Andi
From: Andi Kleen a...@linux.intel.com
In my tests the optimized glibc out of line strcmp is always faster than
using inline rep ; cmpsb, even for small strings. The Intel optimization manual
also recommends to not use it. So remove the cmpstrnsi instruction.
Tested on Sandy Bridge, Westmere
From: Andi Kleen a...@linux.intel.com
The peephole that removes the code to compute a tristate for cmpstrnsi
when only a boolean jump is needed never triggers in my tests. Just
remove it.
gcc/:
2014-09-27 Andi Kleen a...@linux.intel.com
* config/i386/i386.md: Remove peepholes
On Sat, Sep 27, 2014 at 08:45:18PM +0200, Oleg Endo wrote:
On Sat, 2014-09-27 at 11:10 -0700, Andi Kleen wrote:
From: Andi Kleen a...@linux.intel.com
In my tests the optimized glibc out of line strcmp is always faster than
using inline rep ; cmpsb, even for small strings. The Intel
As we saw LTO fixes for -fshort-double it's clear that this flag _is_ used
for some embedded archs.
Did we? It has been ICEing since 4.5, which is before LTO.
-Andi
So - no, you can't simply remove it. But IMHO it should become a
target-specific flag.
How about a patch to just disable it for x86?
-Andi
From: Andi Kleen a...@linux.intel.com
gcc/testsuite/:
2014-09-30 Andi Kleen a...@linux.intel.com
* c-c++-common/cilk-plus/CK/errors.c: New test.
---
gcc/testsuite/c-c++-common/cilk-plus/CK/errors.c | 56
1 file changed, 56 insertions(+)
create mode 100644
From: Andi Kleen a...@linux.intel.com
gcc/testsuite/:
2014-09-30 Andi Kleen a...@linux.intel.com
* c-c++-common/cilk-plus/AN/misc.c (main): Handle
new cilk errors.
---
gcc/testsuite/c-c++-common/cilk-plus/AN/misc.c | 8
1 file changed, 4 insertions(+), 4 deletions
From: Andi Kleen a...@linux.intel.com
Add calls for several illegal Cilk cases to the C++ frontend.
C++ usually doesn't ICE unlike C on illegal cilk, but it's
better to match C in what is allowed and what is not.
if (_Cilk_spawn ...) is still not errored, but at least it doesn't ICE.
gcc/cp
From: Andi Kleen a...@linux.intel.com
_Cilk_spawn or Cilk array expressions are only allowed on their own,
but not in for(), if(), switch, do, while, goto, etc.
The C parser didn't always check for that, which lead to ICEs earlier
for invalid code.
Add a generic helper that checks this and call
From: Andi Kleen a...@linux.intel.com
Output the correct location for an existing cilk error message.
gcc/c-family/:
2014-09-28 Andi Kleen a...@linux.intel.com
* cilk.c (recognize_spawn): Use expression location
for error message.
---
gcc/c-family/cilk.c | 2 +-
1 file
From: Andi Kleen a...@linux.intel.com
Add calls for several illegal Cilk cases to the C++ frontend.
C++ usually doesn't ICE unlike C on illegal cilk, but it's
better to match C in what is allowed and what is not.
if (_Cilk_spawn ...) is still not errored, but at least it doesn't ICE.
gcc/cp
This version addresses the localization problem pointed out by Joseph.
No other changes. I only reposted the two changed patches in the patchkit,
the others have already been approved.
Passes bootstrap and test suite on x86_64-linux.
-Andi
From: Andi Kleen a...@linux.intel.com
_Cilk_spawn or Cilk array expressions are only allowed on their own,
but not in for(), if(), switch, do, while, goto, etc.
The C parser didn't always check for that, which lead to ICEs earlier
for invalid code.
Add a generic helper that checks this and call
On Fri, Oct 03, 2014 at 07:10:05PM +0200, Paolo Carlini wrote:
Hi,
On 10/03/2014 04:08 PM, Andi Kleen wrote:
+ if (check_no_cilk (destination,
+ Cilk array notation cannot be used as a computed goto expression,
+ %_Cilk_spawn% statement cannot be used as a computed goto
I have no idea, but there are lots of error_at() all over while
don't use _. So I just follow precedence.
The problem is, you are *not* calling error_at directly, you are
According to Joseph it's ok because I named the arguments _msgid.
-Andi
Dehao Chen de...@google.com writes:
+
+@item -fauto-profile
+@itemx -fauto-profile=@var{path}
+@opindex fauto-profile
+Enable sampling based feedback directed optimizations, and optimizations
+generally profitable only with profile feedback available.
+
+The following options are enabled:
Uros Bizjak ubiz...@gmail.com writes:
Hello!
Now that %ebx is no more fixed, we can remove all PIC related
complications in atomic_compare_and_swapdwi_doubleword pattern. The
immediate consequence is, that we avoid hidden xchgs that clobbered
unwinding state.
Could also do the same in
Teresa Johnson tejohn...@google.com writes:
Passes regression tests. Ok for google branches?
+{
+ char parameter[1000];
+ sprintf (parameter, %s=%ld,
On Fri, May 09, 2014 at 08:11:40PM -0700, Teresa Johnson wrote:
Thanks for catching that, I will fix it.
BTW I first misunderstood the goal of you patch.
(probably because there is no documentation ... something that should
also be fixed)
I originally thought it was a way to let user code use
I plan to commit it shortly (i am just slowly progressing through the
bugreports and TODOs cumulated)
- indeed for bigger apps and edit/relink cycle it is an life saver ;)
I haven't tested exactly around this, but I see a ~10s (~5%) improved kernel
LTO build time going from 4.9-20140209 to
Iyer, Balaji V balaji.v.i...@intel.com writes:
The sentence Current only... should be changed to something like this:
Currently all the features except _Cilk_for has been implemented.
It would be also good if the documentation mentioned that you have to
specify -lcilkrts
-Andi
--
Andi Kleen a...@firstfloor.org writes:
Iyer, Balaji V balaji.v.i...@intel.com writes:
The sentence Current only... should be changed to something like this:
Currently all the features except _Cilk_for has been implemented.
It would be also good if the documentation mentioned that you have
On Sat, Mar 08, 2014 at 09:22:54PM +0100, Tobias Burnus wrote:
Andi Kleen wrote:
It would be also good if the documentation mentioned that you have
to specify -lcilkrts
Wouldn't it make more sense to automatically add the option? For
instance like the following? Or do we need to do the same
_Cilk_spawn is the correct keyword. cilk_spawn can be used if the user
includes cilk/cilk.h which has the following 3 lines (and that's the whole
file)
#define cilk_spawn _Cilk_spawn
#define cilk_sync _Cilk_sync
#define cilk_for _Cilk_for
In Cilk there are basically 3 keywords:
Everything except _Cilk_for should be supported.
Imagine you're a new cilk user. For you it's totally obvious
what everything is. But someone new to it they won't it
know anything about everything. So you have to tell them.
-Andi
Andi Kleen a...@firstfloor.org writes:
Ping!
Can someone from the C++ side please approve this patch?
That's the only patch not approved in this patch kit, but blocking
the commit.
-Andi
From: Andi Kleen a...@linux.intel.com
Add calls for several illegal Cilk cases to the C++ frontend.
C
From: Andi Kleen a...@linux.intel.com
xbegin/xend/xabort were missing memory barriers. This can
lead to memory operations being moved out of transactions, which would
cause unexpected races.
Always generate implicit memory barriers for these intrinsics.
The compat header versions always
Hmm, can't the insns themselves properly clobber/use memory?
The transactions don't really use the memory. They just guard it,
like a lock.
So the intrinsic doesn't know what memory is used inside the transaction,
but the accesses still cannot be moved out.
I think a barrier is the only
Andi Kleen a...@firstfloor.org writes:
Ping!^2
Andi Kleen a...@firstfloor.org writes:
Ping!
Can someone from the C++ side please approve this patch?
That's the only patch not approved in this patch kit, but blocking
the commit.
-Andi
From: Andi Kleen a...@linux.intel.com
Add calls
Roman Gareev gareevro...@gmail.com writes:
Hi Tobias,
I've attached a patch which removes using of CLooG library from
Graphite. Is it fine for trunk?
Could you please also remove -Werror by default from cloog?
Currently with LTO builds warnings in one of these libraries
usually break the
Andi Kleen a...@firstfloor.org writes:
Ping!^3
Andi Kleen a...@firstfloor.org writes:
Ping!^2
Andi Kleen a...@firstfloor.org writes:
Ping!
Can someone from the C++ side please approve this patch?
That's the only patch not approved in this patch kit, but blocking
the commit.
-Andi
Andi Kleen a...@firstfloor.org writes:
Ping!
From: Andi Kleen a...@linux.intel.com
xbegin/xend/xabort were missing memory barriers. This can
lead to memory operations being moved out of transactions, which would
cause unexpected races.
Always generate implicit memory barriers
On Sun, Nov 09, 2014 at 11:03:50PM -0600, Jason Merrill wrote:
On 10/01/2014 11:26 PM, Andi Kleen wrote:
+ if (check_no_cilk (cond, in a condition for a for-loop))
Why is this one in while the others are as?
I think in was somewhere hard coded in the test suite
and I wanted to minimize test
Andi Kleen a...@firstfloor.org writes:
PING!
Andi Kleen a...@firstfloor.org writes:
PING^2 !
Would be nice to make slim bootstrap work, it really speeds it up quite
a bit.
From: Andi Kleen a...@linux.intel.com
To use gcc-{ar,ranlib} for boot strap we need to add a -B option
Hi Richard,
On Thu, Aug 28, 2014 at 10:18:22AM +0200, Richard Biener wrote:
This also matches joined -B/foo
+{
+ const char *arg = av[i] + 2;
+ const char *end;
+
+ memmove (av + i, av + i + 1, sizeof (char *) * ((ac + 1) - i));
+ ac--;
+ if (*arg ==
From: Andi Kleen a...@linux.intel.com
To use gcc-{ar,ranlib} for boot strap we need to add a -B option
to the tool. Since ar has weird and unusual argument conventions
implement the code by hand instead of using any libraries.
v2: Fix typo
v3: Improve comments. Use strlen. Use DIR_SEPARATOR. Add
From: Andi Kleen a...@linux.intel.com
Change the bootstrap-lto config file to use slim (non fat) LTO..
Speeds up the LTO bootstrap by ~18% on a 4 core system.
This requires using gcc-ar/ranlib in post stage 1 builds, so these
are passed to all sub builds.
v2: Change existing config file
This patchkit implements slim LTO bootstrap, which speeds up LTO
bootstrap by only compiling once.
I implemented all of Richard's feedback for the new -B option for
gcc-ar.
Passes full bootstrap and test on x86_64-linux and LTO bootstrap.
Ok to commit now?
-Andi
Andi Kleen a...@firstfloor.org writes:
diff --git a/config/bootstrap-lto-slim.mk b/config/bootstrap-lto-slim.mk
new file mode 100644
index 000..9e065e1
--- /dev/null
+++ b/config/bootstrap-lto-slim.mk
This file was not supposed to be included anymore. I removed it in my
version. Instead
-ffat-lto-objects is automatically active for hosts not supporting the
linker plugin. Is gcc-ar$(exeext) unconditionally built and found
on such hosts? Will gcc-ar work without finding a linker plugin
on such hosts?
Currently it errors out. I suppose that could be turned into a warning
I
On Mon, Sep 01, 2014 at 01:34:17PM +0200, Richard Biener wrote:
On Sun, Aug 31, 2014 at 4:51 PM, Andi Kleen a...@firstfloor.org wrote:
From: Andi Kleen a...@linux.intel.com
To use gcc-{ar,ranlib} for boot strap we need to add a -B option
to the tool. Since ar has weird and unusual
From: Andi Kleen a...@linux.intel.com
Only give a warning when gcc-ar/nm/ranlib cannot find the plugin.
In this case do not pass a plugin argument to the wrapped program.
This should make it work on non linker plugin systems, so
that the build system can use it unconditionally.
gcc/:
2014-09
From: Andi Kleen a...@linux.intel.com
[This was an old patch of mine that has been posted before,
but never made it in]
This adds a new C/C++ option to force
__attribute__((no_instrument_function)) on every function compiled.
This is useful together with LTO. You may want to have the whole
From: Andi Kleen a...@linux.intel.com
The Linux kernel dynamically patches in __fentry__ calls in and
out at runtime. This allows using function tracing for debugging
in production kernels without (significant) performance penalty.
For this it needs a table pointing to each __fentry__ call
From: Andi Kleen a...@linux.intel.com
I noticed that with the trunk compiler a range of the new rtl
inlines show up as hot in a profiler during stage1. I think
that happens because stage1 is not using optimization
and does not inline plain inline. And these rtl inlines
are very frequently called
there have been bugs in the past in the area of always_inline too.
You're arguing for my patch. It would find those bugs.
-Andi
Hmm, why not make -no-pg (does that exist?) and/or -mno-fentry
I'm not sure.
do this? That is, I don't see the need for a new option.
That would be really odd behavior. An yes/no option whose default
is controlled by other object files' command line.
And -pg would be for all files in LTO,
Or we simply should make -finline work at -O0 (I suppose it might already
work?) and use it.
Yes that's probably better. There are more hot inlines in the stage 1 profile
(like wi::storage_ref or vec::length)
I suspect with the ongoing C++'ification that will get worse.
-Andi
--
1 - 100 of 644 matches
Mail list logo