[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2024-03-03 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
Bug 84402 depends on bug 113575, which changed state.

Bug 113575 Summary: [14 Regression] memory hog building insn-opinit.o 
(i686-linux-gnu -> riscv64-linux-gnu)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113575

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-10-31 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
Bug 84402 depends on bug 54179, which changed state.

Bug 54179 Summary: please split insn-emit.c !
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54179

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-10-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #71 from CVS Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:184378027e92f51e02d3649e0ca523f487fd2810

commit r14-5034-g184378027e92f51e02d3649e0ca523f487fd2810
Author: Robin Dapp 
Date:   Thu Oct 12 11:23:26 2023 +0200

genemit: Split insn-emit.cc into several partitions.

On riscv insn-emit.cc has grown to over 1.2 mio lines of code and
compiling it takes considerable time.
Therefore, this patch adjust genemit to create several partitions
(insn-emit-1.cc to insn-emit-n.cc).  The available patterns are
written to the given files in a sequential fashion.

Similar to match.pd a configure option --with-emitinsn-partitions=num
is introduced that makes the number of partition configurable.

gcc/ChangeLog:

PR bootstrap/84402
PR target/111600

* Makefile.in: Handle split insn-emit.cc.
* configure: Regenerate.
* configure.ac: Add --with-insnemit-partitions.
* genemit.cc (output_peephole2_scratches): Print to file instead
of stdout.
(print_code): Ditto.
(gen_rtx_scratch): Ditto.
(gen_exp): Ditto.
(gen_emit_seq): Ditto.
(emit_c_code): Ditto.
(gen_insn): Ditto.
(gen_expand): Ditto.
(gen_split): Ditto.
(output_add_clobbers): Ditto.
(output_added_clobbers_hard_reg_p): Ditto.
(print_overload_arguments): Ditto.
(print_overload_test): Ditto.
(handle_overloaded_code_for): Ditto.
(handle_overloaded_gen): Ditto.
(print_header): New function.
(handle_arg): New function.
(main): Split output into 10 files.
* gensupport.cc (count_patterns): New function.
* gensupport.h (count_patterns): Define.
* read-md.cc (md_reader::print_md_ptr_loc): Add file argument.
* read-md.h (class md_reader): Change definition.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-07-07 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
Bug 84402 depends on bug 54179, which changed state.

Bug 54179 Summary: please split insn-emit.c !
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54179

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|WONTFIX |---

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-05-05 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #70 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:0a85544e1aaeca41133ecfc438cda913dbc0f122

commit r14-501-g0a85544e1aaeca41133ecfc438cda913dbc0f122
Author: Tamar Christina 
Date:   Fri May 5 13:42:17 2023 +0100

match.pd: Use splits in makefile and make configurable.

This updates the build system to split up match.pd files into chunks of 10.
This also introduces a new flag --with-matchpd-partitions which can be used
to
change the number of partitions.

For the analysis of why 10 please look at the previous patch in the series.

gcc/ChangeLog:

PR bootstrap/84402
* Makefile.in (NUM_MATCH_SPLITS, MATCH_SPLITS_SEQ,
GIMPLE_MATCH_PD_SEQ_SRC, GIMPLE_MATCH_PD_SEQ_O,
GENERIC_MATCH_PD_SEQ_SRC, GENERIC_MATCH_PD_SEQ_O): New.
(OBJS, MOSTLYCLEANFILES, .PRECIOUS): Use them.
(s-match): Split into s-generic-match and s-gimple-match.
* configure.ac (with-matchpd-partitions,
DEFAULT_MATCHPD_PARTITIONS): New.
* configure: Regenerate.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-05-05 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #69 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:703417a030b3d80f55ba1402adc3f1692d3631e5

commit r14-500-g703417a030b3d80f55ba1402adc3f1692d3631e5
Author: Tamar Christina 
Date:   Fri May 5 13:38:50 2023 +0100

match.pd: automatically partition *-match.cc files.

Following on from Richi's RFC[1] this is another attempt to split up
match.pd
into multiple gimple-match and generic-match files.  This version is fully
automated and requires no human intervention.

First things first, some perf numbers.  The following shows the effect of
the
patch on my desktop doing parallel compilation of gimple-match:

++--++--+
| splits | rel. improvement | splits | rel. improvement |
++--++--+
|  1 | 0.00%| 33 | 91.03%   |
|  2 | 71.77%   | 34 | 84.02%   |
|  3 | 100.71%  | 35 | 83.42%   |
|  4 | 143.08%  | 36 | 78.80%   |
|  5 | 176.18%  | 37 | 74.06%   |
|  6 | 174.40%  | 38 | 55.76%   |
|  7 | 176.62%  | 39 | 66.90%   |
|  8 | 168.35%  | 40 | 18.25%   |
|  9 | 189.80%  | 41 | 16.55%   |
| 10 | 171.77%  | 42 | 47.02%   |
| 11 | 152.82%  | 43 | 15.29%   |
| 12 | 112.20%  | 44 | 21.63%   |
| 13 | 158.57%  | 45 | 41.53%   |
| 14 | 158.57%  | 46 | 21.98%   |
| 15 | 152.07%  | 47 | -42.74%  |
| 16 | 151.70%  | 48 | -32.62%  |
| 17 | 131.52%  | 49 | 11.81%   |
| 18 | 133.11%  | 50 | 34.07%   |
| 19 | 137.33%  | 51 | 2.71%|
| 20 | 103.83%  | 52 | -22.23%  |
| 21 | 132.47%  | 53 | 32.30%   |
| 22 | 116.52%  | 54 | 21.45%   |
| 23 | 112.73%  | 55 | 40.02%   |
| 24 | 111.94%  | 56 | 42.83%   |
| 25 | 112.73%  | 57 | -9.98%   |
| 26 | 104.07%  | 58 | 18.01%   |
| 27 | 113.27%  | 59 | -4.91%   |
| 28 | 96.77%   | 60 | 22.94%   |
| 29 | 93.42%   | 61 | -3.73%   |
| 30 | 87.67%   | 62 | -27.43%  |
| 31 | 89.54%   | 63 | -1.05%   |
| 32 | 84.42%   | 64 | -5.44%   |
++--++--+

As can be seen there seems to be a point of diminishing returns in doing
splits.
This comes from the fact that these match files consume a sizeable amount
of
headers.  At a certain point the parsing overhead of the headers dominate
and
you start losing in gains.

As such from this I've made the default 10 splits per file to allow for
some
room for growth in the future without needing changes to the split amount.
Since 5-10 show roughly the same gains it means we can afford to double the
file sizes before we need to up the split amount.  This can be controlled
by the configure parameter --with-matchpd-partitions=.

At 10 splits the sizes of the files are:

 1.2M gimple-match-1.cc
 490K gimple-match-2.cc
 459K gimple-match-3.cc
 462K gimple-match-4.cc
 466K gimple-match-5.cc
 690K gimple-match-6.cc
 517K gimple-match-7.cc
 693K gimple-match-8.cc
1011K gimple-match-9.cc
 490K gimple-match-10.cc
 210K gimple-match-auto.h

The reason gimple-match-1.cc is so large is because it got allocated a very
large function: gimple_simplify_NE_EXPR.

Because of these sporadically large functions the allocation to a split
happens
based on the amount of data already written to a split instead of just a
simple
round robin allocation (though the patch supports that too.).   This means
that
once gimple_simplify_NE_EXPR is allocated to gimple-match-1.cc nothing uses
it
again until the rest of the files catch up.

To support this split a new header file *-match-auto.h is generated to
allow
the individual files to compile separately.

Lastly for the auto generated files I use pragmas to silence the unused
predicate warnings instead of the previous Makefile way because I couldn't
find
a way to set them without knowing the number of split files beforehand.

Finally with this change, bootstrap time has dropped 8 minutes on AArch64.

[1] 

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-05-05 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #68 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:27fcf994c5515e1bbf2ff03d28fd2fa927c7e7b5

commit r14-499-g27fcf994c5515e1bbf2ff03d28fd2fa927c7e7b5
Author: Tamar Christina 
Date:   Fri May 5 13:37:49 2023 +0100

genmatch: split shared code to gimple-match-exports.cc

In preparation for automatically splitting match.pd files I split off the
non-static helper functions that are shared between the match.pd functions
off
to another file.

This file can be compiled in parallel and also allows us to later avoid
duplicate symbols errors.

gcc/ChangeLog:

PR bootstrap/84402
* Makefile.in (OBJS): Add gimple-match-exports.o.
* genmatch.cc (decision_tree::gen): Export gimple_gimplify helpers.
* gimple-match-head.cc (gimple_simplify, gimple_resimplify1,
gimple_resimplify2, gimple_resimplify3, gimple_resimplify4,
gimple_resimplify5, constant_for_folding, convert_conditional_op,
maybe_resimplify_conditional_op, gimple_match_op::resimplify,
maybe_build_generic_op, build_call_internal, maybe_push_res_to_seq,
do_valueize, try_conditional_simplification, gimple_extract,
gimple_extract_op, canonicalize_code, commutative_binary_op_p,
commutative_ternary_op_p, first_commutative_argument,
associative_binary_op_p, directly_supported_p,
get_conditional_internal_fn): Moved to gimple-match-exports.cc
* gimple-match-exports.cc: New file.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-05-05 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #66 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:e487fcc0f7466ea663a0fea52076337bebd42b8b

commit r14-497-ge487fcc0f7466ea663a0fea52076337bebd42b8b
Author: Tamar Christina 
Date:   Fri May 5 13:36:01 2023 +0100

match.pd: Remove commented out line pragmas unless -vv is used.

genmatch currently outputs commented out line directives that have no
effect
but the compiler still has to parse only to discard.

They are however handy when debugging genmatch output.  As such this moves
them
behind the -vv flag.

gcc/ChangeLog:

PR bootstrap/84402
* genmatch.cc (output_line_directive): Only emit commented
directive
when -vv.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-05-05 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #67 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:c0ce29bc1ce329001b6c02bb3d34bcbb086e1b72

commit r14-498-gc0ce29bc1ce329001b6c02bb3d34bcbb086e1b72
Author: Tamar Christina 
Date:   Fri May 5 13:36:43 2023 +0100

match.pd: CSE the dump output check.

This is a small improvement in QoL codegen for match.pd to save time not
re-evaluating the condition for printing debug information in every
function.

There is a small but consistent runtime and compile time win here.  The
runtime
win comes from not having to do the condition over again, and on Arm
plaforms
we now use the new test-and-branch support for booleans to only have a
single
instruction here.

gcc/ChangeLog:

PR bootstrap/84402
* genmatch.cc (decision_tree::gen, write_predicate): Generate new
debug_dump var.
(dt_simplify::gen_1): Use it.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-05-05 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #65 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:580cda3c2799b1f8323af770e52f1eb0fa204718

commit r14-496-g580cda3c2799b1f8323af770e52f1eb0fa204718
Author: Tamar Christina 
Date:   Fri May 5 13:35:17 2023 +0100

match.pd: don't emit label if not needed

This is a small QoL codegen improvement for match.pd to not emit labels
when
they are not needed.  The codegen is nice and there is a small (but
consistent)
improvement in compile time.

gcc/ChangeLog:

PR bootstrap/84402
* genmatch.cc (dt_simplify::gen_1): Only emit labels if used.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-03-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #64 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:75cda3be0232f745cda4e177d514f6900390af0b

commit r13-6902-g75cda3be0232f745cda4e177d514f6900390af0b
Author: Richard Biener 
Date:   Tue Mar 28 12:42:14 2023 +0200

bootstrap/84402 - improve (match ...) code generation

The following avoids duplicating matching code for (match ...)
in match.pd when possible.  That's more easily possible for
(match ...) than simplify because we do not need to handle
common matches (those would be diagnosed only during compiling)
nor is the result able to inspect the active operator.

Specifically this reduces the size of the generated matches for
the atomic ops as noted in PR108129.

gimple-match.cc shrinks from 245k lines to 209k lines with this patch.

PR bootstrap/84402
PR tree-optimization/108129
* genmatch.cc (lower_for): For (match ...) delay
substituting into the match operator if possible.
(dt_operand::gen_gimple_expr): For user_id look at the
first substitute for determining how to access operands.
(dt_operand::gen_generic_expr): Likewise.
(dt_node::gen_kids): Properly sort user_ids according
to their substitutes.
(dt_node::gen_kids_1): Code-generate user_id matching.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-03-28 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #63 from rguenther at suse dot de  ---
On Tue, 28 Mar 2023, amonakov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
> 
> Alexander Monakov  changed:
> 
>What|Removed |Added
> 
>  CC||amonakov at gcc dot gnu.org
> 
> --- Comment #61 from Alexander Monakov  ---
> (In reply to Richard Biener from comment #60)
> > > This one is btw. a known issue PR108129.
> > 
> > But the revision only sligthly changes the patterns so I'm very curious
> > how it arrived at 30% slowdown.
> 
> It adds an extra 'convert2?' to 'nop_atomic_bit_test_and_p' matchers, and 
> since
> match.pd expansion works by emitting match subtrees twice for each '?'
> component, that gives an extra 2x factor to the already bad combinatorial
> explosion going on in those patterns.
> 
> We really need to rework match-and-simplify emission in a smarter way. I've
> looked at that in January once, but there's a few things I'd need help
> understanding, such as...
> 
> > The "trivial" improvement of course would be to special-case
> > iterator uses als for (match ...) like we do for (simplify ...) where
> > we can delay substitution.
> 
> ... this. Is there a short explanation what's 'delayed substitution' in this
> context?

'delayed substitution' works for (simplify (...)) by not expanding the
substitution for each (for ..) iterator but instead passing it as
variable to a split out common function.

For (match (...)) the "substitution" part is trivial so there's no
point doing that.  But instead we can look to apply something similar
to the "matching" part.  When we have

(for X (A B ...)
 (simplify
  (op (X (op2 ...) ...))
  ...

we get for the matching of 'X' (if it's not at the toplevel)

 switch (...)
 {
 case A:
  {
   .. match the rest ..
  }
 case B:
  {
   .. match the rest ..
  }
...

but we can instead emit (maybe only in a subset of cases?)

 switch (...)
 {
 case A:
 case B:
 case ...:
  {
   .. mach the rest ..
  }

in theory we support things like

(for X (plus IFN_POW)
 (...

as both operators are binary - so that's cases we cannot handle this way.

Basically we'd keep the user-defined operator in the AST and adjust
code-generation to deal with that.

I will try to do that.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-03-28 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #62 from Jakub Jelinek  ---
Looking at gimple-match.cc, the case CFN_BUILT_IN_ATOMIC_FETCH_OR_{1,2,4,8,16}:
etc. blocks are identical there, except for the numbers in next_after_fail*
label numbers.
So, could we perhaps expand everything the way we do and just when emitting a
switch
hash the subtree of the cases to be emitted and if the hashes are equal also
compare
and if the subtrees are the same (== would result in the same text being
emitted into
the output except for the label numbers) emit multiple cases with the same
block?
Admittedly I haven't looked yet at the data structures genmatch.cc uses before
emitting
the source, so don't know whether it is feasible.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-03-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #61 from Alexander Monakov  ---
(In reply to Richard Biener from comment #60)
> > This one is btw. a known issue PR108129.
> 
> But the revision only sligthly changes the patterns so I'm very curious
> how it arrived at 30% slowdown.

It adds an extra 'convert2?' to 'nop_atomic_bit_test_and_p' matchers, and since
match.pd expansion works by emitting match subtrees twice for each '?'
component, that gives an extra 2x factor to the already bad combinatorial
explosion going on in those patterns.

We really need to rework match-and-simplify emission in a smarter way. I've
looked at that in January once, but there's a few things I'd need help
understanding, such as...

> The "trivial" improvement of course would be to special-case
> iterator uses als for (match ...) like we do for (simplify ...) where
> we can delay substitution.

... this. Is there a short explanation what's 'delayed substitution' in this
context?

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-03-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #60 from Richard Biener  ---
(In reply to Martin Liška from comment #59)
> (In reply to Andrew Carlotti from comment #58)
> > Since November 2021, there's been a significant regression in the compile
> > time for gimple-match.cc during a bootstrap build (+100% in Stage 2, +73% in
> > Stage 3), with this regression accounting for over 20% of the current total
> > bootstrap time on some aarch64 machines.
> 
> Thank for the interesting numbers! Yeah, it's very unfortunate :/
> 
> > 
> > Most of the change in compile time is due to the following 6 commits (of
> > which one is a performance improvement, and one only regressed the Stage 2
> > build):
> > 
> > 7df89377a7ae3906255e38a79be8e5d962c3a0df 24th November 2021
> > Enhance optimize_atomic_bit_test_and to handle truncation. (Hongtao Liu)
> > Stage 2: +27%
> > Stage 3: +33%
> 
> This one is btw. a known issue PR108129.

But the revision only sligthly changes the patterns so I'm very curious
how it arrived at 30% slowdown.

Note these (match ..) patterns that are not used from inside match.pd itself
(and do not use other (match ..)) would be perfect candidates to emit
to separate files.  Either by explicit syntax or magically where the former
would be easier to cater for in the Makefile.

The "trivial" improvement of course would be to special-case
iterator uses als for (match ...) like we do for (simplify ...) where
we can delay substitution.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-03-27 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #59 from Martin Liška  ---
(In reply to Andrew Carlotti from comment #58)
> Since November 2021, there's been a significant regression in the compile
> time for gimple-match.cc during a bootstrap build (+100% in Stage 2, +73% in
> Stage 3), with this regression accounting for over 20% of the current total
> bootstrap time on some aarch64 machines.

Thank for the interesting numbers! Yeah, it's very unfortunate :/

> 
> Most of the change in compile time is due to the following 6 commits (of
> which one is a performance improvement, and one only regressed the Stage 2
> build):
> 
> 7df89377a7ae3906255e38a79be8e5d962c3a0df 24th November 2021
> Enhance optimize_atomic_bit_test_and to handle truncation. (Hongtao Liu)
> Stage 2: +27%
> Stage 3: +33%

This one is btw. a known issue PR108129.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2023-03-27 Thread andrew.carlotti at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Andrew Carlotti  changed:

   What|Removed |Added

 CC||andrew.carlotti at arm dot com

--- Comment #58 from Andrew Carlotti  ---
Since November 2021, there's been a significant regression in the compile time
for gimple-match.cc during a bootstrap build (+100% in Stage 2, +73% in Stage
3), with this regression accounting for over 20% of the current total bootstrap
time on some aarch64 machines.

Most of the change in compile time is due to the following 6 commits (of which
one is a performance improvement, and one only regressed the Stage 2 build):

7df89377a7ae3906255e38a79be8e5d962c3a0df 24th November 2021
Enhance optimize_atomic_bit_test_and to handle truncation. (Hongtao Liu)
Stage 2: +27%
Stage 3: +33%

9a53101caadae1b5c8d791d247b05268ee4f7f92 16th May 2022
Add MIN/MAX folding from fold_cond_expr_with_comparison to match.pd (Richard
Biener)
Stage 2: +15%
Stage 3: +15%

409978d58dafa689c5b3f85013e2786526160f2c 9th August 2022
tree-optimization/106514 - add --param max-jump-thread-paths (Richard Biener)
Stage 2: -7%
Stage 3: -10%

011d0a033ab370ea38b06b813ac62be8dde0801b 18th August 2022
Make path_range_query standalone and add reset_path. (Aldy Hernandez)
Stage 2: +5%
Stage 3: +0%

4d9db4bdd458a4b526f59e4bc5bbd549d3861cea 12th December 2022
middle-end: simplify complex if expressions where comparisons are inverse of
one another. (Tamar Christina)
Stage 2: +10%
Stage 3: +9%

733a1b777f16cd397b43a242d9c31761f66d3da8 13th January 2023
sched-deps: do not schedule pseudos across calls [PR108117] (Alexander Monakov)
Stage 2: +14%
Stage 3: +9%

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-12-01 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #57 from Martin Liška  ---
Created attachment 53997
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53997=edit
Partial linking path

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-12-01 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #56 from Martin Liška  ---
Created attachment 53996
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53996=edit
make all-host on Ryzen 9 with LTO partial linking

Using partial linking for the following 4 objects (gimple-match.o
generic-match.o insn-recog.o insn-emit.o), I can speed up build of all-host by
almost 30s from 145 to 115 seconds).

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-12-01 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #55 from Martin Liška  ---
Created attachment 53995
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53995=edit
make all-host on Ryzen 9

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-12-01 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #54 from Martin Liška  ---
> Try LTOing libbackend.a?

So this option is not feasible as well, we're paying a too high price for
parallel WPA of the LTO and the resulting time on 32 cores is even slower :/

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-11-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #53 from Martin Liška  ---
(In reply to Richard Biener from comment #50)
> (In reply to Martin Liška from comment #48)
> > Created attachment 53989 [details]
> > CPU utilization of make all-host on recent AMD server
> > 
> > The situation with a recent AMD server is really bad! Having 192 cores, the
> > average CPU utilization of `make all-host` is 6% !
> 
> Just do more builds in parallel!

No! I'm speaking about faster edit-build-debug cycles and also about faster
builds of gcc packages.

> There's just 903 .o files in gcc/ and
> libbackend.a just has 490 of them.  It's not surprising the few larger
> files stretch out the compile-time here.

Well, gimple-match.o takes ~66s on my new AMD Ryzen 9 5950X CPU :/

> Try LTOing libbackend.a?

Yep, that's our parallel for free approach and I would welcome that, however:


during IPA pass: inline
In member function ‘quick_push’,
inlined from ‘make_forwarders_with_degenerate_phis’ at
/home/marxin/Programming/gcc/gcc/tree-ssa-dce.cc:1848:6:
/home/marxin/Programming/gcc/gcc/vec.h:1958:28: internal compiler error:
Segmentation fault
 1958 |   return m_vec->quick_push (obj);
  |^
0x102f987 internal_error(char const*, ...)
???:0
0x117935b cgraph_node::get_untransformed_body()
???:0
0x123f6e9 optimize_inline_calls(tree_node*)
???:0
0x123e4d2 inline_transform(cgraph_node*)
???:0
0x123da5f execute_all_ipa_transforms(bool)
???:0
0x15ebe1b cgraph_node::expand()
???:0
0x15e2f6d symbol_table::compile()
???:0
0x15d0368 lto_main()
???:0

I'll isolate that and hope we can add a configure option for LTOed
libbackend.a.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-11-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #52 from Richard Biener  ---
(In reply to Richard Biener from comment #51)
> (In reply to Martin Liška from comment #49)
> 
> [...]
> 
> > Can please any GNU make expect judge here? Starting e.g. gimple-match.cc
> > early would really help
> > to speed up the build process.
> 
> this has come up in the past and there's no reliable way to order things
> (just use make -j on such machines and overcommit?)

Doesn't make a difference to overall time so early starting isn't the issue
it seems.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-11-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #51 from Richard Biener  ---
(In reply to Martin Liška from comment #49)

[...]

> Can please any GNU make expect judge here? Starting e.g. gimple-match.cc
> early would really help
> to speed up the build process.

this has come up in the past and there's no reliable way to order things
(just use make -j on such machines and overcommit?)

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-11-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #50 from Richard Biener  ---
(In reply to Martin Liška from comment #48)
> Created attachment 53989 [details]
> CPU utilization of make all-host on recent AMD server
> 
> The situation with a recent AMD server is really bad! Having 192 cores, the
> average CPU utilization of `make all-host` is 6% !

Just do more builds in parallel!  There's just 903 .o files in gcc/ and
libbackend.a just has 490 of them.  It's not surprising the few larger
files stretch out the compile-time here.  Try LTOing libbackend.a?

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-11-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #49 from Martin Liška  ---
One more observation I made, apparently we're trying to sort (in Makefile.in)
OBJS with the biggest at the very beginning:

  1295  # Language-independent object files.
  1296  # We put the *-match.o and insn-*.o files first so that a parallel make
  1297  # will build them sooner, because they are large and otherwise tend to
be
  1298  # the last objects to finish building.
  1299  OBJS = \
  1300  gimple-match.o \
  1301  generic-match.o \
  1302  insn-attrtab.o \
  1303  insn-automata.o \

That's fine, plus we introduce dependency for all objects to depend on
generated_files:

  4441  # In order for parallel make to really start compiling the expensive
  4442  # objects from $(OBJS) as early as possible, build all their
  4443  # prerequisites strictly before all objects.
    $(ALL_HOST_OBJS) : | $(generated_files)

Using that, we should see gimple-match.o being spawned very soon, but it's not
the case. Imagine you have already built all-host and let's see what happens:

$ rm -f gimple-match.o ; rm -f tree*.o && make -j4 --debug=b libbackend.a 2>&1
| less
...
   File 'gimple-match.o' does not exist.
 Prerequisite 'cs-bconfig.h' is newer than target 'bconfig.h'.
Must remake target 'bconfig.h'.
 Prerequisite 'cstamp-h' is newer than target 'auto-host.h'.
Must remake target 'auto-host.h'.
 Prerequisite 's-options' is newer than target
'optionlist'.
Must remake target 'optionlist'.
 Prerequisite 's-gtyp-input' is newer than target
'gtyp-input.list'.
Must remake target 'gtyp-input.list'.
 Prerequisite 's-bversion' is newer than target
'bversion.h'.
Must remake target 'bversion.h'.
 Prerequisite 'cs-config.h' is newer than target 'config.h'.
Must remake target 'config.h'.
...
   File 'tree-vrp.o' does not exist.
   File 'tree.o' does not exist.
 Prerequisite 's-i386-bt' is newer than target 'i386-builtin-types.inc'.
Must remake target 'i386-builtin-types.inc'.
   File 'gimple-match.o' does not exist.
 Prerequisite 's-modes-h' is newer than target 'insn-modes.h'.
Must remake target 'insn-modes.h'.
 Prerequisite 's-modes-inline-h' is newer than target
'insn-modes-inline.h'.
Must remake target 'insn-modes-inline.h'.
 Prerequisite 's-version' is newer than target 'version.h'.
Must remake target 'version.h'.
 Prerequisite 's-options-h' is newer than target 'options.h'.
Must remake target 'options.h'.
 Prerequisite 's-genrtl-h' is newer than target 'genrtl.h'.
Must remake target 'genrtl.h'.
 Prerequisite 's-modes-m' is newer than target 'min-insn-modes.cc'.
Must remake target 'min-insn-modes.cc'.
...
   File 'gimple-match.o' does not exist.
 Prerequisite 's-gtype' is newer than target 'gtype-desc.h'.
Must remake target 'gtype-desc.h'.
 Prerequisite 's-constants' is newer than target
'insn-constants.h'.
Must remake target 'insn-constants.h'.
...
  Must remake target 'tree-affine.o'.
g++  -fno-PIE -c   -g -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common 
-DHAVE_CONFIG_H -I. -I. -I/home/marxin/Programming/gcc/gcc
-I/home/marxin/Programming/gcc/gcc/.
-I/home/marxin/Programming/gcc/gcc/../include
-I/home/marxin/Programming/gcc/gcc/../libcpp/include
-I/home/marxin/Programming/gcc/gcc/../libcody 
-I/home/marxin/Programming/gcc/gcc/../libdecnumber
-I/home/marxin/Programming/gcc/gcc/../libdecnumber/bid -I../libdecnumber
-I/home/marxin/Programming/gcc/gcc/../libbacktrace   -o tree-affine.o -MT
tree-affine.o -MMD -MP -MF ./.deps/tree-affine.TPo
/home/marxin/Programming/gcc/gcc/tree-affine.cc
   File 'tree-call-cdce.o' does not exist.
  Must remake target 'tree-call-cdce.o'.
g++  -fno-PIE -c   -g -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common 
-DHAVE_CONFIG_H -I. -I. -I/home/marxin/Programming/gcc/gcc
-I/home/marxin/Programming/gcc/gcc/.
-I/home/marxin/Programming/gcc/gcc/../include
-I/home/marxin/Programming/gcc/gcc/../libcpp/include
-I/home/marxin/Programming/gcc/gcc/../libcody 
-I/home/marxin/Programming/gcc/gcc/../libdecnumber
-I/home/marxin/Programming/gcc/gcc/../libdecnumber/bid -I../libdecnumber
-I/home/marxin/Programming/gcc/gcc/../libbacktrace   -o tree-call-cdce.o -MT
tree-call-cdce.o -MMD -MP -MF 

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-11-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #48 from Martin Liška  ---
Created attachment 53989
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53989=edit
CPU utilization of make all-host on recent AMD server

The situation with a recent AMD server is really bad! Having 192 cores, the
average CPU utilization of `make all-host` is 6% !

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-06-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #47 from Segher Boessenkool  ---
(In reply to Sam James from comment #46)
> Even partially making the build less recursive would likely help a fair bit.

It will help a bit, sure, but not nearly as much as you perhaps hope for.

There are quite a few "synchronisation" points where nothing after it can be
done until everything before it has been done.  Partly this is just because
we have a three-stage bootstrap, but also there are some generator programs
that everything else depends on (on its output that is), and those are real
chokepoints.

Also, recursive make is a scourge of humanity, for sure, but fixing this has
to be done in auto first and foremost.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2022-05-28 Thread sam at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Sam James  changed:

   What|Removed |Added

 CC||sam at gentoo dot org

--- Comment #46 from Sam James  ---
Even partially making the build less recursive would likely help a fair bit. 

The classic text on this is
https://accu.org/journals/overload/14/71/miller_2004/. This doesn't mean that
splitting up files is futile, but when watching a build, much of the time, make
doesn't even get to traverse into each of the directories, because it doesn't
know if it's able to. It can safely be done in stages.

Using includes would let you get a lot of the current state wrt split
directories. Could even just have a certain number of toplevel directories but
non-recursive within them.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2021-10-31 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #45 from Eric Gallager  ---
(In reply to Martin Liška from comment #0)
> [...]
> Then I built GCC with -j1 and used following parser to generate reports:
> https://github.com/marxin/script-misc/blob/master/parse-make-log.py

The new URL for that script is now this, btw:
https://github.com/marxin/script-misc/blob/master/legacy/parse-make-log.py

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2021-10-11 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #44 from Eric Gallager  ---
(In reply to Martin Liška from comment #43)
> (In reply to Eric Gallager from comment #42)
> > Is this just about parallelism bottlenecks for the main build target (e.g.
> > just `make` or `make all`), or does it apply to other Makefile targets, too?
> > (e.g. the testsuite via `make check`, or docs via `make pdf` or something)
> 
> Well, it was intended to cover only the main build, which pdf can be seen as
> part of.

I usually have to run `make pdf` as a separate build target, though, as it
doesn't get run as part of the main build for me... and the bottleneck there,
for the pdf target, is in libstdc++ for me...

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2021-10-11 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #43 from Martin Liška  ---
(In reply to Eric Gallager from comment #42)
> Is this just about parallelism bottlenecks for the main build target (e.g.
> just `make` or `make all`), or does it apply to other Makefile targets, too?
> (e.g. the testsuite via `make check`, or docs via `make pdf` or something)

Well, it was intended to cover only the main build, which pdf can be seen as
part of. On the other hand, `make check` should belong to a different PR if you
have troubles with it.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2021-10-09 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #42 from Eric Gallager  ---
Is this just about parallelism bottlenecks for the main build target (e.g. just
`make` or `make all`), or does it apply to other Makefile targets, too? (e.g.
the testsuite via `make check`, or docs via `make pdf` or something)

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2021-07-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #41 from Andrew Pinski  ---
Latest discussion of this can also be found at:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571555.html

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2021-07-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2018-09-07 00:00:00 |2021-7-18
   Target Milestone|10.4|---

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2021-04-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|10.3|10.4

--- Comment #40 from Richard Biener  ---
GCC 10.3 is being released, retargeting bugs to GCC 10.4.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2020-07-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|10.2|10.3

--- Comment #39 from Richard Biener  ---
GCC 10.2 is released, adjusting target milestone.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2020-07-12 Thread rjiejie at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #38 from jojo  ---
(In reply to Martin Liška from comment #36)
> (In reply to jojo from comment #35)
> > (In reply to Martin Liška from comment #30)
> > > A possible solution can be usage of '-flinker-output=nolto-rel -r' for 
> > > huge
> > > files.
> > 
> > it's useful for splitting huge files ?
> 
> There's experiment I did:
> 
> $ time g++ -O2 /tmp/gimple-match.ii -c
> 
> real0m35.790s
> user0m35.490s
> sys0m0.268s
> 
> $ time g++ -O2 /tmp/gimple-match.ii -c -flto
> 
> real0m8.138s
> user0m7.915s
> sys0m0.202s
> 
> $ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
> gimple-match2.o
> 
> real0m9.087s
> user1m56.028s
> sys0m3.292s
> 
> $ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
> gimple-match2.o --param lto-partitions=8
> 
> real0m7.350s
> user0m48.548s
> sys0m0.976s
> 
> $ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
> gimple-match2.o --param lto-partitions=4
> 
> real0m9.847s
> user0m30.462s
> sys0m0.392s
> 
> so for N==4 we get to 8+10s = 18s (compared to the original 36s). And total
> user time is 30+8, which is comparable
> to the original 36s.

It's looks a little cost down for huge file as insn-emit.c..
I want to use shell tool like 'csplit' to split it and compile parallelly

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2020-07-09 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #37 from Richard Biener  ---
(In reply to Martin Liška from comment #36)
> (In reply to jojo from comment #35)
> > (In reply to Martin Liška from comment #30)
> > > A possible solution can be usage of '-flinker-output=nolto-rel -r' for 
> > > huge
> > > files.
> > 
> > it's useful for splitting huge files ?
> 
> There's experiment I did:
> 
> $ time g++ -O2 /tmp/gimple-match.ii -c
> 
> real0m35.790s
> user0m35.490s
> sys0m0.268s
> 
> $ time g++ -O2 /tmp/gimple-match.ii -c -flto
> 
> real0m8.138s
> user0m7.915s
> sys0m0.202s
> 
> $ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
> gimple-match2.o
> 
> real0m9.087s
> user1m56.028s
> sys0m3.292s
> 
> $ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
> gimple-match2.o --param lto-partitions=8
> 
> real0m7.350s
> user0m48.548s
> sys0m0.976s
> 
> $ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
> gimple-match2.o --param lto-partitions=4
> 
> real0m9.847s
> user0m30.462s
> sys0m0.392s
> 
> so for N==4 we get to 8+10s = 18s (compared to the original 36s). And total
> user time is 30+8, which is comparable
> to the original 36s.

The GSoC parallelism project this year is supposed to replicate this
in a cheaper way and also develop some magic to automatically trigger
it when it seems profitable.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2020-07-09 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #36 from Martin Liška  ---
(In reply to jojo from comment #35)
> (In reply to Martin Liška from comment #30)
> > A possible solution can be usage of '-flinker-output=nolto-rel -r' for huge
> > files.
> 
> it's useful for splitting huge files ?

There's experiment I did:

$ time g++ -O2 /tmp/gimple-match.ii -c

real0m35.790s
user0m35.490s
sys0m0.268s

$ time g++ -O2 /tmp/gimple-match.ii -c -flto

real0m8.138s
user0m7.915s
sys0m0.202s

$ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
gimple-match2.o

real0m9.087s
user1m56.028s
sys0m3.292s

$ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
gimple-match2.o --param lto-partitions=8

real0m7.350s
user0m48.548s
sys0m0.976s

$ time gcc -flto=auto -flinker-output=nolto-rel gimple-match.o  -r -o
gimple-match2.o --param lto-partitions=4

real0m9.847s
user0m30.462s
sys0m0.392s

so for N==4 we get to 8+10s = 18s (compared to the original 36s). And total
user time is 30+8, which is comparable
to the original 36s.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2020-07-09 Thread rjiejie at me dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

jojo  changed:

   What|Removed |Added

 CC||rjiejie at me dot com

--- Comment #35 from jojo  ---
(In reply to Martin Liška from comment #30)
> A possible solution can be usage of '-flinker-output=nolto-rel -r' for huge
> files.

it's useful for splitting huge files ?

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2020-05-07 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #34 from Eric Gallager  ---
(In reply to Giuliano Belinassi from comment #32)
> (In reply to Eric Gallager from comment #31)
> > I think this came up at Cauldron, but I forget what exactly people said
> > about it...
> 
> Actually this PR comes before Cauldron 2019. 

By "came up" I meant simply that it was mentioned, not that that was where it
originated...

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2020-05-07 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|10.0|10.2

--- Comment #33 from Jakub Jelinek  ---
GCC 10.1 has been released.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2019-11-07 Thread giuliano.belinassi at usp dot br
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #32 from Giuliano Belinassi  ---
(In reply to Eric Gallager from comment #31)
> I think this came up at Cauldron, but I forget what exactly people said
> about it...

Actually this PR comes before Cauldron 2019. One way to fix this issue is to
make the match.pd parser output several smaller gimple-match.c, and add these
to the Makefile. Also repeat this procedure to other big files.

Another solution is to parallelize GCC internals and make GCC communicate with
Make somehow so that when a CPU is idle, it starts compiling some files in
parallel.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2019-11-06 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #31 from Eric Gallager  ---
I think this came up at Cauldron, but I forget what exactly people said about
it...

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2019-05-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Martin Liška  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|marxin at gcc dot gnu.org  |unassigned at gcc dot 
gnu.org

--- Comment #30 from Martin Liška  ---
A possible solution can be usage of '-flinker-output=nolto-rel -r' for huge
files.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2019-02-07 Thread giuliano.belinassi at usp dot br
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #29 from Giuliano Belinassi  ---
> No, the proper fix would be to split the generated files and compile them in 
> parallel. Similarly for all the insn-*.c generated files. That would the 
> proper fix.

Indeed. However, I am working on parallelizing the compilation with threads.
This may lead to a solution, but may not be the best for this scenario.

> Anyway, I like the graph you made :)

Thank you.

> But what version of GCC is this graph, with what exact configuration?

* This is the gcc that I used to build: *

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 8.2.0-14'
--with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr
--with-gcc-major-version-only --program-suffix=-8
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-libmpx
--enable-plugin --enable-default-pie --with-system-zlib
--with-target-system-zlib --enable-objc-gc=auto --enable-multiarch
--disable-werror --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 8.2.0 (Debian 8.2.0-14) 

* The gcc that I built: *

Using built-in specs.
COLLECT_GCC=./xgcc
Target: x86_64-pc-linux-gnu
Configured with: /home/giulianob/gcc_svn/trunk//configure --disable-checking
--disable-bootstrap
Thread model: posix
gcc version 9.0.1 20190205 (experimental) (GCC)

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2019-02-07 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #28 from Segher Boessenkool  ---
But what version of GCC is this graph, with what exact configuration?

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2019-02-07 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #27 from Martin Liška  ---
> Since gimple-match.c takes so long to compile, I was wondering if it might
> be possible to reorder the compilation so we can push its compilation early
> in the dependency graph.

No, the proper fix would be to split the generated files and compile them in
parallel. Similarly for all the insn-*.c generated files. That would the proper
fix.

Anyway, I like the graph you made :)

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2019-02-07 Thread giuliano.belinassi at usp dot br
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Giuliano Belinassi  changed:

   What|Removed |Added

 CC||giuliano.belinassi at usp dot 
br

--- Comment #26 from Giuliano Belinassi  ---
Created attachment 45630
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45630=edit
make -j 64 all-gcc, with --disable-bootstrap, on 64-cores. Blue means
dependency to gimple-match.

Since gimple-match.c takes so long to compile, I was wondering if it might be
possible to reorder the compilation so we can push its compilation early in the
dependency graph.

I did the following steps: 
 1) 'configure --disable-bootstrap'
 2) 'make -j 64 all-gcc'
 3) 'make clean'. 
 4) 'make gimple-match.o' using a wrapper[1] that I created to log all files
required by gimple-match, and plotted the attached graphic. Here, blue means
dependency and the largest bar is the 'gimple-match.c' itself.

I used a 64 cores AMD Opteron 6376 in the process.

Any ideas?

[1] https://github.com/giulianobelinassi/gcc-timer-analysis

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-11-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Martin Liška  changed:

   What|Removed |Added

   Target Milestone|9.0 |10.0

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-09-10 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Martin Liška  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org

--- Comment #25 from Martin Liška  ---
Let me assign it.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-09-06 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-09-07
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #24 from Eric Gallager  ---
(In reply to Martin Liška from comment #23)
> I can easily split insn-emit.c. Once we know which was a split should be
> done, I can prepare patch for that.

Confirmed, please do this!

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-04-26 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Martin Liška  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #23 from Martin Liška  ---
I can easily split insn-emit.c. Once we know which was a split should be done,
I can prepare patch for that.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-04-04 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #22 from Martin Liška  ---
(In reply to rguent...@suse.de from comment #21)
> On Wed, 4 Apr 2018, marxin at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
> > 
> > --- Comment #20 from Martin Liška  ---
> > For the libsanitizer/*/*_interceptors I make a quick patch:
> > https://github.com/marxin/gcc/commit/5ce658230db567474997fa411f23ac78366487ce
> > which basically splits asan_interceptors.cc and
> > sanitizer_common_interceptors.inc and moves implementation of string 
> > functions
> > to a separate compile unit.
> > This shrinks time from 38->34s for asan_interceptors.cc being built with
> > enabled checking stage1 compiler.
> > 
> > I believe splitting the interceptors to couple of logical sub-files will 
> > make
> > it very fast. List of interceptors grepped from
> > sanitizer_common_interceptors.inc:
> > I can imagine splitting that to components like string, stdio, time, 
> > process,
> > thread, math,..
> 
> The question is of course _why_ it is this slow.  It's not that this
> is 1s of functions or very large ones...

It's analyzed here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78288

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-04-04 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #21 from rguenther at suse dot de  ---
On Wed, 4 Apr 2018, marxin at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
> 
> --- Comment #20 from Martin Liška  ---
> For the libsanitizer/*/*_interceptors I make a quick patch:
> https://github.com/marxin/gcc/commit/5ce658230db567474997fa411f23ac78366487ce
> which basically splits asan_interceptors.cc and
> sanitizer_common_interceptors.inc and moves implementation of string functions
> to a separate compile unit.
> This shrinks time from 38->34s for asan_interceptors.cc being built with
> enabled checking stage1 compiler.
> 
> I believe splitting the interceptors to couple of logical sub-files will make
> it very fast. List of interceptors grepped from
> sanitizer_common_interceptors.inc:
> I can imagine splitting that to components like string, stdio, time, process,
> thread, math,..

The question is of course _why_ it is this slow.  It's not that this
is 1s of functions or very large ones...

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-04-04 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #20 from Martin Liška  ---
For the libsanitizer/*/*_interceptors I make a quick patch:
https://github.com/marxin/gcc/commit/5ce658230db567474997fa411f23ac78366487ce
which basically splits asan_interceptors.cc and
sanitizer_common_interceptors.inc and moves implementation of string functions
to a separate compile unit.
This shrinks time from 38->34s for asan_interceptors.cc being built with
enabled checking stage1 compiler.

I believe splitting the interceptors to couple of logical sub-files will make
it very fast. List of interceptors grepped from
sanitizer_common_interceptors.inc:
I can imagine splitting that to components like string, stdio, time, process,
thread, math,..

INTERCEPTOR(SIZE_T, strlen, const char *s) {
INTERCEPTOR(SIZE_T, strnlen, const char *s, SIZE_T maxlen) {
INTERCEPTOR(char*, strndup, const char *s, uptr size) {
INTERCEPTOR(char*, __strndup, const char *s, uptr size) {
INTERCEPTOR(char*, textdomain, const char *domainname) {
INTERCEPTOR(int, strcmp, const char *s1, const char *s2) {
INTERCEPTOR(int, strncmp, const char *s1, const char *s2, uptr size) {
INTERCEPTOR(int, strcasecmp, const char *s1, const char *s2) {
INTERCEPTOR(int, strncasecmp, const char *s1, const char *s2, SIZE_T size) {
INTERCEPTOR(char*, strstr, const char *s1, const char *s2) {
INTERCEPTOR(char*, strcasestr, const char *s1, const char *s2) {
INTERCEPTOR(char*, strtok, char *str, const char *delimiters) {
INTERCEPTOR(void*, memmem, const void *s1, SIZE_T len1, const void *s2,
INTERCEPTOR(char*, strchr, const char *s, int c) {
INTERCEPTOR(char*, strchrnul, const char *s, int c) {
INTERCEPTOR(char*, strrchr, const char *s, int c) {
INTERCEPTOR(SIZE_T, strspn, const char *s1, const char *s2) {
INTERCEPTOR(SIZE_T, strcspn, const char *s1, const char *s2) {
INTERCEPTOR(char *, strpbrk, const char *s1, const char *s2) {
INTERCEPTOR(void *, memset, void *dst, int v, uptr size) {
INTERCEPTOR(void *, memmove, void *dst, const void *src, uptr size) {
INTERCEPTOR(void *, memcpy, void *dst, const void *src, uptr size) {
INTERCEPTOR(int, memcmp, const void *a1, const void *a2, uptr size) {
INTERCEPTOR(void*, memchr, const void *s, int c, SIZE_T n) {
INTERCEPTOR(void*, memrchr, const void *s, int c, SIZE_T n) {
INTERCEPTOR(double, frexp, double x, int *exp) {
INTERCEPTOR(float, frexpf, float x, int *exp) {
INTERCEPTOR(long double, frexpl, long double x, int *exp) {
INTERCEPTOR(SSIZE_T, read, int fd, void *ptr, SIZE_T count) {
INTERCEPTOR(SIZE_T, fread, void *ptr, SIZE_T size, SIZE_T nmemb, void *file) {
INTERCEPTOR(SSIZE_T, pread, int fd, void *ptr, SIZE_T count, OFF_T offset) {
INTERCEPTOR(SSIZE_T, pread64, int fd, void *ptr, SIZE_T count, OFF64_T offset)
{
INTERCEPTOR_WITH_SUFFIX(SSIZE_T, readv, int fd, __sanitizer_iovec *iov,
INTERCEPTOR(SSIZE_T, preadv, int fd, __sanitizer_iovec *iov, int iovcnt,
INTERCEPTOR(SSIZE_T, preadv64, int fd, __sanitizer_iovec *iov, int iovcnt,
INTERCEPTOR(SSIZE_T, write, int fd, void *ptr, SIZE_T count) {
INTERCEPTOR(SIZE_T, fwrite, const void *p, uptr size, uptr nmemb, void *file) {
INTERCEPTOR(SSIZE_T, pwrite, int fd, void *ptr, SIZE_T count, OFF_T offset) {
INTERCEPTOR(SSIZE_T, pwrite64, int fd, void *ptr, OFF64_T count,
INTERCEPTOR_WITH_SUFFIX(SSIZE_T, writev, int fd, __sanitizer_iovec *iov,
INTERCEPTOR(SSIZE_T, pwritev, int fd, __sanitizer_iovec *iov, int iovcnt,
INTERCEPTOR(SSIZE_T, pwritev64, int fd, __sanitizer_iovec *iov, int iovcnt,
INTERCEPTOR(int, prctl, int option, unsigned long arg2,
INTERCEPTOR(unsigned long, time, unsigned long *t) {
INTERCEPTOR(__sanitizer_tm *, localtime, unsigned long *timep) {
INTERCEPTOR(__sanitizer_tm *, localtime_r, unsigned long *timep, void *result)
{
INTERCEPTOR(__sanitizer_tm *, gmtime, unsigned long *timep) {
INTERCEPTOR(__sanitizer_tm *, gmtime_r, unsigned long *timep, void *result) {
INTERCEPTOR(char *, ctime, unsigned long *timep) {
INTERCEPTOR(char *, ctime_r, unsigned long *timep, char *result) {
INTERCEPTOR(char *, asctime, __sanitizer_tm *tm) {
INTERCEPTOR(char *, asctime_r, __sanitizer_tm *tm, char *result) {
INTERCEPTOR(long, mktime, __sanitizer_tm *tm) {
INTERCEPTOR(char *, strptime, char *s, char *format, __sanitizer_tm *tm) {
INTERCEPTOR(int, vscanf, const char *format, va_list ap)
INTERCEPTOR(int, vsscanf, const char *str, const char *format, va_list ap)
INTERCEPTOR(int, vfscanf, void *stream, const char *format, va_list ap)
INTERCEPTOR(int, __isoc99_vscanf, const char *format, va_list ap)
INTERCEPTOR(int, __isoc99_vsscanf, const char *str, const char *format,
INTERCEPTOR(int, __isoc99_vfscanf, void *stream, const char *format, va_list
ap)
INTERCEPTOR(int, scanf, const char *format, ...)
INTERCEPTOR(int, fscanf, void *stream, const char *format, ...)
INTERCEPTOR(int, sscanf, const char *str, const char *format, ...)
INTERCEPTOR(int, __isoc99_scanf, const char *format, ...)
INTERCEPTOR(int, __isoc99_fscanf, void *stream, const char *format, ...)
INTERCEPTOR(int, 

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-23 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #19 from Martin Liška  ---
(In reply to Tom Tromey from comment #17)
> The results in comment #13 seem to be missing some compilations --
> I would have expected to see more files from libcpp in there.
> As it is I only see directives.o and line-map.o.

There was a minimum threshold of 0.5s, please take a look at log file in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402#c18

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-23 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #18 from Martin Liška  ---
Created attachment 43492
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43492=edit
Parallel build of make all-host on 128 core EPYC machine (log file)

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-22 Thread tromey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #17 from Tom Tromey  ---
The results in comment #13 seem to be missing some compilations --
I would have expected to see more files from libcpp in there.
As it is I only see directives.o and line-map.o.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-21 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Martin Liška  changed:

   What|Removed |Added

  Attachment #43478|0   |1
is obsolete||

--- Comment #16 from Martin Liška  ---
Created attachment 43482
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43482=edit
-ftime-report for most time consuming files on Haswell machine

Properly generated with -O2 which was missing in previous version.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-21 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #15 from Segher Boessenkool  ---
This is a -O0 build?  That's what that time report shows afaics.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-21 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #13 from Martin Liška  ---
Created attachment 43440
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440=edit
Parallel build of make all-host on 128 core EPYC machine

--- Comment #14 from Martin Liška  ---
Created attachment 43478
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43478=edit
-ftime-report for most time consuming files on Haswell machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-21 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Martin Liška  changed:

   What|Removed |Added

 CC||hubicka at ucw dot cz,
   ||rguenth at gcc dot gnu.org
   Target Milestone|--- |9.0

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-16 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #13 from Martin Liška  ---
Created attachment 43440
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440=edit
Parallel build of make all-host on 128 core EPYC machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-16 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Martin Liška  changed:

   What|Removed |Added

  Attachment #43432|0   |1
is obsolete||

--- Comment #12 from Martin Liška  ---
Created attachment 43439
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43439=edit
Parallel build of make all-host on 8 core Haswell machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #11 from Martin Liška  ---
(In reply to Martin Liška from comment #10)
> Created attachment 43432 [details]
> Parallel build of make all-host on 8 core Haswell machine

This was generated with a slightly modified make (being able to run fully in
parallel):
https://github.com/marxin/make/tree/timestamp-v2

And output is then parsed and 'stacked' graph is generated:
https://github.com/marxin/script-misc/blob/master/parse-make-log-parallel.py

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

Martin Liška  changed:

   What|Removed |Added

  Attachment #43428|0   |1
is obsolete||

--- Comment #10 from Martin Liška  ---
Created attachment 43432
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43432=edit
Parallel build of make all-host on 8 core Haswell machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #9 from Martin Liška  ---
Created attachment 43428
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43428=edit
Parallel build of make all-host on 8 core Haswell machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #8 from Martin Liška  ---
I forgot to note that minimum time threshold is 0.5s for the wall time reports.

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #7 from Martin Liška  ---
Created attachment 43426
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43426=edit
wall time report: boostrap stage3 on Haswell machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #6 from Martin Liška  ---
Created attachment 43425
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43425=edit
wall time report: boostrap stage2 on Haswell machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #4 from Martin Liška  ---
Created attachment 43423
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43423=edit
wall time report: make (for configure --disable-boostrap) on Haswell machine
(system compiler -O2 -g)

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #5 from Martin Liška  ---
Created attachment 43424
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43424=edit
wall time report: boostrap stage1 on Haswell machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #2 from Martin Liška  ---
Created attachment 43421
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43421=edit
make all-host -j128 on 128 core EPYC machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #3 from Martin Liška  ---
Created attachment 43422
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43422=edit
make (for configure --disable-boostrap) -j128 on 128 core EPYC machine

[Bug bootstrap/84402] [meta] GCC build system: parallelism bottleneck

2018-02-15 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402

--- Comment #1 from Martin Liška  ---
Created attachment 43420
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43420=edit
make all-host -j8 on 8 core Haswell machine