Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-26 Thread James Greenhalgh
On Mon, May 16, 2016 at 11:38:04AM +0100, Wilco Dijkstra wrote: > GCC expands switch statements in a very simplistic way and tries to use a > table > expansion even when it is a bad idea for performance or codesize. > GCC typically emits extremely sparse tables that contain mostly default >

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-24 Thread Evandro Menezes
On 05/23/16 15:32, Evandro Menezes wrote: I'm fine with this patch, as it achieves in part what I intended before: going beyond the default_case_values_threshold, too conservative for Exynos M1. My concern is particularly what happens to in-order targets, like the ubiquitous A53. I'll

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-24 Thread Evandro Menezes
On 05/24/16 07:08, Wilco Dijkstra wrote: Jim Wilson wrote: It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3. I see about a 0.37% loss on the integer benchmarks, and no significant change on the FP benchmarks. The integer loss is mainly due to 458.sjeng which drops 2%. We had

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-24 Thread Wilco Dijkstra
Jim Wilson wrote: > It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3. I see > about a 0.37% loss on the integer benchmarks, and no significant > change on the FP benchmarks. The integer loss is mainly due to > 458.sjeng which drops 2%. We had tried various values for >

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-23 Thread Evandro Menezes
On 05/18/16 20:03, Jim Wilson wrote: Though I see that the original patch from Samsung that added the max_case_values field has the -O3 check, so there was apparently some reason why they wanted it to work that way. The value that the exynos-m1 is using, 48, looks pretty large, so maybe they

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-18 Thread Jim Wilson
On Mon, May 16, 2016 at 4:30 AM, James Greenhalgh wrote: > As this change will change code generation for all cores (except > Exynos-M1), I'd like to hear from those with more detailed knowledge of > ThunderX, X-Gene and qdf24xx before I take this patch. It looks like a

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-16 Thread Wilco Dijkstra
James Greenhalgh wrote: > As this change will change code generation for all cores (except > Exynos-M1), I'd like to hear from those with more detailed knowledge of > ThunderX, X-Gene and qdf24xx before I take this patch. > > Let's give it another week or so for comments, and expand the CC list.

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-16 Thread James Greenhalgh
ortex-A53 built for generic, but there is no > difference in perlbench. Where were these changes if not perlbench? Thanks, James > > From: Wilco Dijkstra > Sent: 22 April 2016 17:15 > To: gcc-patches@gcc.gnu.org > Cc: nd > Subject: [PATCH][AArch64] Improv

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-16 Thread Wilco Dijkstra
ping From: Wilco Dijkstra Sent: 22 April 2016 17:15 To: gcc-patches@gcc.gnu.org Cc: nd Subject: [PATCH][AArch64] Improve aarch64_case_values_threshold setting GCC expands switch statements in a very simplistic way and tries to use a table expansion even

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-28 Thread Wilco Dijkstra
Kyrill Tkachov wrote: > On 25/04/16 20:21, Wilco Dijkstra wrote: > > The GCC switch expansion is awful, so > > even with a good indirect predictor it is better to use conditional > > branches. > > In what way is it awful? If there's something we can do better at > can you file a bug report with a

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-27 Thread Kyrill Tkachov
Hi Wilco, On 25/04/16 20:21, Wilco Dijkstra wrote: Evandro Menezes wrote: I assume that you mean that such improvements are true for -mcpu=generic, yes? On which target, A53 or A57 or other? It's true for any CPU setting. The SPEC results are for Cortex-A57 however I wrote a microbenchmark

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-26 Thread Evandro Menezes
On 04/26/16 11:14, Wilco Dijkstra wrote: Evandro Menezes wrote: True, but the results when running on A53 could be quite different. GCC is ~1.2% faster on Cortex-A53 built for generic, but there is no difference in perlbench. Looks good, then. Fine by me. Thanks for your patience, --

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-26 Thread Wilco Dijkstra
Evandro Menezes wrote: > > True, but the results when running on A53 could be quite different. GCC is ~1.2% faster on Cortex-A53 built for generic, but there is no difference in perlbench. Wilco

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes
On 04/25/16 14:58, Wilco Dijkstra wrote: Evandro Menezes wrote: I agree with your assessment, but I'm more curious to understand how this change affects code built with the default -mcpu=generic when run on both A53 and A57, the typical configuration of big.LITTLE machines. I wouldn't expect

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Wilco Dijkstra
Evandro Menezes wrote: > I agree with your assessment, but I'm more curious to understand how > this change affects code built with the default -mcpu=generic when run > on both A53 and A57, the typical configuration of big.LITTLE machines. I wouldn't expect the result to be any different as the

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes
On 04/25/16 14:21, Wilco Dijkstra wrote: Evandro Menezes wrote: I assume that you mean that such improvements are true for -mcpu=generic, yes? On which target, A53 or A57 or other? It's true for any CPU setting. The SPEC results are for Cortex-A57 however I wrote a microbenchmark that shows

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Wilco Dijkstra
Evandro Menezes wrote: > I assume that you mean that such improvements are true for > -mcpu=generic, yes? On which target, A53 or A57 or other? It's true for any CPU setting. The SPEC results are for Cortex-A57 however I wrote a microbenchmark that shows improvements on all targets I have

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes
On 04/22/16 11:15, Wilco Dijkstra wrote: This patch fixes that by setting the default aarch64_case_values_threshold to 16 when the per-CPU tuning is not set. On SPEC2006 this improves the switch heavy benchmarks GCC and perlbench both in performance (1-2%) as well as size (0.5-1% smaller). I

[PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-22 Thread Wilco Dijkstra
GCC expands switch statements in a very simplistic way and tries to use a table expansion even when it is a bad idea for performance or codesize. GCC typically emits extremely sparse tables that contain mostly default entries (something which currently cannot be tuned by backends). Additionally