https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
janus at gcc dot gnu.org changed:
What|Removed |Added
CC||janus at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
sergey.shalnov at intel dot com changed:
What|Removed |Added
Status|NEW |RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #37 from uros at gcc dot gnu.org ---
Author: uros
Date: Thu Feb 8 22:31:15 2018
New Revision: 257505
URL: https://gcc.gnu.org/viewcvs?rev=257505=gcc=rev
Log:
PR target/83008
* config/i386/x86-tune-costs.h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #36 from sergey.shalnov at intel dot com ---
The patch fixes the issue for SKX is in
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00405.html
I will close the PR after the patch has been merged.
Thank you very much for all involved.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #35 from Christophe Lyon ---
Author: clyon
Date: Wed Feb 7 09:12:48 2018
New Revision: 257438
URL: https://gcc.gnu.org/viewcvs?rev=257438=gcc=rev
Log:
[testsuite] Fix gcc.dg/cse_recip.c for AArch64 after r257181.
2018-02-07
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #34 from Richard Biener ---
Author: rguenth
Date: Tue Jan 30 11:19:47 2018
New Revision: 257181
URL: https://gcc.gnu.org/viewcvs?rev=257181=gcc=rev
Log:
2018-01-30 Richard Biener
PR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #33 from sergey.shalnov at intel dot com ---
Richard,
I'm not sure is it a regression or not. I see code has been visibly refactored
in this commit
https://github.com/gcc-mirror/gcc/commit/ee6e9ba576099aed29f1097195c649fc796ecf5e
in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #32 from rguenther at suse dot de ---
On Fri, 26 Jan 2018, sergey.shalnov at intel dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> --- Comment #31 from sergey.shalnov at intel dot com ---
> Richard,
> Thank
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #31 from sergey.shalnov at intel dot com ---
Richard,
Thank you for your latest patch. This patch is exactly that
I’ve discussed in this issue request.
I tested it with SPEC20[06|17] and see no performance/stability degradation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
Richard Biener changed:
What|Removed |Added
Attachment #43084|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #29 from sergey.shalnov at intel dot com ---
Richard,
Thank you for your latest patch. I would like to clarify
the multiple_p() function usage in if() clause.
First of all, I assume that architectures with fixed
size of HW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #28 from sergey.shalnov at intel dot com ---
Richard,
Thank you for your comments.
I see that TYPE_VECTOR_SUBPARTS is constant for for the test case but
multiple_p (group_size, const_nunits) returns 1 in the code:
if
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #27 from rguenther at suse dot de ---
On Wed, 10 Jan 2018, sergey.shalnov at intel dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> --- Comment #26 from sergey.shalnov at intel dot com ---
> Sorry, did you
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #26 from sergey.shalnov at intel dot com ---
Sorry, did you meant "arm_sve.h" on ARM?
In this case we have machine specific code in common part of the gcc code.
Should we make it as machine dependent callback function because having
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #25 from rguenther at suse dot de ---
On Wed, 10 Jan 2018, sergey.shalnov at intel dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> --- Comment #24 from sergey.shalnov at intel dot com ---
> Richard,
> The
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #24 from sergey.shalnov at intel dot com ---
Richard,
The latest "SLP costing for constants/externs improvement" patch generates the
same code as baseline for the test example.
Are you sure that "num_vects_to_check" should 1 if
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #22 from rguenther at suse dot de ---
On Wed, 10 Jan 2018, sergey.shalnov at intel dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> --- Comment #21 from sergey.shalnov at intel dot com ---
> Thanks Richard
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #22 from rguenther at suse dot de ---
On Wed, 10 Jan 2018, sergey.shalnov at intel dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> --- Comment #21 from sergey.shalnov at intel dot com ---
> Thanks Richard
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #23 from Richard Biener ---
Created attachment 43084
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43084=edit
SLP costing for constants/externs improvement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #21 from sergey.shalnov at intel dot com ---
Thanks Richard for your comments.
Based on our discussion I've produced the patch attached and
run it on SPEC2017intrate/fprate on skylake server (with [-Ofast -flto
-march=skylake-avx512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #20 from sergey.shalnov at intel dot com ---
Richard,
I did quick static analysis for your latest patch.
Using command line “-g -Ofast -mfpmath=sse -funroll-loops -march=znver1” your
latest patch
doesn’t affects the issue I discussed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #19 from rguenther at suse dot de ---
On Sun, 24 Dec 2017, sergey.shalnov at intel dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> --- Comment #18 from sergey.shalnov at intel dot com ---
> Yes, I agree that
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #18 from sergey.shalnov at intel dot com ---
Yes, I agree that vector_store stage has it’s own vectorization cost.
And each vector_store has vector_construction stage. These stages are different
in gcc slp (as you know).
To better
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #17 from rguenther at suse dot de ---
On Fri, 15 Dec 2017, sergey.shalnov at intel dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> --- Comment #16 from sergey.shalnov at intel dot com ---
> «it's one
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #16 from sergey.shalnov at intel dot com ---
«it's one vec_construct operation - it's the task of the target to turn this
into a cost comparable to vector_store»
I agree that vec_construct operation cost is based on the target cost
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #15 from rguenther at suse dot de ---
On Fri, 15 Dec 2017, sergey.shalnov at intel dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> --- Comment #14 from sergey.shalnov at intel dot com ---
> " we have a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #14 from sergey.shalnov at intel dot com ---
" we have a basic-block vectorizer. Do you propose to remove it? "
Definitely not! SLP vectorizer is very good to have!
“What's the rationale for not using vector registers”
I just tried
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #13 from rguenther at suse dot de ---
On Fri, 8 Dec 2017, sergey.shalnov at intel dot com wrote:
> And it uses xmm+ vpbroadcastd to spill tmp[] to stack
> ...
> 1e7: 62 d2 7d 08 7c c9 vpbroadcastd %r9d,%xmm1
> 1ed: c4 c1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #12 from sergey.shalnov at intel dot com ---
Richard,
Your last proposal changed the code generated a bit.
Currently is shows:
test_bugzilla1.c:6:5: note: Cost model analysis:.
Vector inside of loop cost: 62576
Vector prologue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #11 from sergey.shalnov at intel dot com ---
Richard,
“Is this about the "stupid" attempt to use as little AVX512 as possible”
No, it is not.
I provided asm listing at the beginning with zmm only to illustrate the issue
more
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #10 from Richard Biener ---
Just to note this is _basic block vectorization_ triggering. Of course we do
vectorize basic blocks even when we do not vectorize any loop.
Is this about the "stupid" attempt to use as little AVX512 as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #9 from sergey.shalnov at intel dot com ---
Created attachment 42813
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42813=edit
New reproducer
Slightly changed first loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #8 from sergey.shalnov at intel dot com ---
Richard,
This is great changes and I see the first loop became vectorized for the test
example I provided with gcc-8.0 main trunk.
But I think the issue a bit more complicated. Vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #7 from Richard Biener ---
Note the first loop is now vectorized fine thus the strange code is gone.
-> fixed? (probably by the fix for PR83202)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #6 from sergey.shalnov at intel dot com ---
I found the issue request related to the vactorization issues in second loop
(reduction uint->int).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65930
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #5 from sergey.shalnov at intel dot com ---
(In reply to Richard Biener from comment #2)
> The strange code is because we perform basic-block vectorization resulting in
>
> vect_cst__249 = {_251, _251, _251, _251, _334, _334, _334,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #4 from rguenther at suse dot de ---
On Sun, 19 Nov 2017, hubicka at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
>
> Jan Hubicka changed:
>
>What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
Jan Hubicka changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
Richard Biener changed:
What|Removed |Added
Keywords||missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #1 from sergey.shalnov at intel dot com ---
Created attachment 42616
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42616=edit
reproducer
40 matches
Mail list logo