https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #57 from Hongtao Liu ---
> For dg-do run testcases I really think we should avoid those -march=
> options, because it means a lot of other stuff, BMI, LZCNT, ...
Make sense.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #56 from Uroš Bizjak ---
The testcase is fixed with g:430c772be3382134886db33133ed466c02efc71c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #55 from Uroš Bizjak ---
(In reply to Jakub Jelinek from comment #53)
> Comment on attachment 57424 [details]
> Proposed testsuite patch
>
> As skylake-avx512 is -mavx512{f,cd,bw,dq,vl}, requiring just avx512f
> effective target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #54 from Richard Biener ---
Please also verify the bug reproduced with the altered set of options.
What's the reason to have avx512-check.h in addition to tree-vect.h?
At least for the vectorizer testsuite the latter is the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #53 from Jakub Jelinek ---
Comment on attachment 57424
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57424
Proposed testsuite patch
As skylake-avx512 is -mavx512{f,cd,bw,dq,vl}, requiring just avx512f effective
target and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #52 from Uroš Bizjak ---
Created attachment 57424
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57424=edit
Proposed testsuite patch
This patch fixes the failure for me (+ some other dg.exp/vect inconsistencies).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #51 from Jakub Jelinek ---
>From the -mavx* options I think -march=skylake-avx512 implies
-mavx512{f,cd,vl,bw,dq} but -mavx512f is implied by any of the latter 4.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #50 from Jakub Jelinek ---
(In reply to Richard Biener from comment #49)
> (In reply to Uroš Bizjak from comment #48)
> > The runtime testcase fails on non-AVX512F x86 targets due to:
> >
> > /* { dg-do run } */
> > /* { dg-options
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #49 from Richard Biener ---
(In reply to Uroš Bizjak from comment #48)
> The runtime testcase fails on non-AVX512F x86 targets due to:
>
> /* { dg-do run } */
> /* { dg-options "-O3" } */
> /* { dg-additional-options
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #48 from Uroš Bizjak ---
The runtime testcase fails on non-AVX512F x86 targets due to:
/* { dg-do run } */
/* { dg-options "-O3" } */
/* { dg-additional-options "-march=skylake-avx512" { target { x86_64-*-*
i?86-*-* } } } */
but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Richard Biener changed:
What|Removed |Added
Status|REOPENED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #46 from GCC Commits ---
The master branch has been updated by Richard Biener :
https://gcc.gnu.org/g:5352ede92483b949e811cbdcdfaec5378f3e06d6
commit r14-8975-g5352ede92483b949e811cbdcdfaec5378f3e06d6
Author: Richard Biener
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #45 from Hongtao Liu ---
> > There's do_store_flag to fixup for uses not in branches and
> > do_compare_and_jump for conditional jumps.
>
> reasonable enough for me.
I mean we only handle it at consumers where upper bits matters.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #44 from Hongtao Liu ---
>
> Note the AND is removed by combine if I add it:
>
> Successfully matched this instruction:
> (set (reg:CCZ 17 flags)
> (compare:CCZ (and:HI (not:HI (subreg:HI (reg:QI 102 [ tem_3 ]) 0))
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #43 from Hongtao Liu ---
> Well, yes, the discussion in this bug was whether to do this at consumers
> (that's sth new) or with all mask operations (that's how we handle
> bit-precision integer operations, so it might be relatively
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #42 from Richard Biener ---
And the do_store_flag part:
diff --git a/gcc/expr.cc b/gcc/expr.cc
index fc5e998e329..44d64274071 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -13693,6 +13693,19 @@ do_store_flag (sepops ops, rtx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #41 from Richard Biener ---
(In reply to Hongtao Liu from comment #38)
> > I think we should also mask off the upper bits of variable mask?
> >
> > notl%esi
> > orl %esi, %edi
> > notl%edi
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #40 from Jakub Jelinek ---
For unsigned _BitInt(4) or unsigned _BitInt(2) we mask it whenever loading from
memory or function argument or whatever other ABI specific spot (and also when
storing because that is how RTL expects it;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #39 from Hongtao Liu ---
> > the question is whether that matches the semantics of GIMPLE (the padding
> > is inverted, too), whether it invokes undefined behavior (don't do it - it
> > seems for people using intrinsics that's what
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #38 from Hongtao Liu ---
> I think we should also mask off the upper bits of variable mask?
>
> notl%esi
> orl %esi, %edi
> notl%edi
> andl$15, %edi
> je .L3
with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #37 from Hongtao Liu ---
(In reply to Richard Biener from comment #36)
> For example with AVX512VL and the following, using -O -fgimple -mavx512vl
> we get simply
>
> notl%esi
> orl %esi, %edi
> cmpb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #36 from Richard Biener ---
For example with AVX512VL and the following, using -O -fgimple -mavx512vl
we get simply
notl%esi
orl %esi, %edi
cmpb$15, %dil
je .L6
typedef long v4si
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Richard Biener changed:
What|Removed |Added
CC||ams at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #34 from Richard Sandiford ---
(In reply to Richard Biener from comment #32)
> Btw, AVX512 knotb will invert all 8 bits and there's no knot just affecting
> the lowest 4 or 2 bits.
>
> It all feels like desaster waiting to happen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #32 from Richard Biener ---
Btw, AVX512 knotb will invert all 8 bits and there's no knot just affecting
the lowest 4 or 2 bits.
It all feels like desaster waiting to happen ;)
For example BIT_NOT_EXPR is RTL expanded like
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #31 from rguenther at suse dot de ---
On Tue, 30 Jan 2024, rsandifo at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
>
> --- Comment #30 from Richard Sandiford ---
> (In reply to Richard Biener from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #30 from Richard Sandiford ---
(In reply to Richard Biener from comment #29)
> But that's just for CONSTRUCTORs, we got the VIEW_CONVERT_EXPR path for
> VECTOR_CSTs. But yeah, that _might_ argue we should perform the same
> masking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #29 from Richard Biener ---
(In reply to Hongtao Liu from comment #28)
> I saw we already maskoff integral modes for vector mask in store_constructor
>
> /* Use sign-extension for uniform boolean vectors with
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #28 from Hongtao Liu ---
I saw we already maskoff integral modes for vector mask in store_constructor
/* Use sign-extension for uniform boolean vectors with
integer modes and single-bit mask entries.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #27 from Richard Biener ---
(In reply to Hongtao Liu from comment #25)
> (In reply to Tamar Christina from comment #24)
> > Just to avoid confusion, are you still working on this one Richi?
>
> I'm working on a patch to add a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #26 from Tamar Christina ---
Ah great, just checking it wasn't left unattended :)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #25 from Hongtao Liu ---
(In reply to Tamar Christina from comment #24)
> Just to avoid confusion, are you still working on this one Richi?
I'm working on a patch to add a target hook as #c18 mentioned.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #24 from Tamar Christina ---
Just to avoid confusion, are you still working on this one Richi?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Tamar Christina changed:
What|Removed |Added
CC||acoplan at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #22 from Hongtao Liu ---
typedef unsigned long mp_limb_t;
typedef long mp_size_t;
typedef unsigned long mp_bitcnt_t;
typedef mp_limb_t *mp_ptr;
typedef const mp_limb_t *mp_srcptr;
#define GMP_LIMB_BITS (sizeof(mp_limb_t) * 8)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #21 from Hongtao Liu ---
typedef unsigned long mp_limb_t;
typedef long mp_size_t;
typedef unsigned long mp_bitcnt_t;
typedef mp_limb_t *mp_ptr;
typedef const mp_limb_t *mp_srcptr;
#define GMP_LIMB_BITS (sizeof(mp_limb_t) * 8)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #20 from Hongtao Liu ---
> Note that I wonder how to eliminate redundant maskings? I suppose
> eventually combine tracking nonzero bits where obvious would do
> that? For example for cmp:V4SI we know the bits will be zero but
> I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #19 from rguenther at suse dot de ---
On Thu, 25 Jan 2024, rsandifo at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
>
> --- Comment #18 from Richard Sandiford ---
> (In reply to Tamar Christina
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #18 from Richard Sandiford ---
(In reply to Tamar Christina from comment #17)
> Well the mid-end has generated the right precision. The type it generates is
> vector(4) vexit_reduc_67;
> so it does say it's a single bit boolean.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #17 from Tamar Christina ---
Well the mid-end has generated the right precision. The type it generates is
vector(4) vexit_reduc_67;
so it does say it's a single bit boolean.
Isn't this just an expand problem?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #16 from Richard Sandiford ---
(In reply to Richard Biener from comment #15)
> I think the problem is the cbranch pattern which looks at all of the
> QImode mask - but of course it doesn't know it's really V4BImode it's
> working on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #15 from Richard Biener ---
(In reply to Richard Sandiford from comment #13)
> I don't think there's any principle that upper bits must be zero.
> How do we end up with a pattern that depends on that being the case?
I think the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Hongtao Liu changed:
What|Removed |Added
Resolution|FIXED |---
Status|RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #13 from Richard Sandiford ---
I don't think there's any principle that upper bits must be zero.
How do we end up with a pattern that depends on that being the case?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #12 from rguenther at suse dot de ---
On Thu, 25 Jan 2024, liuhongt at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
>
> --- Comment #7 from Hongtao Liu ---
> diff --git a/gcc/fold-const.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Richard Biener changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #9 from GCC Commits ---
The master branch has been updated by Richard Biener :
https://gcc.gnu.org/g:578c7b91f418ebbef1bf169117815409e06f5197
commit r14-8413-g578c7b91f418ebbef1bf169117815409e06f5197
Author: Richard Biener
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #8 from Hongtao Liu ---
maybe
diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1fd957288d4..6d321f9baef 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -8035,6 +8035,9 @@ native_encode_vector_part (const_tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #7 from Hongtao Liu ---
diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1fd957288d4..33a8d539b4d 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -8032,7 +8032,7 @@ native_encode_vector_part (const_tree expr,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #6 from Hongtao Liu ---
Another potential buggy place is
240 vexit_reduc_67 = mask_patt_43.28_62 & mask_patt_43.28_63;
241 if (vexit_reduc_67 == { -1, -1, -1, -1 })
242goto ; [94.50%]
243 else
is expanded to
319(insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #5 from Richard Biener ---
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index fe631252dc2..28ad03e0b8a 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -991,8 +991,12 @@ vec_init_loop_exit_info (class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Richard Biener changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #3 from Richard Biener ---
So the change enables early exit vectorization since may_be_zero is _10 == 0
here, resulting in an overall
number_of_iterationsm1 == _10 != 0 ? _10 + 4294967295 : 0
and
number_of_iterations = MAX_EXPR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Richard Biener changed:
What|Removed |Added
Target Milestone|--- |14.0
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #1 from Hongtao Liu ---
int
__attribute__((noinline))
sbitmap_first_set_bit (const_sbitmap bmap)
{
unsigned int n = 0;
sbitmap_iterator sbi;
EXECUTE_IF_SET_IN_SBITMAP (bmap, 0, n, sbi)
return n;
return -1;
}
hangs on
57 matches
Mail list logo