https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
Tamar Christina changed:
What|Removed |Added
Summary|[13/14 regression] jump |[13 regression] jump
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
Jakub Jelinek changed:
What|Removed |Added
Priority|P1 |P2
--- Comment #53 from Jakub Jelinek
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #52 from CVS Commits ---
The master branch has been updated by Jakub Jelinek :
https://gcc.gnu.org/g:de0ee9d14165eebb3d31c84e98260c05c3b33acb
commit r13-7192-gde0ee9d14165eebb3d31c84e98260c05c3b33acb
Author: Jakub Jelinek
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #51 from Jakub Jelinek ---
Dumb untested patch which saves 2 instructions from each of those testcases:
--- gcc/tree-if-conv.cc.jj 2023-04-12 08:53:58.264496474 +0200
+++ gcc/tree-if-conv.cc 2023-04-14 21:02:42.403826690 +0200
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #50 from Jakub Jelinek ---
Anyway, given that in the sorting the last entry has the maximum number of
occurrences,
I think without trying to do more smarts best would be to avoid evaluating that
last condition for now.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #49 from Jakub Jelinek ---
Plus for 4+ args_len, if we don't find some smart sorting, we should still
consider at least some reassociation between the COND_EXPRs, instead of
emitting for 4 args_len
3 COND_EXPRs where second depends
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #48 from Jakub Jelinek ---
for PHIs with 3+ arguments unless all the arguments but one are the same even
when not doing any smarts seems we emit one more COND_EXPR from what we could.
The
/* Common case. */
case loop emits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #47 from Jakub Jelinek ---
The testcase then doesn't have to be floating point, say on x86 -O3 -mavx512f
void
foo (int *f, int d, int e)
{
for (int i = 0; i < 1024; i++)
{
int a = f[i];
int t;
if (a < 0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #46 from rguenther at suse dot de ---
Am 13.04.2023 um 18:54 schrieb jakub at gcc dot gnu.org
:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
>
> --- Comment #45 from Jakub Jelinek ---
> So, would
> void
> foo (float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #45 from Jakub Jelinek ---
So, would
void
foo (float *f, float d, float e)
{
if (e >= 2.0f && e <= 4.0f)
;
else
__builtin_unreachable ();
for (int i = 0; i < 1024; i++)
{
float a = f[i];
f[i] = (a <
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #44 from Richard Biener ---
The larger testcase:
typedef struct __attribute__((__packed__)) _Atom { float x, y, z; int type; }
Atom;
typedef struct __attribute__((__packed__)) _FFParams { int hbtype; float
radius; float hphb; float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
ktkachov at gcc dot gnu.org changed:
What|Removed |Added
Priority|P3 |P1
--- Comment #43 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #42 from Tamar Christina ---
Thanks for all the work so far folks!
Just to clarify the current state, it looks like the first reduced testcase is
now correct.
But the larger example as in c26 is still suboptimal, but slightly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #41 from CVS Commits ---
The master branch has been updated by Andrew Macleod :
https://gcc.gnu.org/g:429a7a88438cc80e7c58d9f63d44838089899b12
commit r13-6945-g429a7a88438cc80e7c58d9f63d44838089899b12
Author: Andrew MacLeod
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #40 from Andrew Macleod ---
> There is no problem with adding --params, and those are always better than
> magic numbers.
>
> Btw, I originally wondered why we don't re-compute zone1_12 because it's
> in the imports of the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #39 from Richard Biener ---
(In reply to Andrew Macleod from comment #37)
> Created attachment 54780 [details]
> in progress patch
>
> Well call me a liar.
>
> It took me a while to understand why, but if we leave it to single
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #38 from CVS Commits ---
The master branch has been updated by Richard Biener :
https://gcc.gnu.org/g:c9954996cd647daf0ba03e34dd279b97982f671f
commit r13-6923-gc9954996cd647daf0ba03e34dd279b97982f671f
Author: Richard Biener
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #37 from Andrew Macleod ---
Created attachment 54780
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54780=edit
in progress patch
Well call me a liar.
It took me a while to understand why, but if we leave it to single
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #36 from Andrew Macleod ---
(In reply to Jakub Jelinek from comment #35)
> (In reply to Andrew Macleod from comment #34)
> > I will poke at whether its possible to cheaply handle a second (or third)
> > level for single dependency
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #35 from Jakub Jelinek ---
(In reply to Andrew Macleod from comment #34)
> I will poke at whether its possible to cheaply handle a second (or third)
> level for single dependency defs.
Will those include also binary ops which have
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #34 from Andrew Macleod ---
(In reply to Jakub Jelinek from comment #33)
> (In reply to Andrew Macleod from comment #32)
> > We could in theory expand it to look at 2 levels if its a single operand...
>
> Yeah, that would help here
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #33 from Jakub Jelinek ---
(In reply to Andrew Macleod from comment #32)
> We could in theory expand it to look at 2 levels if its a single operand...
Yeah, that would help here and could be worth it.
> which will help with some
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #32 from Andrew Macleod ---
The issues is here is pruning to avoid significant time growth.
_1 = (float) l_11(D);
_2 = _1 < 0.0;
zone1_12 = (int) _2;
if (_1 < 0.0)
goto ; [INV]
_1 is an export from the block. In
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #31 from Jakub Jelinek ---
On the #c28 testcase, my #c23 patch seems to improve something only visible in
the details of the evrp dump:
zone1_12 : [irange] int [0, 1] NONZERO 0x1
-2->3 (T) _1 : [frange] float [-Inf, -0.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #30 from Jakub Jelinek ---
But at least the zone1_100 stuff is unused in #c26, so improving #c28 there
wouldn't help.
distbb_99 = distij_98 - radij_82;
_27 = distbb_99 < 0.0;
# RANGE [irange] const int [0, 1] NONZERO 0x1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #29 from Richard Biener ---
For the testcase in comment#26 we see that if-conversion from
if (distbb_170 >= 0.0)
goto ; [59.00%]
else
goto ; [41.00%]
[local count: 311875831]:
...
if (distbb_170 < iftmp.0_97)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #28 from Richard Biener ---
So as for what ranger should get, the testcase in comment#2 after EVRP still
sees
:
_1 = (float) l_10;
_2 = _1 < 0.0;
zone1_17 = (int) _2;
if (_1 < 0.0)
goto ; [INV]
else
goto ;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #27 from Richard Biener ---
I've added heuristics to threading to PR109048 but I think it's too strong to
reject them.
For the testcase in this PR ranger could fix up if it managed to properly
propagate the singleton range early.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #26 from Jakub Jelinek ---
The above slightly simplified (dead var removal, preprocessing etc.):
typedef struct __attribute__((__packed__)) _Atom { float x, y, z; int type; }
Atom;
typedef struct __attribute__((__packed__))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #25 from Tamar Christina ---
Created attachment 54777
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54777=edit
extracted codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #24 from Tamar Christina ---
(In reply to Jakub Jelinek from comment #12)
> (In reply to Richard Biener from comment #11)
> > _1 shoud be [-Inf, nextafter (0.0, -Inf)], not [-Inf, -0.0]
> The reduced testcase is invalid because it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #23 from CVS Commits ---
The master branch has been updated by Jakub Jelinek :
https://gcc.gnu.org/g:ce3974e5962b0e1f72a1f71ebda39d53a77b7cc9
commit r13-6898-gce3974e5962b0e1f72a1f71ebda39d53a77b7cc9
Author: Jakub Jelinek
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #22 from Richard Biener ---
(In reply to Jakub Jelinek from comment #21)
> Created attachment 54770 [details]
> gcc13-pr109154.patch
>
> So what about this then?
> It matches the x86 FTZ behavior, because FTZ is a masked reaction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #21 from Jakub Jelinek ---
Created attachment 54770
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54770=edit
gcc13-pr109154.patch
So what about this then?
It matches the x86 FTZ behavior, because FTZ is a masked reaction to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #20 from Jakub Jelinek ---
(In reply to Richard Biener from comment #18)
> Hmm, I guess if users enable FTZ we could instruct them to tell that to
> the compiler, but requiring -funsafe-math-optimizations is quite a
> difficult
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #19 from Jakub Jelinek ---
Though, we likely use set also when just copying ranges and the like, so we'd
probably
need to move the flush_denormals_to_zero calls from set to somewhere else,
perhaps
range_operator_float::fold_range?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #18 from Richard Biener ---
(In reply to Jakub Jelinek from comment #17)
> (In reply to rguent...@suse.de from comment #15)
> > I think flushing denormals makes sense for "forward" propagation,
>
> Well, it still hurts quite a lot
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #17 from Jakub Jelinek ---
(In reply to rguent...@suse.de from comment #15)
> I think flushing denormals makes sense for "forward" propagation,
Well, it still hurts quite a lot exactly for the ranges around zero.
Given that most
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #16 from Jakub Jelinek ---
Created attachment 54766
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54766=edit
gcc13-pr109154-denorm.patch
Untested patch to honor denormals if floating point mode has them, unless
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #15 from rguenther at suse dot de ---
On Mon, 27 Mar 2023, jakub at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
>
> --- Comment #12 from Jakub Jelinek ---
> (In reply to Richard Biener from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #14 from Jakub Jelinek ---
(In reply to Jakub Jelinek from comment #12)
> We definitely should add range-ops for conversions from integral to floating
> point and from floating to integral and their reverses.
Do we have range-ops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #13 from Aldy Hernandez ---
(In reply to Jakub Jelinek from comment #12)
> (In reply to Aldy Hernandez from comment #10)
> > BTW, I don't think it helps at all here, but casting from l_10 to a float,
> > we know _1 can't be either
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #12 from Jakub Jelinek ---
(In reply to Richard Biener from comment #11)
> _1 shoud be [-Inf, nextafter (0.0, -Inf)], not [-Inf, -0.0]
Well, that is a consequence of the decision to always flush denormals to zero
in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
Richard Biener changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #10 from Aldy Hernandez ---
(In reply to Andrew Macleod from comment #9)
> (In reply to Richard Biener from comment #7)
> > (In reply to Richard Biener from comment #6)
> > > ah, probably it's the missing CSE there:
> > >
> > >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #9 from Andrew Macleod ---
(In reply to Richard Biener from comment #7)
> (In reply to Richard Biener from comment #6)
> > ah, probably it's the missing CSE there:
> >
> > :
> > _1 = (float) l_10;
> > _2 = _1 < 0.0;
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #8 from Aldy Hernandez ---
(In reply to avieira from comment #5)
> Im slightly confused here, on entry to BB 5 we know the opposite of _1 < 0.0
> no? if we branch to BB 5 we know !(_1 < 0.0) so we can't fold _1 <= 1.0, we
> just
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #7 from Richard Biener ---
(In reply to Richard Biener from comment #6)
> ah, probably it's the missing CSE there:
>
> :
> _1 = (float) l_10;
> _2 = _1 < 0.0;
> zone1_17 = (int) _2;
> if (_1 < 0.0)
>
> we are
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #6 from Richard Biener ---
We have
if (_1 < 0.0)
# PHI < .., ..> // the if above only controls which PHI arg we take
... code ...
if (_1 < 1.0e+0)
# PHI < .., ...> // likewise
and are threading _1 < 0.0 -> _1 <
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #5 from avieira at gcc dot gnu.org ---
Im slightly confused here, on entry to BB 5 we know the opposite of _1 < 0.0
no? if we branch to BB 5 we know !(_1 < 0.0) so we can't fold _1 <= 1.0, we
just know that the range of _1 is >= 0.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
Aldy Hernandez changed:
What|Removed |Added
CC||amacleod at redhat dot com,
51 matches
Mail list logo