https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113474
JuzheZhong changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115093
JuzheZhong changed:
What|Removed |Added
CC||juzhe.zhong at rivai dot ai
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115104
--- Comment #1 from JuzheZhong ---
I wonder whether RIVOS CI already found which commit cause this regression ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115104
Bug ID: 115104
Summary: RISC-V: GCC-14 can combine vsext+vadd -> vwadd but
Trunk GCC (GCC 15) Failed
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115068
Bug ID: 115068
Summary: RISC-V: Illegal instruction of vfwadd
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114988
--- Comment #2 from JuzheZhong ---
Li Pan is going to work on it.
Hi, kito and Jeff.
Can this fix backport to GCC-14 ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114988
--- Comment #1 from JuzheZhong ---
Ideally, it should be reported as (-march=rv64gc):
https://godbolt.org/z/3P76YEb9s
: In function 'test_vfwsub_wf_f32mf2':
:4:15: error: return type 'vfloat32mf2_t' requires the V ISA extension
4 |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114988
Bug ID: 114988
Summary: RISC-V: ICE in intrinsic __riscv_vfwsub_wf_f32mf2
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114887
--- Comment #2 from JuzheZhong ---
I think there is a too conservative analysis here:
note: _1: type = float, start = 1, end = 6
note: _5: type = float, start = 6, end = 8
note: _3: type = float, start = 3, end = 7
note: _4: type =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114887
--- Comment #1 from JuzheZhong ---
The "vect" cost model analysis:
https://godbolt.org/z/qbqzon8x1
note: Maximum lmul = 8, At most 40 number of live V_REG at program point 6
for bb 3
It seems that we count one more variable in program
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114639
--- Comment #18 from JuzheZhong ---
(In reply to Li Pan from comment #17)
> According to the V abi, looks like the asm code tries to save/restore the
> callee-saved registers when there is a call in function body.
>
> | Name| ABI Mnemonic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114639
--- Comment #16 from JuzheZhong ---
This issue is not fully fixed since the fixed patch only fixes ICE but there is
a regression in codegen:
https://godbolt.org/z/4nvxeqb6K
Terrible codege:
test(__rvv_uint64m4_t):
addisp,sp,-16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114809
JuzheZhong changed:
What|Removed |Added
CC||juzhe.zhong at rivai dot ai
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714
JuzheZhong changed:
What|Removed |Added
CC||juzhe.zhong at rivai dot ai
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114749
--- Comment #4 from JuzheZhong ---
Hi, Patrick.
It seems that Richard didn't append the testcase in the patch.
Could you send a patch to add the testcase for RISC-V port ?
Thangks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114729
JuzheZhong changed:
What|Removed |Added
CC||juzhe.zhong at rivai dot ai
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114686
JuzheZhong changed:
What|Removed |Added
CC||juzhe.zhong at rivai dot ai
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114639
--- Comment #6 from JuzheZhong ---
Definitely it is a regression:
https://compiler-explorer.com/z/e68x5sT9h
GCC 13.2 is ok, but GCC 14 ICE.
I think you should bisect first.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114476
--- Comment #7 from JuzheZhong ---
Hi, Robin.
Will you fix this bug ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114506
JuzheZhong changed:
What|Removed |Added
CC||juzhe.zhong at rivai dot ai
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114396
--- Comment #19 from JuzheZhong ---
I think it's better to add pr114396.c into vect testsuite instead of x86 target
test since it's the bug not only happens on x86.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
--- Comment #28 from JuzheZhong ---
The original cost model I did work for all cases but with some middle-end
changes
the cost model failed.
I don't have time to figure out what's going on here.
Robin may be interested at it.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109
--- Comment #3 from JuzheZhong ---
(In reply to Robin Dapp from comment #2)
> It is vectorized with a higher zvl, e.g. zvl512b, refer
> https://godbolt.org/z/vbfjYn5Kd.
OK. I see. But Clang generates many slide instruction which are expensive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109
--- Comment #1 from JuzheZhong ---
It seems RISC-V Clang didn't vectorize it ?
https://godbolt.org/z/G4han6vM3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113913
--- Comment #2 from JuzheZhong ---
It's the known issue we are trying to fix it in GCC-15.
My colleague Lehua is taking care of it.
CCing Lehua.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #16 from JuzheZhong ---
The FMA is generated in widening_mul PASS:
Before widening_mul (fab1):
_5 = 3.33314829616256247390992939472198486328125e-1 - _4;
_6 = _5 *
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #15 from JuzheZhong ---
(In reply to rguent...@suse.de from comment #14)
> On Wed, 7 Feb 2024, juzhe.zhong at rivai dot ai wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
> >
> > --- Comment #13 from JuzheZhong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #13 from JuzheZhong ---
Ok. I found the optimized tree:
_5 = 3.33314829616256247390992939472198486328125e-1 - _4;
_8 = .FMA (_5, 1.229982236431605997495353221893310546875e-1, _4);
Let CST0 =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #12 from JuzheZhong ---
Ok. I found it even without vectorization:
GCC is worse than Clang:
https://godbolt.org/z/addr54Gc6
GCC (14 instructions inside the loop):
fld fa3,0(a0)
fld fa5,8(a0)
fld
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #11 from JuzheZhong ---
Hi, I think this RVV compiler codegen is that optimal codegen we want for RVV:
https://repo.hca.bsc.es/epic/z/P6QXCc
.LBB0_5:# %vector.body
sub a4, t0, a3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134
--- Comment #22 from JuzheZhong ---
I have done this following experiment.
diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc
index bf017137260..8c36cc63d3b 100644
--- a/gcc/tree-ssa-loop-ivcanon.cc
+++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113608
--- Comment #2 from JuzheZhong ---
vuint16m2_t vadd(vuint16m2_t a, vuint8m1_t b) {
int vl = __riscv_vsetvlmax_e8m1();
vuint16m2_t c = __riscv_vzext_vf2_u16m2(b, vl);
return __riscv_vadd_vv_u16m2(a, c, vl);
}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134
--- Comment #21 from JuzheZhong ---
Hi, Richard. I looked into ivcanon.
I found that:
/* If the loop has more than one exit, try checking all of them
for # of iterations determinable through scev. */
if (!exit)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51492
--- Comment #11 from JuzheZhong ---
Hi, Tamar.
We are interested in supporting saturating and rounding.
We may need to support scalar first.
Do you have any suggestions ?
Or you are already working on it?
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51492
--- Comment #10 from JuzheZhong ---
Hi, Tamar.
We are interested in supporting saturating and rounding.
We may need to support scalar first.
Do you have any suggestions ?
Or you are already working on it?
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51492
--- Comment #9 from JuzheZhong ---
Ok. After investigation of LLVM:
Before loop vectorizer:
%cond12 = tail call i32 @llvm.usub.sat.i32(i32 %conv5, i32 %wsize)
%conv13 = trunc i32 %cond12 to i16
After loop vectorizer:
%10 = call <16 x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51492
--- Comment #8 from JuzheZhong ---
Missing saturate vectorization causes RVV Clang 20% performance better than RVV
GCC during recent benchmark evaluation.
In coremark pro zip-test, I believe other targets should be the same.
I wonder how we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113695
--- Comment #1 from JuzheZhong ---
Since both operand are input operand, early clobber "&" constraint can not
help.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113695
Bug ID: 113695
Summary: RISC-V: Sources with different EEW must use different
registers
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134
--- Comment #19 from JuzheZhong ---
The loop is:
bb 3 -> bb 4 -> bb 5
| |__⬆
|__⬆
The condition in bb 3 is if (i_21 == 1001).
The condition in bb 4 is if (N_13(D) > i_18).
Look into lsplit:
This loop doesn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
--- Comment #18 from JuzheZhong ---
(In reply to rguent...@suse.de from comment #17)
> On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> >
> > --- Comment #16 from JuzheZhong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
--- Comment #16 from JuzheZhong ---
(In reply to rguent...@suse.de from comment #15)
> On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> >
> > --- Comment #14 from JuzheZhong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
--- Comment #14 from JuzheZhong ---
Thanks Richard.
It seems that we can't fix this issue for now. Is that right ?
If I understand correctly, do you mean we should wait after SLP representations
are finished and then revisit this PR?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
--- Comment #12 from JuzheZhong ---
OK. It seems it has data dependency issue:
missed: not vectorized, possible dependence between data-refs a[i_15] and
a[_4]
a[i_15] = _3; STMT 1
_4 = i_15 + 2;
_5 = a[_4];STMT 2
STMT2 should not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
--- Comment #11 from JuzheZhong ---
It seems that we should fix this case (Richard gave) first which I think it's
not the SCEV or value-numbering issue:
double a[1024];
void foo ()
{
for (int i = 0; i < 1022; i += 2)
{
double tem =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
--- Comment #10 from JuzheZhong ---
I think the root cause is we think i_16 and _1 are alias due to scalar
evolution:
(get_scalar_evolution
(scalar = i_16)
(scalar_evolution = {0, +, 2}_1))
(get_scalar_evolution
(scalar = _1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
JuzheZhong changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #20 from JuzheZhong ---
(In reply to Robin Dapp from comment #19)
> What seems odd to me is that in fre5 we simplify
>
> _429 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0,
> ... });
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
--- Comment #8 from JuzheZhong ---
Hi, Richard.
Now, I find the time to GCC vectorization optimization.
I find this case:
_2 = a[_1];
...
a[i_16] = _4;
,,,
_7 = a[_1];---> This load should be eliminated and re-use _2.
Am I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113166
--- Comment #3 from JuzheZhong ---
#include
#include
template
inline vuint8m1_t tail_load(void const* data);
template<>
inline vuint8m1_t tail_load(void const* data) {
uint64_t const* ptr64 = reinterpret_cast(data);
#if 1
const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113666
Bug ID: 113666
Summary: RISC-V: Cost model test regression due to recent
middle-end loop vectorizer changes
Product: gcc
Version: 14.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #15 from JuzheZhong ---
Hi, Robin.
I tried to disable vec_extract, then the case passed.
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 3b32369f68c..b61b886ef3d 100644
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #13 from JuzheZhong ---
Ok. I found a regression between rvv-next and trunk.
I believe it is GCC-12 vs GCC-14:
rvv-next:
...
.L11:
li t1,31
mv a2,a1
bleua7,t1,.L12
bne a6,zero,.L13
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #11 from JuzheZhong ---
(In reply to Robin Dapp from comment #10)
> The compile farm machine I'm using doesn't have SVE.
> Compiling with -march=armv8-a -O3 pr113607.c -fno-vect-cost-model and
> running it returns 0 (i.e. ok).
>
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #9 from JuzheZhong ---
Hi, Robin.
Could you try this case on latest ARM SVE ?
with -march=armv8-a+sve -O3 -fno-vect-cost-model.
I want to make sure first it is not an middle-end bug.
The RVV vectorized IR is same as ARM SVE.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #8 from JuzheZhong ---
Ok. I can reproduce it too.
I am gonna work on fixing it.
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113608
Bug ID: 113608
Summary: RISC-V: Vector spills after enabling vector abi
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #3 from JuzheZhong ---
I tried trunk GCC to run your case with SPIKE, still didn't reproduce this
issue.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #2 from JuzheZhong ---
I can't reproduce this issue.
Could you test it with this patch applied ?
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643934.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
--- Comment #1 from JuzheZhong ---
I can reproduce this issue.
Could you test it with this patch applied ?
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643934.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #7 from JuzheZhong ---
(In reply to rguent...@suse.de from comment #6)
> On Thu, 25 Jan 2024, juzhe.zhong at rivai dot ai wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
> >
> > --- Comment #5 from JuzheZhong ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113570
--- Comment #5 from JuzheZhong ---
It seems that we don't have any bugs in current SPEC 2017 testing.
So I strongly suggest "full coverage" testing on SPEC 2017 which I mentioned
in PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #5 from JuzheZhong ---
Both ICC and Clang X86 can vectorize SPEC 2017 lbm:
https://godbolt.org/z/MjbTbYf1G
But I am not sure X86 ICC is better or X86 Clang is better.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #4 from JuzheZhong ---
OK. Confirm on X86 GCC failed to vectorize it, wheras Clang X86 can vectorize
it.
https://godbolt.org/z/EaTjGbPGW
X86 Clang and RISC-V Clang IR are same:
%12 = tail call <8 x double>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087
--- Comment #44 from JuzheZhong ---
(In reply to Patrick O'Neill from comment #43)
> (In reply to Patrick O'Neill from comment #42)
> > I kicked off a run roughly 10 hours ago with your memory-hog fix patch
> > applied to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #3 from JuzheZhong ---
Ok I see.
If we change NN into 8, then we can vectorize it with load_lanes/store_lanes
with group size = 8:
https://godbolt.org/z/doe9c3hfo
We will use vlseg8e64 which is RVVM1DF[8] == RVVM1x8DFmode.
Here
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #1 from JuzheZhong ---
It's interesting, for Clang only RISC-V can vectorize it.
I think there are 2 topics:
1. Support vectorization of this codes of in loop vectorizer.
2. Transform gather/scatter into strided load/store for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087
--- Comment #41 from JuzheZhong ---
Hi, Patrick.
Could you trigger test again base on latest trunk GCC?
We have recent memory-hog fix patch:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=3132d2d36b4705bb762e61b1c8ca4da7c78a8321
I want to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #24 from JuzheZhong ---
(In reply to Richard Biener from comment #19)
> (In reply to Richard Biener from comment #18)
> > (In reply to Tamar Christina from comment #17)
> > > Ok, bisected to
> > >
> > >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #14 from JuzheZhong ---
I just tried again both GCC-13.2 and GCC-14 with -fno-vect-cost-model.
https://godbolt.org/z/enEG3qf5K
GCC-14 requires scalar epilogue loop, whereas GCC-13.2 doesn't.
I believe it's not cost model issue.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #12 from JuzheZhong ---
(In reply to Richard Biener from comment #11)
> (In reply to Tamar Christina from comment #9)
> > There is a weird costing going on in the PHI nodes though:
> >
> > m_108 = PHI 1 times vector_stmt costs 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #10 from JuzheZhong ---
(In reply to Tamar Christina from comment #9)
> So on SVE the change is cost modelling.
>
> Bisect landed on g:33c2b70dbabc02788caabcbc66b7baeafeb95bcf which changed
> the compiler's defaults to using the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #31 from JuzheZhong ---
machine dep reorg : 403.69 ( 56%) 23.48 ( 93%) 427.17 ( 57%)
5290k ( 0%)
Confirm remove RTL DF checking, LICM is no longer be compile-time hog issue.
VSETVL PASS count 56% compile-time.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #30 from JuzheZhong ---
Ok. I believe m_avl_def_in && m_avl_def_out can be removed with a better
algorthm.
Then the memory-hog should be fixed soon.
I am gonna rewrite avl_vl_unmodified_between_p and trigger full coverage
testingl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #8 from JuzheZhong ---
I believe the change between Nov and Dec causes regression.
But I don't continue on bisection.
Hope this information can help with your bisection.
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #7 from JuzheZhong ---
(In reply to Tamar Christina from comment #6)
> Hello,
>
> I can bisect it if you want. it should only take a few seconds.
Ok. Thanks a lot ...
I take 2 hours to bisect it manually but still didn't locate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #5 from JuzheZhong ---
Confirm at Nov, 1. The regression is gone.
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=eac0917bd3d2ead4829d56c8f2769176087c7b3d
This commit is ok, which has no regressions.
Still bisecting manually.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #28 from JuzheZhong ---
(In reply to Robin Dapp from comment #27)
> Following up on this:
>
> I'm seeing the same thing Patrick does. We create a lot of large non-sparse
> sbitmaps that amount to around 33G in total.
>
> I did
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113420
JuzheZhong changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #25 from JuzheZhong ---
RISC-V backend memory-hog issue is fixed.
But compile time hog in LICM still there, so keep this PR open.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #22 from JuzheZhong ---
(In reply to Richard Biener from comment #21)
> I once tried to avoid df_reorganize_refs and/or optimize this with the
> blocks involved but failed.
I am considering whether we should disable LICM for RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #19 from JuzheZhong ---
(In reply to JuzheZhong from comment #18)
> Hi, Robin.
>
> I have fixed patch for memory-hog:
> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643418.html
>
> I will commit it after the testing.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #18 from JuzheZhong ---
Hi, Robin.
I have fixed patch for memory-hog:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643418.html
I will commit it after the testing.
But compile-time hog still exists which is loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #17 from JuzheZhong ---
Ok. Confirm the original test 33383M -> 4796k now.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #16 from JuzheZhong ---
(In reply to Andrew Pinski from comment #15)
> (In reply to JuzheZhong from comment #14)
> > Oh. I known the reason now.
> >
> > The issue is not RISC-V backend VSETVL PASS.
> >
> > It's memory bug of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #14 from JuzheZhong ---
Oh. I known the reason now.
The issue is not RISC-V backend VSETVL PASS.
It's memory bug of rtx_equal_p I think.
We are calling rtx_equal_p which is very costly.
For example, has_nonvlmax_reg_avl is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #13 from JuzheZhong ---
So I think we should investigate why calling has_nonvlmax_reg_avl cost so much
memory.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #12 from JuzheZhong ---
Ok. Here is a simple fix which give some hints:
diff --git a/gcc/config/riscv/riscv-vsetvl.cc
b/gcc/config/riscv/riscv-vsetvl.cc
index 2067073185f..ede818140dc 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #11 from JuzheZhong ---
It should be compute_lcm_local_properties. The memory usage reduce 50% after I
remove this function. I am still investigating.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #10 from JuzheZhong ---
No, it's not caused here. I removed the whole function compute_avl_def_data.
The memory usage doesn't change.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #6 from JuzheZhong ---
(In reply to Andrew Pinski from comment #5)
> Note "loop invariant motion" is the RTL based loop invariant motion pass.
So you mean it should be still RISC-V issue, right ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #4 from JuzheZhong ---
Also, the original file with -fno-move-loop-invariants reduce compile time from
60 minutes into 7 minutes:
real7m12.528s
user6m55.214s
sys 0m17.147s
machine dep reorg : 75.93 (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #3 from JuzheZhong ---
Ok. The reduced case:
# 1 "module_first_rk_step_part1.fppized.f90"
# 1 ""
# 1 ""
# 1 "module_first_rk_step_part1.fppized.f90"
!WRF:MEDIATION_LAYER:SOLVER
MODULE module_first_rk_step_part1
CONTAINS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #2 from JuzheZhong ---
To build the attachment file, we need these following file from SPEC2017:
module_big_step_utilities_em.mod module_cumulus_driver.mod
module_fddagd_driver.modmodule_model_constants.mod
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
--- Comment #1 from JuzheZhong ---
Created attachment 57149
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57149=edit
spec2017 wrf
spec2017 wrf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
Bug ID: 113495
Summary: RISC-V: Time and memory awful consumption of SPEC2017
wrf benchmark
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113166
--- Comment #2 from JuzheZhong ---
#include
#if TO_16
# define uintOut_t uint16_t
# define utf8_to_utf32_scalar utf8_to_utf16_scalar
# define utf8_to_utf32_rvv utf8_to_utf16_rvv
#else
# define uintOut_t uint32_t
#endif
size_t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113474
--- Comment #2 from JuzheZhong ---
Oh. It's pretty simple fix. I am not sure whether Richards allow it since it's
stage4 but worth to have a try.
Could you send a patch ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113474
Bug ID: 113474
Summary: RISC-V: Fail to use vmerge.vim for constant vector
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113429
--- Comment #10 from JuzheZhong ---
I have commit V3 patch with rebasing since V2 patch conflicts with the trunk.
I think you can use trunk GCC validate CAM4 directly now.
1 - 100 of 555 matches
Mail list logo