...@rivai.ai; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On Thu, Jan 11, 2024 at 10:52 AM Robin Dapp wrote:
>
> On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> > Oh. I see I think I have done wrong here.
> >
> &
ocal count: 359464610]:
goto ; [100.00%]
}
Final ASM:
main:
lui a5,%hi(a)
li a4,19
sb a4,%lo(a)(a5)
li a0,0
ret
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 20:56
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH
> 32872 spends 2 scalar instructions + 1 scalar_to_vec cost:
>
> lia4,-32768
> addiwa4,a4,104
> vmv.v.xv16,a4
>
> It seems reasonable but only can fix test with -march=rv64gcv_zvl256b but
> failed on -march=rv64gcv_zvl4096b.
The scalar version also needs both instructions:
li
e later pass failed
to CSE it...
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 19:15
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> I think we sho
: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> I think we shouldn't vectorize it with any vlen, since the non-vectorized
> codegen is much better.
> And also, I have tested -msve-vector-bits=2048, ARM SV
te: 2024-01-11 19:15
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> I think we shouldn't vectorize it with any vlen, since the non-vectorized
> codegen is much
.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 19:15
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> I think we shouldn't vectorize it with any vlen, since the
> I think we shouldn't vectorize it with any vlen, since the non-vectorized
> codegen is much better.
> And also, I have tested -msve-vector-bits=2048, ARM SVE doesn't vectorize it.
> -zvl65536b, RVV Clang also doesn't vectorize it.
Of course I agree that optimizing everything to return 0 is
b, RVV Clang also doesn't vectorize it.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-01-11 18:40
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On 1/11/24 11:
On 1/11/24 11:20, juzhe.zh...@rivai.ai wrote:
> Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside the
> loop we have these 2 scalar_to_vec:
>
> 1. MIN_EXPR 1 times scalar_to_vec costs 1 in prologue
>
> This scalar_to_vec cost should be 0 or 1 since it only generate
all invariants are represented by SLP nodes
which we can hand down.
> >
> > juzhe.zh...@rivai.ai
> >
> >
> > From: Robin Dapp
> > Date: 2024-01-11 18:14
> > To: juzhe.zh...@rivai.ai; Richard Biener
> > CC: rdapp.gcc; gcc-pa
don't.
>
> juzhe.zh...@rivai.ai
>
>
> From: Robin Dapp
> Date: 2024-01-11 18:14
> To: juzhe.zh...@rivai.ai; Richard Biener
> CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
> Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_c
n Dapp
Date: 2024-01-11 18:14
To: juzhe.zh...@rivai.ai; Richard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
> Yeah... I just noticed. I should set it as 4 to fix it with biggest VLEN
> size,
>
> Yeah... I just noticed. I should set it as 4 to fix it with biggest VLEN
> size,
> that is, -march=rv64gcv_zvl4096b --param=riscv-autovec-lmul=m8...
>
> I am confused now how to fix this case.
4 is definitely too high compared to a regular instruction.
vmv.vx could even be zero-cost for
;
jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
>> The slidedown/vmv.x.s part is of course vec_extract but we indeed
>> don't seem to cost it as vec_to_scalar here.
>
> It looks like a vectorized live operation as it's not in the loop body
ichard Biener
CC: rdapp.gcc; juzhe.zh...@rivai.ai; gcc-patches; kito.cheng; Kito.cheng;
jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
>> The slidedown/vmv.x.s part is of course vec_extract but we indeed
>> don't seem to cost it as vec_to_scalar here.
>> The slidedown/vmv.x.s part is of course vec_extract but we indeed
>> don't seem to cost it as vec_to_scalar here.
>
> It looks like a vectorized live operation as it's not in the loop body
> (and thus really irrelevant for costing in practice). This has
>
> /* ??? Enable for loop
On Thu, Jan 11, 2024 at 10:52 AM Robin Dapp wrote:
>
> On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> > Oh. I see I think I have done wrong here.
> >
> > I should adjust cost for VEC_EXTRACT not VEC_SET.
> >
> > But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec
> > cost in
ichard Biener
CC: rdapp.gcc; gcc-patches; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> Oh. I see I think I have done wrong here.
>
> I should adjust cost for VEC_EXTRACT not VEC_SET
On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> Oh. I see I think I have done wrong here.
>
> I should adjust cost for VEC_EXTRACT not VEC_SET.
>
> But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec
> cost in vect.dump.
The slidedown/vmv.x.s part is of course vec_extract but
7:18
To: Juzhe-Zhong
CC: gcc-patches; kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On Thu, Jan 11, 2024 at 9:24 AM Juzhe-Zhong wrote:
>
> This patch fixes the following inefficient vectorized codes:
>
>
; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3
On Thu, Jan 11, 2024 at 9:24 AM Juzhe-Zhong wrote:
>
> This patch fixes the following inefficient vectorized codes:
>
> vsetvli a5,zero,e8,mf2,ta,ma
> li a2,17
>
On Thu, Jan 11, 2024 at 9:24 AM Juzhe-Zhong wrote:
>
> This patch fixes the following inefficient vectorized codes:
>
> vsetvli a5,zero,e8,mf2,ta,ma
> li a2,17
> vid.v v1
> li a4,-32768
> vsetvli zero,zero,e16,m1,ta,ma
> addiw
This patch fixes the following inefficient vectorized codes:
vsetvli a5,zero,e8,mf2,ta,ma
li a2,17
vid.v v1
li a4,-32768
vsetvli zero,zero,e16,m1,ta,ma
addiw a4,a4,104
vmv.v.i v3,15
lui a1,%hi(a)
li
24 matches
Mail list logo