RE: [PATCH V16] VECT: Add decrement IV iteration loop control by variable amount support
Committed, thanks Richard. Pan -Original Message- From: Gcc-patches On Behalf Of Richard Biener via Gcc-patches Sent: Thursday, May 25, 2023 9:06 PM To: Richard Sandiford Cc: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org Subject: Re: [PATCH V16] VECT: Add decrement IV iteration loop control by variable amount support On Thu, 25 May 2023, Richard Sandiford wrote: > This looks good to me. Just a couple of very minor cosmetic things: > > juzhe.zh...@rivai.ai writes: > > @@ -753,17 +846,35 @@ vect_set_loop_condition_partial_vectors (class loop > > *loop, > > continue; > > } > > > > - /* See whether zero-based IV would ever generate all-false masks > > - or zero length before wrapping around. */ > > - bool might_wrap_p = vect_rgroup_iv_might_wrap_p (loop_vinfo, rgc); > > - > > - /* Set up all controls for this group. */ > > - test_ctrl = vect_set_loop_controls_directly (loop, loop_vinfo, > > -_seq, > > -_seq, > > -loop_cond_gsi, rgc, > > -niters, niters_skip, > > -might_wrap_p); > > + if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) || !iv_rgc > > + || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor > > + != rgc->max_nscalars_per_iter * rgc->factor)) > > Coding style is to put each subcondition on a separate line when the > whole condition doesn't fit on a single line. So: > > if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) > || !iv_rgc > || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor > != rgc->max_nscalars_per_iter * rgc->factor)) > > > @@ -2725,6 +2726,17 @@ start_over: > >&& !vect_verify_loop_lens (loop_vinfo)) > > LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; > > > > + /* If we're vectorizing an loop that uses length "controls" and > > s/an loop/a loop/(Sorry for not noticing earlier.) > > OK for trunk from my POV with those changes; no need to repost unless > your policies require it. Please give Richi a chance to comment too > though. LGTM as well. Thanks, Richard.
Re: [PATCH V16] VECT: Add decrement IV iteration loop control by variable amount support
On Thu, 25 May 2023, Richard Sandiford wrote: > This looks good to me. Just a couple of very minor cosmetic things: > > juzhe.zh...@rivai.ai writes: > > @@ -753,17 +846,35 @@ vect_set_loop_condition_partial_vectors (class loop > > *loop, > > continue; > > } > > > > - /* See whether zero-based IV would ever generate all-false masks > > - or zero length before wrapping around. */ > > - bool might_wrap_p = vect_rgroup_iv_might_wrap_p (loop_vinfo, rgc); > > - > > - /* Set up all controls for this group. */ > > - test_ctrl = vect_set_loop_controls_directly (loop, loop_vinfo, > > -_seq, > > -_seq, > > -loop_cond_gsi, rgc, > > -niters, niters_skip, > > -might_wrap_p); > > + if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) || !iv_rgc > > + || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor > > + != rgc->max_nscalars_per_iter * rgc->factor)) > > Coding style is to put each subcondition on a separate line when the > whole condition doesn't fit on a single line. So: > > if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) > || !iv_rgc > || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor > != rgc->max_nscalars_per_iter * rgc->factor)) > > > @@ -2725,6 +2726,17 @@ start_over: > >&& !vect_verify_loop_lens (loop_vinfo)) > > LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; > > > > + /* If we're vectorizing an loop that uses length "controls" and > > s/an loop/a loop/(Sorry for not noticing earlier.) > > OK for trunk from my POV with those changes; no need to repost unless > your policies require it. Please give Richi a chance to comment too > though. LGTM as well. Thanks, Richard.
Re: Re: [PATCH V16] VECT: Add decrement IV iteration loop control by variable amount support
Thanks Richard so much. I have sent V17 patch for commit (fix format as you suggested). You don't need to reply that. I am waiting for Richi's final approval. Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-25 20:36 To: juzhe.zhong CC: gcc-patches; rguenther Subject: Re: [PATCH V16] VECT: Add decrement IV iteration loop control by variable amount support This looks good to me. Just a couple of very minor cosmetic things: juzhe.zh...@rivai.ai writes: > @@ -753,17 +846,35 @@ vect_set_loop_condition_partial_vectors (class loop > *loop, >continue; >} > > - /* See whether zero-based IV would ever generate all-false masks > -or zero length before wrapping around. */ > - bool might_wrap_p = vect_rgroup_iv_might_wrap_p (loop_vinfo, rgc); > - > - /* Set up all controls for this group. */ > - test_ctrl = vect_set_loop_controls_directly (loop, loop_vinfo, > - _seq, > - _seq, > - loop_cond_gsi, rgc, > - niters, niters_skip, > - might_wrap_p); > + if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) || !iv_rgc > + || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor > + != rgc->max_nscalars_per_iter * rgc->factor)) Coding style is to put each subcondition on a separate line when the whole condition doesn't fit on a single line. So: if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) || !iv_rgc || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor != rgc->max_nscalars_per_iter * rgc->factor)) > @@ -2725,6 +2726,17 @@ start_over: >&& !vect_verify_loop_lens (loop_vinfo)) > LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; > > + /* If we're vectorizing an loop that uses length "controls" and s/an loop/a loop/(Sorry for not noticing earlier.) OK for trunk from my POV with those changes; no need to repost unless your policies require it. Please give Richi a chance to comment too though. Thanks for your patience with the review process. The final result seems pretty clean to me. Richard
Re: [PATCH V16] VECT: Add decrement IV iteration loop control by variable amount support
This looks good to me. Just a couple of very minor cosmetic things: juzhe.zh...@rivai.ai writes: > @@ -753,17 +846,35 @@ vect_set_loop_condition_partial_vectors (class loop > *loop, > continue; > } > > - /* See whether zero-based IV would ever generate all-false masks > -or zero length before wrapping around. */ > - bool might_wrap_p = vect_rgroup_iv_might_wrap_p (loop_vinfo, rgc); > - > - /* Set up all controls for this group. */ > - test_ctrl = vect_set_loop_controls_directly (loop, loop_vinfo, > - _seq, > - _seq, > - loop_cond_gsi, rgc, > - niters, niters_skip, > - might_wrap_p); > + if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) || !iv_rgc > + || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor > + != rgc->max_nscalars_per_iter * rgc->factor)) Coding style is to put each subcondition on a separate line when the whole condition doesn't fit on a single line. So: if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) || !iv_rgc || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor != rgc->max_nscalars_per_iter * rgc->factor)) > @@ -2725,6 +2726,17 @@ start_over: >&& !vect_verify_loop_lens (loop_vinfo)) > LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; > > + /* If we're vectorizing an loop that uses length "controls" and s/an loop/a loop/(Sorry for not noticing earlier.) OK for trunk from my POV with those changes; no need to repost unless your policies require it. Please give Richi a chance to comment too though. Thanks for your patience with the review process. The final result seems pretty clean to me. Richard
[PATCH V16] VECT: Add decrement IV iteration loop control by variable amount support
From: Ju-Zhe Zhong This patch is supporting decrement IV by following the flow designed by Richard: (1) In vect_set_loop_condition_partial_vectors, for the first iteration of: call vect_set_loop_controls_directly. (2) vect_set_loop_controls_directly calculates "step" as in your patch. If rgc has 1 control, this step is the SSA name created for that control. Otherwise the step is a fresh SSA name, as in your patch. (3) vect_set_loop_controls_directly stores this step somewhere for later use, probably in LOOP_VINFO. Let's use "S" to refer to this stored step. (4) After the vect_set_loop_controls_directly call above, and outside the "if" statement that now contains vect_set_loop_controls_directly, check whether rgc->controls.length () > 1. If so, use vect_adjust_loop_lens_control to set the controls based on S. Then the only caller of vect_adjust_loop_lens_control is vect_set_loop_condition_partial_vectors. And the starting step for vect_adjust_loop_lens_control is always S. This patch has well tested for single-rgroup and multiple-rgroup (SLP) and passed all testcase in RISC-V port. gcc/ChangeLog: * tree-vect-loop-manip.cc (vect_adjust_loop_lens_control): New function. (vect_set_loop_controls_directly): Add decrement IV support. (vect_set_loop_condition_partial_vectors): Ditto. * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): New variable. * tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-3.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-4.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-3.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-4.c: New test. --- .../rvv/autovec/partial/multiple_rgroup-3.c | 288 ++ .../rvv/autovec/partial/multiple_rgroup-4.c | 75 + .../autovec/partial/multiple_rgroup_run-3.c | 36 +++ .../autovec/partial/multiple_rgroup_run-4.c | 15 + gcc/tree-vect-loop-manip.cc | 135 +++- gcc/tree-vect-loop.cc | 12 + gcc/tree-vectorizer.h | 8 + 7 files changed, 557 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-4.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-3.c new file mode 100644 index 000..9579749c285 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-3.c @@ -0,0 +1,288 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax" } */ + +#include + +void __attribute__ ((noinline, noclone)) +f0 (int8_t *__restrict x, int16_t *__restrict y, int n) +{ + for (int i = 0, j = 0; i < n; i += 4, j += 8) +{ + x[i + 0] += 1; + x[i + 1] += 2; + x[i + 2] += 3; + x[i + 3] += 4; + y[j + 0] += 1; + y[j + 1] += 2; + y[j + 2] += 3; + y[j + 3] += 4; + y[j + 4] += 5; + y[j + 5] += 6; + y[j + 6] += 7; + y[j + 7] += 8; +} +} + +void __attribute__ ((optimize (0))) +f0_init (int8_t *__restrict x, int8_t *__restrict x2, int16_t *__restrict y, +int16_t *__restrict y2, int n) +{ + for (int i = 0, j = 0; i < n; i += 4, j += 8) +{ + x[i + 0] = i % 120; + x[i + 1] = i % 78; + x[i + 2] = i % 55; + x[i + 3] = i % 27; + y[j + 0] = j % 33; + y[j + 1] = j % 44; + y[j + 2] = j % 66; + y[j + 3] = j % 88; + y[j + 4] = j % 99; + y[j + 5] = j % 39; + y[j + 6] = j % 49; + y[j + 7] = j % 101; + + x2[i + 0] = i % 120; + x2[i + 1] = i % 78; + x2[i + 2] = i % 55; + x2[i + 3] = i % 27; + y2[j + 0] = j % 33; + y2[j + 1] = j % 44; + y2[j + 2] = j % 66; + y2[j + 3] = j % 88; + y2[j + 4] = j % 99; + y2[j + 5] = j % 39; + y2[j + 6] = j % 49; + y2[j + 7] = j % 101; +} +} + +void __attribute__ ((optimize (0))) +f0_golden (int8_t *__restrict x, int16_t *__restrict y, int n) +{ + for (int i = 0, j = 0; i < n; i += 4, j += 8) +{ + x[i + 0] += 1; + x[i + 1] += 2; + x[i + 2] += 3; + x[i + 3] += 4; + y[j + 0] += 1; + y[j + 1] += 2; + y[j + 2] += 3; + y[j + 3] += 4; + y[j + 4] += 5; + y[j + 5] += 6; + y[j + 6] += 7; + y[j + 7] += 8; +} +} + +void __attribute__ ((optimize (0))) +f0_check (int8_t *__restrict x, int8_t *__restrict x2, int16_t