Hi! On 2024-03-28T08:00:50+0100, I wrote: > On 2024-03-22T15:54:48+0000, Andrew Stubbs <a...@baylibre.com> wrote: >> This patch alters the default (preferred) vector size to 32 on RDNA devices >> to >> better match the actual hardware. 64-lane vectors will continue to be >> used where they are hard-coded (such as function prologues). >> >> We run these devices in wavefrontsize64 for compatibility, but they actually >> only have 32-lane vectors, natively. If the upper part of a V64 is masked >> off (as it is in V32) then RDNA devices will skip execution of the upper part >> for most operations, so this adjustment shouldn't leave too much performance >> on >> the table. One exception is memory instructions, so full wavefrontsize32 >> support would be better. >> >> The advantage is that we avoid the missing V64 operations (such as permute >> and >> vec_extract). >> >> Committed to mainline. > > In my GCN target '-march=gfx1100' testing, this commit > "amdgcn: Prefer V32 on RDNA devices" does resolve (or, make latent?) a > number of execution test FAILs (that is, regressions compared to earlier > '-march=gfx90a' etc. testing). > > This commit also resolves (for my '-march=gfx1100' testing) one > pre-existing FAIL (that is, already seen in '-march=gfx90a' earlier > etc. testing): > > PASS: gcc.dg/tree-ssa/scev-14.c (test for excess errors) > [-FAIL:-]{+PASS:+} gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts > "Overflowness wrto loop niter:\tNo-overflow" > > That means, this test case specifically (or, just its 'scan-tree-dump'?) > needs to be adjusted for GCN V64 testing? > > This commit, as you'd also mentioned elsewhere, however also causes a > number of regressions in 'gcc.target/gcn/gcn.exp', see list below. > > Those can be "fixed" with 'dg-additional-options -march=gfx90a' (or > similar) in the affected test cases (let me know if you'd like me to > 'git push' that), but I suppose something more elaborate may be in order? > (Conditionalize those on 'target { ! gcn_rdna }', and add respective > scanning for 'target gcn_rdna'? I can help with effective-target > 'gcn_rdna' (or similar), if you'd like me to.) > > And/or, have a '-mpreferred-simd-mode=v64' (or similar) to be used for > such test cases, to override 'if (TARGET_RDNA2_PLUS)' etc. in > 'gcn_vectorize_preferred_simd_mode'?
The latter I have quickly implemented, see attached "GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]'". OK to push to trunk branch? (This '--param' will also be useful for another bug/regression I'm about to file.) > Best, probably, both these things, to properly test both V32 and V64? That part remains to be done, but is best done by someone who actually knowns "GCN" assembly/GCC back end -- that is, not me. Grüße Thomas > PASS: gcc.target/gcn/cond_fmaxnm_1.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times > smaxv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times > smaxv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fmaxnm_1_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fmaxnm_1_run.c execution test > > PASS: gcc.target/gcn/cond_fmaxnm_2.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times > smaxv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times > smaxv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fmaxnm_2_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fmaxnm_2_run.c execution test > > PASS: gcc.target/gcn/cond_fmaxnm_3.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times > movv64df_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times > movv64sf_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times > smaxv64sf3 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times > smaxv64sf3 3 > PASS: gcc.target/gcn/cond_fmaxnm_3_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fmaxnm_3_run.c execution test > > PASS: gcc.target/gcn/cond_fmaxnm_4.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times > movv64df_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times > movv64sf_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times > smaxv64sf3 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times > smaxv64sf3 3 > PASS: gcc.target/gcn/cond_fmaxnm_4_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fmaxnm_4_run.c execution test > > PASS: gcc.target/gcn/cond_fmaxnm_5.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times > smaxv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times > smaxv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fmaxnm_5_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fmaxnm_5_run.c execution test > > PASS: gcc.target/gcn/cond_fmaxnm_6.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times > smaxv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times > smaxv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fmaxnm_6_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fmaxnm_6_run.c execution test > > PASS: gcc.target/gcn/cond_fmaxnm_7.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times > smaxv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times > smaxv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fmaxnm_7_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fmaxnm_7_run.c execution test > > PASS: gcc.target/gcn/cond_fmaxnm_8.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times > smaxv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times > smaxv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fmaxnm_8_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fmaxnm_8_run.c execution test > > PASS: gcc.target/gcn/cond_fminnm_1.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times > sminv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times > sminv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fminnm_1_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fminnm_1_run.c execution test > > PASS: gcc.target/gcn/cond_fminnm_2.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times > sminv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times > sminv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fminnm_2_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fminnm_2_run.c execution test > > PASS: gcc.target/gcn/cond_fminnm_3.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times > movv64df_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times > movv64sf_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times > sminv64sf3 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times > sminv64sf3 3 > PASS: gcc.target/gcn/cond_fminnm_3_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fminnm_3_run.c execution test > > PASS: gcc.target/gcn/cond_fminnm_4.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times > movv64df_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times > movv64sf_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times > sminv64sf3 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times > sminv64sf3 3 > PASS: gcc.target/gcn/cond_fminnm_4_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fminnm_4_run.c execution test > > PASS: gcc.target/gcn/cond_fminnm_5.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times > sminv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times > sminv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fminnm_5_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fminnm_5_run.c execution test > > PASS: gcc.target/gcn/cond_fminnm_6.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times > sminv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times > sminv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fminnm_6_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fminnm_6_run.c execution test > > PASS: gcc.target/gcn/cond_fminnm_7.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times > sminv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times > sminv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fminnm_7_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fminnm_7_run.c execution test > > PASS: gcc.target/gcn/cond_fminnm_8.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_.. > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times > sminv64df3_exec 3 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times > sminv64sf3_exec 3 > PASS: gcc.target/gcn/cond_fminnm_8_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_fminnm_8_run.c execution test > > @@ -124634,12 +124634,12 @@ PASS: gcc.target/gcn/cond_shift_3.c > scan-assembler-not movv64di_exec/2 > PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-not v_cndmask_b32 > PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times > \\tv_ashrrev_i32\\tv[0-9]+, 3, v[0-9]+ 1 > PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times > \\tv_lshlrev_b32\\tv[0-9]+, 3, v[0-9]+ 10 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times > vashlv64di3_exec 2 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times > vashlv64si3_exec 18 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times > vashrv64di3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times > vashrv64si3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times > vlshrv64di3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times > vlshrv64si3_exec 1 > PASS: gcc.target/gcn/cond_shift_3_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_shift_3_run.c execution test > > PASS: gcc.target/gcn/cond_shift_4.c (test for excess errors) > @@ -124647,77 +124647,77 @@ PASS: gcc.target/gcn/cond_shift_4.c > scan-assembler-not movv64di_exec/2 > PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-not v_cndmask_b32 > PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times > \\tv_ashrrev_i32\\tv[0-9]+, 3, v[0-9]+ 1 > PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times > \\tv_lshlrev_b32\\tv[0-9]+, 3, v[0-9]+ 10 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times > vashlv64di3_exec 2 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times > vashlv64si3_exec 18 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times > vashrv64di3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times > vashrv64si3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times > vlshrv64di3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times > vlshrv64si3_exec 1 > PASS: gcc.target/gcn/cond_shift_4_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_shift_4_run.c execution test > > PASS: gcc.target/gcn/cond_shift_8.c (test for excess errors) > PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64di_exec/0 > PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64si_exec/0 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times > vashlv64di3_exec 2 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times > vashlv64si3_exec 18 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times > vashrv64di3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times > vashrv64si3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times > vlshrv64di3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times > vlshrv64si3_exec 1 > PASS: gcc.target/gcn/cond_shift_8_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_shift_8_run.c execution test > > PASS: gcc.target/gcn/cond_shift_9.c (test for excess errors) > PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64di_exec/1 > PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64si_exec/2 > PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not v_cndmask_b32 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times > vashlv64di3_exec 2 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times > vashlv64si3_exec 18 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times > vashrv64di3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times > vashrv64si3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times > vlshrv64di3_exec 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times > vlshrv64si3_exec 1 > PASS: gcc.target/gcn/cond_shift_9_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_shift_9_run.c execution test > > PASS: gcc.target/gcn/cond_smax_1.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-not > \\ts_cmpk_lg_u32\\tvcc_lo, 0 > PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not > \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ > PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0 > PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not smaxv64si3/0 > PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times > \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80 > PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times > \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 > PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times > \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-times > smaxv64si3_exec 30 > PASS: gcc.target/gcn/cond_smax_1_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_smax_1_run.c execution test > > PASS: gcc.target/gcn/cond_smin_1.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-not > \\ts_cmpk_lg_u32\\tvcc_lo, 0 > PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not > \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ > PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0 > PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not sminv64si3/0 > PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times > \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80 > PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times > \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 > PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times > \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-times > sminv64si3_exec 30 > PASS: gcc.target/gcn/cond_smin_1_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_smin_1_run.c execution test > > PASS: gcc.target/gcn/cond_umax_1.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-not > \\ts_cmpk_lg_u32\\tvcc_lo, 0 > PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0 > PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not umaxv64si3/0 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times > \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 > PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times > \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 > PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times > \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times > umaxv64si3_exec 20 > PASS: gcc.target/gcn/cond_umax_1_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_umax_1_run.c execution test > > PASS: gcc.target/gcn/cond_umin_1.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-not > \\ts_cmpk_lg_u32\\tvcc_lo, 0 > PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not > \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0 > PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not uminv64si3/0 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times > \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 > PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times > \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 > PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times > \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8 > [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times > uminv64si3_exec 20 > PASS: gcc.target/gcn/cond_umin_1_run.c (test for excess errors) > PASS: gcc.target/gcn/cond_umin_1_run.c execution test > > PASS: gcc.target/gcn/simd-math-1.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_acos" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_acosh" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_asin" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_asinh" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_atan" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_atan2" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_atanh" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_copysign" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_cos" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_cosh" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_erf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_exp" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_exp2" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_fmod" > XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_gamma" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_hypot" > XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_lgamma" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_log" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_log10" > XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log2" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_pow" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_remainder" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_rint" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_scalb" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_significand" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_sin" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_sinh" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_sqrt" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_tan" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_tanh" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64df_tgamma" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_acosf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_acoshf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_asinf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_asinhf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_atan2f" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_atanf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_atanhf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_copysignf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_cosf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_coshf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_erff" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_exp2f" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_expf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_fmodf" > XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_gammaf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_hypotf" > XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_lgammaf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_log10f" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_log2f" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_logf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_powf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_remainderf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_rintf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_scalbf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_significandf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_sinf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_sinhf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_sqrtf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_tanf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_tanhf" > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect > "v64sf_tgammaf" > > @@ -125130,7 +125130,7 @@ PASS: gcc.target/gcn/simd-math-5-char-run.c > (test for excess errors) > PASS: gcc.target/gcn/simd-math-5-char-run.c execution test > PASS: gcc.target/gcn/simd-math-5-char.c (test for excess errors) > XFAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times > __divmodv64si4@rel32@lo 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-char.c scan-assembler-times > __divv64hi3@rel32@lo 1 > PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times > __divv64qi3@rel32@lo 0 > FAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times > __modv64qi3@rel32@lo 1 > PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times > __udivv64qi3@rel32@lo 0 > > @@ -125171,8 +125171,8 @@ PASS: gcc.target/gcn/simd-math-5-long-run.c > (test for excess errors) > PASS: gcc.target/gcn/simd-math-5-long-run.c execution test > PASS: gcc.target/gcn/simd-math-5-long.c (test for excess errors) > XFAIL: gcc.target/gcn/simd-math-5-long.c scan-assembler-times > __divmodv64di4@rel32@lo 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-times > __divv64di3@rel32@lo 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-times > __modv64di3@rel32@lo 1 > PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times > __udivv64di3@rel32@lo 0 > PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times > __umodv64di3@rel32@lo 0 > > PASS: gcc.target/gcn/simd-math-5-short.c (test for excess errors) > XFAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times > __divmodv64si4@rel32@lo 1 > PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times > __divv64hi3@rel32@lo 0 > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-short.c > scan-assembler-times __divv64si3@rel32@lo 1 > FAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times > __modv64hi3@rel32@lo 1 > PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times > __udivv64hi3@rel32@lo 0 > PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times > __umodv64hi3@rel32@lo 0 > > PASS: gcc.target/gcn/simd-math-5.c (test for excess errors) > XFAIL: gcc.target/gcn/simd-math-5.c scan-assembler-times > __divmodv64si4@rel32@lo 1 > PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __divsi3@rel32@lo > 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times > __divv64si3@rel32@lo 1 > [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times > __modv64si3@rel32@lo 1 > PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times > __udivmodv64si4@rel32@lo 0 > PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times > __udivsi3@rel32@lo 0 > PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times > __udivv64si3@rel32@lo 0 > @@ -125242,13 +125242,13 @@ PASS: gcc.target/gcn/simd-math-5.c > scan-assembler-times __umodv64si3@rel32@lo 0 > > PASS: gcc.target/gcn/smax_1.c (test for excess errors) > PASS: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, > v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 > FAIL: gcc.target/gcn/smax_1.c scan-assembler-times > \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80 > [-PASS:-]{+FAIL:+} gcc.target/gcn/smax_1.c scan-assembler-times > vec_cmpv64didi 10 > PASS: gcc.target/gcn/smax_1_run.c (test for excess errors) > PASS: gcc.target/gcn/smax_1_run.c execution test > > PASS: gcc.target/gcn/smin_1.c (test for excess errors) > PASS: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, > v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 > FAIL: gcc.target/gcn/smin_1.c scan-assembler-times > \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80 > [-PASS:-]{+FAIL:+} gcc.target/gcn/smin_1.c scan-assembler-times > vec_cmpv64didi 10 > PASS: gcc.target/gcn/smin_1_run.c (test for excess errors) > PASS: gcc.target/gcn/smin_1_run.c execution test > > PASS: gcc.target/gcn/sram-ecc-3.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-3.c scan-assembler > (\\*zero_extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift) > > PASS: gcc.target/gcn/sram-ecc-4.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-4.c scan-assembler > (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift) > > PASS: gcc.target/gcn/sram-ecc-7.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-7.c scan-assembler > (\\*zero_extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift) > > PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors) > [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-8.c scan-assembler > (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift) > > PASS: gcc.target/gcn/umax_1.c (test for excess errors) > PASS: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, > v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 > FAIL: gcc.target/gcn/umax_1.c scan-assembler-times > \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 > [-PASS:-]{+FAIL:+} gcc.target/gcn/umax_1.c scan-assembler-times > vec_cmpv64didi 8 > PASS: gcc.target/gcn/umax_1_run.c (test for excess errors) > PASS: gcc.target/gcn/umax_1_run.c execution test > > PASS: gcc.target/gcn/umin_1.c (test for excess errors) > PASS: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, > v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 > FAIL: gcc.target/gcn/umin_1.c scan-assembler-times > \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 > [-PASS:-]{+FAIL:+} gcc.target/gcn/umin_1.c scan-assembler-times > vec_cmpv64didi 8 > PASS: gcc.target/gcn/umin_1_run.c (test for excess errors) > PASS: gcc.target/gcn/umin_1_run.c execution test > > > Grüße > Thomas > > >> gcc/ChangeLog: >> >> * config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode): Prefer V32 on >> RDNA devices. >> --- >> gcc/config/gcn/gcn.cc | 26 ++++++++++++++++++++++++++ >> 1 file changed, 26 insertions(+) >> >> diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc >> index 498146dcde9..efb73af50c4 100644 >> --- a/gcc/config/gcn/gcn.cc >> +++ b/gcc/config/gcn/gcn.cc >> @@ -5226,6 +5226,32 @@ gcn_vector_mode_supported_p (machine_mode mode) >> static machine_mode >> gcn_vectorize_preferred_simd_mode (scalar_mode mode) >> { >> + /* RDNA devices have 32-lane vectors with limited support for 64-bit >> vectors >> + (in particular, permute operations are only available for cases that >> don't >> + span the 32-lane boundary). >> + >> + From the RDNA3 manual: "Hardware may choose to skip either half if the >> + EXEC mask for that half is all zeros...". This means that preferring >> + 32-lanes is a good stop-gap until we have proper wave32 support. */ >> + if (TARGET_RDNA2_PLUS) >> + switch (mode) >> + { >> + case E_QImode: >> + return V32QImode; >> + case E_HImode: >> + return V32HImode; >> + case E_SImode: >> + return V32SImode; >> + case E_DImode: >> + return V32DImode; >> + case E_SFmode: >> + return V32SFmode; >> + case E_DFmode: >> + return V32DFmode; >> + default: >> + return word_mode; >> + } >> + >> switch (mode) >> { >> case E_QImode: >> -- >> 2.41.0
>From 9282ea8b064bc22866edb11fc422a85d3298a6b3 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tschwi...@baylibre.com> Date: Sat, 24 Feb 2024 00:29:14 +0100 Subject: [PATCH] GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]' ..., and specify '--param=gcn-preferred-vector-lane-width=64' for 'gcc.target/gcn/[...]' test cases with 'scan-assembler' directives that are specific to 64-lane vectors. This resolves regressions introduced in commit 6dedafe166cc02ae87b6a0699ad61ce3ffc46803 "amdgcn: Prefer V32 on RDNA devices". gcc/ * config/gcn/gcn.opt (--param=gcn-preferred-vector-lane-width): New. * config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode) Use it. * doc/invoke.texi (Optimize Options): Document it. gcc/testsuite/ * gcc.target/gcn/cond_fmaxnm_1.c: Specify '--param=gcn-preferred-vector-lane-width=64'. * gcc.target/gcn/cond_fmaxnm_2.c: Likewise. * gcc.target/gcn/cond_fmaxnm_3.c: Likewise. * gcc.target/gcn/cond_fmaxnm_4.c: Likewise. * gcc.target/gcn/cond_fmaxnm_5.c: Likewise. * gcc.target/gcn/cond_fmaxnm_6.c: Likewise. * gcc.target/gcn/cond_fmaxnm_7.c: Likewise. * gcc.target/gcn/cond_fmaxnm_8.c: Likewise. * gcc.target/gcn/cond_fminnm_1.c: Likewise. * gcc.target/gcn/cond_fminnm_2.c: Likewise. * gcc.target/gcn/cond_fminnm_3.c: Likewise. * gcc.target/gcn/cond_fminnm_4.c: Likewise. * gcc.target/gcn/cond_fminnm_5.c: Likewise. * gcc.target/gcn/cond_fminnm_6.c: Likewise. * gcc.target/gcn/cond_fminnm_7.c: Likewise. * gcc.target/gcn/cond_fminnm_8.c: Likewise. * gcc.target/gcn/cond_shift_3.c: Likewise. * gcc.target/gcn/cond_shift_4.c: Likewise. * gcc.target/gcn/cond_shift_8.c: Likewise. * gcc.target/gcn/cond_shift_9.c: Likewise. * gcc.target/gcn/cond_smax_1.c: Likewise. * gcc.target/gcn/cond_smin_1.c: Likewise. * gcc.target/gcn/cond_umax_1.c: Likewise. * gcc.target/gcn/cond_umin_1.c: Likewise. * gcc.target/gcn/simd-math-1.c: Likewise. * gcc.target/gcn/simd-math-5-char.c: Likewise. * gcc.target/gcn/simd-math-5-long.c: Likewise. * gcc.target/gcn/simd-math-5-short.c: Likewise. * gcc.target/gcn/simd-math-5.c: Likewise. * gcc.target/gcn/smax_1.c: Likewise. * gcc.target/gcn/smin_1.c: Likewise. * gcc.target/gcn/umax_1.c: Likewise. * gcc.target/gcn/umin_1.c: Likewise. --- gcc/config/gcn/gcn.cc | 14 +++++++++++++- gcc/config/gcn/gcn.opt | 16 ++++++++++++++++ gcc/doc/invoke.texi | 8 ++++++++ gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_shift_3.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_shift_4.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_shift_8.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_shift_9.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_smax_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_smin_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_umax_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/cond_umin_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/simd-math-1.c | 3 ++- gcc/testsuite/gcc.target/gcn/simd-math-5-char.c | 3 +++ gcc/testsuite/gcc.target/gcn/simd-math-5-long.c | 3 +++ gcc/testsuite/gcc.target/gcn/simd-math-5-short.c | 3 +++ gcc/testsuite/gcc.target/gcn/simd-math-5.c | 3 +++ gcc/testsuite/gcc.target/gcn/smax_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/smin_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/umax_1.c | 2 ++ gcc/testsuite/gcc.target/gcn/umin_1.c | 2 ++ 36 files changed, 107 insertions(+), 2 deletions(-) diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc index 700e554855e..666f9fdebb2 100644 --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -5231,6 +5231,14 @@ gcn_vector_mode_supported_p (machine_mode mode) static machine_mode gcn_vectorize_preferred_simd_mode (scalar_mode mode) { + bool v32; + if (gcn_preferred_vector_lane_width == 32) + v32 = true; + else if (gcn_preferred_vector_lane_width == 64) + v32 = false; + else if (gcn_preferred_vector_lane_width != -1) + gcc_unreachable (); + else if (TARGET_RDNA2_PLUS) /* RDNA devices have 32-lane vectors with limited support for 64-bit vectors (in particular, permute operations are only available for cases that don't span the 32-lane boundary). @@ -5238,7 +5246,11 @@ gcn_vectorize_preferred_simd_mode (scalar_mode mode) From the RDNA3 manual: "Hardware may choose to skip either half if the EXEC mask for that half is all zeros...". This means that preferring 32-lanes is a good stop-gap until we have proper wave32 support. */ - if (TARGET_RDNA2_PLUS) + v32 = true; + else + v32 = false; + + if (v32) switch (mode) { case E_QImode: diff --git a/gcc/config/gcn/gcn.opt b/gcc/config/gcn/gcn.opt index 1067b45f294..4c0ab2b19ee 100644 --- a/gcc/config/gcn/gcn.opt +++ b/gcc/config/gcn/gcn.opt @@ -116,3 +116,19 @@ Compile for devices requiring XNACK enabled. Default \"any\" if USM is supported msram-ecc= Target RejectNegative Joined ToLower Enum(hsaco_attr_type) Var(flag_sram_ecc) Init(HSACO_ATTR_ANY) Compile for devices with the SRAM ECC feature enabled, or not. Default \"any\". + +-param=gcn-preferred-vector-lane-width= +Target Joined Enum(gcn_preferred_vector_lane_width) Var(gcn_preferred_vector_lane_width) Init(-1) Param +--param=gcn-preferred-vector-lane-width=[default,32,64] Preferred vector lane width. + +Enum +Name(gcn_preferred_vector_lane_width) Type(int) + +EnumValue +Enum(gcn_preferred_vector_lane_width) String(default) Value(-1) + +EnumValue +Enum(gcn_preferred_vector_lane_width) String(32) Value(32) + +EnumValue +Enum(gcn_preferred_vector_lane_width) String(64) Value(64) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index e5acab24cae..9b9085f7167 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -17016,6 +17016,14 @@ loop. The default value is four. @end table +The following choices of @var{name} are available on GCN targets: + +@table @gcctabopt +@item gcn-preferred-vector-lane-width +Preferred vector lane width: @samp{default}, @samp{32}, @samp{64}. + +@end table + The following choices of @var{name} are available on i386 and x86_64 targets: @table @gcctabopt diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c index 17c49bdc518..c36967015e2 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c index 406df48962a..21afb77a785 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c index 45b8b7883ba..ee36f28be35 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c index 416aea89e6e..a73e6e7c017 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c index a4d7ab991de..73a1a736b87 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include "cond_fmaxnm_1.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c index 6c64a01bcbb..9ba5f5b318f 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include "cond_fmaxnm_2.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c index bdb3f2f99ef..68646edeb58 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include "cond_fmaxnm_3.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c index c11633b5236..f3c9e5fc097 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include "cond_fmaxnm_4.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c index bb456887568..66f45a33852 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #define FN(X) __builtin_fmin##X #include "cond_fmaxnm_1.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c index 502f8987494..87c788a59ec 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #define FN(X) __builtin_fmin##X #include "cond_fmaxnm_2.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c index 2ea1eb2ec2c..d24a2ab1a80 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #define FN(X) __builtin_fmin##X #include "cond_fmaxnm_3.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c index 3673ecafc2d..a4fa6bfaf07 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #define FN(X) __builtin_fmin##X #include "cond_fmaxnm_4.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c index ac98941a373..a55d241a1cb 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #define FN(X) __builtin_fmin##X #include "cond_fmaxnm_1.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c index 7f4dba0d314..b34ed424914 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #define FN(X) __builtin_fmin##X #include "cond_fmaxnm_2.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c index 5faf0c5cc59..7c8bd8ce14a 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #define FN(X) __builtin_fmin##X #include "cond_fmaxnm_3.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c index 89d93ac596a..ecd18812caa 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c +++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #define FN(X) __builtin_fmin##X #include "cond_fmaxnm_4.c" diff --git a/gcc/testsuite/gcc.target/gcn/cond_shift_3.c b/gcc/testsuite/gcc.target/gcn/cond_shift_3.c index 983386c1464..d41796ece9c 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_shift_3.c +++ b/gcc/testsuite/gcc.target/gcn/cond_shift_3.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_shift_4.c b/gcc/testsuite/gcc.target/gcn/cond_shift_4.c index c610363d9df..bb22375b6b8 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_shift_4.c +++ b/gcc/testsuite/gcc.target/gcn/cond_shift_4.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_shift_8.c b/gcc/testsuite/gcc.target/gcn/cond_shift_8.c index 0749e2e5e53..634b3c0a7e5 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_shift_8.c +++ b/gcc/testsuite/gcc.target/gcn/cond_shift_8.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_shift_9.c b/gcc/testsuite/gcc.target/gcn/cond_shift_9.c index 61aba27504e..77a43fecf08 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_shift_9.c +++ b/gcc/testsuite/gcc.target/gcn/cond_shift_9.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_smax_1.c b/gcc/testsuite/gcc.target/gcn/cond_smax_1.c index 342b5e827d2..bc80edf2a79 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_smax_1.c +++ b/gcc/testsuite/gcc.target/gcn/cond_smax_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_smin_1.c b/gcc/testsuite/gcc.target/gcn/cond_smin_1.c index ad8b583448b..3df0d8beac5 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_smin_1.c +++ b/gcc/testsuite/gcc.target/gcn/cond_smin_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_umax_1.c b/gcc/testsuite/gcc.target/gcn/cond_umax_1.c index 389228f9e4a..f573bbe27ae 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_umax_1.c +++ b/gcc/testsuite/gcc.target/gcn/cond_umax_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/cond_umin_1.c b/gcc/testsuite/gcc.target/gcn/cond_umin_1.c index 65759d695ad..73b4089adcd 100644 --- a/gcc/testsuite/gcc.target/gcn/cond_umin_1.c +++ b/gcc/testsuite/gcc.target/gcn/cond_umin_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-1.c b/gcc/testsuite/gcc.target/gcn/simd-math-1.c index 6868ccb2c54..a4346f866b7 100644 --- a/gcc/testsuite/gcc.target/gcn/simd-math-1.c +++ b/gcc/testsuite/gcc.target/gcn/simd-math-1.c @@ -2,7 +2,8 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -fno-math-errno -mstack-size=3000000 -fdump-tree-vect" } */ - +/* The 'scan-tree-dump' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #undef PRINT_RESULT #define VERBOSE 0 diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-5-char.c b/gcc/testsuite/gcc.target/gcn/simd-math-5-char.c index 2321c8390c6..5011839e29f 100644 --- a/gcc/testsuite/gcc.target/gcn/simd-math-5-char.c +++ b/gcc/testsuite/gcc.target/gcn/simd-math-5-char.c @@ -1,3 +1,6 @@ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ + #define TYPE char #include "simd-math-5.c" diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-5-long.c b/gcc/testsuite/gcc.target/gcn/simd-math-5-long.c index 37b6cef691e..baf76171ded 100644 --- a/gcc/testsuite/gcc.target/gcn/simd-math-5-long.c +++ b/gcc/testsuite/gcc.target/gcn/simd-math-5-long.c @@ -1,3 +1,6 @@ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ + #define TYPE long #include "simd-math-5.c" diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-5-short.c b/gcc/testsuite/gcc.target/gcn/simd-math-5-short.c index 84cdc9b5fdd..fd87bf51ffc 100644 --- a/gcc/testsuite/gcc.target/gcn/simd-math-5-short.c +++ b/gcc/testsuite/gcc.target/gcn/simd-math-5-short.c @@ -1,3 +1,6 @@ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ + #define TYPE short #include "simd-math-5.c" diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-5.c b/gcc/testsuite/gcc.target/gcn/simd-math-5.c index bc181b45e1b..65916ddfdda 100644 --- a/gcc/testsuite/gcc.target/gcn/simd-math-5.c +++ b/gcc/testsuite/gcc.target/gcn/simd-math-5.c @@ -1,6 +1,9 @@ /* Test that the auto-vectorizer uses the libgcc vectorized division and modulus functions. */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ + /* Setting it this way ensures the run tests use the same flag as the compile tests. */ #pragma GCC optimize("O2") diff --git a/gcc/testsuite/gcc.target/gcn/smax_1.c b/gcc/testsuite/gcc.target/gcn/smax_1.c index 46c21f73132..9ce2064b7d2 100644 --- a/gcc/testsuite/gcc.target/gcn/smax_1.c +++ b/gcc/testsuite/gcc.target/gcn/smax_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/smin_1.c b/gcc/testsuite/gcc.target/gcn/smin_1.c index 8d6edfaa3d1..2df1490381d 100644 --- a/gcc/testsuite/gcc.target/gcn/smin_1.c +++ b/gcc/testsuite/gcc.target/gcn/smin_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/umax_1.c b/gcc/testsuite/gcc.target/gcn/umax_1.c index dc4b9842d9a..4f6217dd6e4 100644 --- a/gcc/testsuite/gcc.target/gcn/umax_1.c +++ b/gcc/testsuite/gcc.target/gcn/umax_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> diff --git a/gcc/testsuite/gcc.target/gcn/umin_1.c b/gcc/testsuite/gcc.target/gcn/umin_1.c index d07f7ec083b..01b9ba1a225 100644 --- a/gcc/testsuite/gcc.target/gcn/umin_1.c +++ b/gcc/testsuite/gcc.target/gcn/umin_1.c @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -ftree-vectorize -dp" } */ +/* The 'scan-assembler' directives are specific to 64-lane vectors. + { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */ #include <stdint.h> -- 2.34.1