Current generic cost model makes dynamic-lmul2-7.c generate inferior codegen.
I found if I tweak the cost a little bit then dynamic-lmul2-7.c codegen can be recovered. However, it makes other tests failed.... It's complicated story.... So, I'd rather set it as default cost and switch to it. Then, we can tune the cost gradually, not only fix the issues we faced (e.g. SHA256), but also no matter how we tweak the costs later, it won't hurt the codegen of current tests. It's true that: we can keep current cost model default_builtin_vectorization_cost And tweak generic cost model, for exampl, add testcase for SHA256 and add -mtune=generic-ooo to test it. But the question, how do you know whether there is a regression on current testsuite with -mtune=generic-ooo ? Note that we can tweak generic vector cost model to fix SHA256 issue easily, but we should also make sure we don't have regressions on current testsuite with the new cost model. So I switch the cost model. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-01-10 23:04 To: 钟居哲; gcc-patches CC: rdapp.gcc; kito.cheng; kito.cheng; Jeff Law Subject: Re: [PATCH] RISC-V: Switch RVV cost model to generic vector cost model On 1/10/24 15:40, 钟居哲 wrote: > I need to add these costs for segment load/stores: > > /* Generic costs for VLA vector operations. */ > static const scalable_vector_cost generic_vla_vector_cost = { > { > 1,/* int_stmt_cost */ > 1,/* fp_stmt_cost */ > 1,/* gather_load_cost */ > 1,/* scatter_store_cost */ > 1,/* vec_to_scalar_cost */ > 1,/* scalar_to_vec_cost */ > 1,/* permute_cost */ > 1,/* align_load_cost */ > 1,/* align_store_cost */ > 2,/* unalign_load_cost */ > 2,/* unalign_store_cost */ > }, > 2,/* vlseg2_vsseg2_permute_cost */ > 2,/* vlseg3_vsseg3_permute_cost */ > 3,/* vlseg4_vsseg4_permute_cost */ > 3,/* vlseg5_vsseg5_permute_cost */ > 4,/* vlseg6_vsseg6_permute_cost */ > 4,/* vlseg7_vsseg7_permute_cost */ > 4,/* vlseg8_vsseg8_permute_cost */ > }; > > to fix the SLP issues in the following patches. > > If you don't allow me to switch to generic vector cost model and tune it. > How can I fix the FAILs of slp-*.c cases ? > > Currently, l let all slp-*.c tests all XFAIL which definitely incorrect. Of course we don't want those XFAILs. It's not a matter of "allowing" or not but rather that I'd like to understand the reasoning. The patch itself seems reasonable to me apart from not really getting the intention. Your main point seems to be > + const cpu_vector_cost *costs = tune_param->vec_costs; > + if (!costs) > + return &generic_vector_cost and that is fine. What's not clear is whether changing the actual costs is a temporary thing or whether it is supposed to be another fallback. If they are going to be changed anyway, why do we need to revert to the default model now? As discussed yesterday increased permute costs and vec_to_scalar costs make sense, to first order. Is that because of dynamic-lmul2-7.c? Generally we need to make the costs dependent on the type or mode of course, just as we started to do with the latencies. Permute is particularly sensitive as you already gathered. Regards Robin