paulwalker-arm added a comment. In D112406#3087191 <https://reviews.llvm.org/D112406#3087191>, @dmgreen wrote:
> Thanks. If the cpu has a 512 bit total vector bandwidth, should the > VScaleForTuning be 1 or 2 (or higher)? llvm doesn't usually deal with total > bandwidth a lot, perhaps not as much as it should. > > @david-arm any thoughts? The total vector bandwidth includes unrolling so currently having `VScaleForTuning=1` and `MaxInterleaveFactor=4` implies 512 tvb. If the target has >128bit vectors then vector loops will likely have more work than they can handle in parallel but as long as that does not negatively affect register pressure it shouldn't be a problem. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112406/new/ https://reviews.llvm.org/D112406 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits