ro-i wrote: I realized that I had a bit of a testing issue. From my reduction tests, I kept the result verification for every test run (because I always wanted to have more testing guards against race conditions etc). But in my new non-reduction test cases, that hurts testing speed because the checks are O(n). Due to that fact, I previously only tested small N (4,096 or 65,535) instead of my usual default 177,777,777.
With that N, we get the following perf change for **non-reduction** workloads: ``` misc_stencil double change for 208 teams: +47.45% change for 10400 teams: -30.69% misc_elem_func double change for 208 teams: +42.02% change for 10400 teams: +63.05% misc_elem_loop double change for 208 teams: +32.79% change for 10400 teams: -19.86% misc_linalg double change for 208 teams: +31.96% change for 10400 teams: -20.79% misc_particle double change for 208 teams: +13.89% change for 10400 teams: +3.09% misc_stencil uint change for 208 teams: +36.12% change for 10400 teams: -0.79% misc_elem_func uint change for 208 teams: +117.16% change for 10400 teams: +16.39% misc_elem_loop uint change for 208 teams: +37.26% change for 10400 teams: +26.12% misc_linalg uint change for 208 teams: +36.46% change for 10400 teams: +23.19% misc_particle uint change for 208 teams: +10.88% change for 10400 teams: -0.22% misc_stencil ulong change for 208 teams: +45.55% change for 10400 teams: -31.35% misc_elem_func ulong change for 208 teams: +39.18% change for 10400 teams: +66.38% misc_elem_loop ulong change for 208 teams: +38.42% change for 10400 teams: -23.92% misc_linalg ulong change for 208 teams: +37.38% change for 10400 teams: -24.18% misc_particle ulong change for 208 teams: +13.16% change for 10400 teams: +1.76% misc_stencil Value change for 208 teams: -3.31% change for 10400 teams: -0.76% misc_elem_func Value change for 208 teams: +0.55% change for 10400 teams: +1.42% misc_elem_loop Value change for 208 teams: -1.26% change for 10400 teams: -2.87% misc_linalg Value change for 208 teams: -0.73% change for 10400 teams: -15.36% misc_particle Value change for 208 teams: -1.15% change for 10400 teams: -0.12% ``` There is probably potential, but I'll change this PR to only handle the reduction cases for now. The other cases would need more analysis to get the most out of it and I need to focus on cross-team reduction for the moment. https://github.com/llvm/llvm-project/pull/201670 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
