felipecrv commented on issue #43687: URL: https://github.com/apache/arrow/issues/43687#issuecomment-2289496101
> Seems this would not generate one? Sorry, the unnecessary template expansions only exist when you pass high SIMD levels as template parameters on template functions that don't really use instructions from those SIMD levels. `SumArray` is an example of this problem: https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/kernels/aggregate_internal.h#L148 Functions like `MeanImpl` shouldn't pass AVX512 SIMD level to `SumArray<>` when the `SumArray` implementation never uses these SIMD instructions. We are generating the same code for functions with different mangled names. > > The SIMD level parameter should be the smallest level that the kernel requires. > > This remind me of this. So generally, SIMD-level is just the "smallest level", not the "exact level"? And compiler can generate SIMD code if user decide to compile with SIMD flag themselves? I'm a bit confusing on this It can't be the "exact level" because you could have a kernel that uses AVX512 for a step while also using AVX2 (or no SIMD at all) for other phases. In other words: a SIMD level Y kernel can call any SIMD level X kernel iff X <= Y. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
