jianxind commented on pull request #7314:
URL: https://github.com/apache/arrow/pull/7314#issuecomment-641109277


   Just find a document(https://dl.acm.org/doi/pdf/10.1145/3178433.3178435 PAGE 
7, Table 6: Comparison of various SIMD wrappers.) Some SIMD 
helper(simdpp/xsimd) has performance issue  at least on some workload.
   
   Another thing is most SIMD helpers has no runtime support, it means we still 
has to build same code(if we can find a common code path for one function) many 
times on arrow itself for the runtime capacity.
   
   And I'm working on the sparse part for aggregate sum recently, the data flow 
is total different for AVX2/AVX512. AVX512 has _mm512_mask_add_pd (__m512d src, 
__mmask8 k, __m512d a, __m512d b) support that it can SIMD add the results 
directly on the valid bit map. For AVX2, it has to use a lookup table mask with 
SIMD and operation to zero the invalid values before passing to SIMD add. The 
difference is applied to other future SIMD func also as all arrow data 
represented with valid bit map.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to