jianxind commented on pull request #7314:
URL: https://github.com/apache/arrow/pull/7314#issuecomment-639285090


   > This looks generally quite complicated. If we need 500 additional lines of 
code to micro-optimize the Sum kernel for a single SIMD instruction set 
(nevermind that we may also want versions for AVX2, Neon, SVE, and whatnot), 
things will quickly get out of hand.
   > 
   > If we want to go the way of per-kernel SIMD optimizations, it may be 
useful to investigate SIMD helper libraries (such as 
[libsimdpp](https://github.com/p12tic/libsimdpp), 
[xsimd](https://xsimd.readthedocs.io/en/latest/)...).
   
   libsimdpp says it has a runtime dispatch support based on function basic. 
But I don't know if the quality of these libs is perfect enough to use and it 
was qualified in every spec, it will make the debug things more hard for me if 
there's any bug there. Another concern is the performance, will the wrapper 
introduce extra cost? And usually the SIMD code/parameters is designed 
carefully for each architecture to get best performance, I suspect there's a 
common code can fit all target.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to