jianxind edited a comment on pull request #7029:
URL: https://github.com/apache/arrow/pull/7029#issuecomment-618855696


   cc @emkornfield 
   
   The AVX512 path is straightforward as the helper of 
mask_compress/mask_expand API provide by AVX512. For potential path-finding of 
SSE/AVX2, as you pointed in the Jira, a solution with fixed lookup table may 
help, I will work the chance then but it definitely need take more time thus I 
commit this done part firstly.
   
   Below is the benchmark data on Avx512 device before/after the intrinsics:
   
   Before:
   BM_PlainEncodingSpacedFloat/1024          1471 ns         1469 ns       
476373 bytes_per_second=2.59603G/s
   BM_PlainEncodingSpacedDouble/1024         1498 ns         1496 ns       
468131 bytes_per_second=5.09834G/s
   BM_PlainDecodingSpacedFloat/1024          1266 ns         1265 ns       
554320 bytes_per_second=3.01623G/s
   BM_PlainDecodingSpacedDouble/1024          920 ns          919 ns       
759151 bytes_per_second=8.30509G/s
   
   After:
   BM_PlainEncodingSpacedFloat/1024           513 ns          512 ns      
1374561 bytes_per_second=7.44416G/s
   BM_PlainEncodingSpacedDouble/1024          635 ns          634 ns      
1108739 bytes_per_second=12.0322G/s
   BM_PlainDecodingSpacedFloat/1024           217 ns          217 ns      
3233406 bytes_per_second=17.613G/s
   BM_PlainDecodingSpacedDouble/1024          309 ns          309 ns      
2267740 bytes_per_second=24.7257G/s


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to