Youwei Wang has posted comments on this change.

Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSE or 
AVX2.
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3081/3/be/src/util/bit-util.inline.h
File be/src/util/bit-util.inline.h:

Line 140:   const __m128i mask = _mm_set_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
11, 12, 13, 14,
> Hi Jim. Thank you for providing this pseudocode. Actually, the macro "        
> #ifn
Hi Jim. I have conducted some simple tests. 
In order to describe it simply, I define several test items here:
1. ScalarFunc: call the function ByteSwapScalar(void* dest, const void* source, 
int len); 
2. SSE4.2 OUTSIDE PERF: put the variable "const __m128i mask" outside the 
function;
3. SSE4.2[INSIDE-STATIC]: put the variable "const __m128i mask"  inside the 
function WITH static modifier;
4. SSE4.2[INSIDE-NOT-STATIC]: put the variable "const __m128i mask"  inside the 
function WITHOUT static modifier;
5. AVX2[INSIDE-STATIC]: put the variable "const __m256i mask"  inside the 
function WITH static modifier;
6. AVX2[INSIDE-NOT-STATIC]: put the variable "const __m256i mask"  inside the 
function WITHOUT static modifier;
Note: GCC has not good support for AVX2 enough, so putting the variable "const 
__m256i mask" outside the function can't compile.

Test approach:
1. Prepare an uint8_t array of 10000000 elements, whose values are randomly 
generated;
2. Use those 6 approaches to swap this array for 1000 times and measure the 
consumed time;
3. SSE4.2 call: ByteSwapSIMD<16>;
4. AVX2 call: ByteSwapSIMD<32>;

CPU info: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz

So the performance result is:
SCALAR PERF: 1x
SSE4.2[OUTSIDE PERF]: 3.00x
SSE4.2[INSIDE-STATIC] PERF: 2.75x 
SSE4.2[INSIDE-NOT-STATIC] PERF: 2.89x 

AVX2[INSIDE-STATIC] PERF: 2.90x 
AVX2[INSIDE-NOT-STATIC] PERF: 3.27x 

Conclusion: so for SSE4.2, we should put the const __m128i mask initializer 
code outside. 
For AVX2, we should not use the static modifier.


-- 
To view, visit http://gerrit.cloudera.org:8080/3081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Youwei Wang <[email protected]>
Gerrit-HasComments: Yes

Reply via email to