Youwei Wang has posted comments on this change. Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSE or AVX2. ......................................................................
Patch Set 15: (2 comments) http://gerrit.cloudera.org:8080/#/c/3081/15/be/src/benchmarks/bswap-benchmark.cc File be/src/benchmarks/bswap-benchmark.cc: > Please also add tests to bit-util-test.cc Hi Jim. Our current modification to the interface BitUtil::ByteSwap only happens inside its body. So there has already existed some test cases in TEST(BitUtil, ByteSwap). So do you mean adding extra test cases in this function TEST(BitUtil, ByteSwap)? Or maybe you mean adding some test cases for the functions like ByteSwapSSE_Unit/ByteSwapAVX2_Unit? Line 124: void TestSSE42Swap(int batch_size, void* d) { > When I asked earlier if we care about FIXED_LEN_SIZE, I wasn't trying to be Hi Jim. Thank you so much for providing this precise information here. After some investigation, I think my previous understanding the functionality of the API “inline void BitUtil::ByteSwap(void* dest, const void* source, int len)“ is quite different as it actually turns out to be. As I can tell from the code, BitUtil::ByteSwap seems to serve the callers mostly DecimalUtil::DecodeFromFixedLenByteArray(buffer, fixed_len_size, v); DecimalUtil::EncodeFromFixedLenByteArray(buffer, fixed_len_size, v); Obviously, in this case, the input parament “len” of this function will be 4bytes corresponding to Decimal4Value, 8bytes corresponding to Decimal8Value and 16bytes corresponding to Decimal16Value. Actually, my current code “inline void BitUtil::ByteSwap(void* dest, const void* source, int len)“ will not utilize the SIMD feature to deal with the input array if the len < 16. So we can expect no performance improvement in this case. Actually, concluded from previous patch, the code has a priori knowledge about what it is going to deal with. So this is the very meaning of this function in the benchmark: template <int num_bytes, int fixed_len_size> inline void ByteSwapAVX2(void* dest, const void* source, int len) int num_bytes: tell which kind of Decimal values to deal with. That is, 4bytes corresponding to Decimal4Value, 8bytes corresponding to Decimal8Value and 16bytes corresponding to Decimal16Value. int fixed_len_size: the internal bytes of corresponding Decimal values to swap; just like you have mentioned previously: After some digging, it looks to me like we sometimes do care about unusual FIXED_LEN_SIZE. See parquet-common.h, which includes the comment: /// fixed_len_size can be less than sizeof(Decimal*Value) for space savings. This means /// that the value in the in-memory format has leading zeros or negative 1's. /// For example, precision 2 fits in 1 byte. All decimals stored as Decimal4Value /// will have 3 bytes of leading zeros, we will only store the interesting byte. Dest: the pointer to the destination memory address; Source: the pointer to the source memory address; Len: the number of Decimal values; If my understanding is true, I afraid the AVX2 optimization can not be used inside the BitUtil::ByteSwap since it will only deal with at most 128Bit data. And SSE4.2 optimization will only be used to deal with the Decimal16Value. Our previous patch shows the high efficiency of using SIMD approach when dealing with a long DecimalXValue array. So my idea now is to find the callers of EncodeToFixedLenByteArray and DecodeToFixedLenByteArray and use our previous patch to substitute these two functions if they are used to encode/decode a very long DecimalXValue array. And I believe this is the essential idea of previous patch. I am looking forward to hearing your idea. :) -- To view, visit http://gerrit.cloudera.org:8080/3081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1 Gerrit-PatchSet: 15 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Youwei Wang <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Youwei Wang <[email protected]> Gerrit-HasComments: Yes
