Youwei Wang has posted comments on this change. Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSE or AVX2. ......................................................................
Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/3081/5/be/src/util/bit-util.inline.h File be/src/util/bit-util.inline.h: Line 139: __attribute__((target("sse4.2"))) > You have explained it clearly; thank you! Hi Jim. The followings are my own conclusions: I believe the answer to this question is YES. Since ByteSwapSSE_Unit is referred by ByteSwapSIMD, and ByteSwapSIMD is referred by the BitUtil::ByteSwap. It should be picked up by the compiler or there will be not-defined symbol error in the code body of BitUtil::ByteSwap. Actaully we can do an experiment to verify this: since ByteSwapSSE_Unit and ByteSwapSIMD can be separated to be a standalone program, we can compile this program and dump its ASM code. If you are interested, would you please take a look at an example here? https://godbolt.org/g/HHi9ui Line 174: static const FuncPointer SIMDFuncTable[2] = {ByteSwapSSE_Unit, ByteSwapAVX2_Unit}; > I don't know what "polynomial", you are referring to. I think you should ch Hi Jim. Sorry for the confusion I have caused. Actually these code: const int func_idx = data_width/16 - 1;SIMDFuncTable[func_idx](dst, src); are previously branches. Its equivalent form is: if(data_width==32) ByteSwapAVX2_Unit else if(data_width==16) ByteSwapSSE_Unit. This trick is known as the jump table. Some compiler will do it during the optimization phase. I do it manually here just in case. So essentially, the overhead of branch code is substitued by the arithmetic operation. Yes, the arithmetic operation still costs some CPU cycles due to run-time computation but we believe it is cheaper than the branch and not so harmful to the instruction pipeline. -- To view, visit http://gerrit.cloudera.org:8080/3081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1 Gerrit-PatchSet: 5 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Youwei Wang <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Youwei Wang <[email protected]> Gerrit-HasComments: Yes
