Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 )
Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions ...................................................................... Patch Set 40: (13 comments) The main obstacle to merging this is the number of #ifdefs - I think it will be difficult to maintain because so much code is conditionally compiled. I left some suggestions in specific places, but I think generally I would prefer: * To use regular if/else with IsAarch64() where possible - when the code can compile on either architectures * If not possible, to keep the implementations separate - in different functions, for example. I.e. have few #ifdef statements in the middle of functions. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/codegen/llvm-codegen-test.cc File be/src/codegen/llvm-codegen-test.cc: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/codegen/llvm-codegen-test.cc@537 PS40, Line 537: #ifdef __aarch64__ Can you modify this so that less of this code is within the ifdef. I think you can simplify it to: EXPECT_EQ(std::unordered_set<string>( {"-dummy1", "-dummy2", "-dummy3", "-dummy4", base::IsAarch64() ? "-sse2" : "+sse2", "-lzcnt"}), LlvmCodeGen::ApplyCpuAttrWhitelist( {"+dummy1", "+dummy2", "-dummy3", "+dummy4", "+sse2", "-lzcnt"})); I think all the #ifdefs are going to be difficult to maintain without breaking aarch so would be good to minimise them. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/exec/delimited-text-parser.inline.h File be/src/exec/delimited-text-parser.inline.h: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/exec/delimited-text-parser.inline.h@242 PS40, Line 242: __m128i xmm_buffer, xmm_delim_mask, xmm_escape_mask; I think you can move this into the #ifndef. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/kudu/util/block_bloom_filter.cc File be/src/kudu/util/block_bloom_filter.cc: PS40: For all the files under kudu/util, we need to pull in the changes from Kudu, e.g. from https://gerrit.cloudera.org/#/c/14964/ We don't want to diverge this code. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util-test.cc File be/src/util/bit-util-test.cc: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util-test.cc@137 PS40, Line 137: #ifdef __aarch64__ Does this need to be an ifdef? Can we use base::IsAArch64()? I prefer regular control flow over ifdefs because we will then always be at least compiling the aarch64 code. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util-test.cc@187 PS40, Line 187: #ifdef __aarch64__ Same things here and below - use regular if statements where possible. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.h File be/src/util/bit-util.h: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.h@132 PS40, Line 132: #ifndef __aarch64__ This is really ugly. It would be better just to define a separate version of the function so that we don't have the ifdefs interspersed with the code. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.cc File be/src/util/bit-util.cc: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.cc@144 PS40, Line 144: #ifdef __aarch64__ Can you group together the aarch64 implementations after the x86 implementations. This has too many #ifdefs and is pretty unreadable. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.cc@196 PS40, Line 196: if (TEMPLATE_DATA_WIDTH == 16) { The #ifdefs mixed with control flow is not readable, can you factor out into a helper function or something to reduce the number of ifdefs http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.cc@237 PS40, Line 237: #ifndef __aarch64__ Same here, too many #ifdefs mixed in with the control flow. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/cpu-info.cc File be/src/util/cpu-info.cc: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/cpu-info.cc@160 PS40, Line 160: hardware_flags_ |= ParseCPUFlags("sse3"); This seems confusing. I think we should define a new constant for neon to make it less confusing. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/hash-util.h File be/src/util/hash-util.h: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/hash-util.h@42 PS40, Line 42: #ifndef __aarch64__ We need to find a better solution than all these #ifdefs. Can't you do something like DCHECK(CpuInfo::IsSupported(CpuInfo::SSE4_2) || base::IsAArch64()); Maybe we want to define IsAarch64() in a header so that it can be inlined and does not require a function call. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/sse-util.h File be/src/util/sse-util.h: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/sse-util.h@86 PS40, Line 86: /// support (e.g. IMPALA-6882). Can you add a comment here like: // On AARCH64 we define alternative implementations that emulate the intrinsics. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/sse2neon.h File be/src/util/sse2neon.h: PS40: Where did this version of the file come from? I would like to know so I can check the license and also in case we want to refresh the version. -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 40 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 12 Aug 2020 00:34:31 +0000 Gerrit-HasComments: Yes
