Jim Apple has posted comments on this change. Change subject: Add a build flag for the undefined behavior sanitizer, aka "ubsan". ......................................................................
Patch Set 1: (15 comments) http://gerrit.cloudera.org:8080/#/c/5082/1/be/CMakeLists.txt File be/CMakeLists.txt: Line 40: # -fwrapv: force signed integer arithmetic operations to wrap This was already assumed, and a number of our tests depend on it holding. http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/exec/aggregation-node.cc File be/src/exec/aggregation-node.cc: Line 263 This is UB when conjunct_ctxs_ is empty. http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/exec/data-source-scan-node.cc File be/src/exec/data-source-scan-node.cc: Line 151: memset(cols_next_val_idx_.data(), 0, sizeof(int) * cols_next_val_idx_.size()); If the first paramer is nullptr, this is UB http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/exec/hdfs-avro-scanner-ir.cc File be/src/exec/hdfs-avro-scanner-ir.cc: Line 276: *decimal >>= bytes_to_fill * 8; If bytes_to_fill >= sizeof(decimal), this is UB http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: Line 1379: double percent = total_rows * 100 / std::max(1L, num_input_rows); num_input_rows is sometimes 0. http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/exec/read-write-util.cc File be/src/exec/read-write-util.cc: Line 100: memcpy(&uinteger, &integer, sizeof(integer)); type punning is UB. Line 101: uinteger = (uinteger << 1) ^ (integer >> (CHAR_BIT * sizeof(uinteger) - 1)); lshift signed integers can put a 1 in the sign bit, which is UB. http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: Line 819: DCHECK(needle != NULL); If hay_len is 0, haystack can be nullptr http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS1, Line 1684: accumulators::count(completion_times) accumulators::count(completion_times) can be zero, making mean and variance undefined. http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/runtime/row-batch-serialize-test.cc File be/src/runtime/row-batch-serialize-test.cc: PS1, Line 164: len int buf[x] where x is a 0-length variable at run-time is UB http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/runtime/string-value.inline.h File be/src/runtime/string-value.inline.h: Line 58: const int result = len ? strncmp(s1, s2, len) : 0; if either s1 or s2 is nullptr, this is UB http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/thirdparty/mustache/mustache.cc File be/src/thirdparty/mustache/mustache.cc: Line 361: bool is_triple = true; not all paths through FindNextTag set is_triple. http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/util/bit-packing-test.cc File be/src/util/bit-packing-test.cc: Line 36: if (bit_width >= CHAR_BIT * sizeof(uint32_t)) return ~0; lshift past the end is UB http://gerrit.cloudera.org:8080/#/c/5082/1/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: Line 387: ::max(1L, total_time_counter()->value()); sometimes divides by 0: UB http://gerrit.cloudera.org:8080/#/c/5082/1/bin/run-backend-tests.sh File bin/run-backend-tests.sh: Line 41: export PATH="${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/bin:${PATH}" http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#stack-traces-and-report-symbolization llvm-symbolizer on PATH -- To view, visit http://gerrit.cloudera.org:8080/5082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I88c7234bd7c5eb7404490a0913d90470c10835e7 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-HasComments: Yes
