Jim Apple has posted comments on this change. Change subject: IMPALA-4123: Fast bit unpacking ......................................................................
Patch Set 4: (15 comments) http://gerrit.cloudera.org:8080/#/c/4494/4/be/src/benchmarks/bit-packing-benchmark.cc File be/src/benchmarks/bit-packing-benchmark.cc: PS4, Line 296: bool const PS4, Line 303: UnpackValues Unpack32Values PS4, Line 312: + offset This could go off the end of out_buffer if NUM_OUT_VALUES is not a multiple of 32, right? PS4, Line 322: int64_t const http://gerrit.cloudera.org:8080/#/c/4494/4/be/src/util/bit-packing-test.cc File be/src/util/bit-packing-test.cc: PS4, Line 30: compute_mask CamelCaseName PS4, Line 35: smaller smaller than what? Line 116: const int num_in_values = 64 * 1024; constexpr, and all caps name Line 119: in[i] = rand(); std::generate http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/util/bit-packing.h File be/src/util/bit-packing.h: PS2, Line 57: ast one return > A batch size of 32 is the smallest that works for all supported bit widths. I don't understand yet why you have to choose a batch size of 8 for a bit width of 3. http://gerrit.cloudera.org:8080/#/c/4494/4/be/src/util/bit-packing.h File be/src/util/bit-packing.h: PS4, Line 66: bytes bits? PS4, Line 66: values values of type OutType PS4, Line 70: in_bytes Can you add to the comment a description of how this parameter is used? http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing.inline.h File be/src/util/bit-packing.inline.h: PS3, Line 143: undef > I'm not quite sure that I understand the intent or how I'd do that, aside f https://gcc.gnu.org/onlinedocs/gcc/Push_002fPop-Macro-Pragmas.html#Push_002fPop-Macro-Pragmas http://gerrit.cloudera.org:8080/#/c/4494/4/be/src/util/bit-packing.inline.h File be/src/util/bit-packing.inline.h: PS4, Line 125: return lower_bits | (trailing_bits << TRAILING_BITS_SHIFT); This comes out to two shifts, a bitwise or, and a bitwise and. If you did unaligned 64-bit reads to fit the full value inside of a single word, it could be done in one shift and one bitwise and, yes? Line 142: BOOST_PP_REPEAT_FROM_TO(0, 33, UNPACK_VALUE_CALL, ignore); So, even after turning this into a templated function, a loop with static bounds doesn't cause inlining? Maybe leave a note on why you do it this way, in that case. -- To view, visit http://gerrit.cloudera.org:8080/4494 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Jim Apple <jbap...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes