Jim Apple has posted comments on this change. Change subject: IMPALA-4123: Fast bit unpacking ......................................................................
Patch Set 3: (34 comments) http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/benchmarks/bit-packing-benchmark.cc File be/src/benchmarks/bit-packing-benchmark.cc: Line 296: } > We need to do this to start the reader reading from the start of the buffer Should offset advance in this loop, then? http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/benchmarks/bit-packing-benchmark.cc File be/src/benchmarks/bit-packing-benchmark.cc: Line 270: Why the blank line here? PS3, Line 277: NUM_VALUES Maybe NUM_OUT_VALUES PS3, Line 305: * d const uint8_t * const data_end ... Line 334: data[i] = static_cast<uint8_t>(i); std::iota PS3, Line 336: ( I think you can even elide the parens http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing-test.cc File be/src/util/bit-packing-test.cc: PS3, Line 29: compute_mask ComputeMask, and in an anonymous namespace PS3, Line 33: UnpackSubset With a comment here, if I'm going to use it in PackUnpack. PS3, Line 37: with memory that is this phrase can be omitted PS3, Line 39: int also unsigned? PS3, Line 39: int unsigned? DCHECK <= something? PS3, Line 39: num_values num_in_values? PS3, Line 45: int const, here and elsewhere PS3, Line 51: & compute_mask(bit_width) AKA mask; already hoisted PS3, Line 58: becase because PS3, Line 58: the input buffer size "the input buffer size (num_values)" PS3, Line 60: // Size buffer exactly so that ASAN can detect buffer overruns. I think manual canaray checks might be called for here PS3, Line 65: ASSERT_EQ Can you add more " << error message " to your ASSERTs? http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/util/bit-packing.h File be/src/util/bit-packing.h: Line 36: /// little-endian order). E.g. the values 1, 2, 3, 4 packed with bit width 4 results > We don't do this anywhere else, seems unnecessarily verbose. Yeah, the verbosity is frustrating. However, cstdint doesn't necessarily pt these in the global namespace, and stdint.h is deprecated. What color do you think we should paint this bikeshed? Maybe a using declaration at the top? PS2, Line 57: Type> > Added some explanation to the class comment for the magic number. It works So, this boundary condition happens for any multiple of 8, and you chose 32 to amortize the loop some more? Did you do any testing of 16 and 64 to see how they compare? http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing.h File be/src/util/bit-packing.h: PS3, Line 55: after the last byte of 'in' that was read But the last byte of 'in' that was read might only have been partially used, right? http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/util/bit-packing.inline.h File be/src/util/bit-packing.inline.h: PS2, Line 108: \ : if (end_bit_offset < load_bit_width) { > Reworded this. I did some experiments early on looking at the assembly and Did you try always_inline and template parameters? If yes, what else did you try? http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing.inline.h File be/src/util/bit-packing.inline.h: PS3, Line 56: == 0 can be omitted if the branches are swapped Line 56: const int64_t max_input_values = BIT_WIDTH == 0 ? num_values : (in_bytes * CHAR_BIT) / BIT_WIDTH; long line PS3, Line 90: early : // with I think you accidentally a word PS3, Line 93: LOAD_TYPE I think "LOAD_TYPE" could be "LoadType" to more clearly match the lexical standard for other type names PS3, Line 93: i If 'i' is a compile-time constant, maybe call it "I" or "NTH"? PS3, Line 96: load_bit_width I think constexprs should get the static const treatment of all caps PS3, Line 106: shifted Can you add a comment describing what this is for? PS3, Line 108: 32 How is this 32 derived? PS3, Line 109: end_bit_offset so, this case al the bits we need are in one word? PS3, Line 119: Make non-negative to avoid spurious compiler warning Maybe give it an unsigned type? PS3, Line 124: Special-case impossible BIT_WIDTH 32 to avoid spurious compiler warning this could use a bit more explanation PS3, Line 143: define Check not already defined? -- To view, visit http://gerrit.cloudera.org:8080/4494 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Jim Apple <jbap...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes