Jim Apple has posted comments on this change. Change subject: IMPALA-4123: Fast bit unpacking ......................................................................
Patch Set 2: (34 comments) First pass, didn't get to all of it yet but I thought you might want to see what was done http://gerrit.cloudera.org:8080/#/c/4494/2//COMMIT_MSG Commit Message: PS2, Line 12: was "want" PS2, Line 18: possible long line http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/benchmarks/bit-packing-benchmark.cc File be/src/benchmarks/bit-packing-benchmark.cc: PS2, Line 259: include I think cmath, cstdlib, cstdio are the non-deprecated versions for C++ Line 260: #include <stdlib.h> Can you arrange these in three blocks: 1. C standard headers 2. C++ standard headers 3. Other alphabetized in each block Line 276: #define NUM_VALUES 1024 * 1024 constexpr int PS2, Line 281: BenchmarkParams You can leave the constructor out and use aggregate initialization: BenchmarkParams bp {p,q,r}; Line 290: BenchmarkParams* p = reinterpret_cast<BenchmarkParams*>(data); maybe const auto p = reinterpret_cast<const BenchmarkParams *> PS2, Line 294: int64_t const PS2, Line 294: i * 32 % NUM_VALUES + j; Shouldn't this while thing be modulo NUM_VALUES, not just the non-+j part? Line 296: reader.Reset(p->data, p->data_len); Why Reset? PS2, Line 333: uint8_t* data = new uint8_t[data_len]; unique_ptr to clean up after PS2, Line 334: int int64_t PS2, Line 337: BenchmarkParams This can be a local var, right? http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/util/bit-packing.h File be/src/util/bit-packing.h: Line 18: #include <stdint.h> cstdint PS2, Line 29: Unpack bit-packed values What does this phrase mean? PS2, Line 30: outputted unpacked PS2, Line 30: is are PS2, Line 31: 0 What does a zero bit_width mean in this context? PS2, Line 30: 'out' must have : /// enough space for 'num_values'. And in must point to at least in_bytes of addressable memory? PS2, Line 33: the byte a pointer to the byte PS2, Line 33: plus "And also" PS2, Line 32: utType. : /// Retu extra line break here, please PS2, Line 33: the number of : /// values that were read. I can infer that by just subtracting, right? I think the pair might be overkill PS2, Line 36: const This const has no effect, I believe. Line 36: static const std::pair<const uint8_t*, int64_t> UnpackValues(int bit_width, std::uint8_t, etc. PS2, Line 57: Unpack32Values Why not up to 64? http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/util/bit-packing.inline.h File be/src/util/bit-packing.inline.h: Line 35: case 0: return UnpackValues<OutType, 0>(in, in_bytes, num_values, out); BOOST_PP_REPEAT_FROM_TO would be worth it here, I think PS2, Line 70: NULL nullptr PS2, Line 70: <const uint8_t*, int64_t> can be omitted PS2, Line 78: DCHECK_GE static_assert PS2, Line 79: 8 CHAR_BIT, here and below PS2, Line 80: in_bytes * 8 / BIT_WIDTH I think this line would be clearer with more parens PS2, Line 99: unsigned integer type static_check with http://en.cppreference.com/w/cpp/types/is_unsigned? PS2, Line 108: inlined as : // soon as possible and subject to all optimizations Can you elaborate on this? Are there benchmarks on that? -- To view, visit http://gerrit.cloudera.org:8080/4494 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-HasComments: Yes
