Dan Hecht has posted comments on this change.

Change subject: IMPALA-1619: Support 64-bit allocations.
......................................................................


Patch Set 7:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/2781/7//COMMIT_MSG
Commit Message:

PS7, Line 10: bug decompressor
garbled


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/exec/delimited-text-parser.cc
File be/src/exec/delimited-text-parser.cc:

Line 123:           row_end_locations, field_locations, num_tuples, num_fields, 
next_column_start));
did you run a quick text parsing benchmark to see what effect, if any, these 
checks have?


Line 161:         DCHECK(status.ok());
why is that the case?


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

PS7, Line 168: byte_size
any reason to not just make byte_size() return int64?


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

PS7, Line 62: 1GB only as StringValue and StringVal use 32-bit
this doesn't quite explain it given that 32-bit can go up to one before 2GB.  
Would it be accurate to say:
... up to 1GB as that's the limit for StringValue and StringVal.  All other ...

and then leave it up to the StringValue code to document why the 1GB limit is 
in place.


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

PS7, Line 558: >
how did this get reversed?


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/runtime/collection-value-builder.h
File be/src/runtime/collection-value-builder.h:

PS7, Line 40: static_cast<int64_t>
should we just make initial_tuple_capacity be an int64_t?


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/runtime/mem-pool.h
File be/src/runtime/mem-pool.h:

Line 114:   uint8_t* EmptyAllocPtr() { return reinterpret_cast<uint8_t 
*>(&zero_length_region_); }
maybe make this static?  Seems more like a global constant rather than a per 
pool thing.


Line 215:     if (UNLIKELY(size < 0)) return NULL;
when would this happen legitimately? wondering if it should be a DCHECK instead?


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/runtime/string-buffer.h
File be/src/runtime/string-buffer.h:

PS7, Line 31: size_
buffer_size_?


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/util/bit-util.h
File be/src/util/bit-util.h:

PS7, Line 206: positive
shouldn't this say "non-negative". i.e. 0 is okay, right?


PS7, Line 208: Is32BitPositive
if so, how about IsNonNegative32Bit()?  (Also switched the order since that 
seems to read easier).


Line 209:     return (value & ~((1ULL << 31) - 1)) == 0;
A more standard and IMO readable (and just as fast) way is:

return (uint64_t)value <= numeric_limits<int32_t>::max();


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/util/decompress.cc
File be/src/util/decompress.cc:

Line 446:         return 
Status(TErrorCode::SNAPPY_DECOMPRESS_INVALID_COMPRESSED_LENGTH);
why did the old code have uncompressed_total_len? was that just wrong?


http://gerrit.cloudera.org:8080/#/c/2781/7/infra/python/bootstrap_virtualenv.py
File infra/python/bootstrap_virtualenv.py:

Line 123:   impala_home = os.environ.get("IMPALA_HOME", "")
is this variable needed?


-- 
To view, visit http://gerrit.cloudera.org:8080/2781
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7ed28083d809a86d801a9c063a0aa32c50d32b20
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to