Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1619: Support 64-bit allocations.
......................................................................


Patch Set 5:

(11 comments)

Can we also add a test to udf-test that exercises the corner cases: zero-length 
and negative allocations?

http://gerrit.cloudera.org:8080/#/c/2781/5/be/src/exec/delimited-text-parser.h
File be/src/exec/delimited-text-parser.h:

Line 120:   Status FillColumns(int64_t len, char** last_column, int* num_fields,
There are a couple of places in DelimitedTextParser::ParseFieldLocations() 
where we're not checking the return status of this function.


http://gerrit.cloudera.org:8080/#/c/2781/5/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

Line 757:         if (state_->abort_on_error()) {
We should probably implement this abort_on_error=false behaviour where we just 
skip the row for the long string case. The multi-GB string is kind of a corner 
case and we're handling it much better now, but it would be good to be 
consistent in error handling.


http://gerrit.cloudera.org:8080/#/c/2781/5/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

Line 234: stream
Stream? Buffer, no?


http://gerrit.cloudera.org:8080/#/c/2781/5/be/src/runtime/buffered-block-mgr.cc
File be/src/runtime/buffered-block-mgr.cc:

Line 306:   int64_t buffers_needed = BitUtil::Ceil(size, max_block_size());
We should just check here that buffers_needed fits in a 32-bit int, rather than 
propagating the type change.


http://gerrit.cloudera.org:8080/#/c/2781/5/be/src/runtime/buffered-block-mgr.h
File be/src/runtime/buffered-block-mgr.h:

Line 342:   bool TryAcquireTmpReservation(Client* client, int64_t num_buffers);
We shouldn't need to change this, this is denominated in 8MB buffers.

I think the only change we really need is to add a check in ConsumeMemory that 
the number of buffers fits in a 32-bit int.


http://gerrit.cloudera.org:8080/#/c/2781/5/be/src/runtime/free-pool.h
File be/src/runtime/free-pool.h:

Line 68:     if (UNLIKELY(size == 0)) return mem_pool_->EmptyAllocPtr();
Should we be checking for < 0 like in the MemPool? I couldn't see any handling 
of that case if a rogue UDF called Allocate(-1).


http://gerrit.cloudera.org:8080/#/c/2781/5/be/src/runtime/string-buffer.h
File be/src/runtime/string-buffer.h:

Line 77:   /// the buffer will exceed memory limit.
Mention that this doesn't free the previous buffer?


http://gerrit.cloudera.org:8080/#/c/2781/5/be/src/util/codec.cc
File be/src/util/codec.cc:

Line 188: (output_len64 & ~((1LL << 31) - 1)) != 0)
What the benefit of this compared to the straightforward comparison? Because it 
handles the negative case too?

Can we factor this check out into something in BitUtil?


http://gerrit.cloudera.org:8080/#/c/2781/5/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

Line 242: length supported
should also include the maximum length in the error message. E.g.

Length of column $0 exceeds maximum supported length of $1


http://gerrit.cloudera.org:8080/#/c/2781/5/infra/python/deps/requirements.txt
File infra/python/deps/requirements.txt:

Line 45: python-snappy == 0.5.0
We currently add the egg/tarball/whatever for the dependencies here (rather 
than having download_requirements hammer third-party servers all the time)


http://gerrit.cloudera.org:8080/#/c/2781/5/tests/query_test/test_compressed_formats.py
File tests/query_test/test_compressed_formats.py:

Line 1: # Copyright (c) 2012 Cloudera, Inc. All rights reserved.
Can we also extend the test_large_strings test to exercise some of the cases 
that now work?


-- 
To view, visit http://gerrit.cloudera.org:8080/2781
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7ed28083d809a86d801a9c063a0aa32c50d32b20
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to