Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3671: Add query option to limit scratch space usage
......................................................................


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/buffered-block-mgr-test.cc
File be/src/runtime/buffered-block-mgr-test.cc:

Line 157:     BufferedBlockMgr* mgr = CreateMgr(query_id, max_buffers, 
block_size, &state, query_options);
Long line


Line 1448: TEST_F(BufferedBlockMgrTest, ScratchLimitZero) {
Nit: put this next to the NoTmpDirs test since the tests are variations of each 
other. It'll be easier for maintainers.


http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/buffered-block-mgr.cc
File be/src/runtime/buffered-block-mgr.cc:

Line 509:     status = WriteUnpinnedBlock(src);
As discussed, can you file a bug for executing this code when disable_spilling_ 
is true. I think you had a way to reproduce the problem.


http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/buffered-block-mgr.h
File be/src/runtime/buffered-block-mgr.h:

Line 646:   /// Amount of Scratch Space Allocated
Nit: inconsistent capitalisation compared to surrounding comments.


http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

Line 140:     const TUniqueId& query_id, File** new_file) {
Add a DCHECK(file_group != NULL) to document/enforce the invariant.


Line 225:       file_group_->current_space_allocated_ + write_size
Does this fit on a single line now?


Line 282: TmpFileMgr::File* TmpFileMgr::FileGroup::GetFileAt(int index) {
Nit: I think it's a bit more readable if we define these one-line accessors in 
the header: GetFileAt(), NumFiles(), SetSpaceAllocationLimit().

Also it's probably worth adding bounds checks here for debug mode.

DCHECK_GE(index, 0);
DCHECK_LT(index, NumFiles());


http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

Line 112:   /// responsible for calling the Close method to delete all the 
files in the group
Nit: put a . on the end of comments for consistency.


Line 115:     FileGroup(TmpFileMgr* tmp_file_mgr, int64_t max_write_size = -1);
Can you add a destructor that DCHECKS that the group was closed, to make it 
easier to find any resource leaks.


Line 142:     boost::ptr_vector<File> tmp_files_;
Ah, I misunderstood ptr_vector I think. Didn't realise the old code used it.


Line 148:     int64_t max_space_limit;
Limit makes sense to me. Just needs an _ on the end.

Actually I think we should make this consistent with SetSpaceAllocationLimit(). 
I.e. SetSpaceLimit() and space_limit_.


http://gerrit.cloudera.org:8080/#/c/3938/2/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

Line 285:   ("SCRATCH_LIMIT_EXCEEDED", 91, "Scratch space limit of $0 bytes 
exceeded.")
I'm changing my mind on this. I think we should mention spilling/spill-to-disk 
in this error message to give an additional clue to users. How about:

  Scratch space limit of $0 bytes exceeded for query while spilling data to 
disk.


-- 
To view, visit http://gerrit.cloudera.org:8080/3938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3e9039ad1c0df978ab71b956af33fd997714e73
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to