Tim Armstrong has posted comments on this change. Change subject: IMPALA-3671: Add query option to limit scratch space usage ......................................................................
Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/buffered-block-mgr-test.cc File be/src/runtime/buffered-block-mgr-test.cc: Line 157: BufferedBlockMgr* mgr = CreateMgr(query_id, max_buffers, block_size, &state, query_options); Long line Line 1448: TEST_F(BufferedBlockMgrTest, ScratchLimitZero) { Nit: put this next to the NoTmpDirs test since the tests are variations of each other. It'll be easier for maintainers. http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/buffered-block-mgr.cc File be/src/runtime/buffered-block-mgr.cc: Line 509: status = WriteUnpinnedBlock(src); As discussed, can you file a bug for executing this code when disable_spilling_ is true. I think you had a way to reproduce the problem. http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/buffered-block-mgr.h File be/src/runtime/buffered-block-mgr.h: Line 646: /// Amount of Scratch Space Allocated Nit: inconsistent capitalisation compared to surrounding comments. http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: Line 140: const TUniqueId& query_id, File** new_file) { Add a DCHECK(file_group != NULL) to document/enforce the invariant. Line 225: file_group_->current_space_allocated_ + write_size Does this fit on a single line now? Line 282: TmpFileMgr::File* TmpFileMgr::FileGroup::GetFileAt(int index) { Nit: I think it's a bit more readable if we define these one-line accessors in the header: GetFileAt(), NumFiles(), SetSpaceAllocationLimit(). Also it's probably worth adding bounds checks here for debug mode. DCHECK_GE(index, 0); DCHECK_LT(index, NumFiles()); http://gerrit.cloudera.org:8080/#/c/3938/2/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: Line 112: /// responsible for calling the Close method to delete all the files in the group Nit: put a . on the end of comments for consistency. Line 115: FileGroup(TmpFileMgr* tmp_file_mgr, int64_t max_write_size = -1); Can you add a destructor that DCHECKS that the group was closed, to make it easier to find any resource leaks. Line 142: boost::ptr_vector<File> tmp_files_; Ah, I misunderstood ptr_vector I think. Didn't realise the old code used it. Line 148: int64_t max_space_limit; Limit makes sense to me. Just needs an _ on the end. Actually I think we should make this consistent with SetSpaceAllocationLimit(). I.e. SetSpaceLimit() and space_limit_. http://gerrit.cloudera.org:8080/#/c/3938/2/common/thrift/generate_error_codes.py File common/thrift/generate_error_codes.py: Line 285: ("SCRATCH_LIMIT_EXCEEDED", 91, "Scratch space limit of $0 bytes exceeded.") I'm changing my mind on this. I think we should mention spilling/spill-to-disk in this error message to give an additional clue to users. How about: Scratch space limit of $0 bytes exceeded for query while spilling data to disk. -- To view, visit http://gerrit.cloudera.org:8080/3938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie3e9039ad1c0df978ab71b956af33fd997714e73 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
