Dan Hecht has posted comments on this change. Change subject: IMPALA-3671: Add query option to limit scratch space usage ......................................................................
Patch Set 4: (16 comments) http://gerrit.cloudera.org:8080/#/c/3938/4/be/src/runtime/buffered-block-mgr-test.cc File be/src/runtime/buffered-block-mgr-test.cc: Line 1199: query_options.scratch_limit = 0; should this case be calling InitMultipleTmpDirs() with num_dirs > 0, so that the reason for spilling disabled is the scratch_limit and not the lack of tmpdir? Line 1206: could we have a test case that shows that the scratch limit calculation is correct. i.e. can allocate up to the limit file space and then a byte over gives the error? http://gerrit.cloudera.org:8080/#/c/3938/4/be/src/runtime/buffered-block-mgr.cc File be/src/runtime/buffered-block-mgr.cc: PS4, Line 1326: ScratchSpace let's make this clearer what it's accounting. http://gerrit.cloudera.org:8080/#/c/3938/4/be/src/runtime/buffered-block-mgr.h File be/src/runtime/buffered-block-mgr.h: PS4, Line 646: allocated is the units bytes? clarify comment and consider making the name clearer: scratch_space_bytes_used_ or similar. http://gerrit.cloudera.org:8080/#/c/3938/4/be/src/runtime/test-env.cc File be/src/runtime/test-env.cc: PS4, Line 72: ( space before http://gerrit.cloudera.org:8080/#/c/3938/4/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: PS4, Line 280: emplace_back why not just push_back(tmp_file)? Line 287: file->Remove(); drops returned status on the floor. http://gerrit.cloudera.org:8080/#/c/3938/4/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: Line 32: /// client owns the File handle and can use it to expand the file. explain the relationship between TmpFileMgr and FileGroup (or how FileGroup is used to organize files). PS4, Line 54: if it crosses the allocation limit of its associated FileGroup this makes it sound like we allow the allocation to cross the limit and then fail sometime later. reword to clarify that it errors before limit is reached. Line 63: Status Remove(); should this still be exposed or should we only do this through file group close()? if exposed seems like we should update the FileGroup's accounting, but that seems messy. PS4, Line 116: DCHECK DCHECK_EQ PS4, Line 135: allocated in bytes? PS4, Line 136: max_size max_size makes it sound like this is the maximum file size, or something like that. how about bytes_limit or similar. PS4, Line 137: space_limit_ bytes_limit_? PS4, Line 155: space bytes PS4, Line 190: GetFile How about calling it NewFile() (to help differentiate with GetFileAt())? -- To view, visit http://gerrit.cloudera.org:8080/3938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie3e9039ad1c0df978ab71b956af33fd997714e73 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
