Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16318 )

Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
......................................................................


Patch Set 27:

(19 comments)

Another round of comments. I understand the local buffer file logic now and I 
think it makes sense, just need to do another pass to look for bugs.

http://gerrit.cloudera.org:8080/#/c/16318/27//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16318/27//COMMIT_MSG@77
PS27, Line 77:    The first available local directory is used for the local 
buffer
Can you give an example of how you might set this up in practice to allow both 
local and remote spilling? Would you configure two local directories with 
different limits so that one was the local buffer directory and the other was 
the main scratch directory?


http://gerrit.cloudera.org:8080/#/c/16318/27//COMMIT_MSG@91
PS27, Line 91: * Unit Tests added to tmp-file-mgr-test/disk-io-mgr-test.
I think it'd be good to add a buffer pool test that exercises the remote 
spilling code paths. We have some randomized tests like  
BufferPoolTest::TestRandomInternalMulti() - I think you could add another one 
with a FileGroup configured to use remote scratch.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/io/disk-file.h
File be/src/runtime/io/disk-file.h:

http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/io/disk-file.h@137
PS27, Line 137:   void SetActualFileSize(int64_t size) {
This is only called once, right? Can we check that invariant:

  DCHECK_EQ(0, actual_file_size_.Load());


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/io/disk-file.h@138
PS27, Line 138: DCHECK
nit: use DCHECK_LE - it automatically prints the two values if it fails so it's 
a little nicer for debugging.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/io/disk-io-mgr.cc@252
PS27, Line 252:     is_full_ = written_bytes != 0 && written_bytes == 
disk_file_->actual_file_size();
Add

  DCHECK_LE(written_bytes, disk_file_->actual_file_size())

Just so we catch it in testing if there's a bug.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/io/disk-io-mgr.cc@329
PS27, Line 329:     int bytes = (file_size - offset < buffer_size) ? (file_size 
- offset) : buffer_size;
Maybe use min() instead?


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/io/request-context.cc
File be/src/runtime/io/request-context.cc:

http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/io/request-context.cc@243
PS27, Line 243:     (static_cast<WriteRange*>(range))->callback()(status);
nit: I think the parentheses around static_cast here and on l246 are not needed.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr-internal.h
File be/src/runtime/tmp-file-mgr-internal.h:

http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr-internal.h@326
PS27, Line 326:   /// The number of the tmp files in the pool. For debug and 
test.
We don't seem to use this variable any more


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.h@194
PS27, Line 194:     std::unique_ptr<TmpFileBufferPool> tmp_file_pool_;
When is this null or non-null?


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@427
PS24, Line 427:       *need_local_buffer_dir = false;
> Currently local buffer directory won't be inserted into tmp_dirs_, the dire
I left a comment on the commit message. It seems like it would be helpful to 
have an example of how to configure this in a reasonable way.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@512
PS27, Line 512:   if (tmp_dirs_remote_ctrl_.tmp_file_pool_ == nullptr) {
What [


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@917
PS27, Line 917:     if (tmp_file_mgr_->tmp_dirs_remote_ctrl_.tmp_file_pool_ != 
nullptr) {
Which case is this? Can we add a boolean helper method to TmpDirsRemoteCtrl() 
that makes it self-describing? It's sometimes a little hard to follow code when 
null/non-null implies certain modes.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@967
PS27, Line 967:   if (!tmp_files_remote_.empty()) {
It would be helpful to add a couple of comments to explain the code flow. E.g.

  // First, check to see if we can allocate more space in the previous file.

then below

 // No space in previous file, we need to set up a new remote file.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@1019
PS27, Line 1019:     return Status("Page size can't be larger than the 
temporary file size.");
I think this error isn't accurate any more?


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@1073
PS27, Line 1073:       if ((*tmp_file)->is_blacklisted()) {
nit: there seems to be some unnecessary formatting changes here that it would 
be better to avoid.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@1112
PS27, Line 1112:     if (!alloc_full || !status.ok()) return status;
nit: maybe reverse the condition so status.ok() comes first. It's basically 
equivalent but I prefer this pattern because in other places there are 
functions where the values of output arguments are undefined if they return a 
non-error status.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@1705
PS27, Line 1705:     if (cur_write_range_ == nullptr) {
               :       DCHECK(shut_down_);
               :       return;
               :     }
Couldn't we just return on line 1699 in this case?


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@1753
PS27, Line 1753:             auto key_range_it = find(write_ranges_.begin(), 
write_ranges_.end(), range);
I think this ends up being O(n^2) in the number of write ranges, which isn't 
ideal. Generally I like to avoid premature optimization, but it would be good 
to avoid the quadratic behaviour.

Can we store an iterator from write_ranges_ in write_ranges_to_add_ instead? 
std::list iterators are stable and remain valid as long as the item is present 
in the list.


http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.cc@1862
PS27, Line 1862: DCHECK
nit: DCHECK_EQ



--
To view, visit http://gerrit.cloudera.org:8080/16318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89
Gerrit-Change-Number: 16318
Gerrit-PatchSet: 27
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Fri, 15 Jan 2021 08:03:49 +0000
Gerrit-HasComments: Yes

Reply via email to