Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18219 )

Change subject: IMPALA-11064 Optimizing Temporary File Structure for Batch 
Reading
......................................................................


Patch Set 9:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/18219/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18219/8//COMMIT_MSG@67
PS8, Line 67: a block with spill id 1 and sequence number 1. Instead, if step
> It's not really clear to me what would happen if step number is other than
Added examples. Simply, the idea is to specify which future block is going to 
fetch.
For example, if we are reading the data of block 1-1, then will try to fetch 
block 1-2 if step is 1, and will try to get block 1-3 if step is 2.
Block 1-3 means the spill id is 1 and sequence id is 3. We expect the data 
would be read sequentially by the sequence id within the same spill id. The 
main purpose is that it may have chances that not fast enough to fetch the 
closest block before we move forward to read the next block, so we can set a 
larger step to prefetch a block a little far away.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/io/disk-file.h
File be/src/runtime/io/disk-file.h:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/io/disk-file.h@256
PS8, Line 256:     // written_bytes() in FileWriter is thread-safe.
> Any chance this should also be under lock_guard? Looks like probably not be
The life cycle of file_writer_ should be the same as the DiskFile, nobody else 
is able to set it to nullptr. The written_bytes() is protected inside the 
FileWriter's lock_. So it should be fine.
Added some comments.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/io/disk-io-mgr.cc@291
PS8, Line 291:     DCHECK_LE(written_bytes, actual_file_size);
> Why call actual_content_bytes() again (which uses a lock) rather than compa
Done


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/io/disk-io-mgr.cc@413
PS8, Line 413: Status RemoteOperRange::DoFetch() {
> Why is io_ctx added? It doesn't seem to be used at all.
Thanks for pointing this out. I think it was used within the function, but no 
longer needed. Removed.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/io/disk-io-mgr.cc@435
PS8, Line 435:     // Return okay because the remote file is deleted only when 
the query is going to
> nit: "the the", and I'd remove "probably" unless there's some sort of race
That makes sense. Changed.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/io/request-context.h
File be/src/runtime/io/request-context.h:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/io/request-context.h@195
PS8, Line 195:   void set_read_use_batch_counter(RuntimeProfile::Counter* 
read_use_batch_counter) {
> nit: it'd be nice if order of these functions added order of the counter va
Done


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr-internal.h
File be/src/runtime/tmp-file-mgr-internal.h:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr-internal.h@67
PS8, Line 67:       is_full_ = true;
> Why does this set is_full_ when it doesn't actually allocate the space? If
Yeah, that is reason to make it easier to manage the blocks and have a fixed 
sequence. Also, in most cases I can see, the page size (num_bytes) spilled in 
the same partition should be the same unless the last few pages if my 
understanding is correct.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr-test.cc
File be/src/runtime/tmp-file-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr-test.cc@334
PS8, Line 334:       TmpFileGroup& file_group, int idx, bool batch_read = 
false) {
> Planning to add tests where batch_read is false?
Added one old testcase to use it, also changed the new testcase to use both 
batch_read true and false.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr-test.cc@2148
PS8, Line 2148:     WaitForCallbacks(1);
> I wonder if this whole test should be duplicated, so we can have both batch
That is a good idea. Added.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.h@521
PS8, Line 521:   /// compression). The caller should not modify the data in 
'buffer' until the write
> These default values are only used in testing, and remote_batch_read_prefet
Changed to default value.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@457
PS8, Line 457: Status TmpFileMgr::CreateUploadTmpFileThread() {
> I'm not really sure why this is needed. Reading the function comments doesn
I guess the function is helping the turn the buffer size into a form of files 
in the pool during initialization, which is the form that will eventually 
happen, because all the released buffer files should be returned to the pool. 
However it may not be necessary, removed for now.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@545
PS8, Line 545:   DCHECK(file_group != nullptr);
> This function is only called one place, where we know tmp_file is a TempFil
On second thought, I moved the function to TmpFileRemote, and name it as 
ReleaseReadBuffers().


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@549
PS8, Line 549:
> This check redundant because we called it in TempFileRemote right before th
Removed the check of the caller, and keep the check inside the implementation.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@608
PS8, Line 608:
> I'm not clear what this pattern is doing. I see it here and somewhat simila
There is a context that we can reserve the local disk space for a new file from 
two ways, one is from the metrics local_buff_dir_bytes_high_water_mark before 
it reaches the dir bytes_limit, the other is from the pool, an evictable file 
will be returned to the pool without changing the metrics. For long running, 
the latter way could be more common.

Because this patch introduces a new file size for using batch read, it could be 
smaller than the default value. Also, previously the files in the pool are with 
default file size. To reduce complexity using different sizes in the pool, it 
would be good to remain the previous pattern, so this patch uses a way when 
returning the buffer space, which changes the metrics, we will check the 
remaining size, if larger than a default file size, then return a dummy file 
(with default file size) to the pool instead, otherwise, it could be a case 
that the thread in the pool might be starving waiting for the evictable file.

Added some comments.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@831
PS8, Line 831:   *path = parsed_raw_path;
> nit: "by default" seems misleading here. I don't see any way to change this
Done.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@1362
PS8, Line 1362:   }
> Personally I'd rather call actual_content_bytes once and store it locally f
Done


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/service/query-options.h@279
PS8, Line 279:   QUERY_OPT_FN(remote_batch_read_prefetch_step, 
REMOTE_BATCH_READ_PREFETCH_STEP,         \
> When will docs for these new query options be added?
That is a good point. Filed a Jira IMPALA-11656 to follow up.


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/service/query-options.cc@1255
PS8, Line 1255:                          "Only integer value larger than or 
equal to 0 is allowed.",
> nit: "equal" instead of "equals"
Done



--
To view, visit http://gerrit.cloudera.org:8080/18219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If913785cac9e2dafa20013b6600c87fcaf3e2018
Gerrit-Change-Number: 18219
Gerrit-PatchSet: 9
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Mon, 17 Oct 2022 03:00:10 +0000
Gerrit-HasComments: Yes

Reply via email to