Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16318 )

Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
......................................................................


Patch Set 24:

(39 comments)

Still going through some of the finer details but I thought I'd push out 
another round of comments.

http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/io/disk-io-mgr.cc@339
PS24, Line 339:     if (disk_file_src_->is_to_delete()) {
I think we should simplify the concurrency by removing the is_to_delete() 
logic. It looks like it's an optimisation to cut short the uploading, but it's 
not necessary for correctness, since the thread needs to finish reading/writing 
the current block anyway. I.e. now we have to wait for 1 block to finish, 
without this optimisation we have to wait for 1 file.

I'd prefer to simplify the code and have this particular case be slower for now 
- query cancellation generally isn't on the critical path of queries.

I might think about this differently if we had a way to cancel 
hdfsRead()/hdfsWrite(), since then we could guarantee it returns fast, but we 
don't.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h
File be/src/runtime/tmp-file-mgr-internal.h:

http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@170
PS24, Line 170: /// TmpFileRemote is a derived class of TmpFile to provide 
methods to handle a
I really like this class comment.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@274
PS24, Line 274: done_
Maybe rename to full_ or at_capacity_? I think that is more specific than 
"done_".


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@311
PS24, Line 311:   /// Function called when a TmpFileGroup is closing to remove 
the write ranges
Can you describe more precisely what this does with its input argument. E.g. 
"Removes all enqueued write ranges belonging to 'io_ctx'".


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@375
PS24, Line 375: destorying
nit: destroying


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.h@101
PS24, Line 101: /// Locking:
Thanks for this.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.h@162
PS24, Line 162:   /// A configuration for the control parameters of remote 
temporary directories.
Can you mention the ownership lifecycle of this? It looks like it's owned by 
TmpFileMgr and has the same lifecycle.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.h@167
PS24, Line 167: tmporary
nit: temporary. I think I see a couple more instances of this typo so it's best 
to do a grep and find them all.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@100
PS24, Line 100: Should
nit: should


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@132
PS24, Line 132: const string TMP_DIR_HDFS_LOCAL_DEFAULT = 
"hdfs://localhost:20500/tmp";
Why do we need this? Outside of tests, shouldn't we be always getting the HDFS 
scratch dir path from config flags? I wouldn't expect this to be a valid path 
outside of our development environments.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@300
PS24, Line 300:       if (tmp_dirs_remote_ != nullptr) continue;
In this case, add a WARNING log that the temporary directory is being ignored.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@305
PS24, Line 305: TMP_DIR_HDFS_LOCAL_DEFAULT
why wouldn't we use tmp_path here?


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@357
PS24, Line 357:       // Later we might allow to use s3 fast upload directly 
without
nit: the can fit on one line within the 90 char limit


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@405
PS24, Line 405:     vector<bool>& is_tmp_dir_on_disk, bool& has_remote_dir, int 
disk_id) {
Our convention is to use pointers for output arguments


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@411
PS24, Line 411:       has_remote_dir = false;
The naming is confusing - in this case we still have a remote dir, but don't 
need to add a local buffer directory.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@427
PS24, Line 427:       return Status::OK();
Do we want to exit early in this case? Don't want want to still use the local 
buffer directory as a temporary directory?


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@507
PS24, Line 507:   if (tmp_file->disk_type() == io::DiskFileType::DUMMY) {
nit: just move this to the top (line 504) and make it a one-liner so that it's 
clearly not part of the main control flow of this function, i.e.

   if (tmp_file->disk_type() == io::DiskFileType::DUMMY) return Status::OK()


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@515
PS24, Line 515:   auto tmp_file_remote = static_cast<TmpFileRemote*>(tmp_file);
nit: I'd prefer to declare with the full type instead of auto.

This is a matter of taste, but I think it makes the code easier to read knowing 
the types.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@520
PS24, Line 520: boost
nit: use std::unique_lock instead of boost, we switch over a while ago. In .cc 
files that import common/names.h you can just use unique_lock since 
std::unique_lock is imported.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@524
PS24, Line 524:     if (status.ok()) {
nit: could remove one layer of nesting here, e.g.

if (status.ok() && remote_file->GetFileStatus() != 
io::DiskFileStatus::PERSISTED) {

}


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@561
PS24, Line 561:   if (tmp_file == nullptr) {
nit: I'd prefer if this was on a single line to make code denser.

  if (tmp_file == nullptr) return TMP_FILE_MGR_NO_AVAILABLE_FILE_TO_EVICT;


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@571
PS24, Line 571: Evction
nit: Eviction


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@717
PS24, Line 717:   DCHECK(local_buffer_path != nullptr);
Pass local_buffer_path by const& if it's always non-null


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@722
PS24, Line 722:     hdfs_conn_ = hdfsConnect("default", 0);
I think this connects to the wrong filesystem if hdfs_url references a 
different HDFS instance. Can't we use GetConnection() like in the S3 example 
below? I think maybe the code should be outside the if statement since 
GetConnection() works for all supported filesystems, just the arguments would 
be different (e.g. s3a_options)


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@774
PS24, Line 774:   SetToDelete();
I was a little suspicious of this because it's not signalling a conditional 
variable or doing anything else to wake up blocked threads (that's the usual 
pattern for waking up blocked threads).

My suggestion (see disk-io-mgr.cc) is to remove this logic to simplify.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@778
PS24, Line 778:   boost::unique_lock<shared_mutex> 
buffer_file_lock(*(disk_buffer_file_->GetFileLock()));
nit: mentioned this elsewhere, but drop boost:: so we're using std::unique_lock.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@789
PS24, Line 789:     } else {
nit: can merge the else and if into an "else if" and eliminate a level of 
indentation.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@807
PS24, Line 807:   // Return the file to the pool if it hasn't been enqueued.
nit: could fit this condition onto one line, we've usually favoured that 
because it's denser and you can fit more code onto a screen.

  if (to_enqueue) file_group_->tmp_file_mgr()->EnqueueTmpFilesPool(this, true);


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@906
PS24, Line 906: the
nit: extra the


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1025
PS24, Line 1025:     // It is not supposed to have a remote directory other 
than HDFS or S3.
We can remove a layer of indentation by removing the if and just having a 
DCHECK at the top:

  DCHECK(IsHdfsPath(dir.c_str(), false) || IsS3APath(dir.c_str(), false))


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1100
PS24, Line 1100:     Status status = AllocateLocalSpace(
If there's an error status here, it looks like we're dropping it. I think 
there's a bug and we need to handle the various error statuses from 
AllocateLocalSpace differently (e.g. SCRATCH_LIMIT_EXCEEDED).


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1134
PS24, Line 1134:  due to
nit: because instead of "due to"


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1194
PS24, Line 1194:     auto tmp_file = static_cast<TmpFileRemote*>(handle->file_);
nit: i'd prefer spelling out the type instead of auto, so long as it's not too 
verbose.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1466
PS24, Line 1466:   if (!tmp_file->GetWriteFile()->IsSpaceReserved()) {
Need to come back and understand this logic better.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1545
PS24, Line 1545:   if (upload_status.ok()) {
Should we log a warning if it failed?


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1731
PS24, Line 1731: Status TmpFileBufferPool::AddWriteRangesHelper(DiskFile* 
disk_file,
Maybe rename this to MoveWriteRangesHelper, since it's really moving the ranges 
from one state/data structure to another. I found the "Add" part confusing.


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1774
PS24, Line 1774:     if (range->offset() == 0) {
nit: could fit on 1 line


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1818
PS24, Line 1818:       if (tmp_file_remote->is_enqueued()) {
nit: the usual style in the codebase is to put these short conditionals on a 
single line to make the code denser


http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1841
PS24, Line 1841:   while (tmp_files_avail_pool_.empty()) {
If remote spilling is enabled, we should add a timer that tracks the time spent 
waiting for a buffer, so that we can tell if this is a bottleneck.



--
To view, visit http://gerrit.cloudera.org:8080/16318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89
Gerrit-Change-Number: 16318
Gerrit-PatchSet: 24
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Thu, 07 Jan 2021 05:59:35 +0000
Gerrit-HasComments: Yes

Reply via email to