Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16318 )
Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 ...................................................................... Patch Set 24: (39 comments) Still going through some of the finer details but I thought I'd push out another round of comments. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/io/disk-io-mgr.cc@339 PS24, Line 339: if (disk_file_src_->is_to_delete()) { I think we should simplify the concurrency by removing the is_to_delete() logic. It looks like it's an optimisation to cut short the uploading, but it's not necessary for correctness, since the thread needs to finish reading/writing the current block anyway. I.e. now we have to wait for 1 block to finish, without this optimisation we have to wait for 1 file. I'd prefer to simplify the code and have this particular case be slower for now - query cancellation generally isn't on the critical path of queries. I might think about this differently if we had a way to cancel hdfsRead()/hdfsWrite(), since then we could guarantee it returns fast, but we don't. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h File be/src/runtime/tmp-file-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@170 PS24, Line 170: /// TmpFileRemote is a derived class of TmpFile to provide methods to handle a I really like this class comment. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@274 PS24, Line 274: done_ Maybe rename to full_ or at_capacity_? I think that is more specific than "done_". http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@311 PS24, Line 311: /// Function called when a TmpFileGroup is closing to remove the write ranges Can you describe more precisely what this does with its input argument. E.g. "Removes all enqueued write ranges belonging to 'io_ctx'". http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@375 PS24, Line 375: destorying nit: destroying http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.h@101 PS24, Line 101: /// Locking: Thanks for this. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.h@162 PS24, Line 162: /// A configuration for the control parameters of remote temporary directories. Can you mention the ownership lifecycle of this? It looks like it's owned by TmpFileMgr and has the same lifecycle. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.h@167 PS24, Line 167: tmporary nit: temporary. I think I see a couple more instances of this typo so it's best to do a grep and find them all. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@100 PS24, Line 100: Should nit: should http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@132 PS24, Line 132: const string TMP_DIR_HDFS_LOCAL_DEFAULT = "hdfs://localhost:20500/tmp"; Why do we need this? Outside of tests, shouldn't we be always getting the HDFS scratch dir path from config flags? I wouldn't expect this to be a valid path outside of our development environments. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@300 PS24, Line 300: if (tmp_dirs_remote_ != nullptr) continue; In this case, add a WARNING log that the temporary directory is being ignored. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@305 PS24, Line 305: TMP_DIR_HDFS_LOCAL_DEFAULT why wouldn't we use tmp_path here? http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@357 PS24, Line 357: // Later we might allow to use s3 fast upload directly without nit: the can fit on one line within the 90 char limit http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@405 PS24, Line 405: vector<bool>& is_tmp_dir_on_disk, bool& has_remote_dir, int disk_id) { Our convention is to use pointers for output arguments http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@411 PS24, Line 411: has_remote_dir = false; The naming is confusing - in this case we still have a remote dir, but don't need to add a local buffer directory. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@427 PS24, Line 427: return Status::OK(); Do we want to exit early in this case? Don't want want to still use the local buffer directory as a temporary directory? http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@507 PS24, Line 507: if (tmp_file->disk_type() == io::DiskFileType::DUMMY) { nit: just move this to the top (line 504) and make it a one-liner so that it's clearly not part of the main control flow of this function, i.e. if (tmp_file->disk_type() == io::DiskFileType::DUMMY) return Status::OK() http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@515 PS24, Line 515: auto tmp_file_remote = static_cast<TmpFileRemote*>(tmp_file); nit: I'd prefer to declare with the full type instead of auto. This is a matter of taste, but I think it makes the code easier to read knowing the types. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@520 PS24, Line 520: boost nit: use std::unique_lock instead of boost, we switch over a while ago. In .cc files that import common/names.h you can just use unique_lock since std::unique_lock is imported. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@524 PS24, Line 524: if (status.ok()) { nit: could remove one layer of nesting here, e.g. if (status.ok() && remote_file->GetFileStatus() != io::DiskFileStatus::PERSISTED) { } http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@561 PS24, Line 561: if (tmp_file == nullptr) { nit: I'd prefer if this was on a single line to make code denser. if (tmp_file == nullptr) return TMP_FILE_MGR_NO_AVAILABLE_FILE_TO_EVICT; http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@571 PS24, Line 571: Evction nit: Eviction http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@717 PS24, Line 717: DCHECK(local_buffer_path != nullptr); Pass local_buffer_path by const& if it's always non-null http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@722 PS24, Line 722: hdfs_conn_ = hdfsConnect("default", 0); I think this connects to the wrong filesystem if hdfs_url references a different HDFS instance. Can't we use GetConnection() like in the S3 example below? I think maybe the code should be outside the if statement since GetConnection() works for all supported filesystems, just the arguments would be different (e.g. s3a_options) http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@774 PS24, Line 774: SetToDelete(); I was a little suspicious of this because it's not signalling a conditional variable or doing anything else to wake up blocked threads (that's the usual pattern for waking up blocked threads). My suggestion (see disk-io-mgr.cc) is to remove this logic to simplify. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@778 PS24, Line 778: boost::unique_lock<shared_mutex> buffer_file_lock(*(disk_buffer_file_->GetFileLock())); nit: mentioned this elsewhere, but drop boost:: so we're using std::unique_lock. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@789 PS24, Line 789: } else { nit: can merge the else and if into an "else if" and eliminate a level of indentation. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@807 PS24, Line 807: // Return the file to the pool if it hasn't been enqueued. nit: could fit this condition onto one line, we've usually favoured that because it's denser and you can fit more code onto a screen. if (to_enqueue) file_group_->tmp_file_mgr()->EnqueueTmpFilesPool(this, true); http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@906 PS24, Line 906: the nit: extra the http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1025 PS24, Line 1025: // It is not supposed to have a remote directory other than HDFS or S3. We can remove a layer of indentation by removing the if and just having a DCHECK at the top: DCHECK(IsHdfsPath(dir.c_str(), false) || IsS3APath(dir.c_str(), false)) http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1100 PS24, Line 1100: Status status = AllocateLocalSpace( If there's an error status here, it looks like we're dropping it. I think there's a bug and we need to handle the various error statuses from AllocateLocalSpace differently (e.g. SCRATCH_LIMIT_EXCEEDED). http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1134 PS24, Line 1134: due to nit: because instead of "due to" http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1194 PS24, Line 1194: auto tmp_file = static_cast<TmpFileRemote*>(handle->file_); nit: i'd prefer spelling out the type instead of auto, so long as it's not too verbose. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1466 PS24, Line 1466: if (!tmp_file->GetWriteFile()->IsSpaceReserved()) { Need to come back and understand this logic better. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1545 PS24, Line 1545: if (upload_status.ok()) { Should we log a warning if it failed? http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1731 PS24, Line 1731: Status TmpFileBufferPool::AddWriteRangesHelper(DiskFile* disk_file, Maybe rename this to MoveWriteRangesHelper, since it's really moving the ranges from one state/data structure to another. I found the "Add" part confusing. http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1774 PS24, Line 1774: if (range->offset() == 0) { nit: could fit on 1 line http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1818 PS24, Line 1818: if (tmp_file_remote->is_enqueued()) { nit: the usual style in the codebase is to put these short conditionals on a single line to make the code denser http://gerrit.cloudera.org:8080/#/c/16318/24/be/src/runtime/tmp-file-mgr.cc@1841 PS24, Line 1841: while (tmp_files_avail_pool_.empty()) { If remote spilling is enabled, we should add a timer that tracks the time spent waiting for a buffer, so that we can tell if this is a bottleneck. -- To view, visit http://gerrit.cloudera.org:8080/16318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89 Gerrit-Change-Number: 16318 Gerrit-PatchSet: 24 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Thu, 07 Jan 2021 05:59:35 +0000 Gerrit-HasComments: Yes
