Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/16318 )
Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 ...................................................................... Patch Set 22: (63 comments) http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h File be/src/runtime/io/disk-file.h: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@19 PS20, Line 19: > nit: prefer #pragma once instead of include guards for new files. Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@39 PS20, Line 39: /// LOCAL indicates the file is in the local filesystem. > Can you document this state machine in the class comment for DiskFile. E.g. Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@48 PS20, Line 48: /// DELETED indicates the file is deleted from the filesystem. > Briefly doc what the constructors do and what the difference is. Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@66 PS20, Line 66: > what does persisted mean? written to the remote scratch? The diskfile can be in the local or remote filesystem. Persisted indicates that all of the contents have been written to the file and the file is closed. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@68 PS20, Line 68: // Delete the physical file. Caller should hold the exclusive file lock. > The usual pattern is to have a pair of functions, e.g. IsPersisted() and Is Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@142 PS20, Line 142: const std::string path_; > could be const since it's set in the constructor always Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@145 PS20, Line 145: const int64_t file_size_ = 0; > could be const since it's set in the constructor always Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@152 PS20, Line 152: DiskFileStatus file_status_; > This variable is unused and can be removed. Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@163 PS20, Line 163: > When is this held in shared/exclusive mode? By who? Added comment. The mutex is to protect the file from deleting, so the one who is going to delete the file need to hold an exclusive lock, for example, if it is a local buffer file and is to be evicted, the caller needs to acquire a unique lock on the mutex. In other cases, like reading or writing, shared lock would be fine. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@174 PS20, Line 174: ask the > nit:internal Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.h@177 PS20, Line 177: /// The hdfs connection used to connect to the remote scratch path. > It would be good if this checked the validity of the state transitions, i.e Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.cc File be/src/runtime/io/disk-file.cc: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.cc@25 PS20, Line 25: #include "util/spinlock.h" > nit: include common/names.h here and remove some of the std::/boost:: names Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.cc@36 PS20, Line 36: unique_lock<SpinLock> status_l(status_lock_); > What does this mean? Do we have a JIRA for this? Is there a gap in function The function is to delete a single file, but for now the remote files are deleted by removing the entire remote directory when deconstructs the TmpFileGroup, so not necessary to implement the function for a single remote file deletion. May remove the comment. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.cc@50 PS20, Line 50: > nit: the usual pattern is to common/names.h in .cc files and then drop the Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-file.cc@70 PS20, Line 70: } else { > Is this reachable? It's unclear from here whether this means that we'll cra Fixed. It is a wrong assertion in this patch. But still doesn't need to create a file writer for the remote files, because the remote files are written by RemoteOperRange::DoUpload(), instead of file writer. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-io-mgr.h@89 PS20, Line 89: /// called. > I think the class comment should mention the new class of operations in add Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-io-mgr.cc@296 PS20, Line 296: uint8 > uint8_t is the standard type, I think uint8 comes from a random library. Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-io-mgr.cc@854 PS20, Line 854: uint8_t* buffer = (uint8_t*)malloc(size); > This isn't ideal but probably OK since there's only a bounded number of thr Will issue a jira to optimize this part. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/disk-io-mgr.cc@859 PS20, Line 859: size))); > We should try to handle this gracefully with an error status. Done and added a test case DiskIoMgrTest::WriteToRemoteFailMallocBlock. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/local-file-writer.h File be/src/runtime/io/local-file-writer.h: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/local-file-writer.h@38 PS20, Line 38: within > nit: within Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/local-file-writer.cc File be/src/runtime/io/local-file-writer.cc: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/local-file-writer.cc@91 PS20, Line 91: > nit: whitespace Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/local-file-writer.cc@91 PS20, Line 91: // for writing, instead of sharing the same file handle. > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/request-ranges.h File be/src/runtime/io/request-ranges.h: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/request-ranges.h@279 PS20, Line 279: /// 'disk_file' and 'disk_buffer_file' provides methods to confirm and guranttee the > Mention when these are non-null - i.e. when spilling to a remote filesystem Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/request-ranges.h@586 PS20, Line 586: > Maybe mention when this is set? Is it when doing the read? Yes, it is set in the DoRead() function. It chooses whether reading from the buffer or to read a page from the remote. Added comment. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/request-ranges.h@716 PS20, Line 716: /// block buffer. Caller must not hold 'lock_'. > DoUpload and DoFetch can be private, I think Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/scan-range.cc File be/src/runtime/io/scan-range.cc: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/scan-range.cc@272 PS20, Line 272: shared_lock<shared_mutex> local_file_lock(*(disk_buffer_file_->GetFileLock())); > It looks like there's a lock order such that the local file lock is always Added a comment in the member "lock_" of disk-file.h. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/scan-range.cc@278 PS20, Line 278: && disk_file_->is_deleted(file_lock)) { > This case probably deserve a single line comment Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/io/scan-range.cc@281 PS20, Line 281: return ReadOutcome::CANCELLED; > This bit is kind of subtle and deserves a comment. This is when the buffer Yes, if the buffer exists and is not in fetching(reading from the remote and writing back to the local buffer). Added a comment. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr-internal.h File be/src/runtime/tmp-file-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr-internal.h@167 PS20, Line 167: /// used to manage the status of the temporary file. For each DiskFile, there are > Can you add a brief comment describing what this class represents. Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr-internal.h@178 PS20, Line 178: /// Most of the state transitions are done in the DiskIoMgr when an IO operation is done. > Can you add a brief comment describing what this class represents. Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr-internal.h@269 PS20, Line 269: > Can we have a DCHECK that the state transition is valid? Same applies to an Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr-test.cc File be/src/runtime/tmp-file-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr-test.cc@1370 PS20, Line 1370: MemRange data_mem_range(reinterpret_cast<uint8_t*>(&data[0]), data.size()); > Maybe I missed it, but how does this guarantee that the data is flushed to The testcases for reading from a remote file are in DiskIoMgrTest, e.g. DiskIoMgrTest::WriteToRemoteSuccess, because it may be easier to handle the file status changing in DiskIoMgrTest(need an upload and remove the local buffer before reading from remote). The testcases in TmpFileMgrTest involing remote directory mostly work with the local buffer and test if the configuration works well with remote directories. Added comments in TmpFileMgrTest. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@93 PS20, Line 93: /// across all files. Writes that would exceed the limit fail with an error status. > Can you document the locking here. It's now pretty non-trivial, so I think Added comments about the locking of remote temporary files. May add more comments on relevant locks. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@131 PS20, Line 131: is_local_dir(is_local_dir) {} > I find the name a little odd - we use "expected_local" elsewhere to refer t Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@188 PS20, Line 188: / sha > Enqueue (yes english is a ridiculous language sometimes) Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@194 PS20, Line 194: inLoc > Dequeue Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@197 PS20, Line 197: TmpFileMgr(); > Can you document when this will return? I.e. what method or event needs to Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@275 PS20, Line 275: cker() c > temporary (it looks like this spelling mistake is in a few places, so proba Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@290 PS20, Line 290: /// filesystem typically using 4kb or smaller blocks > nit: these comments are wrapping before 90 characters in a lot of places, c Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@297 PS20, Line 297: friend class TmpFileMgrTest; > This is adding a lot of members to a class that already has a lot of member Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@313 PS20, Line 313: ctrl_.tmp_files_remo > We've usually included something in the name indicating that it's a conditi Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@432 PS20, Line 432: /// 'handle'. > This seems wrong or confusing. Spilling queries can use pages larger than d Fixed with "max_row_size" rounded to power of 2. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@473 PS20, Line 473: it is exceeded. Mus > Should this be Reserve? Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@474 PS20, Line 474: Status AllocateSpace( > I'd probably prefer to *not* have so many boolean arguments with default va Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.h@488 PS20, Line 488: us Reser > available Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@102 PS20, Line 102: DEFINE_string(remote_tmp_file_block_size, "1M", > Flag description needs to be fixed. Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@105 PS20, Line 105: DEFINE_bool(remote_tmp_files_avail_pool_lifo, false, > Do we need the read by file mode? I'm skeptical that any perf benefit is wo Removed in this patch. May implement it later when there is a concrete perf benefit. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@203 PS20, Line 203: "Invalid value of remote_tmp_file_size '$0'", FLAGS_remote_tmp_file_size)); > I would prefer just returning an error if it doesn't parse correctly or is Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@208 PS20, Line 208: MAX_REMOTE_TMPFILE_SIZE_THRESHOLD_MB * 1024 * 1024; > Same here Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@224 PS20, Line 224: > nit:trimmed Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@282 PS20, Line 282: vector<bool> is_tmp_dir_on_disk(DiskInfo::num_disks(), false); > Consider combining some of the shared logic here, the control flow is not t Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@298 PS20, Line 298: scratch_subdir_path_str = TMP_DIR_HDFS_LOCAL_DEFAULT + "/" + TMP_SUB_DIR_NAME; > Modifying the vector we're iterating over is pretty confusing and I don't s Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@336 PS20, Line 336: tmp_dirs[i], metrics, is_tmp_dir_on_disk, has_remote_dir, disk_id)); > This loop and function in general is growing to be pretty big, would be goo Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@344 PS20, Line 344: > temporary Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@459 PS20, Line 459: n; > nit:Enqueue Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@465 PS20, Line 465: > We use c++-style cases, i.e. static_cast<TmpFileRemote*>. I think you shoul Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@513 PS20, Line 513: auto buffer_file = tmp_file_remote->DiskBufferFile(); > Same comment about static_cast Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@526 PS20, Line 526: } > Use braces for multi-line if statement Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@621 PS20, Line 621: RWFileOptions opts; > Needs braces Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@1178 PS20, Line 1178: handle->read_range_->Reset(nullptr, handle->write_range_->file(), > I'm surprised based on the function name that it does all this stuff - it s Removed due to removing the read_by_file. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@1308 PS20, Line 1308: > create? Done http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/runtime/tmp-file-mgr.cc@1732 PS20, Line 1732: > Looks like we don't do anything with add_status? Is that intended? Added status handle. http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/util/hdfs-util.h File be/src/util/hdfs-util.h: http://gerrit.cloudera.org:8080/#/c/16318/20/be/src/util/hdfs-util.h@54 PS20, Line 54: /// Returns true iff the path refers to a location on an HDFS filesystem. > can you add a sentence explaining what check_default_fs does? Done -- To view, visit http://gerrit.cloudera.org:8080/16318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89 Gerrit-Change-Number: 16318 Gerrit-PatchSet: 22 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Sat, 28 Nov 2020 15:39:58 +0000 Gerrit-HasComments: Yes
