Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/22378 )
Change subject: IMPALA-13677: Support remote scratch directory cleanup at Impala daemon startup ...................................................................... Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/22378/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22378/3//COMMIT_MSG@7 PS3, Line 7: IMPALA-13677: Support remote scratch directory cleanup at Impala daemon startup > Maybe rephrase this to: Done http://gerrit.cloudera.org:8080/#/c/22378/3//COMMIT_MSG@21 PS3, Line 21: removed entirely. > Maybe we should also mention other assumptions: Done http://gerrit.cloudera.org:8080/#/c/22378/3/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/22378/3/be/src/runtime/io/disk-io-mgr.cc@406 PS3, Line 406: VLOG(2) << "File upload succeeded. File name: " << remote_file_path; > Should this be at INFO level? Logging at each 256M (default) file upload co Yeah, it could be. Changed to VLOG(2), hope it helps. http://gerrit.cloudera.org:8080/#/c/22378/2/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/22378/2/be/src/runtime/tmp-file-mgr.cc@882 PS2, Line 882: if (hdfsCreateDirectory(hdfs_conn, path_.c_str()) != 0) { > I asked because it doesn't seem to me to be consistent with the comment abo Thanks for this comment. I tested this with a remote hdfs. If we try to upload a file to a path like hdfs://xxxx:port/not_existing/impala-scratch, it can fail with a 'Permission denied' error. The hdfsCreateDirectory is more of a validation step here to check if the path is writable when the impala-scratch directory doesn't exist. If the path already exists, the upload process can successfully create directories and files under the impala-scratch directory without any issues. I've updated the comment accordingly, please see if it's clearer http://gerrit.cloudera.org:8080/#/c/22378/3/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/22378/3/be/src/runtime/tmp-file-mgr.cc@475 PS3, Line 475: = Yeah, it is from FLAGS_hostname in https://github.com/apache/impala/blob/988d353e02430731a212371ad3c37310ad58a07a/be/src/runtime/exec-env.cc#L257, which initialized from GetHostname() in https://github.com/apache/impala/blob/988d353e02430731a212371ad3c37310ad58a07a/be/src/util/network-util.cc#L57C8-L57C19. It should be always the hostname instead of ip. -- To view, visit http://gerrit.cloudera.org:8080/22378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iadd49b7384d52bac5ddab4e86cd9f39dc2c88e1b Gerrit-Change-Number: 22378 Gerrit-PatchSet: 4 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Sat, 25 Jan 2025 12:32:17 +0000 Gerrit-HasComments: Yes
