Yida Wu has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/22378 )
Change subject: IMPALA-13677: Add startup cleanup for remote scratch ...................................................................... IMPALA-13677: Add startup cleanup for remote scratch This patch introduces a new feature for cleaning up remote scratch files during Impala daemon startup, ensuring that potential leftover files from abnormal shutdowns are removed. To allow efficient cleanup, this patch also refines the remote scratch directory hierarchy by adding a host-level directory, changing it from: <base_dir>/<backend_id>_<query_id>/<file_name> to: <base_dir>/<hostname>/<backend_id>_<query_id>/<file_name> During startup, if the host-level directory exists, it will be removed entirely. This design assumes one Impala daemon per host. Also added one flag remote_scratch_cleanup_on_startup to control whether the host-level directory is cleaned during Impala daemon startup. By default, this feature is enabled. Tests: Passed exhaustive tests. Adds testcase test_scratch_dirs_remote_spill_leftover_files_removal. Change-Id: Iadd49b7384d52bac5ddab4e86cd9f39dc2c88e1b --- M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M tests/custom_cluster/test_scratch_disk.py 5 files changed, 94 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/22378/2 -- To view, visit http://gerrit.cloudera.org:8080/22378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iadd49b7384d52bac5ddab4e86cd9f39dc2c88e1b Gerrit-Change-Number: 22378 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
