Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/22378 )

Change subject: IMPALA-13677: Add startup cleanup for remote scratch
......................................................................

IMPALA-13677: Add startup cleanup for remote scratch

This patch introduces a new feature for cleaning up remote scratch
files during Impala daemon startup, ensuring that potential leftover
files from abnormal shutdowns are removed.

To allow efficient cleanup, this patch also refines the remote
scratch directory hierarchy by adding a host-level directory,
changing it from:
<base_dir>/<backend_id>_<query_id>/<file_name>
to:
<base_dir>/<hostname>/<backend_id>_<query_id>/<file_name>

During startup, if the host-level directory exists, it will be
removed entirely. This design assumes one Impala daemon per host.

Also added one flag remote_scratch_cleanup_on_startup to control
whether the host-level directory is cleaned during Impala daemon
startup. By default, this feature is enabled.

Tests:
Passed exhaustive tests.
Adds testcase test_scratch_dirs_remote_spill_leftover_files_removal.

Change-Id: Iadd49b7384d52bac5ddab4e86cd9f39dc2c88e1b
---
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M tests/custom_cluster/test_scratch_disk.py
5 files changed, 94 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/22378/2
--
To view, visit http://gerrit.cloudera.org:8080/22378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iadd49b7384d52bac5ddab4e86cd9f39dc2c88e1b
Gerrit-Change-Number: 22378
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>

Reply via email to