Hello Jason Fehr, Michael Smith, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/20386
to look at the new patch set (#13).
Change subject: IMPALA-12389: Use -skipTrash to avoid accumulating trash
......................................................................
IMPALA-12389: Use -skipTrash to avoid accumulating trash
The default behavior for deleting files on Hadoop is to
move them to a trash folder. The trash folder can be
aged out, but Impala's developer environment sets the
trash to live a long time. This is a problem, because the
trash contents will continue to accumulate.
This combines multiple changes to avoid accumulating trash:
1. This changes HadoopFsCommandLineClient's delete_file_dir
to use -skipTrash to avoid accumulating the trash for
this case. This helps on non-HDFS test environments.
2. This changes the unique_database fixture to delete the
database directory before dropping the database. Non-external
tables deleted as part of DROP DATABASE .. CASCADE are
moved to the trash. Deleting the database directory ourselves
avoids sending these files to the trash.
3. "hdfs dfs -expunge -immediate" can recover the disk space, but
it is very slow. This increases the dfs.block.invalidate.limit
to allow HDFS to delete more blocks in a single heartbeat.
To support this change, there were other test-only changes:
- This updates a few tests that placed tables outside of the
unique_database. In particular, Iceberg tests using
create_iceberg_table_from_directory() were putting tables
outside the database.
- TestHdfsEncryption and TestHdfsPermissions used WebHDFS-style
paths without the leading slash. This is harmless, but this
cleans them up to use normal paths. This is safe, because the
delegating client converts it internally when using the WebHDFS
client.
Testing:
- Ran tests locally and examined the trash directory
Change-Id: I2d304113596aaf70a122202a33276fc7c3d599e8
---
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
M tests/common/file_utils.py
M tests/conftest.py
M tests/metadata/test_hdfs_encryption.py
M tests/metadata/test_hdfs_permissions.py
M tests/query_test/test_scanners.py
M tests/query_test/test_udfs.py
M tests/util/hdfs_util.py
M tests/util/iceberg_metadata_util.py
9 files changed, 35 insertions(+), 25 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20386/13
--
To view, visit http://gerrit.cloudera.org:8080/20386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d304113596aaf70a122202a33276fc7c3d599e8
Gerrit-Change-Number: 20386
Gerrit-PatchSet: 13
Gerrit-Owner: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>