Michael Smith has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20386 )
Change subject: IMPALA-12389: Use -skipTrash to avoid accumulating trash ...................................................................... IMPALA-12389: Use -skipTrash to avoid accumulating trash The default behavior for deleting files on Hadoop is to move them to a trash folder. The trash folder can be aged out, but Impala's developer environment sets the trash to live a long time. This is a problem, because the trash contents will continue to accumulate. This combines multiple changes to avoid accumulating trash: 1. This changes HadoopFsCommandLineClient's delete_file_dir to use -skipTrash to avoid accumulating the trash for this case. This helps on non-HDFS test environments. 2. This changes the unique_database fixture to delete the database directory before dropping the database. Non-external tables deleted as part of DROP DATABASE .. CASCADE are moved to the trash. Deleting the database directory ourselves avoids sending these files to the trash. 3. "hdfs dfs -expunge -immediate" can recover the disk space, but it is very slow. This increases the dfs.block.invalidate.limit to allow HDFS to delete more blocks in a single heartbeat. To support this change, there were other test-only changes: - This updates a few tests that placed tables outside of the unique_database. In particular, Iceberg tests using create_iceberg_table_from_directory() were putting tables outside the database. - TestHdfsEncryption and TestHdfsPermissions used WebHDFS-style paths without the leading slash. This is harmless, but this cleans them up to use normal paths. This is safe, because the delegating client converts it internally when using the WebHDFS client. Testing: - Ran tests locally and examined the trash directory Change-Id: I2d304113596aaf70a122202a33276fc7c3d599e8 Reviewed-on: http://gerrit.cloudera.org:8080/20386 Tested-by: Impala Public Jenkins <[email protected]> Reviewed-by: Michael Smith <[email protected]> --- M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl M tests/common/file_utils.py M tests/conftest.py M tests/metadata/test_hdfs_encryption.py M tests/metadata/test_hdfs_permissions.py M tests/query_test/test_scanners.py M tests/query_test/test_udfs.py M tests/util/hdfs_util.py M tests/util/iceberg_metadata_util.py 9 files changed, 35 insertions(+), 25 deletions(-) Approvals: Impala Public Jenkins: Verified Michael Smith: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/20386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I2d304113596aaf70a122202a33276fc7c3d599e8 Gerrit-Change-Number: 20386 Gerrit-PatchSet: 14 Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]>
