Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14311
Change subject: IMPALA-8950: Add -d, -f options to hdfs copyFromLocal, put, cp ...................................................................... IMPALA-8950: Add -d, -f options to hdfs copyFromLocal, put, cp Add the -d option and -f option to the following commands: `hdfs dfs -copyFromLocal <localsrc> URI` `hdfs dfs -put [ - | <localsrc1> .. ]. <dst>` `hdfs dfs -cp URI [URI ...] <dest>` The -d option "Skip[s] creation of temporary file with the suffix ._COPYING_." which improves performance of these commands on S3 since S3 does not support metadata only renames. The -f option "Overwrites the destination if it already exists" combined with HADOOP-13884 this improves issues seen with S3 consistency issues by avoiding a HEAD request to check if the destination file exists or not. Adds a new filesystem client to impala_test_suite.py called 'cli_client' which is an instance of HadoopFsCommandLineClient. HadoopFsCommandLineClient now has methods that wrap 'copyFromLocal' and 'put'. Re-factored most usages of the aforementioned HDFS commands to use the new cli_client. Some usages were not appropriate / worth refactoring, so occasionally this patch just adds the '-d' and '-f' options explicitly. Testing: * Ran core tests on HDFS and S3 Change-Id: I0d45db1c00554e6fb6bcc0b552596d86d4e30144 --- M testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test M tests/common/impala_test_suite.py M tests/custom_cluster/test_coordinators.py M tests/custom_cluster/test_hive_parquet_timestamp_conversion.py M tests/custom_cluster/test_parquet_max_page_header.py M tests/custom_cluster/test_udf_concurrency.py M tests/metadata/test_hidden_files.py M tests/metadata/test_refresh_partition.py M tests/metadata/test_stale_metadata.py M tests/query_test/test_compressed_formats.py M tests/query_test/test_hdfs_file_mods.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_multiple_filesystems.py M tests/query_test/test_nested_types.py M tests/query_test/test_scanners.py M tests/query_test/test_scanners_fuzz.py M tests/query_test/test_udfs.py M tests/util/hdfs_util.py 18 files changed, 110 insertions(+), 66 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14311/1 -- To view, visit http://gerrit.cloudera.org:8080/14311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I0d45db1c00554e6fb6bcc0b552596d86d4e30144 Gerrit-Change-Number: 14311 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar <[email protected]>
