Sahil Takiar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14311


Change subject: IMPALA-8950: Add -d, -f options to hdfs copyFromLocal, put, cp
......................................................................

IMPALA-8950: Add -d, -f options to hdfs copyFromLocal, put, cp

Add the -d option and -f option to the following commands:

`hdfs dfs -copyFromLocal <localsrc> URI`
`hdfs dfs -put [ - | <localsrc1> .. ]. <dst>`
`hdfs dfs -cp URI [URI ...] <dest>`

The -d option "Skip[s] creation of temporary file with the suffix
._COPYING_." which improves performance of these commands on S3 since S3
does not support metadata only renames.

The -f option "Overwrites the destination if it already exists" combined
with HADOOP-13884 this improves issues seen with S3 consistency issues by
avoiding a HEAD request to check if the destination file exists or not.

Adds a new filesystem client to impala_test_suite.py called 'cli_client'
which is an instance of HadoopFsCommandLineClient.
HadoopFsCommandLineClient now has methods that wrap 'copyFromLocal' and
'put'. Re-factored most usages of the aforementioned HDFS commands to use
the new cli_client. Some usages were not appropriate / worth
refactoring, so occasionally this patch just adds the '-d' and '-f'
options explicitly.

Testing:
* Ran core tests on HDFS and S3

Change-Id: I0d45db1c00554e6fb6bcc0b552596d86d4e30144
---
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_coordinators.py
M tests/custom_cluster/test_hive_parquet_timestamp_conversion.py
M tests/custom_cluster/test_parquet_max_page_header.py
M tests/custom_cluster/test_udf_concurrency.py
M tests/metadata/test_hidden_files.py
M tests/metadata/test_refresh_partition.py
M tests/metadata/test_stale_metadata.py
M tests/query_test/test_compressed_formats.py
M tests/query_test/test_hdfs_file_mods.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_multiple_filesystems.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_scanners.py
M tests/query_test/test_scanners_fuzz.py
M tests/query_test/test_udfs.py
M tests/util/hdfs_util.py
18 files changed, 110 insertions(+), 66 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14311/1
--
To view, visit http://gerrit.cloudera.org:8080/14311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0d45db1c00554e6fb6bcc0b552596d86d4e30144
Gerrit-Change-Number: 14311
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <[email protected]>

Reply via email to