Sailesh Mukil has uploaded a new patch set (#2). Change subject: IMPALA-5383: Fix PARQUET_FILE_SIZE option for ADLS ......................................................................
IMPALA-5383: Fix PARQUET_FILE_SIZE option for ADLS PARQUET_FILE_SIZE query option doesn't work with ADLS because the AdlFileSystem doesn't have a notion of block sizes. And impala depends on the filesystem remembering the block size which is then used as the target parquet file size (this is done for Hdfs so that the parquet file size and block size match even if the parquet_file_size isn't a valid blocksize). We special case for Adls just like we do for S3 to bypass the FileSystem block size, and instead just use the requested PARQUET_FILE_SIZE as the output partitions block_size (and consequently the parquet file target size). Testing: Re-enabled test_insert_parquet_verify_size() for ADLS. Also fixed a miscellaneous bug with the ADLS client listing helper function. Change-Id: I474a913b0ff9b2709f397702b58cb1c74251c25b --- M be/src/exec/hdfs-table-sink.cc M be/src/util/hdfs-util.cc M be/src/util/hdfs-util.h M tests/query_test/test_insert_parquet.py M tests/util/adls_util.py 5 files changed, 17 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/7018/2 -- To view, visit http://gerrit.cloudera.org:8080/7018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I474a913b0ff9b2709f397702b58cb1c74251c25b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil <[email protected]>
