Sahil Takiar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14621


Change subject: IMPALA-8557: Add '.txt' to text files, remove '.' at end of 
filenames
......................................................................

IMPALA-8557: Add '.txt' to text files, remove '.' at end of filenames

Writes to text tables on ABFS are failing because HADOOP-15860 recently
changed the ABFS behavior when writing files / folders that end with a
'.'. ABFS explicitly does not allow files / folders that end with a dot.
>From the ABFS docs: "Avoid blob names that end with a dot (.), a forward
slash (/), or a sequence or combination of the two."

The behavior prior to HADOOP-15860 was to simply drop any trailing dots
when writing files or folders, but that can lead to various issues
because clients may try to read back a file that should exist on ABFS,
but doesn't. HADOOP-15860 changed the behavior so that any attempt to
write a file or folder with a trailing dot fails on ABFS.

Impala writes all text files with a trailing dot due to some odd
behavior in hdfs-table-sink.cc. The table sink writes files with
a "file extension" which is dependent on the file type. For example,
Parquet files have a file extension of ".parq". For some reason, text
files had no file extension, so Impala would try to write text files of
the following form:
"244c5ee8ece6f759-8b1a1e3b00000000_45513034_data.0.".

This patch adds the ".txt" extension to all written text files and
modifies the hdfs-table-sink.cc so that it doesn't add a trailing dot to
a filename if there is no file extension.

Testing:
* Ran core tests
* Re-ran affected ABFS tests

Change-Id: I2a9adacd45855cde86724e10f8a131e17ebf46f8
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-text-table-writer.cc
2 files changed, 5 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/14621/1
--
To view, visit http://gerrit.cloudera.org:8080/14621
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2a9adacd45855cde86724e10f8a131e17ebf46f8
Gerrit-Change-Number: 14621
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <[email protected]>

Reply via email to