Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2700: ASCII NUL characters are doubled on insert into 
text tables
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3703/5/be/src/exec/hdfs-text-table-writer.cc
File be/src/exec/hdfs-text-table-writer.cc:

PS5, Line 208:     if (escape_char_ == '\0') {
             :       rowbatch_stringstream_ << str_val->ptr[i];
> not sure about how this works.. if escapt_char_ is '\0' but field_delim_ is
We're replicating Hive's behaviour, which is that '\0' means no escaping. We 
already implement this on the read path.

This does mean that if data containing the escape char is inserted, we won't 
read back the same results.


-- 
To view, visit http://gerrit.cloudera.org:8080/3703
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia30fa314d1ee1e99f9e7598466eb1570ca7940fc
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: anujphadke <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Huaisi Xu <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: anujphadke <[email protected]>
Gerrit-HasComments: Yes

Reply via email to