Dan Hecht has posted comments on this change. Change subject: IMPALA-2069: add USE_UTF8_PARQUET_STRINGS query option ......................................................................
Patch Set 1: > Oh yea sorry, was working on the resolution patch and got confused > :) > > In practice it's possible no one would notice if we changed the > default, but technically it could affect someone's workload. e.g. I > think this determines what kind of object spark creates when it > reads the file, and technically Impala doesn't know it's writing > UTF8 strings, i.e. you could be writing raw binary data or a > different encoding. At the very least I think we should wait until > C6 to change the default. Okay. I think it would be worth having a short comment explaining why this is off by default in the code. And then if we think it should be changed when we can break compatibility, please file a JIRA with targetversion = Impala 3.0 to remember to do it. -- To view, visit http://gerrit.cloudera.org:8080/2531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I030c9f5c6272e09c1ce133f66234e3cfb26b68d4 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Skye Wanderman-Milne <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]> Gerrit-HasComments: No
