Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14197 )
Change subject: IMPALA-5092 Add support for VARCHAR in Kudu tables ...................................................................... Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/14197/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14197/12//COMMIT_MSG@12 PS12, Line 12: VARCHAR length is enforced in the Kudu client, Impala doesn't have to : enforce it further. The length of the VARCHAR is given in characters : (UTF8) so a VARCHAR(n) is stored in "n" to "4n" bytes. > Relatedly, are there any test cases that cover strings with multi byte char Yeah we need to test the hell out of this, cause it's violating an existing invariant and can affect a bunch of other code. This patch is assuming a lot of stuff is just gonna work and I don't have much confidence in that. E.g. here's a DCHECK assuming that VARCHARs are https://github.com/apache/impala/blob/b5805de/be/src/exprs/anyval-util.h#L302 I also wouldn't be surprised at all if Impala inserts casts to varchar in various places, e.g. when inserting into tables. I also have no idea whether we might be using the VARCHAR length to truncate. In general the idea of having a different definition of length in the storage engine vs the rest of query execution seems extremely messy, but I understand where it coming from so maybe we can live with it if we're clear about the behaviour in a bunch of different circumstances and have a lot of testing. -- To view, visit http://gerrit.cloudera.org:8080/14197 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d4959410fdd882bfa980cb55e8a7837c7823da8 Gerrit-Change-Number: 14197 Gerrit-PatchSet: 12 Gerrit-Owner: Attila Bukor <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tamas Mate <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Tue, 10 Mar 2020 18:30:48 +0000 Gerrit-HasComments: Yes
