Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13928 )

Change subject: KUDU-1938 Add non-copy setters to partial row pt 3
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13928/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13928/3//COMMIT_MSG@14
PS3, Line 14: to already be truncated (which it is in Impala's case) and only 
check
is this a safe assumption? last I was aware, Impala's treatment of "string" is 
actually not UTF8, so their CHAR(8) is 8 bytes, not 8 unicode characters. Based 
on the rest of this commit message it sounds like we treat CHAR(8) as 8 unicode 
characters, which might be more than 8 bytes


http://gerrit.cloudera.org:8080/#/c/13928/3//COMMIT_MSG@17
PS3, Line 17: to avoid having to count each character manually.
is the unicode character counting not already fast-pathed for the ASCII subset 
of utf8? it seems like that should be a pretty easy optimization. It's still 
O(n) but probably can be several bytes per cycle (eg load 8 butes and & with 
0x8080808080808080 to check for high bits)



--
To view, visit http://gerrit.cloudera.org:8080/13928
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1f2aba098d649eb94e0314f6606cc33600e8d766
Gerrit-Change-Number: 13928
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Bukor <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Fri, 26 Jul 2019 22:52:42 +0000
Gerrit-HasComments: Yes

Reply via email to