Todd Lipcon has posted comments on this change.

Change subject: KUDU-981 (part 1): validate identifiers as UTF8 with no null 
bytes
......................................................................


Patch Set 1:

> Forcing everyone to use UTF-8 is pragmatic, but it's also restrictive for 
> users who want to use their own random encoding (granted, the Linux kernel 
> policy is incompatible with many encodings). What do you think of adopting a 
> similar policy for Kudu?

I think the days of non-UTF8 international encodings are starting to wane, no?

More importantly, we currently use the 'string' type in our .protos for such 
identifiers, and protobuf prescribes that strings should be utf8. That has 
specific impact on the Java APIs which expose these things as 'String' and 
hard-code UTF8. So, I think in practice we are already enforcing UTF8 for any 
apps that use the Java client, and this is just adding server-side validation 
that prevents C++ users from creating tables that can't be read from Java.

-- 
To view, visit http://gerrit.cloudera.org:8080/5296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3b05e2882c9c2ce9b47c16450d9d54a04d3e38b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-HasComments: No

Reply via email to