Todd Lipcon has posted comments on this change. Change subject: KUDU-981 (part 1): validate identifiers as UTF8 with no null bytes ......................................................................
Patch Set 1: > Forcing everyone to use UTF-8 is pragmatic, but it's also restrictive for > users who want to use their own random encoding (granted, the Linux kernel > policy is incompatible with many encodings). What do you think of adopting a > similar policy for Kudu? I think the days of non-UTF8 international encodings are starting to wane, no? More importantly, we currently use the 'string' type in our .protos for such identifiers, and protobuf prescribes that strings should be utf8. That has specific impact on the Java APIs which expose these things as 'String' and hard-code UTF8. So, I think in practice we are already enforcing UTF8 for any apps that use the Java client, and this is just adding server-side validation that prevents C++ users from creating tables that can't be read from Java. -- To view, visit http://gerrit.cloudera.org:8080/5296 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3b05e2882c9c2ce9b47c16450d9d54a04d3e38b Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-HasComments: No