Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13760 )
Change subject: KUDU-1938 Add support for CHAR/VARCHAR pt 1 ...................................................................... Patch Set 26: (10 comments) http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/common/partial_row-test.cc File src/kudu/common/partial_row-test.cc: http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/common/partial_row-test.cc@267 PS26, Line 267: EXPECT_OK(row.Unset(4)); Does it make sense to add some more cases in here for CHAR: 1) all 10 spaces 2) leading 10 spaces and one non-space 3) just 1 space 4) '<space><non-space>' 5) 9 spaces followed by non-space (both non-unicode and unicode non-space) 6) patterns with symbols encoded in 2 and 3 bytes as well, and mix of those ? http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/common/partial_row.cc File src/kudu/common/partial_row.cc: http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/common/partial_row.cc@195 PS26, Line 195: ; nit: drop the semicolon http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/common/partial_row.cc@198 PS26, Line 198: ; nit: drop the semicolon http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/common/schema.h File src/kudu/common/schema.h: http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/common/schema.h@126 PS26, Line 126: RDMBMS nit: RDMS ? http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/common/schema.h@127 PS26, Line 127: characters Maybe, explicitly say 'characters/symbols (not bytes)'. http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/util/char_util.h File src/kudu/util/char_util.h: http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/util/char_util.h@20 PS26, Line 20: #include <stddef.h> : #include <stdint.h> nit: why not to use C++ headers cstddef and cstdint ? http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/util/char_util.h@35 PS26, Line 35: static nit: maybe, using constexpr would be more idiomatic here? http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/util/char_util.h@43 PS26, Line 43: max_utf8_length Could you add description of the 'max_utf8_length' parameter into the in-line doc? http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/util/char_util.cc File src/kudu/util/char_util.cc: http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/util/char_util.cc@30 PS26, Line 30: (*str & 0xc0) != 0x80 BTW, where is the UTF-8 validation w.r.t. the number of bytes 2,3,4 as specified by byte 1 supposed to happen? Maybe, document that this function doesn't validate the input w.r.t. correctness of the UTF-8 encoding. http://gerrit.cloudera.org:8080/#/c/13760/26/src/kudu/util/char_util.cc@31 PS26, Line 31: (num_bytes) why in parentheses? -- To view, visit http://gerrit.cloudera.org:8080/13760 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I998982dba93831db91c43a97ce30d3e68c2a4a54 Gerrit-Change-Number: 13760 Gerrit-PatchSet: 26 Gerrit-Owner: Attila Bukor <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Mon, 19 Aug 2019 18:20:37 +0000 Gerrit-HasComments: Yes
