Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14354 )
Change subject: KUDU-1938 Make UTF-8 truncation faster pt 2 ...................................................................... Patch Set 12: (3 comments) http://gerrit.cloudera.org:8080/#/c/14354/11//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14354/11//COMMIT_MSG@22 PS11, Line 22: [ RUN ] CharUtilTest.StressTestUtf8 : [ OK ] CharUtilTest.StressTestUtf8 (10599 ms) > I think it's due to the fact that this way we can only fast runs of 16 ASCI It'll be really hard to provide guidance on when to enable that optimization. And if the code runs in the client, it'll have to be configurable via function call rather than a flag. What about reusing the existing code too? Something like this: if next 16 bytes are all ASCII: use 16 byte fast path else if next 8 bytes are all ASCII: use original 8 byte fast path else: use one byte slow path http://gerrit.cloudera.org:8080/#/c/14354/11/src/kudu/util/char_util-test.cc File src/kudu/util/char_util-test.cc: http://gerrit.cloudera.org:8080/#/c/14354/11/src/kudu/util/char_util-test.cc@97 PS11, Line 97: Slice data; : : data = "ááááááááááááááááááááááááááááááááaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"; > I'm not sure what you mean. Can you do it in one line? Slice data = "..."; http://gerrit.cloudera.org:8080/#/c/14354/9/src/kudu/util/char_util.cc File src/kudu/util/char_util.cc: PS9: > I'm still thinking how to do this exactly. This will be used by the clients Bear in mind that we already have CheckCPUFlags() to verify that the CPU supports SSE 4.2 instructions. It's called in the servers as well as in the C++ client. So this really just boils down to ensuring we don't run this code in tests on old CPUs. Maybe it'd be good enough to call CheckCPUFlags() in KuduTest::KuduTest or KuduTest::Setup? -- To view, visit http://gerrit.cloudera.org:8080/14354 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a491157dd5c8b4815030bbda921a0afc0bafd28 Gerrit-Change-Number: 14354 Gerrit-PatchSet: 12 Gerrit-Owner: Attila Bukor <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 12 Nov 2019 08:49:34 +0000 Gerrit-HasComments: Yes
