Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/14353 )
Change subject: [WIP] KUDU-1938 Performance tuning ...................................................................... Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/14353/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14353/1//COMMIT_MSG@41 PS1, Line 41: [ OK ] CharUtilTest.StressTestAscii (902 ms) > It looks like this patch makes ascii handling ~3x faster and utf8 is the sa yep, basically it adds a fast-path for chunks that contain only ASCII characters in which case we don't need to examine each byte individually. http://gerrit.cloudera.org:8080/#/c/14353/1/src/kudu/util/char_util.cc File src/kudu/util/char_util.cc: http://gerrit.cloudera.org:8080/#/c/14353/1/src/kudu/util/char_util.cc@24 PS1, Line 24: Slice UTF8Truncate(Slice val, size_t max_utf8_length) { > Was this copied from a reference implementation somewhere or written from s This was written from scratch. Todd's comment on https://gerrit.cloudera.org/c/13928/3//COMMIT_MSG#17 gave me the idea, then I brainstormed with two friends (Istvan Farmosi & Zoltan Chovan) and we came up with this optimization. -- To view, visit http://gerrit.cloudera.org:8080/14353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iebb98e18a3619029d9b0bc224c7dead89a3d7374 Gerrit-Change-Number: 14353 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Bukor <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 03 Oct 2019 06:59:00 +0000 Gerrit-HasComments: Yes
