Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14354 )

Change subject: KUDU-1938 Make UTF-8 truncation faster pt 2
......................................................................


Patch Set 13:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14354/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14354/11//COMMIT_MSG@22
PS11, Line 22: [ RUN      ] CharUtilTest.StressTestUtf8
             : [       OK ] CharUtilTest.StressTestUtf8 (9285 ms)
> It'll be really hard to provide guidance on when to enable that optimizatio
yeah maybe allow setting it on SetSliceCopy? I tried having a fallback fast 
path the way you proposed as well, but it was unfortunately slower in both 
cases:

8 and 16 byte fast paths (aligned):

[ RUN      ] CharUtilTest.StressTestUtf8
[       OK ] CharUtilTest.StressTestUtf8 (11317 ms)
[ RUN      ] CharUtilTest.StressTestAscii
[       OK ] CharUtilTest.StressTestAscii (777 ms)

I believe this is due to the extra branches which I suspect was the case with 
my previous attempt at making the slow path faster (instead of counting 
everything that's not the first byte of an UTF8 character, check the first ones 
only and advance the pointer to the start of the next character based on the 
first byte).

Interestingly enough I tried again now without aligning to 8/16 bytes and it 
was faster this way even though aligned seemed better with the non-SSE 
instructions:

8 byte fast path only with (unaligned):

[ RUN      ] CharUtilTest.StressTestUtf8
[       OK ] CharUtilTest.StressTestUtf8 (7485 ms)
[ RUN      ] CharUtilTest.StressTestAscii
[       OK ] CharUtilTest.StressTestAscii (1160 ms)

16 byte fast path only (unaligned):

[ RUN      ] CharUtilTest.StressTestUtf8
[       OK ] CharUtilTest.StressTestUtf8 (11150 ms)
[ RUN      ] CharUtilTest.StressTestAscii
[       OK ] CharUtilTest.StressTestAscii (754 ms)

It seems unaligned two-lane fast path wins over all other cases though:

8 and 16 byte fast paths (unaligned):

[ RUN      ] CharUtilTest.StressTestUtf8
[       OK ] CharUtilTest.StressTestUtf8 (9285 ms)
[ RUN      ] CharUtilTest.StressTestAscii
[       OK ] CharUtilTest.StressTestAscii (708 ms)


http://gerrit.cloudera.org:8080/#/c/14354/11/src/kudu/util/char_util-test.cc
File src/kudu/util/char_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/14354/11/src/kudu/util/char_util-test.cc@97
PS11, Line 97: TEST_F(CharUtilTest, CorrectnessTestUtf8AndAscii) {
             :   Slice result;
             :   Slice data = 
"ááááááááááááááááááááááááááááááááaaaaaaaaaaaaaaaaaaaaaaaaaaaa
> Can you do it in one line?
Done


http://gerrit.cloudera.org:8080/#/c/14354/9/src/kudu/util/char_util.cc
File src/kudu/util/char_util.cc:

PS9:
> Bear in mind that we already have CheckCPUFlags() to verify that the CPU su
Done



--
To view, visit http://gerrit.cloudera.org:8080/14354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9a491157dd5c8b4815030bbda921a0afc0bafd28
Gerrit-Change-Number: 14354
Gerrit-PatchSet: 13
Gerrit-Owner: Attila Bukor <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 12 Nov 2019 12:29:43 +0000
Gerrit-HasComments: Yes

Reply via email to