Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14354 )

Change subject: KUDU-1938 Make UTF-8 truncation faster pt 2
......................................................................


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14354/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14354/11//COMMIT_MSG@22
PS11, Line 22: [ RUN      ] CharUtilTest.StressTestUtf8
             : [       OK ] CharUtilTest.StressTestUtf8 (10599 ms)
> I think it's due to the fact that this way we can only fast runs of 16 ASCI
It'll be really hard to provide guidance on when to enable that optimization. 
And if the code runs in the client, it'll have to be configurable via function 
call rather than a flag.

What about reusing the existing code too? Something like this:

  if next 16 bytes are all ASCII:
    use 16 byte fast path
  else if next 8 bytes are all ASCII:
    use original 8 byte fast path
  else:
    use one byte slow path


http://gerrit.cloudera.org:8080/#/c/14354/11/src/kudu/util/char_util-test.cc
File src/kudu/util/char_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/14354/11/src/kudu/util/char_util-test.cc@97
PS11, Line 97:   Slice data;
             :
             :   data = 
"ááááááááááááááááááááááááááááááááaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
> I'm not sure what you mean.
Can you do it in one line?

  Slice data = "...";


http://gerrit.cloudera.org:8080/#/c/14354/9/src/kudu/util/char_util.cc
File src/kudu/util/char_util.cc:

PS9:
> I'm still thinking how to do this exactly. This will be used by the clients
Bear in mind that we already have CheckCPUFlags() to verify that the CPU 
supports SSE 4.2 instructions. It's called in the servers as well as in the C++ 
client.

So this really just boils down to ensuring we don't run this code in tests on 
old CPUs. Maybe it'd be good enough to call CheckCPUFlags() in 
KuduTest::KuduTest or KuduTest::Setup?



-- 
To view, visit http://gerrit.cloudera.org:8080/14354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9a491157dd5c8b4815030bbda921a0afc0bafd28
Gerrit-Change-Number: 14354
Gerrit-PatchSet: 12
Gerrit-Owner: Attila Bukor <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 12 Nov 2019 08:49:34 +0000
Gerrit-HasComments: Yes

Reply via email to