Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17129 )

Change subject: KUDU-3248: Match C++ replica selection behavior of Java client
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17129/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17129/3//COMMIT_MSG@15
PS3, Line 15: This is expected to be a good balance
> How do we know?  What if a single process does a lot of full table scans?
That single process is itself not distributed in nature at that point and I 
would expect if it is doing multiple full scans of the same tablet it would be 
a good choice to pick the already warm tablet.

IMPALA-10481 describes why our current choice is suboptimal and this change 
aims to address that concern. We do plan to performance test and validate this 
change Impala side as well.

The one case I could see where a single process might want to use multiple 
tablet replicas is the case where the tablet splitting feature is used on the 
same process, though there hasn't been a lot of performance evaluation of that 
yet. IMPALA-9792 is still open tracking that evaluation from the Impala side 
(Which is the only known usage today).


http://gerrit.cloudera.org:8080/#/c/17129/3/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/17129/3/src/kudu/client/client-internal.cc@191
PS3, Line 191: still benefit from spreading the load across replicas
> I meant we already have a recommendation to have just a single client per a
I didn't have a good reason to suspect that per-client would necessarily be a 
better choice than per process. Especially considering we often see a single 
client per process anyway.

The benefit of per process is that it matches the current java behavior and it 
means that multiple clients can be used without impacting the selection 
affinity on a given process. I could see a world where both is beneficial but 
didn't want to overcomplicate the change.

I think the "ultimate" fix would be to allow to user to provide their own 
"seed" for replica selection randomness. I think that could be a useful follow 
on change to track with a jira.



--
To view, visit http://gerrit.cloudera.org:8080/17129
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa55e88b4a222fabfaa7fa521c24482cc6816b04
Gerrit-Change-Number: 17129
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 26 Feb 2021 20:09:00 +0000
Gerrit-HasComments: Yes

Reply via email to