Hello Will Berkeley, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12158

to look at the new patch set (#8).

Change subject: KUDU-2348: In the Java client, pick a random replica when no 
replica is local
......................................................................

KUDU-2348: In the Java client, pick a random replica when no replica is local

In RemoteTablet.java, 'getClosestServerInfo' always returns the last server
in hashmap iteration order, which may cause load to be concentrated
on one server especially when the number of tablet servers is small.

This commit use a more random approach to select server.

The current situation is: when the client chooses a replica to scan,
and there are no local replicas, it chooses whichever server ends up
last in the map iteration order, every time. So the choice is determined
by the set of UUIDs of tablet servers hosting replicas and implementation
details: the map implementation (consider if we used a TreeMap instead
of a hashmap) and possibly the history of the map instance.
As the issue said, this is bad because it could be the same across many
client instances talking to the cluster.

The new situation is: when the client chooses a replica to scan,
and there are no replicas to scan, it chooses one based on the first
character of the tablet id and the map iteration order. For a fixed tablet,
different clients will make the same choice of server assuming they have
the same map iteration order. This is an improvement over the current
situation, but there is still the possibility for clients to pile on to
one server for all the scans of a particular tablet.

A complete solution to the problem would remove the map iteration order
as a factor and choose a server at random without regard for tablet id,
so clients will scan different servers for the same tablet even if their
'tabletServers' map instances have exactly the same state.
Unfortunately, that's trickier, because right now the Java client is
dependent on 'getClosestServerInfo' returning the same choice for the
same set of tablet servers.
See 'testGetReplicaSelectedServerInfoDeterminism' in TestRemoteTablet.java.

Change-Id: I3d70e45d4c9532bb32223c1dddd0936b4ff8fd99
---
M java/kudu-client/src/main/java/org/apache/kudu/client/RemoteTablet.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestRemoteTablet.java
2 files changed, 29 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/12158/8
--
To view, visit http://gerrit.cloudera.org:8080/12158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3d70e45d4c9532bb32223c1dddd0936b4ff8fd99
Gerrit-Change-Number: 12158
Gerrit-PatchSet: 8
Gerrit-Owner: Zhang Yifan <zhangyif...@xiaomi.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Zhang Yifan <zhangyif...@xiaomi.com>

Reply via email to