Hello Will Berkeley, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/12158 to look at the new patch set (#8). Change subject: KUDU-2348: In the Java client, pick a random replica when no replica is local ...................................................................... KUDU-2348: In the Java client, pick a random replica when no replica is local In RemoteTablet.java, 'getClosestServerInfo' always returns the last server in hashmap iteration order, which may cause load to be concentrated on one server especially when the number of tablet servers is small. This commit use a more random approach to select server. The current situation is: when the client chooses a replica to scan, and there are no local replicas, it chooses whichever server ends up last in the map iteration order, every time. So the choice is determined by the set of UUIDs of tablet servers hosting replicas and implementation details: the map implementation (consider if we used a TreeMap instead of a hashmap) and possibly the history of the map instance. As the issue said, this is bad because it could be the same across many client instances talking to the cluster. The new situation is: when the client chooses a replica to scan, and there are no replicas to scan, it chooses one based on the first character of the tablet id and the map iteration order. For a fixed tablet, different clients will make the same choice of server assuming they have the same map iteration order. This is an improvement over the current situation, but there is still the possibility for clients to pile on to one server for all the scans of a particular tablet. A complete solution to the problem would remove the map iteration order as a factor and choose a server at random without regard for tablet id, so clients will scan different servers for the same tablet even if their 'tabletServers' map instances have exactly the same state. Unfortunately, that's trickier, because right now the Java client is dependent on 'getClosestServerInfo' returning the same choice for the same set of tablet servers. See 'testGetReplicaSelectedServerInfoDeterminism' in TestRemoteTablet.java. Change-Id: I3d70e45d4c9532bb32223c1dddd0936b4ff8fd99 --- M java/kudu-client/src/main/java/org/apache/kudu/client/RemoteTablet.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestRemoteTablet.java 2 files changed, 29 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/12158/8 -- To view, visit http://gerrit.cloudera.org:8080/12158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3d70e45d4c9532bb32223c1dddd0936b4ff8fd99 Gerrit-Change-Number: 12158 Gerrit-PatchSet: 8 Gerrit-Owner: Zhang Yifan <zhangyif...@xiaomi.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-Reviewer: Zhang Yifan <zhangyif...@xiaomi.com>