[
https://issues.apache.org/jira/browse/KUDU-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956555#comment-16956555
]
Bankim Bhavsar commented on KUDU-2962:
--------------------------------------
I think there are 2 high-level approaches here:
# In {{FindTabletFollowers}} go through each supplied tablet servers and
sanitize that the tablet server hosts the supplied tablet id . Then use the
reduced/filtered list below in the existing implementation. This can be
accomplished using helper methods like {{GetConsensusState()}} from
{{cluster_itest_util.cc}}.
# Alternative is to query the master for tablet servers for specified tablet
id. This can be accomplished using helper method {{GetTabletLocations}} and
filtering out leader tablet. In this case, the caller to FindTabletFollowers()
need not even supply the tablet server map but instead needs the master proxy.
Method #2 looks better to me unless there is a reason not to go through the
master and instead rely/verify state of the tablet servers directly.
[~aserbin] what do you think?
> Fix kudu::itest::FindTabletFollowers() test utility function
> ------------------------------------------------------------
>
> Key: KUDU-2962
> URL: https://issues.apache.org/jira/browse/KUDU-2962
> Project: Kudu
> Issue Type: Improvement
> Components: test
> Reporter: Alexey Serbin
> Assignee: Bankim Bhavsar
> Priority: Minor
> Labels: newbie
>
> The {{kudu::itest::FindTabletFollowers()}} function is unsafe: it uses
> {{kudu::itest::FindTabletLeader()}} to generate the result as a complement to
> tablet servers hosting the leader replica, but it doesn't sanitize the set of
> tablet servers to make sure it contains only tablet servers hosting replicas
> of the specified tablet.
> For example, if you have a cluster with 10 tablet servers, and a tablet with
> 3 tablet replicas, passing the map for all tablet servers in the 10-node
> cluster would result in {{FindTabletFollowers()}} reporting 9 followers.
> Whoops!
> It's necessary to either fix the implementation of this utility function to
> sanitize its first argument, or simply get rid of it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)