[
https://issues.apache.org/jira/browse/KUDU-3346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17469005#comment-17469005
]
ASF subversion and git services commented on KUDU-3346:
-------------------------------------------------------
Commit 5ef0168cf0ae4471632d63cad223d7301f415982 in kudu's branch
refs/heads/master from zhangyifan27
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=5ef0168 ]
KUDU-3346: fix rebalancer tool fails to run with '--ignored_tservers'
Prior to this patch the validity of 'ignored_tservers' was checked when
'BuildClusterinfo', which leads to a failure when the 'raw_info' only contains
contains information of tservers on a specific location. This patch fix it by
moving the parameter validity check into 'KsckResultsToClusterRawInfo', because
ksck results contain original cluster information.
I noticed 'ClusterInfo::tservers_to_empty' is not necessary to be built when
'BuildClusterInfo', because we use this info only for printing cluster's stats
and running IgnoredTserverRunner. This should be refactored in follow-up patch.
This patch adds a regression test for the issue and I also verified this fix on
a real cluster.
Change-Id: I1361f562f3e886077a79c3de8ea5fb2ebb8df6e9
Reviewed-on: http://gerrit.cloudera.org:8080/18114
Reviewed-by: Andrew Wong <[email protected]>
Tested-by: Andrew Wong <[email protected]>
> Rebalance fails when trying to decommission tserver on a rack-aware cluster
> ---------------------------------------------------------------------------
>
> Key: KUDU-3346
> URL: https://issues.apache.org/jira/browse/KUDU-3346
> Project: Kudu
> Issue Type: Bug
> Affects Versions: 1.15.0
> Reporter: Georgiana Ogrean
> Assignee: YifanZhang
> Priority: Major
> Attachments: rebalance_ignored_tserver_1c.log.Z, rebalance_v1.log.Z
>
>
> When following the steps [in the
> docs|https://docs.cloudera.com/runtime/7.2.0/administering-kudu/topics/kudu-decommissioning-or-permanently-removing-tablet-server-from-cluster.html]
> for decommissioning a tserver, the rebalance job fails with:
> {code:java}
> Invalid argument: ignored tserver <tserver_uuid> is not reported among know
> tservers
> {code}
> Steps followed:
> 1. Checked that ksck passes.
> 2. Put the tserver to be decommissioned in maintenance mode.
> {code:java}
> sudo -u kudu kudu tserver state enter_maintenance $MASTER_ADDRESSES
> 5ae499b1b870419daabb0e8da90ef233 {code}
> 3. Ran rebalance with {{-ignored_tservers}} and
> {{-move_replicas_from_ignored_tservers}} flags.
> {code:java}
> sudo -u kudu kudu cluster rebalance $MASTER_ADDRESSES
> -move_replicas_from_ignored_tservers
> -ignored_tservers=5ae499b1b870419daabb0e8da90ef233 -v=1{code}
> The logs for the rebalace command are attached.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)