[ https://issues.apache.org/jira/browse/KUDU-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hao Hao updated KUDU-2129: -------------------------- Description: When adds a new replica and the process fails. Ksck warns about tablet being 'under-replicated' while there are three healthy replicas. We should better improve the message in this case: {noformat}Tablet 848cd6f04bed4f049915ddf04e960240 of table 'ssb_1000_lineorder' is under-replicated: 1 replica(s) not RUNNING 3c305734ab9d4e0ebfbd0def74841a5d (vd0240.halxg.cloudera.com:7050): RUNNING [LEADER] 70f7ee61ead54b1885d819f354eb3405 (vd0338.halxg.cloudera.com:7050): RUNNING 72fcec63e96f4248ae39d114eb3cd7c9 (vd0340.halxg.cloudera.com:7050): bad state State: INITIALIZED Data state: TABLET_DATA_COPYING Last status: Tablet Copy: Downloading block 4611685954068824754 (13746/17499) cc32936bc8594948a04fd4240da36aed (vd0236.halxg.cloudera.com:7050): RUNNING 1 replicas' active configs differ from the master's. All the peers reported by the master and tablet servers are: A = 3c305734ab9d4e0ebfbd0def74841a5d B = 70f7ee61ead54b1885d819f354eb3405 C = 72fcec63e96f4248ae39d114eb3cd7c9 D = cc32936bc8594948a04fd4240da36aed The consensus matrix is: Config source | Voters | Current term | Config index | Committed? ---------------+------------------+--------------+--------------+------------ master | A* B C D | | | Yes A | A* B C D | 93 | 3418016 | Yes B | A* B C D | 93 | 3418016 | Yes C | A B C D | 93 | 3418016 | Yes D | A* B C D | 93 | 3418016 | Yes Table ssb_1000_lineorder has 1 under-replicated tablet(s){noformat} was: When adding a new replica and if is not successful. Ksck warns about tablet being 'under-replicated' while there are three healthy replicas. We should better improve the message in this case: {noformat}Tablet 848cd6f04bed4f049915ddf04e960240 of table 'ssb_1000_lineorder' is under-replicated: 1 replica(s) not RUNNING 3c305734ab9d4e0ebfbd0def74841a5d (vd0240.halxg.cloudera.com:7050): RUNNING [LEADER] 70f7ee61ead54b1885d819f354eb3405 (vd0338.halxg.cloudera.com:7050): RUNNING 72fcec63e96f4248ae39d114eb3cd7c9 (vd0340.halxg.cloudera.com:7050): bad state State: INITIALIZED Data state: TABLET_DATA_COPYING Last status: Tablet Copy: Downloading block 4611685954068824754 (13746/17499) cc32936bc8594948a04fd4240da36aed (vd0236.halxg.cloudera.com:7050): RUNNING 1 replicas' active configs differ from the master's. All the peers reported by the master and tablet servers are: A = 3c305734ab9d4e0ebfbd0def74841a5d B = 70f7ee61ead54b1885d819f354eb3405 C = 72fcec63e96f4248ae39d114eb3cd7c9 D = cc32936bc8594948a04fd4240da36aed The consensus matrix is: Config source | Voters | Current term | Config index | Committed? ---------------+------------------+--------------+--------------+------------ master | A* B C D | | | Yes A | A* B C D | 93 | 3418016 | Yes B | A* B C D | 93 | 3418016 | Yes C | A B C D | 93 | 3418016 | Yes D | A* B C D | 93 | 3418016 | Yes Table ssb_1000_lineorder has 1 under-replicated tablet(s){noformat} > Improve ksck 'under-replicated' message > --------------------------------------- > > Key: KUDU-2129 > URL: https://issues.apache.org/jira/browse/KUDU-2129 > Project: Kudu > Issue Type: Improvement > Affects Versions: 1.5.0 > Reporter: Hao Hao > > When adds a new replica and the process fails. Ksck warns about tablet being > 'under-replicated' while there are three healthy replicas. We should better > improve the message in this case: > {noformat}Tablet 848cd6f04bed4f049915ddf04e960240 of table > 'ssb_1000_lineorder' is under-replicated: 1 replica(s) not RUNNING > 3c305734ab9d4e0ebfbd0def74841a5d (vd0240.halxg.cloudera.com:7050): RUNNING > [LEADER] > 70f7ee61ead54b1885d819f354eb3405 (vd0338.halxg.cloudera.com:7050): RUNNING > 72fcec63e96f4248ae39d114eb3cd7c9 (vd0340.halxg.cloudera.com:7050): bad state > State: INITIALIZED > Data state: TABLET_DATA_COPYING > Last status: Tablet Copy: Downloading block 4611685954068824754 > (13746/17499) > cc32936bc8594948a04fd4240da36aed (vd0236.halxg.cloudera.com:7050): RUNNING > 1 replicas' active configs differ from the master's. > All the peers reported by the master and tablet servers are: > A = 3c305734ab9d4e0ebfbd0def74841a5d > B = 70f7ee61ead54b1885d819f354eb3405 > C = 72fcec63e96f4248ae39d114eb3cd7c9 > D = cc32936bc8594948a04fd4240da36aed > The consensus matrix is: > Config source | Voters | Current term | Config index | Committed? > ---------------+------------------+--------------+--------------+------------ > master | A* B C D | | | Yes > A | A* B C D | 93 | 3418016 | Yes > B | A* B C D | 93 | 3418016 | Yes > C | A B C D | 93 | 3418016 | Yes > D | A* B C D | 93 | 3418016 | Yes > Table ssb_1000_lineorder has 1 under-replicated tablet(s){noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)