[jira] [Assigned] (KUDU-3082) tablets in "CONSENSUS_MISMATCH" state for a long time
[ https://issues.apache.org/jira/browse/KUDU-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Serbin reassigned KUDU-3082: --- Assignee: (was: Alexey Serbin) > tablets in "CONSENSUS_MISMATCH" state for a long time > - > > Key: KUDU-3082 > URL: https://issues.apache.org/jira/browse/KUDU-3082 > Project: Kudu > Issue Type: Bug > Components: consensus >Affects Versions: 1.10.1 >Reporter: YifanZhang >Priority: Major > Attachments: master_leader.log, ts25.info.gz, ts26.log.gz > > > Lately we found a few tablets in one of our clusters are unhealthy, the ksck > output is like: > > {code:java} > Tablet Summary > Tablet 7404240f458f462d92b6588d07583a52 of table '' is conflicted: 3 > replicas' active configs disagree with the leader master's > 7380d797d2ea49e88d71091802fb1c81 (kudu-ts26): RUNNING > d1952499f94a4e6087bee28466fcb09f (kudu-ts25): RUNNING > 47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER] > All reported replicas are: > A = 7380d797d2ea49e88d71091802fb1c81 > B = d1952499f94a4e6087bee28466fcb09f > C = 47af52df1adc47e1903eb097e9c88f2e > D = 08beca5ed4d04003b6979bf8bac378d2 > The consensus matrix is: > Config source | Replicas | Current term | Config index | Committed? > ---+--+--+--+ > master| A B C* | | | Yes > A | A B C* | 5| -1 | Yes > B | A B C| 5| -1 | Yes > C | A B C* D~ | 5| 54649| No > Tablet 6d9d3fb034314fa7bee9cfbf602bcdc8 of table '' is conflicted: 2 > replicas' active configs disagree with the leader master's > d1952499f94a4e6087bee28466fcb09f (kudu-ts25): RUNNING > 47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER] > 5a8aeadabdd140c29a09dabcae919b31 (kudu-ts21): RUNNING > All reported replicas are: > A = d1952499f94a4e6087bee28466fcb09f > B = 47af52df1adc47e1903eb097e9c88f2e > C = 5a8aeadabdd140c29a09dabcae919b31 > D = 14632cdbb0d04279bc772f64e06389f9 > The consensus matrix is: > Config source | Replicas | Current term | Config index | Committed? > ---+--+--+--+ > master| A B* C| | | Yes > A | A B* C| 5| 5| Yes > B | A B* C D~ | 5| 96176| No > C | A B* C| 5| 5| Yes > Tablet bf1ec7d693b94632b099dc0928e76363 of table '' is conflicted: 1 > replicas' active configs disagree with the leader master's > a9eaff3cf1ed483aae84954d649a (kudu-ts23): RUNNING > f75df4a6b5ce404884313af5f906b392 (kudu-ts19): RUNNING > 47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER] > All reported replicas are: > A = a9eaff3cf1ed483aae84954d649a > B = f75df4a6b5ce404884313af5f906b392 > C = 47af52df1adc47e1903eb097e9c88f2e > D = d1952499f94a4e6087bee28466fcb09f > The consensus matrix is: > Config source | Replicas | Current term | Config index | Committed? > ---+--+--+--+ > master| A B C* | | | Yes > A | A B C* | 1| -1 | Yes > B | A B C* | 1| -1 | Yes > C | A B C* D~ | 1| 2| No > Tablet 3190a310857e4c64997adb477131488a of table '' is conflicted: 3 > replicas' active configs disagree with the leader master's > 47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER] > f0f7b2f4b9d344e6929105f48365f38e (kudu-ts24): RUNNING > f75df4a6b5ce404884313af5f906b392 (kudu-ts19): RUNNING > All reported replicas are: > A = 47af52df1adc47e1903eb097e9c88f2e > B = f0f7b2f4b9d344e6929105f48365f38e > C = f75df4a6b5ce404884313af5f906b392 > D = d1952499f94a4e6087bee28466fcb09f > The consensus matrix is: > Config source | Replicas | Current term | Config index | Committed? > ---+--+--+--+ > master| A* B C| | | Yes > A | A* B C D~ | 1| 1991 | No > B | A* B C| 1| 4| Yes > C | A* B C| 1| 4| Yes{code} > These tablets couldn't recover for a couple of days until we restart > kudu-ts27. > I found so many duplicated logs in kudu-ts27 are like: > {code:java} > I0314 04:38:41.511279 65731 raft_consensus.cc:937] T > 740424
[jira] [Assigned] (KUDU-3082) tablets in "CONSENSUS_MISMATCH" state for a long time
[ https://issues.apache.org/jira/browse/KUDU-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Serbin reassigned KUDU-3082: --- Assignee: Alexey Serbin > tablets in "CONSENSUS_MISMATCH" state for a long time > - > > Key: KUDU-3082 > URL: https://issues.apache.org/jira/browse/KUDU-3082 > Project: Kudu > Issue Type: Bug > Components: consensus >Affects Versions: 1.10.1 >Reporter: YifanZhang >Assignee: Alexey Serbin >Priority: Major > Attachments: master_leader.log, ts25.info.gz, ts26.log.gz > > > Lately we found a few tablets in one of our clusters are unhealthy, the ksck > output is like: > > {code:java} > Tablet Summary > Tablet 7404240f458f462d92b6588d07583a52 of table '' is conflicted: 3 > replicas' active configs disagree with the leader master's > 7380d797d2ea49e88d71091802fb1c81 (kudu-ts26): RUNNING > d1952499f94a4e6087bee28466fcb09f (kudu-ts25): RUNNING > 47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER] > All reported replicas are: > A = 7380d797d2ea49e88d71091802fb1c81 > B = d1952499f94a4e6087bee28466fcb09f > C = 47af52df1adc47e1903eb097e9c88f2e > D = 08beca5ed4d04003b6979bf8bac378d2 > The consensus matrix is: > Config source | Replicas | Current term | Config index | Committed? > ---+--+--+--+ > master| A B C* | | | Yes > A | A B C* | 5| -1 | Yes > B | A B C| 5| -1 | Yes > C | A B C* D~ | 5| 54649| No > Tablet 6d9d3fb034314fa7bee9cfbf602bcdc8 of table '' is conflicted: 2 > replicas' active configs disagree with the leader master's > d1952499f94a4e6087bee28466fcb09f (kudu-ts25): RUNNING > 47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER] > 5a8aeadabdd140c29a09dabcae919b31 (kudu-ts21): RUNNING > All reported replicas are: > A = d1952499f94a4e6087bee28466fcb09f > B = 47af52df1adc47e1903eb097e9c88f2e > C = 5a8aeadabdd140c29a09dabcae919b31 > D = 14632cdbb0d04279bc772f64e06389f9 > The consensus matrix is: > Config source | Replicas | Current term | Config index | Committed? > ---+--+--+--+ > master| A B* C| | | Yes > A | A B* C| 5| 5| Yes > B | A B* C D~ | 5| 96176| No > C | A B* C| 5| 5| Yes > Tablet bf1ec7d693b94632b099dc0928e76363 of table '' is conflicted: 1 > replicas' active configs disagree with the leader master's > a9eaff3cf1ed483aae84954d649a (kudu-ts23): RUNNING > f75df4a6b5ce404884313af5f906b392 (kudu-ts19): RUNNING > 47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER] > All reported replicas are: > A = a9eaff3cf1ed483aae84954d649a > B = f75df4a6b5ce404884313af5f906b392 > C = 47af52df1adc47e1903eb097e9c88f2e > D = d1952499f94a4e6087bee28466fcb09f > The consensus matrix is: > Config source | Replicas | Current term | Config index | Committed? > ---+--+--+--+ > master| A B C* | | | Yes > A | A B C* | 1| -1 | Yes > B | A B C* | 1| -1 | Yes > C | A B C* D~ | 1| 2| No > Tablet 3190a310857e4c64997adb477131488a of table '' is conflicted: 3 > replicas' active configs disagree with the leader master's > 47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER] > f0f7b2f4b9d344e6929105f48365f38e (kudu-ts24): RUNNING > f75df4a6b5ce404884313af5f906b392 (kudu-ts19): RUNNING > All reported replicas are: > A = 47af52df1adc47e1903eb097e9c88f2e > B = f0f7b2f4b9d344e6929105f48365f38e > C = f75df4a6b5ce404884313af5f906b392 > D = d1952499f94a4e6087bee28466fcb09f > The consensus matrix is: > Config source | Replicas | Current term | Config index | Committed? > ---+--+--+--+ > master| A* B C| | | Yes > A | A* B C D~ | 1| 1991 | No > B | A* B C| 1| 4| Yes > C | A* B C| 1| 4| Yes{code} > These tablets couldn't recover for a couple of days until we restart > kudu-ts27. > I found so many duplicated logs in kudu-ts27 are like: > {code:java} > I0314 04:38:41.511279 65731 raft_con