Jean-Daniel Cryans created KUDU-1860: ----------------------------------------
Summary: ksck doesn't identify tablets that are evicted but still in config Key: KUDU-1860 URL: https://issues.apache.org/jira/browse/KUDU-1860 Project: Kudu Issue Type: Bug Components: util Affects Versions: 1.2.0 Reporter: Jean-Daniel Cryans Priority: Critical As reported by a user on Slack, ksck can give you a wrong output such as: {noformat} ca199fafca544df2a1b2a01be9d5266d (server1:7250): RUNNING [LEADER] a077957f627c4758ab5a989aca8a1ca8 (server2:7250): RUNNING 5c09a555c205482b8131f15b2c249ec6 (server3:7250): bad state State: NOT_STARTED Data state: TABLET_DATA_TOMBSTONED Last status: Tablet initializing... {noformat} The problem is that server2 was already evicted out of the configuration (based on reading the logs) but it wasn't committed in the config (which contains server 1 and 3) since there's really only 1 server left out of 3. Ideally ksck should try to see what each server thinks the configuration is and see if there's a difference from what's in the master. As it is, it looks like we're missing 1 replica but in reality this is a broken tablet. -- This message was sent by Atlassian JIRA (v6.3.15#6346)