[ 
https://issues.apache.org/jira/browse/KUDU-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849299#comment-15849299
 ] 

Todd Lipcon commented on KUDU-1860:
-----------------------------------

To clarify: it's "evicted" meaning that there is a pending configuration that 
removes it, but the pending configuration is not yet committed?

Agreed it would be great to show pending config information here. We don't 
currently centralize the pending config to the master IIRC, but we could 
consider doing so, or fetching it from tservers during ksck (which might be 
less error-prone)

> ksck doesn't identify tablets that are evicted but still in config
> ------------------------------------------------------------------
>
>                 Key: KUDU-1860
>                 URL: https://issues.apache.org/jira/browse/KUDU-1860
>             Project: Kudu
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 1.2.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>
> As reported by a user on Slack, ksck can give you a wrong output such as:
> {noformat}
>   ca199fafca544df2a1b2a01be9d5266d (server1:7250): RUNNING [LEADER]
>   a077957f627c4758ab5a989aca8a1ca8 (server2:7250): RUNNING
>   5c09a555c205482b8131f15b2c249ec6 (server3:7250): bad state
>     State:       NOT_STARTED
>     Data state:  TABLET_DATA_TOMBSTONED
>     Last status: Tablet initializing...
> {noformat}
> The problem is that server2 was already evicted out of the configuration 
> (based on reading the logs) but it wasn't committed in the config (which 
> contains server 1 and 3) since there's really only 1 server left out of 3.
> Ideally ksck should try to see what each server thinks the configuration is 
> and see if there's a difference from what's in the master. As it is, it looks 
> like we're missing 1 replica but in reality this is a broken tablet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to