[ https://issues.apache.org/jira/browse/KAFKA-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366459#comment-16366459 ]
Matthias J. Sax commented on KAFKA-6555: ---------------------------------------- I had a quick look at the PR and I am not sure, if we want to address the issue like this. You allow to query restoring tasks. This seems to be questionable from a semantical point of view. I guess, I know understand the difference to KAFKA-6144 – there, the idea is to allow to query StandbyTask is case the main task in restoring atm. The assumption is, that a StandbyTask hold data that is almost up-to-date while in you PR, you would allow to query very old data. Furthermore, as long as a StandbyTask would be queried, the data would not change – because the main task is restoring, the StandbyTask just sever the latest state. If we allow to query a task that is restoring, you would see different (old) data until processing resumes. Thus, it's quite different. Is this a correct summary? > Making state store queryable during restoration > ----------------------------------------------- > > Key: KAFKA-6555 > URL: https://issues.apache.org/jira/browse/KAFKA-6555 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Ashish Surana > Priority: Major > > State store in Kafka streams are currently only queryable when StreamTask is > in RUNNING state. The idea is to make it queryable even in the RESTORATION > (PARTITION_ASSIGNED) state as the time spend on restoration can be huge and > making the data inaccessible during this time could be downtime not suitable > for many applications. > When the active partition goes down then one of the following occurs: > # One of the standby replica partition gets promoted to active: Replica task > has to restore the remaining state from the changelog topic before it can > become RUNNING. The time taken for this depends on how much the replica is > lagging behind. During this restoration time the state store for that > partition is currently not queryable resulting in the partition downtime. We > can make the state store partition queryable for the data already present in > the state store. > # When there is no replica or standby task, then active task will be started > in one of the existing node. That node has to build the entire state from the > changelog topic which can take lot of time depending on how big is the > changelog topic, and keeping state store not queryable during this time is > the downtime for the parition. > It's very important improvement as it could simply improve the availability > of microservices developed using kafka streams. > I am working on a patch for this change. Any feedback or comments are welcome. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)