Guozhang Wang created KAFKA-3672:
------------------------------------

             Summary: Introduce globally consistent checkpoint in Kafka Streams
                 Key: KAFKA-3672
                 URL: https://issues.apache.org/jira/browse/KAFKA-3672
             Project: Kafka
          Issue Type: Bug
          Components: streams
    Affects Versions: 0.10.0.0
            Reporter: Guozhang Wang


This is originate from the idea of rethinking about the checkpoint file 
creation condition:

Today the checkpoint file containing the checkpointed offsets is written upon 
stream task clean shutdown, and is read and deleted upon stream task 
(re-)construction. The rationale is that if upon task re-construction, the 
checkpoint file is missing, it indicates that the underlying persistent state 
store (rocksDB, for example)'s state may not be consistent with the committed 
offsets, and hence we'd better to wipe-out the maybe-broken state storage and 
rebuild from the beginning of the offset.

However, we may able to do better than this setting if we can fully control the 
persistent store flushing time to be aligned with committing, and hence as long 
as we commit, we are always guaranteed to get a clear checkpoint.

This may be generalized to a "global state checkpoint" mechanism in Kafka 
Streams, which may also subsume KAFKA-3184 for non persistent stores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to