[ 
https://issues.apache.org/jira/browse/KAFKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933650#comment-14933650
 ] 

Guozhang Wang commented on KAFKA-2592:
--------------------------------------

The problem is that when we flush the rocksDB / change-log, we did not commit 
offset on the upstream Kafka at the same time, hence the rocksDB / change-log 
will go "ahead of time". And when there is a failure in between we are highly 
likely to have duplicates. When user call commit() or when we commit 
periodically, we will do all the three flushes together (not atomically yet 
though).

The reason we did not commit upon writing to stores is that we do not know if 
the current record going through the topology has been finished processing, 
hence if we should consider it as completed and included in the committed 
offset or not.

> Stop Writing the Change-log in store.put() / delete() for Non-transactional 
> Store
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-2592
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2592
>             Project: Kafka
>          Issue Type: Sub-task
>            Reporter: Guozhang Wang
>            Assignee: Yasuhiro Matsuda
>             Fix For: 0.9.0.0
>
>
> Today we keep a dirty threshold and try to send to change-log in store.put() 
> / delete() when the threshold has been exceeded. Doing this will largely 
> increase the likelihood of inconsistent state upon unclean shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to