hachikuji commented on a change in pull request #8812:
URL: https://github.com/apache/kafka/pull/8812#discussion_r583835401
##########
File path: core/src/main/scala/kafka/log/Log.scala
##########
@@ -826,8 +832,16 @@ class Log(@volatile private var _dir: File,
preallocate = config.preallocate))
}
- recoveryPoint = activeSegment.readNextOffset
Review comment:
I think it is a gap that there is no minimum replication factor before a
write can get exposed. Any writes that end up seeing the
`NOT_ENOUGH_REPLICAS_AFTER_APPEND` error code are more vulnerable. These are
unacknowledged writes, and the producer is expected to retry, but the consumer
can still read them once the ISR shrinks and we would still view it as "data
loss" if the broker failed before they could be flushed to disk. With the
"strict min isr" proposal, the leader is not allowed to shrink the ISR lower
than some replication factor, which helps to plug this hole.
Going back to @purplefox's suggestion, it does seem like a good idea to
flush segments beyond the recovery point during recovery. It kind of serves to
constrain the initial state of the system which makes it easier to reason about
(e.g. you only need to worry about the loss of unflushed data from the last
restart). Some of the flush weaknesses probably still exist though regardless
of this change.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]