ijuma commented on a change in pull request #8812:
URL: https://github.com/apache/kafka/pull/8812#discussion_r583656545



##########
File path: core/src/main/scala/kafka/log/Log.scala
##########
@@ -826,8 +832,16 @@ class Log(@volatile private var _dir: File,
         preallocate = config.preallocate))
     }
 
-    recoveryPoint = activeSegment.readNextOffset

Review comment:
       @purplefox To clarify, the case you are talking about is:
   
   1. A catastrophic scenario where a partition is offline (i.e. all replicas 
are down)
   2. The OS was not shutdown
   3. The OS did not flush the data to disk (this typically happens 
irrespective of our flushes due to OS configurations)
   4. The replica that was the last member of the ISR comes back up, registers, 
to ZK and the Controller makes it the
   leader (since it was the last member of the ISR, if it was a different 
replica, it won't be given leadership without
   unclean leader election)
   5. The hw is beyond the flushed segments
   5. Consumer fetches the data beyond the flushed segments
   6. OS hard dies
   
   This is an interesting edge case, it seems incredibly unlikely, but possible 
if the hw can be beyond the flushed segments. @junrao & @hachikuji are we 
missing any detail that protects us against this?
   
   The following has a similar impact:
   1. Leader accepts a write
   2. Write is replicated, hw is incremented, but data is not flushed
   3. All replicas die, but the ISR is not shrunk yet
   4. Leader receives a write, accepts it, replication doesn't happen since 
replicas are gone
   5. ISR is shrunk, hw is incremented
   6. Producer won't receive a successful ack given min.isr=2, but the consumer 
reads data that is only in the leader
   7. Leader crashes and the unflushed data is gone (or hard disk dies and all 
the data in the leader is gone)
   
   Flushing the segments during recovery helps on some scenarios, but not the 
one I just mentioned (assuming I am not missing anything). @hachikuji had a 
"strict min isr" proposal where the ISR is never allowed to shrink below 
`min.isr`. I haven't thought about all the details, but perhaps that covers 
both issues. Thoughts @hachikuji?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to