[ https://issues.apache.org/jira/browse/KAFKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560410#comment-13560410 ]
Jay Kreps commented on KAFKA-727: --------------------------------- Fantastic catch. I think another fix is to just save the size of the log prior to translating the hw mark and use this rather than dynamically checking log.sizeInBytes later in the method. This will effectively act as a valid lower bound. It might also be worthwhile to write a throw away torture test that has one thread do appends and another thread do reads and check that this condition is not violated in case there are any more of these subtleties. Happy to take this one on since it is my bad. > broker can still expose uncommitted data to a consumer > ------------------------------------------------------ > > Key: KAFKA-727 > URL: https://issues.apache.org/jira/browse/KAFKA-727 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8 > Reporter: Jun Rao > Priority: Blocker > > Even after kafka-698 is fixed, we still see consumer clients occasionally see > uncommitted data. The following is how this can happen. > 1. In Log.read(), we pass in startOffset < HW and maxOffset = HW. > 2. Then we call LogSegment.read(), in which we call translateOffset on the > maxOffset. The offset doesn't exist and translateOffset returns null. > 3. Continue in LogSegment.read(), we then call messageSet.sizeInBytes() to > fetch and return the data. > What can happen is that between step 2 and step 3, a new message is appended > to the log and is not committed yet. Now, we have exposed uncommitted data to > the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira