Jason Gustafson created KAFKA-9835:
--------------------------------------

             Summary: Race condition with concurrent write allows reads above 
high watermark
                 Key: KAFKA-9835
                 URL: https://issues.apache.org/jira/browse/KAFKA-9835
             Project: Kafka
          Issue Type: Bug
            Reporter: Jason Gustafson
            Assignee: Jason Gustafson


Kafka's log implementation serializes all writes using a lock, but allows 
multiple concurrent reads while that lock is held. The `FileRecords` class 
contains the core implementation. Reads to the log create logical slices of 
`FileRecords` which are then passed to the network layer for sending. An 
abridged version of the implementation of `slice` is provided below:

{code}
    public FileRecords slice(int position, int size) throws IOException {
        int end = this.start + position + size;
        // handle integer overflow or if end is beyond the end of the file
        if (end < 0 || end >= start + sizeInBytes())
            end = start + sizeInBytes();
        return new FileRecords(file, channel, this.start + position, end, true);
    }
{code}

The `size` parameter here is typically derived from the fetch size, but is 
upper-bounded with respect to the high watermark. The two calls to 
`sizeInBytes` here are problematic because the size of the file may change in 
between them. Specifically a concurrent write may increase the size of the file 
after the first call to `sizeInBytes` but before the second one. In the worst 
case, when `size` defines the limit of the high watermark, this can lead to a 
slice containing uncommitted data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to