[GitHub] [kafka] vvcephei commented on pull request #5693: KAFKA-7223: In-Memory Suppression Buffering

GitHub Mon, 01 Oct 2018 08:53:13 -0700

yes. This is an optimization to support maximal efficiency in:
* removing some unknown number records, each of which is currently the minimum 
in the buffer when it gets removed
* maintaining a correct value of `minTimestamp`.


As far as we know right now, we will only ever need to remove the min records 
from the buffer. I.e., I don't think we need to iterate for a while and *then* 
remove.

But we may need to remove more than one record, and we won't know if we need to 
remove the *next* record until after we remove *this* record.

Previously, I didn't have this guard, but in that case, we can't just set 
`minTimestamp` to the buffer time of the next record upon removing. Because we 
don't know whether the record we just removed is the leftmost record in the 
tree without traversing it again. Because of that, I had to avoid updating 
`minTimestamp` until you close the iterator (and therefore it had to be a 
`CloseableIterator`). This means that the KTableSuppressProcessor couldn't just 
keep popping records while the minTimestamp was less than the desired boundary, 
it had to get the "buffer time" from the TimeKey and make its decision from 
that.

All in all, it's way cleaner this way, with the expense of that one extra guard.

I could go one step further and make it like a "predicated, consuming 
iterator", which just pops records out as long as the predicate condition is 
true. Do you think this would be more straightforward?

[ Full content available at: https://github.com/apache/kafka/pull/5693 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [kafka] vvcephei commented on pull request #5693: KAFKA-7223: In-Memory Suppression Buffering

Reply via email to