[
https://issues.apache.org/jira/browse/KAFKA-12314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281526#comment-17281526
]
Sagar Rao commented on KAFKA-12314:
-----------------------------------
hey [~ableegoldman], is this something I can take up?
> Leverage custom comparator for optimized range scans on RocksDB
> ---------------------------------------------------------------
>
> Key: KAFKA-12314
> URL: https://issues.apache.org/jira/browse/KAFKA-12314
> Project: Kafka
> Issue Type: Improvement
> Reporter: A. Sophie Blee-Goldman
> Priority: Major
>
> Currently our SessionStore has poor performance on any range scans due to the
> byte layout and possibility of varyingly sized keys. A session window
> consists of the key and two timestamps, the windowEnd and windowStart. This
> data is formatted as
> [key, windowEnd, windowStart]
> The default comparator in rocksdb is lexicographical, and so it compares
> bytes starting with the key. This means with the above format, the records
> are effectively sorted first by key and then by windowEnd. But if two keys
> are of different lengths, the comparator will start on the left and end up
> comparing the tail bytes of the longer key against the windowEnd timestamp of
> the shorter key. Due to this, we have to set the bounds on SessionStore range
> scans very conservatively, which means we end up reading way more data than
> we need.
> One way out of this would be to use a custom comparator which understands the
> window bytes format we use. So far we haven't done this because of the
> overhead in crossing the JNI with the Java Comparator; we would need a native
> comparator to avoid further performance hit.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)