[ 
https://issues.apache.org/jira/browse/KAFKA-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020911#comment-16020911
 ] 

Michal Borowiecki commented on KAFKA-5243:
------------------------------------------

Just a note, replacing the second argument is not an option IMO, as it would 
clash with the current range(K from, K to) method. K is a type parameter that 
itself could be an int, making the two indistinguishable, I think.

Secondly, the existing range() and all() methods expressly do not guarantee 
ordering of the returned iterator. I think the new range(from, to, limit) 
method would only make sense if order in the returned iterator is consistent 
across invocations. This is probably not a problem for the built-in stores, but 
given these stores are meant to be pluggable, perhaps it would be better to not 
force other stores implementations to take on those guarantees? Instead a new 
interface with stronger guarantees could be added e.g. 
ReadOnlyOrderedKeyValueStore extending ReadOnlyKeyValueStore and adding this 
extra method. It could also add the consistent ordering promise on the 
inherited range(from, to) and all() methods. Just a thought.

Probably best to raise a KIP and discuss on the mailing list. Since this is a 
public API change a KIP is required anyway:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals 

> Request to add row limit in ReadOnlyKeyValueStore range function
> ----------------------------------------------------------------
>
>                 Key: KAFKA-5243
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5243
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 0.10.1.1
>            Reporter: Joe Wood
>
> When using distributed queries across a cluster of stream stores it's quite 
> common to use query pagination to limit the number of rows returned. The 
> {{range}} function on {{ReadOnlyKeyValueStore}} only accepts the {{to}} and 
> {{from}} keys. This means that the query created either unncessarily 
> retrieves the entire range and manually limits the rows, or estimates the 
> range based on the key values. Neither options are ideal for processing 
> distributed queries.
> This suggestion is to add an overload to the {{range}} function by adding a 
> third (or replacement second) argument as a suggested row limit count. This 
> means that the range of keys returned will not exceed the supplied count.
> {code:java}
> // Get an iterator over a given range of keys, limiting to limit elements.
> KeyValueIterator<K,V> range(K from, K to, int limit)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to