Re: Iterating over state entries

Ken Krugler Mon, 19 Feb 2018 16:04:23 -0800

Hi Till,

> On Feb 19, 2018, at 8:14 AM, Till Rohrmann <trohrm...@apache.org> wrote:
> 
> Hi Ken,
> 
> just for my clarification, the `RocksDBMapState#entries` method does not 
> satisfy your requirements? This method does not allow you to iterate across 
> different keys of your keyed stream of course. But it should allow you to 
> iterate over the different entries for a given key of your keyed stream.


As per my email to Fabian, I should have been more precise in my requirements.

I need to do incremental iteration of the entries, versus a complete iteration.

And I'm assuming I can't keep the iterator around across calls to the function.

Regards,

— Ken


> On Mon, Feb 19, 2018 at 12:10 AM, Ken Krugler <kkrugler_li...@transpac.com 
> <mailto:kkrugler_li...@transpac.com>> wrote:
> Hi there,
> 
> I’ve got a MapState where I need to iterate over the entries.
> 
> This currently isn’t supported (at least for Rocks DB), AFAIK, though there 
> is an issue/PR <https://issues.apache.org/jira/browse/FLINK-8297> to improve 
> this.
> 
> The best solution I’ve seen is what Fabian proposed, which involves keeping a 
> ValueState with a count of entries, and then having the key for the MapState 
> be the index.
> 
>> I cannot comment on the internal design, but you could put the data into a
>> RocksDBStateBackend MapState<Integer, X> where the value X is your data
>> type and the key is the list index. You would need another ValueState for
>> the current number of elements that you put into the MapState.
>> A MapState allows to fetch and traverse the key, value, or entry set of the
>> Map without loading it completely into memory.
>> The sets are traversed in sort order of the key, so should be in insertion
>> order (given that you properly increment the list index).
> 
> 
> This effectively lets you iterate over all of the map entries for a given 
> (keyed) state - though it doesn’t solve the “I have to iterate over _every_ 
> entry” situation.
> 
> Is this currently the best option?

--------------------------
Ken Krugler
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

Re: Iterating over state entries

Reply via email to