Till,

Thanks again for putting this together.  It is certainly along the lines of
what I want to accomplish, but I see some problem with it.  In your code
you use a ValueStore to store the priority queue.  If you are expecting to
store a lot of values in the queue, then you are likely to be using RocksDB
as the state backend.  But if you use a ValueStore for the priority queue
with RocksDB as the backend, the whole priority queue will be deserialized
and serialized each time you add an item to it.  That will become a
crushing cost as the queue grows.

I could instead use a ListState with the RocksDB state, that way only the
single value being added is serialized on an add.  But the get() operation
in the RocksDBListState seems very inefficient, defeating the idea of
working with data sets that don't fit in memory.  It loads all values into
a List instead of returning an Iterable that returns values in the list by
iterating via the RockDB scan API.  Samza has the advantage here, as it
provides a ordered KV state API that allows you to truly iterate over the
values in RocksDB and a caching lager to batch writes into RocksDB. I am
surprised there is no OrderedKeyValueStore API Flink.

Given that only the RocksDB backend can store more state that can fit in
memory and the cost associated with its get() method when keeping track of
a list, it seems like there isn't a good solution keep track of large state
in the form of a list or ordered list in Flink right now.


On Wed, Apr 20, 2016 at 10:13 AM, Till Rohrmann <trohrm...@apache.org>
wrote:

> The stream operator would do the following: Collecting the inputs from
> (x,y) and z which are already keyed. Thus, we know that x=z holds true.
> Using a priority queue we order the elements because we don't know how the
> arrive at the operator. Whenever we receive a watermark indicating that no
> earlier events can arrive anymore, we can go through the two priority
> queues to join the elements. The queues are part of the operators state so
> that we don't lose information in case of a recovery.
>
> I've sketched such an operator here [1]. I hope this helps you to get
> started.
>

Reply via email to