Yes currently Kafka Streams does not provide natural construct for a list
store. Personally I'd still recommend considering option 2) if your
computational pattern falls in that category.


Guozhang

On Wed, Jul 25, 2018 at 1:43 AM, Andrea Spina <andrea.sp...@radicalbit.io>
wrote:

> Hi Guozhang,
> thanks for the answer. I meant that I was considering the idea to use a
> surrogate key X= (A,B) where A is the actual key and B is something let me
> understanding it belongs to the list B, which is the last list collected
> until the n-th emission.
> Anyway, 1) is far away from RocksDB optimization layer AFAIK, 2) comes
> little tricky, third approach sounds really interesting and I think I'll
> give a shot.
>
> So AFAIU kafka streams does not provide any construct allowing this task,
> is this right?
>
> Thanks,
>
> Andrea
>
> 2018-07-24 23:54 GMT+02:00 Guozhang Wang <wangg...@gmail.com>:
>
> > Hello Andrea,
> >
> > I do not fully understand what does `nth-id-before-emission` mean here,
> but
> > I can think of a couple of options here:
> >
> > 1) Just use a key-value store, with the value encoding the list of events
> > for that key. Whenever a new event of the key gets in, you retrieve the
> > current list for that key, update the list, and put it back into the
> > key-value store. This is programmably most simple, but may not be ideal
> in
> > performance since the default persistent store RocksDB is a
> log-structured
> > store.
> >
> > 2) If your computation is commutative and associative, you can just
> update
> > your computed result for a key whenever a new event of that key is being
> > received.
> >
> > 3) It is more complicated: you can use two stores, where the first store
> is
> > just a plain persistent buffer of all the events not processed / emitted
> so
> > far, and second store is an index from key to the locations of the list
> of
> > events for this key. Its efficiency should be better than 1) but also
> more
> > complicated to program.
> >
> >
> >
> > Guozhang
> >
> >
> >
> > On Tue, Jul 24, 2018 at 3:53 AM, Andrea Spina <
> andrea.sp...@radicalbit.io>
> > wrote:
> >
> > > Dear community,
> > > I'd add to my topology a stateful operator - graced with a Store -
> > demanded
> > > to save some compuation A.
> > >
> > > I'd like to implement it so that it can store, by the same key, a list
> of
> > > values by appending A by events come in. Something similar e.g. in
> Apache
> > > Flink, this can be achieved by the ListState construct. When the
> semantic
> > > of the events change somehow, I'll trigger the emission of what I
> stored
> > in
> > > my list state.
> > >
> > > I went through the window store code because I thought the base concept
> > was
> > > quite simliar (appending events as they come and computing updates with
> > the
> > > related iterator) but I wasn't able to find any inspiration.
> > >
> > > By now I'm considering a key value store, with a surrogate key like
> key =
> > > (event_key, nth-id-before-emission), which allows me to retrieve the
> list
> > > of computations when the trigger is fired.
> > >
> > > Are there better approaches by which achieving this task, either are
> > there
> > > construct already making this possible?
> > >
> > > Thank you everybody.
> > >
> > > --
> > > *Andrea Spina*
> > > Software Engineer @ Radicalbit Srl
> > > Via Borsieri 41, 20159, Milano - IT
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> *Andrea Spina*
> Software Engineer @ Radicalbit Srl
> Via Borsieri 41, 20159, Milano - IT
>



-- 
-- Guozhang

Reply via email to