Yes currently Kafka Streams does not provide natural construct for a list store. Personally I'd still recommend considering option 2) if your computational pattern falls in that category.
Guozhang On Wed, Jul 25, 2018 at 1:43 AM, Andrea Spina <andrea.sp...@radicalbit.io> wrote: > Hi Guozhang, > thanks for the answer. I meant that I was considering the idea to use a > surrogate key X= (A,B) where A is the actual key and B is something let me > understanding it belongs to the list B, which is the last list collected > until the n-th emission. > Anyway, 1) is far away from RocksDB optimization layer AFAIK, 2) comes > little tricky, third approach sounds really interesting and I think I'll > give a shot. > > So AFAIU kafka streams does not provide any construct allowing this task, > is this right? > > Thanks, > > Andrea > > 2018-07-24 23:54 GMT+02:00 Guozhang Wang <wangg...@gmail.com>: > > > Hello Andrea, > > > > I do not fully understand what does `nth-id-before-emission` mean here, > but > > I can think of a couple of options here: > > > > 1) Just use a key-value store, with the value encoding the list of events > > for that key. Whenever a new event of the key gets in, you retrieve the > > current list for that key, update the list, and put it back into the > > key-value store. This is programmably most simple, but may not be ideal > in > > performance since the default persistent store RocksDB is a > log-structured > > store. > > > > 2) If your computation is commutative and associative, you can just > update > > your computed result for a key whenever a new event of that key is being > > received. > > > > 3) It is more complicated: you can use two stores, where the first store > is > > just a plain persistent buffer of all the events not processed / emitted > so > > far, and second store is an index from key to the locations of the list > of > > events for this key. Its efficiency should be better than 1) but also > more > > complicated to program. > > > > > > > > Guozhang > > > > > > > > On Tue, Jul 24, 2018 at 3:53 AM, Andrea Spina < > andrea.sp...@radicalbit.io> > > wrote: > > > > > Dear community, > > > I'd add to my topology a stateful operator - graced with a Store - > > demanded > > > to save some compuation A. > > > > > > I'd like to implement it so that it can store, by the same key, a list > of > > > values by appending A by events come in. Something similar e.g. in > Apache > > > Flink, this can be achieved by the ListState construct. When the > semantic > > > of the events change somehow, I'll trigger the emission of what I > stored > > in > > > my list state. > > > > > > I went through the window store code because I thought the base concept > > was > > > quite simliar (appending events as they come and computing updates with > > the > > > related iterator) but I wasn't able to find any inspiration. > > > > > > By now I'm considering a key value store, with a surrogate key like > key = > > > (event_key, nth-id-before-emission), which allows me to retrieve the > list > > > of computations when the trigger is fired. > > > > > > Are there better approaches by which achieving this task, either are > > there > > > construct already making this possible? > > > > > > Thank you everybody. > > > > > > -- > > > *Andrea Spina* > > > Software Engineer @ Radicalbit Srl > > > Via Borsieri 41, 20159, Milano - IT > > > > > > > > > > > -- > > -- Guozhang > > > > > > -- > *Andrea Spina* > Software Engineer @ Radicalbit Srl > Via Borsieri 41, 20159, Milano - IT > -- -- Guozhang