Re: Kafka Streams: How to best maintain changelog indices using the DSL?

2017-02-08 Thread Matthias J. Sax
If the mapping is 1-to-1 than you can get it done. That's good. As you observed by yourself, with non-unique mapping it's way harder (or maybe even impossible) to get this. Also your KTable#groupBy(...)#aggregate(...) is a good solution. Thus, now I am just wondering, what you mean by: >

Re: Kafka Streams: How to best maintain changelog indices using the DSL?

2017-02-08 Thread Dmitry Minkovsky
Actually... I've got the 1-to-1 variant doing wonders for me. I replaced the #aggregate() with #reduce(((k, v) -> v, (k, v) -> null) and things are just lovely. Combining these indices with the various join operations, I am able to to build up deeply nested structures, or eh, materialized views,

Re: Kafka Streams: How to best maintain changelog indices using the DSL?

2017-02-08 Thread Dmitry Minkovsky
> And before we discuss deeper, a follow up question: if you map from to new_key, is this mapping "unique", or could it be that two different k/v-pairs map to the same new_key? Yes, this has been central in my exploration so far. For some fields the field is unique, for others it is not.

Re: Kafka Streams: How to best maintain changelog indices using the DSL?

2017-02-08 Thread Matthias J. Sax
It's difficult problem. And before we discuss deeper, a follow up question: if you map from to new_key, is this mapping "unique", or could it be that two different k/v-pairs map to the same new_key? If there are overlaps, you end up with a different problem as if there are no overlaps,

Kafka Streams: How to best maintain changelog indices using the DSL?

2017-02-08 Thread Dmitry Minkovsky
I have a changelog that I'd like to index by some other key. So, something like this: class Item { byte[] id; String name; } KStreamBuilder topology = new KStreamBuilder(); KTable items = topology