Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2017-01-03 Thread Damian Guy
Thanks Gouzhang - i'll remove the joins. I agree we need to refactor TopologyBuilder, but I think we'll need another KIP for that. Thanks, Damian On Fri, 30 Dec 2016 at 01:32 Guozhang Wang wrote: > 1/2: Sounds good, let's remove the joins within KGlobalTable for now. > > 3. I see, makes sense.

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-29 Thread Guozhang Wang
1/2: Sounds good, let's remove the joins within KGlobalTable for now. 3. I see, makes sense. Unfortunately since TopologyBuilder is a public class we cannot separate its internal usage only functions like build / buildWithGlobalTables / etc with other user functions like stream / table / etc. We

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-20 Thread Damian Guy
Hi Guozhang, Thanks for your input. Answers below, but i'm thinking we should remove joins from GlobalKTables for the time being and re-visit if necessary in the future. 1. with a global table the joins are never really materialized (at least how i see it), rather they are just views on the exist

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-20 Thread Guozhang Wang
One more thing to add: 6. For KGlobalTable, it is always bootstrapped from the beginning while for other KTables, we are enabling users to override their resetting position as in https://github.com/apache/kafka/pull/2007 Should we consider doing the same for KGlobalTable as well? Guozhang On

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-20 Thread Guozhang Wang
Thanks for the very well written proposal, and sorry for the very-late review. I have a few comments here: 1. We are introducing a "queryableViewName" in the GlobalTable join results, while I'm wondering if we should just add a more general function like "materialize" to KTable and KGlobalTable wi

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-09 Thread Damian Guy
Thanks for the update Michael. I just wanted to add that there is one crucial piece of information that i've failed to add (I apologise). To me, the join between 2 Global Tables just produces a view on top of the underlying tables (this is the same as it works for KTables today). So that means th

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-09 Thread Michael Noll
Damian and I briefly chatted offline (thanks, Damian!), and here's the summary of my thoughts and conclusion. TL;DR: Let's skip outer join support for global tables. In more detail: - We agreed that, technically, we can add OUTER JOIN support. However, outer joins only work if certain precondit

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-08 Thread Damian Guy
Hi Michael, I don't see how that helps? Lets say we have tables Person(id, device_id, name, ...), Device(id, person_id, type, ...), and both are keyed with same type. And we have a stream, that for the sake of simplicity, has both person_id and device_id ( i know this is a bit contrived!) so our

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-08 Thread Michael Noll
The key type returned by both KeyValueMappers (in the current trunk version, that type is named `R`) would need to be the same for this to work. On Wed, Dec 7, 2016 at 4:46 PM, Damian Guy wrote: > Michael, > > We can only support outerJoin if both tables are keyed the same way. Lets > say for e

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-07 Thread Damian Guy
Michael, We can only support outerJoin if both tables are keyed the same way. Lets say for example you can map both ways, however, the key for each table is of a different type. So t1 is long and t2 is string - what is the key type of the resulting GlobalKTable? So when you subsequently join to th

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-07 Thread Michael Noll
Damian, yes, that makes sense. But I am still wondering: In your example, there's no prior knowledge "can I map from t1->t2" that Streams can leverage for joining t1 and t2 other than blindly relying on the user to provide an appropriate KeyValueMapper for K1/V1 of t1 -> K2/V2 of t2. In other w

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-07 Thread Damian Guy
Hi Michael, Sure. Say we have 2 input topics t1 & t2 below: t1{ int key; string t2_id; ... } t2 { string key; .. } If we create global tables out of these we'd get: GlobalKTable t1; GlobalKTable t2; So the join can only go in 1 direction, i.e, from t1 -> t2 as in order to perform the join

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-07 Thread Michael Noll
> There is no outer-join for GlobalKTables as the tables may be keyed > differently. So you need to use the key from the left side of the join > along with the KeyValueMapper to resolve the right side of the join. This > wont work the other way around. Care to elaborate why it won't work the other

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-07 Thread Damian Guy
Hi Matthias, Thanks for the feedback. There is no outer-join for GlobalKTables as the tables may be keyed differently. So you need to use the key from the left side of the join along with the KeyValueMapper to resolve the right side of the join. This wont work the other way around. On the bootst

Re: [DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-06 Thread Matthias J. Sax
Thanks for the KIP Damian. Very nice motivating example! A few comments: - why is there no outer-join for GlobalKTables - on bootstrapping GlobalKTable, could it happen that this never finishes if the application fails before bootstrapping finishes and new data gets written at the same time? Do

[DISCUSS] KIP-99: Add Global Tables to Kafka Streams

2016-12-06 Thread Damian Guy
Hi all, I would like to start the discussion on KIP-99: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=67633649 Looking forward to your feedback. Thanks, Damian