Re: Create KTable from two topics

2016-06-03 Thread Guozhang Wang
I tried to re-run your application code locally: you can find the sample I wrote here: https://gist.github.com/guozhangwang/ed8936e5861378082e757d87a44916f1 If I do "joined.toStream().print();" this is the output: 127339 , null 131933 , null 128072 , null 128074 , null *123507 , null 128073 , nul

RE: Create KTable from two topics

2016-06-03 Thread Philip Remsberg
, 2016 11:30 AM To: users@kafka.apache.org Subject: Re: Create KTable from two topics Guozhang, The output I pasted doesn't strictly follow that definition. They key you mentioned(128073) is the only one with two records. I kept that intentionally to see the behavior. All other keys have onl

FW: Create KTable from two topics

2016-06-03 Thread Philip Remsberg
Apostolescu Subject: RE: Create KTable from two topics I believe that this should be going to Paul now. Philip Remsberg T: 855-885-5566 Ext 1621 ThreatTrack Security  | philip.remsb...@threattrack.com Connect with us on: Facebook | Twitter | LinkedIn -Original Message- From: Srikanth

Re: Create KTable from two topics

2016-06-03 Thread Srikanth
Guozhang, The output I pasted doesn't strictly follow that definition. They key you mentioned(128073) is the only one with two records. I kept that intentionally to see the behavior. All other keys have only one record. Yet they are all printed twice. Data I pasted is all I had its not a sample.

Re: Create KTable from two topics

2016-06-02 Thread Guozhang Wang
Hello Srikanth, When involved in joins, KTable need to pass both the old value as well as the new value as a pair to the join operator since it is "an ever updating table with the underlying changelog", for example, your topic1 stream have key "128073" with updated values from 542361 to 560608. Th

Re: Create KTable from two topics

2016-06-02 Thread Srikanth
I did try approach 3 yesterday with the following sample data. topic1: 127339 538433 131933 626026 128072 536012 128074 546262 *123507 517631* 128073 542361 128073 560608 topic2: 128074 100282 131933 100394 127339 100445 128073 100710 *123507 100226* I joined these and printed the re

Re: Create KTable from two topics

2016-06-02 Thread Matthias J. Sax
I would not expect a performance difference. -Matthias On 06/02/2016 06:15 PM, Srikanth wrote: > In terms of performance there is not going to be much difference to+table > vs through+aggregateByKey rt? > > Srikanth > > > On Thu, Jun 2, 2016 at 9:21 AM, Matthias J. Sax > wrote: > >> Hi Srika

Re: Create KTable from two topics

2016-06-02 Thread Srikanth
In terms of performance there is not going to be much difference to+table vs through+aggregateByKey rt? Srikanth On Thu, Jun 2, 2016 at 9:21 AM, Matthias J. Sax wrote: > Hi Srikanth, > > your third approach seems to be the best fit. It uses only one shuffle > of the data (which you cannot prev

Re: Create KTable from two topics

2016-06-02 Thread Matthias J. Sax
Hi Srikanth, your third approach seems to be the best fit. It uses only one shuffle of the data (which you cannot prevent in any case). If you want to put everything into a single application, you could use a "dummy" custom aggregation to convert the KStream into a KTable instead of writing into

Create KTable from two topics

2016-06-01 Thread Srikanth
Hello, How do I build a KTable from two topics such that key is in one topic and value in other? Ex, topic1 has a key called basekey and userId as value. topic2 has same basekey and locationId as value topic1 = {"basekey":1,"userId":111} topic1 = {"basekey":2,"userId":222} topic2 = {"basekey":1