Hi,

if the workarounds that Xingcan and me mentioned are no options for your 
use-case, then I think this might currently be the better option. But I would 
expect some better support for stream joins in the near future.

Best,
Stefan

> Am 31.01.2018 um 07:04 schrieb Marchant, Hayden <hayden.march...@citi.com>:
> 
> Stefan,
> 
> So are we essentially saying that in this case, for now, I should stick to 
> DataSet / Batch Table API?
> 
> Thanks,
> Hayden
> 
> -----Original Message-----
> From: Stefan Richter [mailto:s.rich...@data-artisans.com] 
> Sent: Tuesday, January 30, 2018 4:18 PM
> To: Marchant, Hayden [ICG-IT] <hm97...@imceu.eu.ssmb.com>
> Cc: user@flink.apache.org; Aljoscha Krettek <aljos...@apache.org>
> Subject: Re: Joining data in Streaming
> 
> Hi,
> 
> as far as I know, this is not easily possible. What would be required is 
> something like a CoFlatmap function, where one input stream is blocking until 
> the second stream is fully consumed to build up the state to join against. 
> Maybe Aljoscha (in CC) can comment on future plans to support this.
> 
> Best,
> Stefan
> 
>> Am 30.01.2018 um 12:42 schrieb Marchant, Hayden <hayden.march...@citi.com>:
>> 
>> We have a use case where we have 2 data sets - One reasonable large data set 
>> (a few million entities), and a smaller set of data. We want to do a join 
>> between these data sets. We will be doing this join after both data sets are 
>> available.  In the world of batch processing, this is pretty straightforward 
>> - we'd load both data sets into an application and execute a join operator 
>> on them through a common key.   Is it possible to do such a join using the 
>> DataStream API? I would assume that I'd use the connect operator, though I'm 
>> not sure exactly how I should do the join - do I need one 'smaller' set to 
>> be completely loaded into state before I start flowing the large set? My 
>> concern is that if I read both data sets from streaming sources, since I 
>> can't be guaranteed of the order that the data is loaded, I may lose lots of 
>> potential joined entities since their pairs might not have been read yet. 
>> 
>> 
>> Thanks,
>> Hayden Marchant
>> 
>> 
> 

Reply via email to