Re: Joining data in Streaming

2018-02-05 Thread Steven Wu
; > > > Thanks, > > Hayden > > > > -Original Message- > > From: Stefan Richter [mailto:s.rich...@data-artisans.com] > > Sent: Tuesday, January 30, 2018 4:18 PM > > To: Marchant, Hayden [ICG-IT] <hm97...@imceu.eu.ssmb.com> > > Cc: user@f

Re: Joining data in Streaming

2018-01-31 Thread Stefan Richter
> Sent: Tuesday, January 30, 2018 4:18 PM > To: Marchant, Hayden [ICG-IT] <hm97...@imceu.eu.ssmb.com> > Cc: user@flink.apache.org; Aljoscha Krettek <aljos...@apache.org> > Subject: Re: Joining data in Streaming > > Hi, > > as far as I know, this is not easily pos

RE: Joining data in Streaming

2018-01-30 Thread Marchant, Hayden
m97...@imceu.eu.ssmb.com> Cc: user@flink.apache.org; Aljoscha Krettek <aljos...@apache.org> Subject: Re: Joining data in Streaming Hi, as far as I know, this is not easily possible. What would be required is something like a CoFlatmap function, where one input stream is blocking until the s

Re: Joining data in Streaming

2018-01-30 Thread Xingcan Cui
Hi Hayden, To perform a full-history join on two streams has not been natively supported now. As a workaround, you may implement a CoProcessFunction and cache the records from both sides in states until the stream with fewer data has been fully cached. Then you could safely clear the cache for

Re: Joining data in Streaming

2018-01-30 Thread Stefan Richter
Hi, as far as I know, this is not easily possible. What would be required is something like a CoFlatmap function, where one input stream is blocking until the second stream is fully consumed to build up the state to join against. Maybe Aljoscha (in CC) can comment on future plans to support

Joining data in Streaming

2018-01-30 Thread Marchant, Hayden
We have a use case where we have 2 data sets - One reasonable large data set (a few million entities), and a smaller set of data. We want to do a join between these data sets. We will be doing this join after both data sets are available. In the world of batch processing, this is pretty