Re: data enrichment with SQL use case

2018-05-15 Thread miki haiat
HI guys , This is how i tried to solve my enrichment case https://gist.github.com/miko-code/d615aa05b65579f4366ba9fe8a8275fd Currently we need to use *keyby()* before the process function. My concern is if i have in flight N

Re: data enrichment with SQL use case

2018-04-26 Thread Fabian Hueske
Hi all, @Ken, the approach of telling the operator which input to read from would cause problems with the current checkpointing mechanism because checkpoint barriers are not allowed to overtake regular records. Chaining wouldn't be an issue, because operators with two inputs are not chained to

Re: data enrichment with SQL use case

2018-04-25 Thread Ken Krugler
Hi Michael, Windowing works when you’re joining timestamped metadata and non-metadata. The common case I’m referring to is where there’s some function state (e.g. rules to process data, machine learning models, or in my case clusters), where you want to process the non-metadata with the

Re: data enrichment with SQL use case

2018-04-25 Thread TechnoMage
I agree in the general case you need to operate on the stream data based on the metadata you have. The side input feature coming some day may help you, in that it would give you a means to receive inputs out of band. But, given changing metadata and changing stream data I am not sure this is

Re: data enrichment with SQL use case

2018-04-25 Thread TechnoMage
Using a flat map function, you can always buffer the non-meta data stream in the operator state until the metadata is aggregated, and then process any collected data. It would require a RichFlatMap to hold data. Michael > On Apr 25, 2018, at 1:20 PM, Ken Krugler

Re: data enrichment with SQL use case

2018-04-25 Thread Ken Krugler
Hi Fabian, > On Apr 24, 2018, at 3:01 AM, Fabian Hueske wrote: > > Hi Alex, > > An operator that has to join two input streams obviously requires two inputs. > In case of an enrichment join, the operator should first read the meta-data > stream and build up a data

Re: data enrichment with SQL use case

2018-04-24 Thread Fabian Hueske
Hi Alex, An operator that has to join two input streams obviously requires two inputs. In case of an enrichment join, the operator should first read the meta-data stream and build up a data structure as state against which the other input is joined. If the meta data is (infrequently) updated,

Re: data enrichment with SQL use case

2018-04-23 Thread Alexander Smirnov
Hi Fabian, please share the workarounds, that must be helpful for my case as well Thank you, Alex On Mon, Apr 23, 2018 at 2:14 PM Fabian Hueske wrote: > Hi Miki, > > Sorry for the late response. > There are basically two ways to implement an enrichment join as in your > use

Re: data enrichment with SQL use case

2018-04-23 Thread Fabian Hueske
Hi Miki, Sorry for the late response. There are basically two ways to implement an enrichment join as in your use case. 1) Keep the meta data in the database and implement a job that reads the stream from Kafka and queries the database in an ASyncIO operator for every stream record. This should

Re: data enrichment with SQL use case

2018-04-16 Thread Ken Krugler
Hi Miki, I haven’t tried mixing AsyncFunctions with SQL queries. Normally I’d create a regular DataStream workflow that first reads from Kafka, then has an AsyncFunction to read from the SQL database. If there are often duplicate keys in the Kafka-based stream, you could keyBy(key) before the

Re: data enrichment with SQL use case

2018-04-16 Thread miki haiat
HI thanks for the reply i will try to break your reply to the flow execution order . First data stream Will use AsyncIO and select the table , Second stream will be kafka and the i can join the stream and map it ? If that the case then i will select the table only once on load ? How can i

Re: data enrichment with SQL use case

2018-04-15 Thread Ken Krugler
If the SQL data is all (or mostly all) needed to join against the data from Kafka, then I might try a regular join. Otherwise it sounds like you want to use an AsyncFunction to do ad hoc queries (in parallel) against your SQL DB.

data enrichment with SQL use case

2018-04-15 Thread miki haiat
Hi, I have a case of meta data enrichment and im wondering if my approach is the correct way . 1. input stream from kafka. 2. MD in msSQL . 3. map to new pojo I need to extract a key from the kafka stream and use it to select some values from the sql table . SO i thought to use