Kafka Sinking from DataSet

2019-02-05 Thread Jonny Graham
Hi, I'm using HadoopInputs.readHadoopFile() to read a Parquet file, which gives me a DataSource which (as far as I can see) is basically a DataSet. I want to write data from this source into Kafka, but the Kafka sink only works on a DataStream. There's no easy way to convert my DataSet to a

RE: Kafka stream fed in batches throughout the day

2019-01-22 Thread Jonny Graham
would remain open for that time. Thanks, Jonny From: miki haiat [mailto:miko5...@gmail.com] Sent: Monday, January 21, 2019 5:07 PM To: Jonny Graham Cc: user@flink.apache.org Subject: Re: Kafka stream fed in batches throughout the day In flink you cant read data from kafka in Dataset API (Batch

Kafka stream fed in batches throughout the day

2019-01-21 Thread Jonny Graham
We have a Kafka stream of events that we want to process with a Flink datastream process. However, the stream is populated by an upstream batch process that only executes every few hours. So the stream has very 'bursty' behaviour. We need a window based on event time to await the next events