Re: NPE when using spring bean in custom input format

2019-01-21 Thread Piotr Nowojski
Hi, You have to use `open()` method to handle initialisation of the things required by your code/operators. By the nature of the LocalEnvironment, the life cycle of the operators is different there compared to what happens when submitting a job to the real cluster. With remote environments

Re: Query on retract stream

2019-01-21 Thread Piotr Nowojski
Hi, There is a missing feature in Flink Table API/SQL of supporting retraction streams as the input (or conversions from append stream to retraction stream) at the moment. With that your problem would simplify to one simple `SELECT uid, count(*) FROM Changelog GROUP BY uid`. There is an

Re: Query on retract stream

2019-01-21 Thread Jeff Zhang
I am thinking of another approach instead of retract stream. Is it possible to define a custom window to do this ? This window is defined for each order. And then you just need to analyze the events in this window. Piotr Nowojski 于2019年1月21日周一 下午8:44写道: > Hi, > > There is a missing feature in

Re: Re: Is there a way to find the age of an element in a Global window?

2019-01-21 Thread Manjusha Vuyyuru
Hi Kostas, I have a similar scenario where i have to clear window elements upon reaching some count or clear the elements if they are older than one hour. I'm using the below approach, just wanted to know if its the right way : DataStream> out = mappedFields .map(new

[Flink 1.6] How to get current total number of processed events

2019-01-21 Thread Thanh-Nhan Vo
Hello all, I have a question, please ! I'm using Flink 1.6 to process our data in streaming mode. I wonder if at a given event, there is a way to get the current total number of processed events (before this event). If possible, I want to get this total number of processed events as a value

Temporal tables not behaving as expected

2019-01-21 Thread Chris Miller
Hi all, I'm new to Flink so am probably missing something simple. I'm using Flink 1.7.1 and am trying to use temporal table functions but aren't getting the results I expect. With the example code below, I would expect 4 records to be output (one for each order), but instead I'm only seeing

Kafka stream fed in batches throughout the day

2019-01-21 Thread Jonny Graham
We have a Kafka stream of events that we want to process with a Flink datastream process. However, the stream is populated by an upstream batch process that only executes every few hours. So the stream has very 'bursty' behaviour. We need a window based on event time to await the next events

Re: Query on retract stream

2019-01-21 Thread Hequn Cheng
Hi Gagan, Yes, you can achieve this with Flink TableAPI/SQL. However, you have to pay attention to the following things: 1) Currently, Flink only ingests append streams. In order to ingest upsert streams(steam with keys), you can use groupBy with a user-defined LAST_VALUE aggregate function. For

Re: Query on retract stream

2019-01-21 Thread Piotr Nowojski
@Jeff: It depends if user can define a time window for his condition. As Gagan described his problem it was about “global” threshold of pending orders. I have just thought about another solution that should work without any custom code. Converting “status” field to status_value int: - "+1”

Re: Kafka stream fed in batches throughout the day

2019-01-21 Thread miki haiat
In flink you cant read data from kafka in Dataset API (Batch) And you dont want to mess with start and stop your job every few hours. Can you elaborate more on your use case , Are you going to use KeyBy , is thire any way to use trigger ... ? On Mon, Jan 21, 2019 at 4:43 PM Jonny Graham wrote:

Re: Query on retract stream

2019-01-21 Thread Gagan Agrawal
Thank you guys. It's great to hear multiple solutions to achieve this. I understand that records once emitted to Kafka can not be deleted and that's acceptable for our use case as last updated value should always be correct. However as I understand most of these solutions will work for global

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-21 Thread Ufuk Celebi
Hey Aaron, sorry for the late reply. (1) I think I was able to reproduce this issue using snappy-java. I've filed a ticket here: https://issues.apache.org/jira/browse/FLINK-11402. Can you check the ticket description whether it's in line with what you are experiencing? Most importantly, do you

Re: Query on retract stream

2019-01-21 Thread Hequn Cheng
Hi Gagan, > But I also have a requirement for event time based sliding window aggregation Yes, you can achieve this with Flink TableAPI/SQL. However, currently, sliding windows don't support early fire, i.e., only output results when event time reaches the end of the window. Once window fires,

Flink Jdbc streaming source support in 1.7.1 or in future?

2019-01-21 Thread Manjusha Vuyyuru
Hello, Do flink 1.7.1 supports connection to relational database(mysql)? I want to use mysql as my streaming source to read some configuration.. Thanks, Manju

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-21 Thread Jeff Zhang
Thanks Chesnay for raising this discussion thread. I think there are 3 major use scenarios for flink binary distribution. 1. Use it to set up standalone cluster 2. Use it to experience features of flink, such as via scala-shell, sql-client 3. Downstream project use it to integrate with their