Flink Window with multiple trigger condition

2020-05-20 Thread aj
Hello All, I am getting a lot of user events in a stream. There are different types of events, now I want to build some aggregation metrics for the user by grouping events in buckets. My condition for windowing is : 1. Start the window for the user when event_name: *"search" *arrived for the u

Re: Developing Beam applications using Flink checkpoints

2020-05-20 Thread Eleanore Jin
Hi Ivan, Beam coders are wrapped in Flink's TypeSerializers. So I don't think it will result in double serialization. Thanks! Eleanore On Tue, May 19, 2020 at 4:04 AM Ivan San Jose wrote: > Perfect, thank you so much Arvid, I'd expect more people using Beam on > top of Flink, but it seems is n

Re: How do I get the IP of the master and slave files programmatically in Flink?

2020-05-20 Thread Yangze Guo
Hi, Felipe Do you mean to get the Host and Port of the task executor where your operator is indeed running on? If that is the case, IIUC, two possible components that contain this information are RuntimeContext and the Configuration param of RichFunction#open. After reading the relevant code path

Re: Stream Iterative Matching

2020-05-20 Thread Guowei Ma
Hi, Marc I think the window operator might fulfill your needs. You could find the detailed description here[1] In general, I think you could choose the correct type of window and use the `ProcessWindowFunction` to emit the elements that match the best sum. [1] https://ci.apache.org/projects/flink

Re: Flink Dashboard UI Tasks hard limit

2020-05-20 Thread Vijay Balakrishnan
Hi, I have increased the number of slots available but the Job is not using all the slots but runs into this approximate 18000 Tasks limit. Looking into the source code, it seems to be opening file - https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io

Re: How do I get the IP of the master and slave files programmatically in Flink?

2020-05-20 Thread Alexander Fedulov
Hi Felippe, could you clarify in some more details what you are trying to achieve? Best regards, -- Alexander Fedulov | Solutions Architect +49 1514 6265796 Follow us @VervericaData -- Join Flink Forward - The Apache Flink Conferenc

Stream Iterative Matching

2020-05-20 Thread ba
Hi All, I'm new to Flink but am trying to write an application that processes data from internet connected sensors. My problem is as follows: -Data arrives in the format: [sensor id] [timestamp in seconds] [sensor value] -Data can arrive out of order (between sensor IDs) by upto 5 minutes. -So a

Re: Question about My Flink Application

2020-05-20 Thread Alexander Fedulov
Hi Sara, do you have logs? Any exceptions in them? Best, -- Alexander Fedulov | Solutions Architect +49 1514 6265796 Follow us @VervericaData -- Join Flink Forward - The Apache Flink Conference Stream Processing | Event Driven | Re

Re: Testing process functions

2020-05-20 Thread Alexander Fedulov
Hi Manas, I would recommend using TestHarnesses for testing. You could also use them prior to 1.10. Here is an example of setting the dependencies: https://github.com/afedulov/fraud-detection-demo/blob/master/flink-job/build.gradle#L113 You can see some examples of tests for a demo application he

Re: [EXTERNAL] Re: Memory growth from TimeWindows

2020-05-20 Thread Slotterback, Chris
What I've noticed is that heap memory ends up growing linearly with time indefinitely (past 24 hours) until it hits the roof of the allocated heap for the task manager, which leads me to believe I am leaking somewhere. All of my windows have an allowed lateness of 5 minutes, and my watermarks ar

Re: Flink operator throttle

2020-05-20 Thread Alexander Fedulov
Hi Chen, just a small comment regarding your proposition: this would work well when one does a complete message passthrough. If there is some filtering in the pipeline, which could be dependent on the incoming stream data itself, the output throughput (the goal of the throttling) would be hard to

How do I get the IP of the master and slave files programmatically in Flink?

2020-05-20 Thread Felipe Gutierrez
Hi all, I have my own operator that extends the AbstractUdfStreamOperator class and I want to issue some messages to it. Sometimes the operator instances are deployed on different TaskManagers and I would like to set some attributes like the master and slave IPs on it. I am trying to use these va

Re: CoFlatMap has high back pressure

2020-05-20 Thread sundar
Thanks a lot for all the help! I was able to figure out the bug just now. I had some extra code in the coFlatMap function(emitting stats) which was inefficient and causing high GC usage. Fixing that fixed the issue. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble

[DISCUSS] Remove dependency shipping through nested jars during job submission.

2020-05-20 Thread Kostas Kloudas
Hi all, I would like to bring the discussion in https://issues.apache.org/jira/browse/FLINK-17745 to the dev mailing list, just to hear the opinions of the community. In a nutshell, in the early days of Flink, users could submit their jobs as fat-jars that had a specific structure. More concretel