Re: Multiple output operations in a job vs multiple jobs

2018-08-02 Thread Fabian Hueske
Hi, Paul is right. Which and how much data is stored in state for a window depends on the type of the function that is applied on the windows: - ReduceFunction: Only the reduced value is stored - AggregateFunction: Only the accumulator value is stored - WindowFunction or ProcessWindowFunction:

Re: Multiple output operations in a job vs multiple jobs

2018-08-02 Thread vino yang
Hi Paul, Yes, I am talking about the normal case, Flink must store the data in the window as a state to prevent failure. In some scenarios your understanding is also correct, and flink uses the window pane to optimize window calculations. So, if your scene is in optimized mode, ignore this.

Re: Multiple output operations in a job vs multiple jobs

2018-07-31 Thread vino yang
Hi anna, 1. The srcstream is a very high volume stream and the window size is 2 weeks and 4 weeks. Is the window size a problem? In this case, I think it is not a problem because I am using reduce which stores only 1 value per window. Is that right? *>> Window Size is based on your business