Re: Over Window Not Processing Messages

2018-06-29 Thread Gregory Fee
Thanks! I'm working on a way to deliver the data in order (or closer to in order) and deliver watermarks more often. I'll let you know my results. On Thu, Jun 28, 2018 at 5:36 AM, Fabian Hueske wrote: > In a nutshell the Over operator works as follows: > - When a row arrives it is put into a Map

Re: Over Window Not Processing Messages

2018-06-28 Thread Gregory Fee
I'm writing a custom S3 source in order to work around some issues with back pressure and checkpointing at scale in my bootstrap logic. I moved around the logic to assign timestamps and watermarks. As part of that I ended up generating watermarks earlier in the pipeline but having another operator

Re: Over Window Not Processing Messages

2018-06-28 Thread Fabian Hueske
In a nutshell the Over operator works as follows: - When a row arrives it is put into a MapState keyed on its timestamp and a timer is registered to process it when the watermark passes that timestamp. - All the heavy computation is done in the onTimer() method. For each unique timestamp, the Over

Re: Over Window Not Processing Messages

2018-06-27 Thread Hequn Cheng
Hi Gregory, What's the cause of your problem. It would be great if you can share your experience which I think will definitely help others. On Thu, Jun 28, 2018 at 11:30 AM, Gregory Fee wrote: > Yep, it was definitely a watermarking issue. I have that sorted out now. > Thanks! > > On Wed, Jun

Re: Over Window Not Processing Messages

2018-06-27 Thread Gregory Fee
Yep, it was definitely a watermarking issue. I have that sorted out now. Thanks! On Wed, Jun 27, 2018 at 6:49 PM, Hequn Cheng wrote: > Hi Gregory, > > As you are using the rowtime over window. It is probably a watermark > problem. The window only output when watermarks make a progress. You can >

Re: Over Window Not Processing Messages

2018-06-27 Thread Hequn Cheng
Hi Gregory, As you are using the rowtime over window. It is probably a watermark problem. The window only output when watermarks make a progress. You can use processing-time(instead of row-time) to verify the assumption. Also, make sure there are data in each of you source partition, the watermark

Re: Over Window Not Processing Messages

2018-06-27 Thread Gregory Fee
Thanks for your answers! Yes, it was based on watermarks. Fabian, the state does indeed grow quite a bit in my scenario. I've observed in the range of 5GB. That doesn't seem to be an issue in itself. However, in my scenario I'm loading a lot of data from a historic store that is only partitioned b

Re: Over Window Not Processing Messages

2018-06-27 Thread Fabian Hueske
Hi, The OVER window operator can only emit result when the watermark is advanced, due to SQL semantics which define that all records with the same timestamp need to be processed together. Can you check if the watermarks make sufficient progress? Btw. did you observe state size or IO issues? The O

Re: Over Window Not Processing Messages

2018-06-26 Thread Rong Rong
Hi Greg. Based on a quick test I cannot reproduce the issue, it is emitting messages correctly in the ITCase environment. can you share more information? Does the same problem happen if you use proctime? I am guessing this could be highly correlated with how you set your watermark strategy of your

Over Window Not Processing Messages

2018-06-26 Thread Gregory Fee
Hello User Community! I am running some streaming SQL that involves a union all into an over window similar to the below: SELECT user_id, count(action) OVER (PARTITION BY user_id ORDER BY rowtime RANGE INTERVAL '14' DAY PRECEDING) action_count, rowtime FROM (SELECT rowtime, user_id, thing as