Although things improved during bootstrapping and when even volume was
larger. As soon as the traffic slowed down the events are getting stuck
(buffered?) at the OVER operator for a very long time. Several hours.

On Fri, Aug 23, 2019 at 5:04 PM Vinod Mehra <vme...@lyft.com> wrote:

> (Forgot to mention that we are using Flink 1.4)
>
> Update: Earlier the OVER operator was assigned a parallelism of 64. I
> reduced it to 1 and the problem went away! Now the OVER operator is not
> filtering/buffering the events anymore.
>
> Can someone explain this please?
>
> Thanks,
> Vinod
>
> On Fri, Aug 23, 2019 at 3:09 PM Vinod Mehra <vme...@lyft.com> wrote:
>
>> We have a SQL based flink job which is consume a very low volume stream
>> (1 or 2 events in few hours):
>>
>>
>>
>>
>>
>>
>> *SELECT user_id,    COUNT(*) OVER (PARTITION BY user_id ORDER BY rowtime
>> RANGE INTERVAL '30' DAY PRECEDING) as count_30_days,
>> COALESCE(occurred_at, logged_at) AS latency_marker,    rowtimeFROM
>> event_fooWHERE user_id IS NOT NULL*
>>
>> The OVER operator seems to filter out events as per the flink dashboard
>> (records received = <non-zero-number> records sent = 0)
>>
>> The operator looks like this:
>>
>> *over: (PARTITION BY: $1, ORDER BY: rowtime, RANGEBETWEEN 2592000000
>> PRECEDING AND CURRENT ROW, select: (rowtime, $1, $2, COUNT(*) AS w0$o0)) ->
>> select: ($1 AS user_id, w0$o0 AS count_30_days, $2 AS latency_marker,
>> rowtime) -> to: Tuple2 -> Filter -> group_counter_count_30d.1.sqlRecords ->
>> sample_without_formatter*
>>
>> I know that the OVER operator can discard late arriving events, but these
>> events are not arriving late for sure. The watermark for all operators stay
>> at 0 because the output events is 0.
>>
>> We have an exactly same SQL job against a high volume stream that is
>> working fine. Watermarks progress in timely manner and events are delivered
>> in timely manner as well.
>>
>> Any idea what could be going wrong? Are the events getting buffered
>> waiting for certain number of events? If so, what is the threshold?
>>
>> Thanks,
>> Vinod
>>
>

Reply via email to