On Thu, 21 Feb 2019 at 14:00, Dawid Wysakowicz
wrote:
> If an event arrived at WindowOperator before the Watermark, then it will
> be accounted for window aggregation and put in state. Once that state gets
> checkpointed this same event won't be processed again. In other words if a
> checkpoint
If an event arrived at WindowOperator before the Watermark, then it will
be accounted for window aggregation and put in state. Once that state
gets checkpointed this same event won't be processed again. In other
words if a checkpoint succeeds elements that produced corresponding
state won't be
On Thu, 21 Feb 2019 at 13:36, Dawid Wysakowicz
wrote:
> It is definitely a solution ;)
>
> You should be aware of the downsides though:
>
>- you might get different results in case of reprocessing
>- you might drop some data as late, due to some delays in processing,
>if the events
It is definitely a solution ;)
You should be aware of the downsides though:
* you might get different results in case of reprocessing
* you might drop some data as late, due to some delays in processing,
if the events arrive later then the "ProcessingTime" threshold
Best,
Dawid
On
Yes, it was the "watermarks for event time when no events for that shard"
problem.
I am now investigating whether we can use a blended watermark of
max(lastEventTimestamp - 1min, System.currentTimeMillis() - 5min) to ensure
idle shards do not cause excessive data retention.
Is that the best
Hi Stephen,
Watermark for a single operator is the minimum of Watermarks received
from all inputs, therefore if one of your shards/operators does not have
incoming data it will not produce Watermarks thus the Watermark of
WindowOperator will not progress. So this is sort of an expected behavior.
Hi Stephen
If the window has not been triggered ever, maybe you could investigate the
watermark, maybe the doc[1][2] can be helpful.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/windows.html#interaction-of-watermarks-and-windows
[2]
>
>
> From: Stephen Connolly [mailto:stephen.alan.conno...@gmail.com
> <mailto:stephen.alan.conno...@gmail.com>]
> Sent: Tuesday, February 19, 2019 6:32 AM
> To: user mailto:user@flink.apache.org>>
> Subject: EXT :Re: How to debug difference between Kinesis
> function isn’t getting data you have to watch out for this.
>
>
>
> *From:* Stephen Connolly [mailto:stephen.alan.conno...@gmail.com]
> *Sent:* Tuesday, February 19, 2019 6:32 AM
> *To:* user
> *Subject:* EXT :Re: How to debug difference between Kinesis and Kafka
>
>
of a source function isn’t getting
data you have to watch out for this.
From: Stephen Connolly [mailto:stephen.alan.conno...@gmail.com]
Sent: Tuesday, February 19, 2019 6:32 AM
To: user
Subject: EXT :Re: How to debug difference between Kinesis and Kafka
Hmmm my suspicions are now quite high. I created
Hmmm my suspicions are now quite high. I created a file source that just
replays the events straight then I get more results
On Tue, 19 Feb 2019 at 11:50, Stephen Connolly <
stephen.alan.conno...@gmail.com> wrote:
> Hmmm after expanding the dataset such that there was additional data that
>
Hmmm after expanding the dataset such that there was additional data that
ended up on shard-0 (everything in my original dataset was coincidentally
landing on shard-1) I am now getting output... should I expect this kind of
behaviour if no data arrives at shard-0 ever?
On Tue, 19 Feb 2019 at
Hi, I’m having a strange situation and I would like to know where I should
start trying to debug.
I have set up a configurable swap in source, with three implementations:
1. A mock implementation
2. A Kafka consumer implementation
3. A Kinesis consumer implementation
>From injecting a log and
13 matches
Mail list logo