As per this link, it says that it only supports value_only for now as I am
using pyflink. Does it mean I can't extract the timestamp appended by Kafka
with pyflink as of now?
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/connectors/datastream/kafka/#deserializer
or does it mean
Hi Liu and Jinfeng,
I am trying to implement KafkaDeserializationSchema for Pyflink but am
unable to get any examples. Can you share some links or references using
which I can understand and try to implement myself?
Perez sid.
Cool thanks for the clarification.
Sid.
On Mon, Sep 11, 2023 at 9:22 AM liu ron wrote:
> Hi, Sid
>
> For the second question, I think it is not needed.
>
> Best,
> Ron
>
> Feng Jin 于2023年9月9日周六 21:19写道:
>
>> hi Sid
>>
>>
>> 1. You can customize KafkaDeserializationSchema[1], in the
Hi, Sid
For the second question, I think it is not needed.
Best,
Ron
Feng Jin 于2023年9月9日周六 21:19写道:
> hi Sid
>
>
> 1. You can customize KafkaDeserializationSchema[1], in the `deserialize`
> method, you can obtain the Kafka event time.
>
> 2. I don't think it's necessary to explicitly mention
hi Sid
1. You can customize KafkaDeserializationSchema[1], in the `deserialize`
method, you can obtain the Kafka event time.
2. I don't think it's necessary to explicitly mention the watermark
strategy.
[1].
Hello experts,
My source is Kafka and I am trying to generate records for which I have
FlinkKafkaConsumer class.
Now my first question is how to consume an event timestamp for the records
generated.
I know for a fact that for CLI, there is one property called
*print.timestamp=true* which gives
Watermarks are not included in checkpoints or savepoints.
See [1] for some head-scratchingly-complicated info about restarts,
watermarks, and unaligned checkpoints.
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#interplay
Hi Alexis
Currently I think checkpoint and savepoint will not save watermarks. I
think how to deal with watermarks at checkpoint/savepoint is a good
question, we can discuss this in dev mail list
Best,
Shammon FY
On Wed, Mar 15, 2023 at 4:22 PM Alexis Sarda-Espinosa <
sarda.espin...@gmail.
t;
>> Hi David, thanks for the answer. One follow-up question: will the
>> watermark be reset to Long.MIN_VALUE every time I restart a job with
>> savepoint?
>>
>> Am Di., 14. März 2023 um 05:08 Uhr schrieb David Anderson <
>> dander...@apache.org>:
>&g
pin...@gmail.com> wrote:
> Hi David, thanks for the answer. One follow-up question: will the
> watermark be reset to Long.MIN_VALUE every time I restart a job with
> savepoint?
>
> Am Di., 14. März 2023 um 05:08 Uhr schrieb David Anderson <
> dander...@apache.org>:
Hi David, thanks for the answer. One follow-up question: will the watermark
be reset to Long.MIN_VALUE every time I restart a job with savepoint?
Am Di., 14. März 2023 um 05:08 Uhr schrieb David Anderson <
dander...@apache.org>:
> Watermarks always follow the corresponding event(s). I'm
Watermarks always follow the corresponding event(s). I'm not sure why
they were designed that way, but that is how they are implemented.
Windows maintain this contract by emitting all of their results before
forwarding the watermark that triggered the results.
David
On Mon, Mar 13, 2023 at 5:28
Hi Alexis
Do you use both event-time watermark generator and TimerService for
processing time in your job? Maybe you can try using event-time watermark
first.
Best,
Shammon.FY
On Sat, Mar 11, 2023 at 7:47 AM Alexis Sarda-Espinosa <
sarda.espin...@gmail.com> wrote:
> Hello,
>
> I recently ran
Hello,
I recently ran into a weird issue with a streaming job in Flink 1.16.1. One
of my functions (KeyedProcessFunction) has been using processing time
timers. I now want to execute the same job based on a historical data dump,
so I had to adjust the logic to use event time timers in that case
DataStream time windows and Flink SQL make assumptions about the timestamps
and watermarks being milliseconds since the epoch. But the underlying
machinery does not. So if you limit yourself to process functions (for
example), then nothing will assign any semantics to the time values.
David
watermark is reached and that
required all sources to have reached at least that point in the recovery). Once
we have reached the startup datetime watermark the system seamlessly flips into
live processing mode. The watermarks still trigger my timers but now we are
processing the last ~1 minute
> concept of watermarks is more abstract, so I'll leave implementation
> details aside.
>
> Speaking generally, yes, there is a set of requirements that must be met
> in order to be able to generate a system that uses watermarks.
>
> The primary question is what are watermarks used
Hi,
I will not speak about details related to Flink specifically, the
concept of watermarks is more abstract, so I'll leave implementation
details aside.
Speaking generally, yes, there is a set of requirements that must be met
in order to be able to generate a system that uses watermarks
Hey everyone,
I'm wondering if anyone has done any experiments trying to use non-temporal
watermarks? For example, a dataset may contain some kind of virtual
timestamp / version field that behaves just like a regular timestamp
(monotonically increasing, etc.), but has a different scale / range
) number of DataStream
keyed/broadcast/plain and also to tap into
the meta-stream of watermark events. Each Input is set up separately and can
implement separate handlers for the events/watermarks/etc.
However, it is an operator implementation, you e.g. need to manually set up
timer manager
Hi,
I am trying to process watermarks in a BroadcastConnectedStream. However, I am
not able to find any direct way to handle watermark events, similar to what we
have in processWatermark1 in a KeyedCoProcessOperator. Following are further
details.
In the context of the example given
be interested (since obviously it
would help my thesis). Maybe you can tell how much effort it would be? I
imagine it would need support in the place where the watermarks are registered
(the one I sent) and in the place they are actually used (which I have not
checked yet at all).
-Theo
> On
Hi Theo,
The most logical reason is that nested attributes were added later than
watermarks were :) I agree that it's something that would be worthwhile to
improve. If you can and want to make a contribution on this, that would be
great.
Best regards,
Martijn
On Wed, Dec 14, 2022 at 9:24 AM
Actually, this behaviour is documented
<https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#create-table>
(See the Watermarks section, where it is stated that the column must be a
“top-level” column). So I suppose, there is a reason. Nevertheless it is
Hey everyone,
I have encountered a problem with my Table API Program. I am trying to use a
nested attribute as a watermark. The structure of my schema is a row, which
itself has 3 rows as attributes and they again have some attributes, especially
the Timestamp that I want to use as a
Hi David,
Thanks a lot for clarification.
Best, Peter
> On 21. Aug 2022, at 18:36, David Anderson wrote:
>
> If you have two watermark strategies in your job, the downstream
> TimestampsAndWatermarksOperator will absorb incoming watermarks and not
> forward
If you have two watermark strategies in your job, the
downstream TimestampsAndWatermarksOperator will absorb incoming watermarks
and not forward them downstream, but it will have no effect upstream.
The only exception to this is that watermarks equal to Long.MAX_VALUE are
forwarded downstream
Hi there,
While still struggling with events and watermarks out of order after sorting
with a buffer process function (compare [1]) I tired to solve the issue by
assigning a new watermark after the mentioned sorting function.
The Flink docs [2] are not very certain about the impact
son of the subsequent event times, the event time of
the current event is compared with the current watermark
(context.timerService().currentWatermark()). Result: Watermarks and event
timestamps are NOT ascending. On some events the event timestamp is lower than
the current watermark.
I somehow susp
gt; .withTimestampAssigner(new
>>>> SerializableTimestampAssigner[StarscreamEventCounter_V1] {
>>>> override def extractTimestamp(element: StarscreamEventCounter_V1,
>>>> recordTimestamp: Long): Long =
>>>> element.envelopeTimestamp
>>>&g
f(2, ChronoUnit.HOURS))
>>> .withTimestampAssigner(new
>>> SerializableTimestampAssigner[StarscreamEventCounter_V1] {
>>> override def extractTimestamp(element: StarscreamEventCounter_V1,
>>> recordTimestamp: Long): Long =
>>> element.env
.withTimestampAssigner(new
>> SerializableTimestampAssigner[StarscreamEventCounter_V1] {
>> override def extractTimestamp(element: StarscreamEventCounter_V1,
>> recordTimestamp: Long): Long =
>> element.envelopeTimestamp
>> })
>>
>> The Watermarks are correct
f(2, ChronoUnit.HOURS))
> .withTimestampAssigner(new
> SerializableTimestampAssigner[StarscreamEventCounter_V1] {
> override def extractTimestamp(element: StarscreamEventCounter_V1,
> recordTimestamp: Long): Long =
> element.envelopeTimestamp
> })
>
> The Watermarks ar
extractTimestamp(element: StarscreamEventCounter_V1,
recordTimestamp: Long): Long =
element.envelopeTimestamp
})
The Watermarks are correctly getting assigned.
However, when a reduce function is used the window never terminates
because the `ctx.getCurrentWatermark()` returns the default
);
out.collect(((ObjectNode) originalEvent).toString());
}
}
}
Op di 29 mrt. 2022 om 15:23 schreef Schwalbe Matthias <
matthias.schwa...@viseca.ch>:
> Hello Hans-Peter,
>
>
>
> I’m a little confused which version of your code you are tes
Hello Hans-Peter,
I’m a little confused which version of your code you are testing against:
* ProcessingTimeSessionWindows or EventTimeSessionWindows?
* did you keep the withIdleness() ??
As said before:
* for ProcessingTimeSessionWindows, watermarks play no role
* if you keep
ater on
>2. So far you used a session window to determine the point in time
>when to emit the stored/enriched/sorted events
>3. Watermarks are generated with bounded out of orderness
>4. You use session windows with a specific gap
>5. In your experiment you ever onl
Hi Dawid,
Thanks for the update, I also managed to work around it by adding another
watermark assignment operator between the join and the window. I’ll have to see
if it’s possible to assign watermarks at the source, but even if it is, I worry
that the different partitions created by all my
Hi Alexis,
I tried looking into your example. First of all, so far, I've spent only
a limited time looking at the WatermarkGenerator, and I have not
thoroughly understood how it works. I'd discourage assigning watermarks
anywhere in the middle of your pipeline. This is considered
ing of
the watermarks on single operators / per subtask useful:
Look for subtasks that don’t have watermarks, or too low watermarks for a
specific session window to trigger.
Thias
From: HG
Sent: Mittwoch, 16. März 2022 16:41
To: Schwalbe Matthias
Cc: user
Subject: Re: Watermarks event time
ific enough (please confirm or correct in your answer):
>
>1. You store incoming events in state per transaction_id to be
>sorted/aggregated(min/max time) by event time later on
>2. So far you used a session window to determine the point in time
>when to emit the stored/en
/aggregated(min/max time) by event time later on
2. So far you used a session window to determine the point in time when to
emit the stored/enriched/sorted events
3. Watermarks are generated with bounded out of orderness
4. You use session windows with a specific gap
5. In your experiment you
but somehow the processing did not
run as expected.
When I pushed 1000 events eventually 800 or so would appear at the output.
This was resolved by switching to ProcessingTimeSessionWindows .
My thought was then that I could remove the watermarkstrategies with
watermarks with timestamp assigners (handling
For completeness, this still happens with Flink 1.14.4
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Friday, March 11, 2022 12:21 AM
To: user@flink.apache.org
Cc: pnowoj...@apache.org
Subject: Re: Interval join operator is not forwarding watermarks
] https://github.com/asardaes/flink-interval-join-test
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Thursday, March 10, 2022 7:47 PM
To: user@flink.apache.org
Cc: pnowoj...@apache.org
Subject: RE: Interval join operator is not forwarding watermarks
: Interval join operator is not forwarding watermarks correctly
Hello,
I'm in the process of updating from Flink 1.11.3 to 1.14.3, and it seems the
interval join in my pipeline is no longer working. More specifically, I have a
sliding window after the interval join, and the window isn't firing. After
.connect(stream2)
.keyBy(selector1, selector2)
.transform("Interval Join", TypeInformation.of(Pojo.class), joinOperator);
---
Some more information in case it's relevant:
- stream2 is obtained from a side output.
- both stream1 and stream2 have watermarks assigned by custom str
en you should set
>> an idleness which after that, a watermark is produced.
>>
>> Idleness is
>>
>> On Fri, Feb 11, 2022 at 2:53 PM HG wrote:
>>
>>> Hi,
>>>
>>> I am getting a headache when thinking about watermarks and timestamp
couple of days nothing is produced), then you should set an
> idleness which after that, a watermark is produced.
>
> Idleness is
>
> On Fri, Feb 11, 2022 at 2:53 PM HG wrote:
>
>> Hi,
>>
>> I am getting a headache when thinking about watermarks and tim
uple of days nothing is produced), then you should set an
idleness which after that, a watermark is produced.
Idleness is
On Fri, Feb 11, 2022 at 2:53 PM HG wrote:
> Hi,
>
> I am getting a headache when thinking about watermarks and timestamps.
> My application reads events from
Hi,
I am getting a headache when thinking about watermarks and timestamps.
My application reads events from Kafka (they are in json format) as a
Datastream
Events can be keyed by a transactionId and have a event timestamp
(handlingTime)
All events belonging to a single transactionId will arrive
processFunction will just emit watermarks from upstream as they come. No
function/operator in Flink is a black hole w.r.t. watermarks. It's just
important to remember that watermark after a network shuffle is always the
min of all inputs (ignoring idle inputs). So if any connected upstream part
Thanks again Arvid,
I am getting closer to the culprit as I've found some interesting
scenarios. Still no exact answer yet. We are indeed also using
.withIdleness to mitigate slow/issues with partitions.
I did have a few clarifying questions though w.r.t watermarks if you don't
mind.
*Watermark
upstream.
A usual suspect when not seeing good watermarks is that the custom
watermark assigner is not working as expected. But you mentioned that with
a no-op, the process function is actually showing the watermark and that
leaves me completely puzzled.
I would dump down your example even more
progress.
So now:
Kafka -> Flink Kafka source -> flatMap (map & filter) ->
assignTimestampsAndWaterMarks -> map Function -> *process function (print
watermarks) *-> Key By -> Keyed Process Function -> *process function
(print watermarks)* -> Simple Sink
I am seeing similar
you please simply remove AsyncIO+Sink from your job and
check for print statements?
On Tue, Oct 12, 2021 at 3:23 AM Ahmad Alkilani wrote:
> Flink 1.11
> I have a simple Flink application that reads from Kafka, uses event
> timestamps, assigns timestamps and watermarks and then key'
Flink 1.11
I have a simple Flink application that reads from Kafka, uses event
timestamps, assigns timestamps and watermarks and then key's by a field and
uses a KeyedProcessFunciton.
The keyed process function outputs events from with the `processElement`
method using `out.collect`. No timers
r Nowojski
> *Sent:* 08 October 2021 13:17
> *To:* James Sandys-Lumsdaine
> *Cc:* user@flink.apache.org
> *Subject:* Re: Empty Kafka topics and watermarks
>
> Hi James,
>
> I believe you have encountered a bug that we have already fixed [1]. The
> small problem is t
Kafka topics and watermarks
Hi James,
I believe you have encountered a bug that we have already fixed [1]. The small
problem is that in order to fix this bug, we had to change some
`@PublicEvolving` interfaces and thus we were not able to backport this fix to
earlier minor releases
Hi James,
I believe you have encountered a bug that we have already fixed [1]. The
small problem is that in order to fix this bug, we had to change some
`@PublicEvolving` interfaces and thus we were not able to backport this fix
to earlier minor releases. As such, this is only fixed starting from
Hi everyone,
I'm putting together a Flink workflow that needs to merge historic data from a
custom JDBC source with a Kafka flow (for the realtime data). I have
successfully written the custom JDBC source that emits a watermark for the last
event time after all the DB data has been emitted but
ned today.
>> When we tried to shutdown a job with a savepoint, the watermarks became
>> equal to 2^63 - 1.
>>
>> This caused timers to fire indefinitely and crash downstream systems with
>> overloaded untrue data.
>>
>> We are using event time processing with Kafka
eak the infinite iteration over timers …
I believe the behavior exhibited by flink is intentional and no defect!
What do you think?
Thias
From: JING ZHANG
Sent: Freitag, 24. September 2021 12:25
To: Guowei Ma
Cc: Marco Villalobos ; user
Subject: Re: stream processing savepoints and watermarks quest
> could only be triggered when there is a watermark (except the "quiesce
> phase").
> I think it could not advance any watermarks after MAX_WATERMARK is
> received.
>
> Best,
> Guowei
>
>
> On Fri, Sep 24, 2021 at 4:31 PM JING ZHANG wrote:
>
>> Hi Guowei,
&
Hi, JING
Thanks for the case.
But I am not sure this would happen. As far as I know the event timer could
only be triggered when there is a watermark (except the "quiesce phase").
I think it could not advance any watermarks after MAX_WATERMARK is received.
Best,
Guowei
On Fri, Se
ps://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint
>>
>> Best,
>> JING ZHANG
>>
>> Marco Villalobos 于2021年9月24日周五 下午12:54写道:
>>
>>> Something strange happened today.
>>> W
ned today.
>> When we tried to shutdown a job with a savepoint, the watermarks became
>> equal to 2^63 - 1.
>>
>> This caused timers to fire indefinitely and crash downstream systems with
>> overloaded untrue data.
>>
>> We are using event time processing with Kafka as o
-a-final-savepoint
Best,
JING ZHANG
Marco Villalobos 于2021年9月24日周五 下午12:54写道:
> Something strange happened today.
> When we tried to shutdown a job with a savepoint, the watermarks became
> equal to 2^63 - 1.
>
> This caused timers to fire indefinitely and crash downstream systems wi
Something strange happened today.
When we tried to shutdown a job with a savepoint, the watermarks became
equal to 2^63 - 1.
This caused timers to fire indefinitely and crash downstream systems with
overloaded untrue data.
We are using event time processing with Kafka as our source.
It seems
I think what you are seeing is that the files have records with similar
timestamps. That means after reading file1 your watermarks are already
progressed to the end of your time range. When Flink picks up file2, all
records are considered late records and no windows fire anymore.
See [1
;
> Using FileProcessingMode.*PROCESS_CONTINUOUSLY*
>
> Into a streaming job that uses tumbling Windows and watermarks causes my
> streaming process to stop ad the reading files phase.
>
> Meanwhile if i delete my declarations of Windows and watermark the program
> works as expected.
&g
Hello, during my personal development of a Flink streaming Platform i found something that perplexes me.Using FileProcessingMode.PROCESS_CONTINUOUSLYInto a streaming job that uses tumbling Windows and watermarks causes my streaming process to stop ad the reading files phase.Meanwhile if i delete
Thanks for your example, Maciej
I can explain more about the design.
> Let's have events.
> S1, id1, v1, 1
> S2, id1, v2, 1
>
> Nothing is happening as none of the streams have reached the watermark.
> Now let's add
> S2, id2, v2, 101
> This should trigger join for id1 because we have all the
Hi Leonard,
Let's assume we have two streams.
S1 - id, value1, ts1 with watermark = ts1 - 1
S2 - id, value2, ts2 with watermark = ts2 - 1
Then we have following interval join
SELECT id, value1, value2, ts1 FROM S1 JOIN S2 ON S1.id = S2.id and
ts1 between ts2 - 1 and ts2
Let's have events.
Hello, Maciej
> I agree the watermark should pass on versioned table side, because
> this is the only way to know which version of record should be used.
> But if we mimics behaviour of interval join then main stream watermark
> could be skipped.
IIRC, rowtime interval join requires the watermark
sage is late or early. If we only
> use the watermark on versioned table side, we have no means to determine
> whether the event in the main stream is ready to emit.
>
> Best,
> Shengkai
>
> maverick 于2021年4月26日周一 上午2:31写道:
>>
>> Hi,
>> I'm curious why Event
us why Event Time Temporal Join needs watermarks from both sides
> to
> perform join.
>
> Shouldn't watermark on versioned table side be enough to perform join ?
>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>
Hi,
I'm curious why Event Time Temporal Join needs watermarks from both sides to
perform join.
Shouldn't watermark on versioned table side be enough to perform join ?
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
ssors using Flink 1.12, and tried to get them working on Amazon
> EMR. However Amazon EMR only supports Flink 1.11.2 at the moment. When I
> went to downgrade, I found, inexplicably, that watermarks were no longer
> propagating.
>
> There is only one partition on the topic, and p
Hi Mathieu,
The easiest way is to already emit several inputs on the source level. If
you use DeserializationSchema, try to use the method with the collector.
The watermarks should then be generated as if you would only receive one
element at a time.
On Sun, Apr 18, 2021 at 11:08 AM Mathieu D
aggregation
m2 in the 9:00-10:00 aggregation
What's the proper way to set the watermarks in such a case ?
Thanks for your insights !
Mathieu
Le sam. 17 avr. 2021 à 07:05, Lasse Nedergaard <
lassenedergaardfl...@gmail.com> a écrit :
> Hi
>
> One thing to remember is that Flinks wate
Hello,
I'm totally new to Flink, and I'd like to make sure I understand things
properly around watermarks.
We're processing messages from iot devices.
Those messages have a timestamp, and we have a first phase of processing
based on this timestamp. So far so good.
These messages actually "
Hi everyone,
I'm seeing some strange behavior from FlinkKafkaConsumer. I wrote up some
Flink processors using Flink 1.12, and tried to get them working on Amazon
EMR. However Amazon EMR only supports Flink 1.11.2 at the moment. When I
went to downgrade, I found, inexplicably, that watermarks were
Thank you.
Yuan Mei 于2021年3月4日周四 下午11:10写道:
> Hey Yidan,
>
> KafkaShuffle is initially motivated to support shuffle data
> materialization on Kafka, and started with a limited version supporting
> hash-partition only. Watermark is maintained and forwarded as part of
> shuffle data. So you are
Hey Yidan,
KafkaShuffle is initially motivated to support shuffle data materialization
on Kafka, and started with a limited version supporting hash-partition
only. Watermark is maintained and forwarded as part of shuffle data. So you
are right, watermark storing/forwarding logic has nothing to do
And do you know when kafka consumer/producer will be re implemented
according to the new source/sink api? I am thinking whether I should adjust
the code for now, since I need to re adjust the code when it is
reconstructed to the new source/sink api.
yidan zhao 于2021年3月4日周四 下午4:44写道:
> I
I uploaded a picture to describe that.
https://ftp.bmp.ovh/imgs/2021/03/2068f2e22045e696.png
>
One more question, If I only need watermark's logic, not keyedStream, why
not provide methods such as writeDataStream and readDataStream. It uses the
similar methods for kafka producer sink records and broadcast watermark to
partitions and then kafka consumers read it and regenerate the watermark.
Great :)
Just one more note. Currently FlinkKafkaShuffle has a critical bug [1] that
probably will prevent you from using it directly. I hope it will be fixed
in some next release. In the meantime you can just inspire your solution
with the source code.
Best,
Piotrek
[1]
Yes, you are right and thank you. I take a brief look at what
FlinkKafkaShuffle is doing, it seems what I need and I will have a try.
>
Hi,
Can not you write the watermark as a special event to the "mid-topic"? In
the "new job2" you would parse this event and use it to assign watermark
before `xxxWindow2`? I believe this is what FlinkKafkaShuffle is doing [1],
you could look at its code for inspiration.
Piotrek
[1]
I have a job which includes about 50+ tasks. I want to split it to multiple
jobs, and the data is transferred through Kafka, but how about watermark?
Is anyone have do something similar and solved this problem?
Here I give an example:
The original job: kafkaStream1(src-topic) => xxxProcess =>
Basically the only thing that Watermarks do is to trigger event time
timers. Event time timers are used explicitly in KeyedProcessFunctions, but
are also used internally by time windows, CEP (to sort the event stream),
in various time-based join operations, and within the Table/SQL API.
If you
> it is not clear to me if watermarks are also used by map/flatmpat
operators or just by window operators.
Watermarks are most liked only used by timing segmented aggregation
operator to trigger result materialization. In streaming, this “timing
segmentation” is usually called “windowing”,
Hi,
reading through the documentation regarding waterrmarks, it is not clear to me
if watermarks are also used by map/flatmpat operators or just by window
operators.
My application reads from a kafka topic (with multiple partitions) and extracts
assigns timestamp on each tuple based on some
With an AscendingTimestampExtractor, watermarks are not created for every
event, and as your job starts up, some events will be processed before the
first watermark is generated.
The impossible value you see is an initial value that's in place until the
first real watermark is available
My source is a Kafka topic.
I am using Event Time.
I assign the event time with an AscendingTimestampExtractor
I noticed when debugging that in the KeyedProcessFunction that
after my highest known event time of: 2020-06-23T00:46:30.000Z
the processElement method had a watermark with an
ling window operator has no
> effect on the watermarks of the resulting append stream - the watermarks of
> the input stream are propagated as-is.
>
> This seems to be a documented behavior
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#interaction
ling window operator has no
> effect on the watermarks of the resulting append stream - the watermarks of
> the input stream are propagated as-is.
>
> This seems to be a documented behavior
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#interaction-of-wate
that the tumbling window operator has no
effect on the watermarks of the resulting append stream - the watermarks of
the input stream are propagated as-is.
This seems to be a documented behavior
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#interaction
1 - 100 of 284 matches
Mail list logo