[
https://issues.apache.org/jira/browse/BEAM-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ajo Thomas updated BEAM-7240:
-----------------------------
Description:
Currently, watermarks in kinesis IO are computed taking into account the record
arrival time in a {{KinesisRecord}}. The arrival time might not always be the
right representation of the event time. The user of the IO should be able to
specify how they want to extract the event time from the KinesisRecord.
As the per current logic, the end user of the IO cannot control watermark
computation in any way. A user should be able to control watermark computation
through some custom heuristics or configurable params like time duration to
advance the watermark if no data was received (could be due to a shard getting
stalled. The watermark should advance and not be stalled in that case).
was:Currently, watermarks in kinesis IO are computed taking into account the
record arrival time in a {{KinesisRecord}}. The arrival time might not always
be the right representation of the event time. Also, as the per current logic,
the end user of the IO cannot control watermark computation in any way.
Summary: Kinesis IO Watermark Computation Improvements (was: Kinesis
IO Watermark Improvements)
> Kinesis IO Watermark Computation Improvements
> ---------------------------------------------
>
> Key: BEAM-7240
> URL: https://issues.apache.org/jira/browse/BEAM-7240
> Project: Beam
> Issue Type: Improvement
> Components: io-java-kinesis
> Reporter: Ajo Thomas
> Priority: Minor
>
> Currently, watermarks in kinesis IO are computed taking into account the
> record arrival time in a {{KinesisRecord}}. The arrival time might not always
> be the right representation of the event time. The user of the IO should be
> able to specify how they want to extract the event time from the
> KinesisRecord.
> As the per current logic, the end user of the IO cannot control watermark
> computation in any way. A user should be able to control watermark
> computation through some custom heuristics or configurable params like time
> duration to advance the watermark if no data was received (could be due to a
> shard getting stalled. The watermark should advance and not be stalled in
> that case).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)