[ 
https://issues.apache.org/jira/browse/BEAM-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci reassigned BEAM-2467:
----------------------------------

    Assignee: Paweł Kaczmarczyk

> KinesisIO watermark based on approximateArrivalTimestamp
> --------------------------------------------------------
>
>                 Key: BEAM-2467
>                 URL: https://issues.apache.org/jira/browse/BEAM-2467
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Paweł Kaczmarczyk
>            Assignee: Paweł Kaczmarczyk
>
> In Kinesis we can start reading the stream at some point in the past during 
> the retention period (up to 7 days). With current approach for setting 
> record's timestamp and watermark (both are always set to current time, i.e. 
> Instant.now()), we can't observe the actual position in the stream.
> So the idea is to change this behaviour and set the record timestamp based on 
> the 
> [ApproximateArrivalTimestamp|http://docs.aws.amazon.com/kinesis/latest/APIReference/API_Record.html#Streams-Type-Record-ApproximateArrivalTimestamp].
>  Watermark will be set accordingly to the last read record's timestamp. 
> ApproximateArrivalTimestamp is still some approximation and may result in 
> having records with out-of-order timestamp's which in turn may result in some 
> events marked as late. This however should not be a frequent issue and even 
> if it happens it should be a matter of milliseconds or seconds so can be 
> handled even with a tiny allowedLateness setting



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to