je-ik commented on issue #25975:
URL: https://github.com/apache/beam/issues/25975#issuecomment-1776656656

   I finally had more time to look into the code. KinesisIO looks good in terms 
of default watermarking. What is probably causing the issues is [this line][1] 
in your test application. Processing time watermark will move watermark 
significantly on restore. It is actually strange (and possibly result of some 
other coincidence) that you don't see data loss on *every restart* of a 
Pipeline. I think the issue will disappear if you use the default watermarking 
policy and set allowed lateness as suggests [this comment][2]
   
    [1]: 
https://github.com/psolomin/beam-playground/blob/e0f1834d6ee76a796d6d21836e1d04140e6d9ca2/kinesis-io-with-enhanced-fan-out/src/main/java/com/psolomin/consumer/KinesisToFilePipeline.java#L52
    [2]: 
https://github.com/apache/beam/blob/bf5ded44e6ee7e9e752f44379df43a3aa453fc7e/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/KinesisReader.java#L127


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to