Why not just to create a custom watermark policy for that? Or you mean to make 
it as a default policy?

—
Alexey

> On 27 Oct 2023, at 10:25, Jan Lukavský <je...@seznam.cz> wrote:
> 
> 
> Hi, 
> 
> when discussing about [1] we found out, that the issue is actually caused by 
> processing time watermarks in KinesisIO. Enabling this watermark outputs 
> watermarks based on current processing time, _but event timestamps are 
> derived from ingestion timestamp_. This can cause unbounded lateness when 
> processing backlog. I think this setup is error-prone and will likely cause 
> data loss due to dropped elements. This can be solved in two ways: 
> 
>  a) deprecate processing time watermarks, or 
> 
>  b) modify KinesisIO's watermark policy so that is assigns event timestamps 
> as well (the processing-time watermark policy would have to derive event 
> timestamps from processing-time). 
> 
> I'd prefer option b) , but it might be a breaking change, moreover I'm not 
> sure if I understand the purpose of processing-time watermark policy, it 
> might be essentially ill defined from the beginning, thus it might really be 
> better to remove it completely. There is also a related issue [2]. 
> 
> Any thoughts on this? 
> 
>  Jan 
> 
> [1] https://github.com/apache/beam/issues/25975 
> 
> [2] https://github.com/apache/beam/issues/28760 
> 

Reply via email to