Can you explain more about your custom watermark policy?

On Sun, Aug 24, 2025 at 11:43 AM gaurav mishra <[email protected]>
wrote:

> Hi All,
> We have a dataflow job which is consuming from 100s of kafka topics. This
> job has time based window aggregations. From time to time we see the jobs
> dropping data due to lateness at pipeline start. once the job reaches a
> steady state, there is no data loss. We are using a custom timestamp policy
> which is moving the watermark based on the message's header/payload to stay
> in the event_time domain.
> For the data loss following is my theory which i need your input if thats
> a valid possibility and if possible are there any solutions to this problem
> -
>
> say there are n topics - t1, t2..... t_withlag....tn which are being
> consumed from.
> all topics have almost 0 backlog except for one t_withlag.
> The backlog can be assumed to be substantial so that it goes beyond the
> allowed lateness.
>
> Job starts -
> 1) Budle1 starts getting created and let's say the bundle was filled only
> from the first few topics. so that runner could not reach the laggy topic.
> 2) watermark policy is invoked for all topics' partitions which
> participated in the above bundle.
> 3) Runner computed the global watermark based on the above input.
> 4) Since the laggy topic did not contribute to the watermark calculation
> the global watermark was set to ~now()
> 5) Next bundle starts processing and in this bundle runner started
> encountering msges from t_withlag and hence dropping data from those
> partitions until it is caught up.
>
> Is this scenario possible? If yes, what are possible solutions to such a
> situation?
>
>
>
>
>
>
>
>
>

Reply via email to