RE: Unbounded sources unable to recover from checkpointMark when withMaxReadTime() is used

2020-07-27 Thread Sunny, Mani Kolbe
Thank you Max. Is there a reference code somewhere to implement checkpointing for BoundedReadFromUnboundedSource ? I couldn’t quiet figure out where to get restored checkpointMark. It appears to be specific to runner implementations. -Original Message- From: Maximilian Michels Sent:

Exceptions: Attempt to deliver a timer to a DoFn, but timers are not supported in Dataflow.

2020-07-27 Thread Mohil Khare
Hello All, Any idea how to debug this and find out which stage, which DoFn or which side input is causing the problem? Do I need to override OnTimer with every DoFn to avoid this problem? I thought that some uncaught exceptions were causing this and added various checks and exception handling in

Re: Partition unbounded collection like Kafka source

2020-07-27 Thread Chamikara Jayalath
Probably you should apply the Partition[1] transform on the output PCollection of your read. Note though that the exact parallelization is runner dependent (for example, runner might autoscale up resulting in more writers). Did you run into issues when just reading from Kafka and writing to

Re: Unbounded sources unable to recover from checkpointMark when withMaxReadTime() is used

2020-07-27 Thread Maximilian Michels
Hi Mani, Just to let you know I think it would make sense to either (1) implement checkpointing for BoundedReadFromUnboundedSource or (2) throw an error in case of a provided checkpoint mark Like you pointed out, ignoring it like we currently do, does not seem like a feasible solution. -Max