Gaurav, My question is about why Async was made the default when it changed the semantics of operator callbacks. Your response doesn't answer that.
In a way we broke backward compatibility. Chandni On Sun, Nov 22, 2015 at 9:22 PM, Gaurav Gupta <[email protected]> wrote: > The idea behind Async checkpointing is to unblock operator while the state > is getting transferred to HDFS. > Just to clarify that this beginWindow (x) -> endWindow(x) -> checkpointed > (x-1 ) should be an ideal sequence, but if the HDFS is slow or for some > other reason transferring the state to HDFS is slow this sequence may not > hold true. > > Can your use case be addressed by > https://malhar.atlassian.net/browse/APEX-78 < > https://malhar.atlassian.net/browse/APEX-78>? > > Thanks > - Gaurav > > > On Nov 22, 2015, at 3:56 PM, Chandni Singh <[email protected]> > wrote: > > > > With Async checkpointing the checkpoint callback in CheckpointPoint > > listener is called for a previous window, that is, > > beginWindow (x) -> endWindow(x) -> checkpointed (x-1 ) > > > > This feature was newly introduced. With synchronous checkpointing, the > > behavior was always > > beginWindow(x) -> endWindow(x) -> checkpointed (x) > > > > A lot of operators were written before asynchronous checkpointing was > > introduced and few of them can rely on the sequencing guaranteed by > > synchronous checkpointing. > > > > So why was Async Checkpointed made default? > > > > With how Async checkpoint is today, the complexity to handle transient > > state in checkpointed callback falls on every operator. For eg, lets say > > earlier I had a transient map which I cleared every time the checkpointed > > was called, with async checkpointing this simple task will be a lot more > > complicated. > > > > I think Async checkpointing broke the semantics of operator callbacks and > > should NOT be the default. > >
