Alternatively I would ask why the checkpointed callback needs to wait until the data was copied to HDFS instead upon completion of the state serialization.
Thomas On Sun, Nov 22, 2015 at 9:41 PM, Chandni Singh <[email protected]> wrote: > Gaurav, > > My question is about why Async was made the default when it changed the > semantics of operator callbacks. Your response doesn't answer that. > > In a way we broke backward compatibility. > > Chandni > > On Sun, Nov 22, 2015 at 9:22 PM, Gaurav Gupta <[email protected]> > wrote: > > > The idea behind Async checkpointing is to unblock operator while the > state > > is getting transferred to HDFS. > > Just to clarify that this beginWindow (x) -> endWindow(x) -> checkpointed > > (x-1 ) should be an ideal sequence, but if the HDFS is slow or for some > > other reason transferring the state to HDFS is slow this sequence may not > > hold true. > > > > Can your use case be addressed by > > https://malhar.atlassian.net/browse/APEX-78 < > > https://malhar.atlassian.net/browse/APEX-78>? > > > > Thanks > > - Gaurav > > > > > On Nov 22, 2015, at 3:56 PM, Chandni Singh <[email protected]> > > wrote: > > > > > > With Async checkpointing the checkpoint callback in CheckpointPoint > > > listener is called for a previous window, that is, > > > beginWindow (x) -> endWindow(x) -> checkpointed (x-1 ) > > > > > > This feature was newly introduced. With synchronous checkpointing, the > > > behavior was always > > > beginWindow(x) -> endWindow(x) -> checkpointed (x) > > > > > > A lot of operators were written before asynchronous checkpointing was > > > introduced and few of them can rely on the sequencing guaranteed by > > > synchronous checkpointing. > > > > > > So why was Async Checkpointed made default? > > > > > > With how Async checkpoint is today, the complexity to handle transient > > > state in checkpointed callback falls on every operator. For eg, lets > say > > > earlier I had a transient map which I cleared every time the > checkpointed > > > was called, with async checkpointing this simple task will be a lot > more > > > complicated. > > > > > > I think Async checkpointing broke the semantics of operator callbacks > and > > > should NOT be the default. > > > > >
