If the requirement is that the order is always begingWindow()->endWindow()->checkpointed(), why to pass windowId in the checkpointed() call back?
Thanks - Gaurav > On Nov 22, 2015, at 11:22 PM, Chandni Singh <[email protected]> wrote: > > FYI, > > HDHTWriter implementation is dependent on the older semantics and seems to > be broken now. > startWindow(x) -> endWindow(x) -> checkpointed(x) > In the checkpointed implementation, it copies certain state (transient) and > transfers it to a checkpointedWriteCache with respect to window 'x'. > > With Async checkpointing it, the state that is transferred is much more > recent than window 'x'. > > Chandni > > > On Sun, Nov 22, 2015 at 11:04 PM, Chandni Singh <[email protected]> > wrote: > >> Agreed. Thomas's solution fixes the backward incompatibility. I think we >> really need to fix this. >> >> On Sun, Nov 22, 2015 at 10:23 PM, Timothy Farkas <[email protected]> >> wrote: >> >>> Gaurav, >>> >>> I think if the state copy fails then STRAM should roll back the operator >>> to >>> a checkpoint that is further back than the last checkpoint. If you are >>> saying that you want to preserve the semantic that checkpointed is only >>> called after a checkpoint is completed, I would argue that that guarantee >>> is already pointless in the current implementation since it is always >>> possible for an operator to be rolled back to a checkpoint before it's >>> last >>> completed checkpoint. So, it is already currently possible for some >>> database or file operation performed after a completed checkpoint to be >>> redone after a failure. Because of this I think Thomas's solution makes >>> the >>> most sense. Thomas's solution would also address Chandni's original point >>> that the semantics for the checkpointed call back have been violated. >>> There >>> are operators in our libraries that have depended on the beginWindow(x), >>> endWindow(x), and checkpointed(x) call sequence, which is now broken. We >>> should probably fix that. >>> >>> Tim >>> >>> On Sun, Nov 22, 2015 at 10:02 PM, Gaurav Gupta <[email protected]> >>> wrote: >>> >>>> Thomas, >>>> >>>> This was done to preserve checkpointing semantics that is to tell the >>>> operator that its state is preserved. Say if database is updated or >>> files >>>> are moved in checkpointed call but the state copy fails, how to address >>>> such scenarios? >>>> >>>> Thanks >>>> - Gaurav >>>> >>>>> On Nov 22, 2015, at 9:44 PM, Thomas Weise <[email protected]> >>>> wrote: >>>>> >>>>> Alternatively I would ask why the checkpointed callback needs to wait >>>> until >>>>> the data was copied to HDFS instead upon completion of the state >>>>> serialization. >>>>> >>>>> Thomas >>>>> >>>>> >>>>> On Sun, Nov 22, 2015 at 9:41 PM, Chandni Singh < >>> [email protected]> >>>>> wrote: >>>>> >>>>>> Gaurav, >>>>>> >>>>>> My question is about why Async was made the default when it changed >>> the >>>>>> semantics of operator callbacks. Your response doesn't answer that. >>>>>> >>>>>> In a way we broke backward compatibility. >>>>>> >>>>>> Chandni >>>>>> >>>>>> On Sun, Nov 22, 2015 at 9:22 PM, Gaurav Gupta < >>> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> The idea behind Async checkpointing is to unblock operator while the >>>>>> state >>>>>>> is getting transferred to HDFS. >>>>>>> Just to clarify that this beginWindow (x) -> endWindow(x) -> >>>> checkpointed >>>>>>> (x-1 ) should be an ideal sequence, but if the HDFS is slow or for >>> some >>>>>>> other reason transferring the state to HDFS is slow this sequence >>> may >>>> not >>>>>>> hold true. >>>>>>> >>>>>>> Can your use case be addressed by >>>>>>> https://malhar.atlassian.net/browse/APEX-78 < >>>>>>> https://malhar.atlassian.net/browse/APEX-78>? >>>>>>> >>>>>>> Thanks >>>>>>> - Gaurav >>>>>>> >>>>>>>> On Nov 22, 2015, at 3:56 PM, Chandni Singh < >>> [email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> With Async checkpointing the checkpoint callback in CheckpointPoint >>>>>>>> listener is called for a previous window, that is, >>>>>>>> beginWindow (x) -> endWindow(x) -> checkpointed (x-1 ) >>>>>>>> >>>>>>>> This feature was newly introduced. With synchronous checkpointing, >>> the >>>>>>>> behavior was always >>>>>>>> beginWindow(x) -> endWindow(x) -> checkpointed (x) >>>>>>>> >>>>>>>> A lot of operators were written before asynchronous checkpointing >>> was >>>>>>>> introduced and few of them can rely on the sequencing guaranteed by >>>>>>>> synchronous checkpointing. >>>>>>>> >>>>>>>> So why was Async Checkpointed made default? >>>>>>>> >>>>>>>> With how Async checkpoint is today, the complexity to handle >>> transient >>>>>>>> state in checkpointed callback falls on every operator. For eg, >>> lets >>>>>> say >>>>>>>> earlier I had a transient map which I cleared every time the >>>>>> checkpointed >>>>>>>> was called, with async checkpointing this simple task will be a lot >>>>>> more >>>>>>>> complicated. >>>>>>>> >>>>>>>> I think Async checkpointing broke the semantics of operator >>> callbacks >>>>>> and >>>>>>>> should NOT be the default. >>>>>>> >>>>>>> >>>>>> >>>> >>>> >>> >> >>
