FYI,

HDHTWriter implementation is dependent on the older semantics and seems to
be broken now.
startWindow(x) -> endWindow(x) -> checkpointed(x)
In the checkpointed implementation, it copies certain state (transient) and
transfers it to a checkpointedWriteCache with respect to window 'x'.

With Async checkpointing it, the state that is transferred is much more
recent than window 'x'.

Chandni


On Sun, Nov 22, 2015 at 11:04 PM, Chandni Singh <[email protected]>
wrote:

> Agreed. Thomas's solution fixes the backward incompatibility. I think we
> really need to fix this.
>
> On Sun, Nov 22, 2015 at 10:23 PM, Timothy Farkas <[email protected]>
> wrote:
>
>> Gaurav,
>>
>> I think if the state copy fails then STRAM should roll back the operator
>> to
>> a checkpoint that is further back than the last checkpoint. If you are
>> saying that you want to preserve the semantic that checkpointed is only
>> called after a checkpoint is completed, I would argue that that guarantee
>> is already pointless in the current implementation since it is always
>> possible for an operator to be rolled back to a checkpoint before it's
>> last
>> completed checkpoint. So, it is already currently possible for some
>> database or file operation performed after a completed checkpoint to be
>> redone after a failure. Because of this I think Thomas's solution makes
>> the
>> most sense. Thomas's solution would also address Chandni's original point
>> that the semantics for the checkpointed call back have been violated.
>> There
>> are operators in our libraries that have depended on the beginWindow(x),
>> endWindow(x), and checkpointed(x) call sequence, which is now broken. We
>> should probably fix that.
>>
>> Tim
>>
>> On Sun, Nov 22, 2015 at 10:02 PM, Gaurav Gupta <[email protected]>
>> wrote:
>>
>> > Thomas,
>> >
>> > This was done to preserve checkpointing semantics that is to tell the
>> > operator that its state is preserved. Say if database is updated or
>> files
>> > are moved in checkpointed call but the state copy fails, how to address
>> > such scenarios?
>> >
>> > Thanks
>> > - Gaurav
>> >
>> > > On Nov 22, 2015, at 9:44 PM, Thomas Weise <[email protected]>
>> > wrote:
>> > >
>> > > Alternatively I would ask why the checkpointed callback needs to wait
>> > until
>> > > the data was copied to HDFS instead upon completion of the state
>> > > serialization.
>> > >
>> > > Thomas
>> > >
>> > >
>> > > On Sun, Nov 22, 2015 at 9:41 PM, Chandni Singh <
>> [email protected]>
>> > > wrote:
>> > >
>> > >> Gaurav,
>> > >>
>> > >> My question is about why Async was made the default when it changed
>> the
>> > >> semantics of operator callbacks. Your response doesn't answer that.
>> > >>
>> > >> In a way we broke backward compatibility.
>> > >>
>> > >> Chandni
>> > >>
>> > >> On Sun, Nov 22, 2015 at 9:22 PM, Gaurav Gupta <
>> [email protected]>
>> > >> wrote:
>> > >>
>> > >>> The idea behind Async checkpointing is to unblock operator while the
>> > >> state
>> > >>> is getting transferred to HDFS.
>> > >>> Just to clarify that this beginWindow (x) -> endWindow(x) ->
>> > checkpointed
>> > >>> (x-1 ) should be an ideal sequence, but if the HDFS is slow or for
>> some
>> > >>> other reason transferring the state to HDFS is slow this sequence
>> may
>> > not
>> > >>> hold true.
>> > >>>
>> > >>> Can your use case be addressed by
>> > >>> https://malhar.atlassian.net/browse/APEX-78 <
>> > >>> https://malhar.atlassian.net/browse/APEX-78>?
>> > >>>
>> > >>> Thanks
>> > >>> - Gaurav
>> > >>>
>> > >>>> On Nov 22, 2015, at 3:56 PM, Chandni Singh <
>> [email protected]>
>> > >>> wrote:
>> > >>>>
>> > >>>> With Async checkpointing the checkpoint callback in CheckpointPoint
>> > >>>> listener is called for a previous window, that is,
>> > >>>> beginWindow (x) -> endWindow(x) -> checkpointed (x-1 )
>> > >>>>
>> > >>>> This feature was newly introduced. With synchronous checkpointing,
>> the
>> > >>>> behavior was always
>> > >>>> beginWindow(x) -> endWindow(x) -> checkpointed (x)
>> > >>>>
>> > >>>> A lot of operators were written before asynchronous checkpointing
>> was
>> > >>>> introduced and few of them can rely on the sequencing guaranteed by
>> > >>>> synchronous checkpointing.
>> > >>>>
>> > >>>> So why was Async Checkpointed made default?
>> > >>>>
>> > >>>> With how Async checkpoint is today, the complexity to handle
>> transient
>> > >>>> state in checkpointed callback falls on every operator. For eg,
>> lets
>> > >> say
>> > >>>> earlier I had a transient map which I cleared every time the
>> > >> checkpointed
>> > >>>> was called, with async checkpointing this simple task will be a lot
>> > >> more
>> > >>>> complicated.
>> > >>>>
>> > >>>> I think Async checkpointing broke the semantics of operator
>> callbacks
>> > >> and
>> > >>>> should NOT be the default.
>> > >>>
>> > >>>
>> > >>
>> >
>> >
>>
>
>

Reply via email to