Re: Why is Async checkpointing made default?

Gaurav Gupta Sun, 22 Nov 2015 22:23:26 -0800

Thomas,

I did it to maintain the semantics of Checkpoint. If we need to change this 
then I can change the implementation.


Thanks
- Gaurav

> On Nov 22, 2015, at 10:11 PM, Thomas Weise <[email protected]> wrote:
> 
> You can only perform such operation in committed. Anything done in
> checkpointed can be repeated (until it becomes a recovery checkpoint).
> 
> On Sun, Nov 22, 2015 at 10:02 PM, Gaurav Gupta <[email protected]>
> wrote:
> 
>> Thomas,
>> 
>> This was done to preserve checkpointing semantics that is to tell the
>> operator that its state is preserved. Say if database is updated or files
>> are moved in checkpointed call but the state copy fails, how to address
>> such scenarios?
>> 
>> Thanks
>> - Gaurav
>> 
>>> On Nov 22, 2015, at 9:44 PM, Thomas Weise <[email protected]>
>> wrote:
>>> 
>>> Alternatively I would ask why the checkpointed callback needs to wait
>> until
>>> the data was copied to HDFS instead upon completion of the state
>>> serialization.
>>> 
>>> Thomas
>>> 
>>> 
>>> On Sun, Nov 22, 2015 at 9:41 PM, Chandni Singh <[email protected]>
>>> wrote:
>>> 
>>>> Gaurav,
>>>> 
>>>> My question is about why Async was made the default when it changed the
>>>> semantics of operator callbacks. Your response doesn't answer that.
>>>> 
>>>> In a way we broke backward compatibility.
>>>> 
>>>> Chandni
>>>> 
>>>> On Sun, Nov 22, 2015 at 9:22 PM, Gaurav Gupta <[email protected]>
>>>> wrote:
>>>> 
>>>>> The idea behind Async checkpointing is to unblock operator while the
>>>> state
>>>>> is getting transferred to HDFS.
>>>>> Just to clarify that this beginWindow (x) -> endWindow(x) ->
>> checkpointed
>>>>> (x-1 ) should be an ideal sequence, but if the HDFS is slow or for some
>>>>> other reason transferring the state to HDFS is slow this sequence may
>> not
>>>>> hold true.
>>>>> 
>>>>> Can your use case be addressed by
>>>>> https://malhar.atlassian.net/browse/APEX-78 <
>>>>> https://malhar.atlassian.net/browse/APEX-78>?
>>>>> 
>>>>> Thanks
>>>>> - Gaurav
>>>>> 
>>>>>> On Nov 22, 2015, at 3:56 PM, Chandni Singh <[email protected]>
>>>>> wrote:
>>>>>> 
>>>>>> With Async checkpointing the checkpoint callback in CheckpointPoint
>>>>>> listener is called for a previous window, that is,
>>>>>> beginWindow (x) -> endWindow(x) -> checkpointed (x-1 )
>>>>>> 
>>>>>> This feature was newly introduced. With synchronous checkpointing, the
>>>>>> behavior was always
>>>>>> beginWindow(x) -> endWindow(x) -> checkpointed (x)
>>>>>> 
>>>>>> A lot of operators were written before asynchronous checkpointing was
>>>>>> introduced and few of them can rely on the sequencing guaranteed by
>>>>>> synchronous checkpointing.
>>>>>> 
>>>>>> So why was Async Checkpointed made default?
>>>>>> 
>>>>>> With how Async checkpoint is today, the complexity to handle transient
>>>>>> state in checkpointed callback falls on every operator. For eg, lets
>>>> say
>>>>>> earlier I had a transient map which I cleared every time the
>>>> checkpointed
>>>>>> was called, with async checkpointing this simple task will be a lot
>>>> more
>>>>>> complicated.
>>>>>> 
>>>>>> I think Async checkpointing broke the semantics of operator callbacks
>>>> and
>>>>>> should NOT be the default.
>>>>> 
>>>>> 
>>>> 
>> 
>>

Re: Why is Async checkpointing made default?

Reply via email to