You can only perform such operation in committed. Anything done in checkpointed can be repeated (until it becomes a recovery checkpoint).
On Sun, Nov 22, 2015 at 10:02 PM, Gaurav Gupta <[email protected]> wrote: > Thomas, > > This was done to preserve checkpointing semantics that is to tell the > operator that its state is preserved. Say if database is updated or files > are moved in checkpointed call but the state copy fails, how to address > such scenarios? > > Thanks > - Gaurav > > > On Nov 22, 2015, at 9:44 PM, Thomas Weise <[email protected]> > wrote: > > > > Alternatively I would ask why the checkpointed callback needs to wait > until > > the data was copied to HDFS instead upon completion of the state > > serialization. > > > > Thomas > > > > > > On Sun, Nov 22, 2015 at 9:41 PM, Chandni Singh <[email protected]> > > wrote: > > > >> Gaurav, > >> > >> My question is about why Async was made the default when it changed the > >> semantics of operator callbacks. Your response doesn't answer that. > >> > >> In a way we broke backward compatibility. > >> > >> Chandni > >> > >> On Sun, Nov 22, 2015 at 9:22 PM, Gaurav Gupta <[email protected]> > >> wrote: > >> > >>> The idea behind Async checkpointing is to unblock operator while the > >> state > >>> is getting transferred to HDFS. > >>> Just to clarify that this beginWindow (x) -> endWindow(x) -> > checkpointed > >>> (x-1 ) should be an ideal sequence, but if the HDFS is slow or for some > >>> other reason transferring the state to HDFS is slow this sequence may > not > >>> hold true. > >>> > >>> Can your use case be addressed by > >>> https://malhar.atlassian.net/browse/APEX-78 < > >>> https://malhar.atlassian.net/browse/APEX-78>? > >>> > >>> Thanks > >>> - Gaurav > >>> > >>>> On Nov 22, 2015, at 3:56 PM, Chandni Singh <[email protected]> > >>> wrote: > >>>> > >>>> With Async checkpointing the checkpoint callback in CheckpointPoint > >>>> listener is called for a previous window, that is, > >>>> beginWindow (x) -> endWindow(x) -> checkpointed (x-1 ) > >>>> > >>>> This feature was newly introduced. With synchronous checkpointing, the > >>>> behavior was always > >>>> beginWindow(x) -> endWindow(x) -> checkpointed (x) > >>>> > >>>> A lot of operators were written before asynchronous checkpointing was > >>>> introduced and few of them can rely on the sequencing guaranteed by > >>>> synchronous checkpointing. > >>>> > >>>> So why was Async Checkpointed made default? > >>>> > >>>> With how Async checkpoint is today, the complexity to handle transient > >>>> state in checkpointed callback falls on every operator. For eg, lets > >> say > >>>> earlier I had a transient map which I cleared every time the > >> checkpointed > >>>> was called, with async checkpointing this simple task will be a lot > >> more > >>>> complicated. > >>>> > >>>> I think Async checkpointing broke the semantics of operator callbacks > >> and > >>>> should NOT be the default. > >>> > >>> > >> > >
