There are a lot of things which are different when it comes to async checkpointing. I was evaluating it in the morning and expect that either I am able to explain or open jira issues. With my partial observation is that with Async checkpointing, checkpointed is not issued (chandni, the last statement in the if block is ³return²). I am digging into it but feel free to chime in if someone else is able to find that.
Also I realized that my morning email applies as it is to committed but checkpointed has deviated a little bit from that. Will post the revised response soon. ‹ Chetan On 11/10/15, 2:04 PM, "Chandni Singh" <[email protected]> wrote: >Chetan, > >Looking at the checkpoint(windowId) in Node.java, I don't think the steps >you mentioned are followed. > >*if (using AsyncFSStorageAgent) {* >* asyncFSStorageAgent.copyToHdfs(...)* >*}* >*operator.checkpointed(windowId);* > >This means even copyToHdfs fails the operator is notified that the window >is check-pointed. > >Are we saying that copyToHdfs will never fail with AsyncFSStorageAgent for >a window since the operator is notified that the window is checkpointed? > >Chandni > >On Tue, Nov 10, 2015 at 11:33 AM, Timothy Farkas <[email protected]> >wrote: > >> Will do >> >> On Tue, Nov 10, 2015 at 11:01 AM, Pramod Immaneni >><[email protected]> >> wrote: >> >> > Is there a unit test covering it? Otherwise can you write one to test >>the >> > hypothesis. >> > >> > On Tue, Nov 10, 2015 at 11:00 AM, Timothy Farkas <[email protected]> >> > wrote: >> > >> > > That is what it is looking like to me. The task is submitted >> > > GenericNode#checkpoint line 504, then at the end of the >> > > GenericNode#checkpoint line 531 checkpointed is called. I am likely >> > missing >> > > something, just would like to know what :) >> > > >> > > Tim >> > > >> > > On Tue, Nov 10, 2015 at 10:51 AM, Pramod Immaneni < >> > [email protected]> >> > > wrote: >> > > >> > > > Tim, >> > > > >> > > > Are you suggesting that checkpointed is called before the >>checkpoint >> is >> > > > completely persisted in the storage. >> > > > >> > > > Thanks >> > > > >> > > > On Tue, Nov 10, 2015 at 10:49 AM, Timothy Farkas < >> [email protected]> >> > > > wrote: >> > > > >> > > > > Chetan, >> > > > > >> > > > > I do not see the process of reporting the checkpoint to stram, >> > > receiving >> > > > > the ack, and then calling checkpointed. The logic I'm seeing in >> > > > GenericNode >> > > > > line 484 is that the checkpoint method is called, it spawns >>another >> > > > thread >> > > > > that writes to hdfs, and then checkpointed is immediately called >> > > > > afterwards. I am missing something, can you give me some >>pointers >> so >> > > > that I >> > > > > can better understand the flow? >> > > > > >> > > > > Tim >> > > > > >> > > > > On Tue, Nov 10, 2015 at 10:33 AM, Munagala Ramanath < >> > > [email protected] >> > > > > >> > > > > wrote: >> > > > > >> > > > > > Chetan's answer provides a good explanation as well as >>clarifying >> > > that >> > > > > > the difference can be more than 1. >> > > > > > >> > > > > > Since checkpointing (i.e. "commit notification" as Thomas >>refers >> to >> > > > > > it) is asynchronous, I'm curious >> > > > > > about whether the window ids in the checkpointed call are >> > guaranteed >> > > > > > to be sequential or if they could >> > > > > > be out of order, i.e. can the checkpointed call see window id >>101 >> > > > > > before it sees 100 ? >> > > > > > >> > > > > > Ram >> > > > > > >> > > > > > On Tue, Nov 10, 2015 at 10:27 AM, Bhupesh Chawda >> > > > > > <[email protected]> wrote: >> > > > > > > Hi Tim, >> > > > > > > Thanks for the detailed explanation. >> > > > > > > I understand that the sequence would be >> > > > > > > beginWindow (x) -> endWindow (x) -> checkpointed (x) -> >> > > beginWindow >> > > > > > > (x+1) >> > > > > > > >> > > > > > > However when I try to print out the window ids in >>beginWindow, >> > > > > endWindow >> > > > > > > and checkpointed calls, I see x and x-1 respectively. >> > > > > > > I.e. If the window just before checkpoint is 100, I see that >> the >> > > > > > > checkpointed call had window id 99. >> > > > > > > >> > > > > > > Note: This is observed in the local mode of Apex. >> > > > > > > >> > > > > > > Thanks >> > > > > > > -Bhupesh >> > > > > > > On 10-Nov-2015 11:25 pm, "Timothy Farkas" >><[email protected] >> > >> > > > wrote: >> > > > > > > >> > > > > > >> Hi Bhupesh, >> > > > > > >> >> > > > > > >> The sequencing of checkpoint called in relation to >>beginWindow >> > and >> > > > > > >> endWindow depends on how your APPLICATION_WINDOW_COUNT and >> > > > > > >> CHECKPOINT_WINDOW_COUNT are configured. If the two >> WINDOW_COUNTs >> > > are >> > > > > not >> > > > > > >> configured to be the same then it is possible that >> checkpointed >> > is >> > > > > > called >> > > > > > >> within an application window. So the sequence of events >>would >> be >> > > > this: >> > > > > > >> >> > > > > > >> beginWindow -> checkpointed -> endWindow >> > > > > > >> >> > > > > > >> If the APPLICATION_WINDOW_COUNT and CHECKPOINT_WINDOW_COUNT >> are >> > > the >> > > > > same >> > > > > > >> (default). Then the sequence of calls would be this: >> > > > > > >> >> > > > > > >> beginWindow (100) -> endWindow (100) -> checkpointed (100) >> -> >> > > > > > beginWindow >> > > > > > >> (101) >> > > > > > >> >> > > > > > >> The numbers in the sequence represent possible streaming >> window >> > > Ids >> > > > > that >> > > > > > >> each call would be associated with. >> > > > > > >> >> > > > > > >> The StateMachine which calls these callbacks for non-input >> > > operators >> > > > > is >> > > > > > in >> > > > > > >> GenericNode.java. >> > > > > > >> >> > > > > > >> Tim >> > > > > > >> >> > > > > > >> On Tue, Nov 10, 2015 at 3:36 AM, Bhupesh Chawda < >> > > > > > [email protected]> >> > > > > > >> wrote: >> > > > > > >> >> > > > > > >> > Hi Chetan / Community, >> > > > > > >> > >> > > > > > >> > Can someone please elaborate on why the window id >>supplied >> to >> > > > > > >> > CheckpointListener and the Operator would differ. >> > > > > > >> > I tried looking at the window ids of checkpointed() and >>the >> > > > > > beginWindow() >> > > > > > >> > calls and they differ by 1. Don't know why this should be >> the >> > > > case. >> > > > > > >> > >> > > > > > >> > Thanks. >> > > > > > >> > -Bhupesh >> > > > > > >> > >> > > > > > >> > On Thu, Sep 17, 2015 at 5:56 AM, Chetan Narsude < >> > > > > > [email protected]> >> > > > > > >> > wrote: >> > > > > > >> > >> > > > > > >> > > Short answer is yes. >> > > > > > >> > > >> > > > > > >> > > All the control tuples are scheduled to be delivered >> outside >> > > of >> > > > > the >> > > > > > >> > window. >> > > > > > >> > > As checkpointed callback is triggered because of >> CHECKPOINT >> > > > > control >> > > > > > >> > tuple, >> > > > > > >> > > it will happen after endWindow and before the next >> > > beginWindow. >> > > > > > >> > > >> > > > > > >> > > The windowId supplied to CheckpointListener and the one >> > > provided >> > > > > to >> > > > > > >> > > Operator need not match even though the sequence is >> defined. >> > > So >> > > > I >> > > > > am >> > > > > > >> > > curious how you intend to use this knowledge. >> > > > > > >> > > >> > > > > > >> > > -- >> > > > > > >> > > Chetan >> > > > > > >> > > >> > > > > > >> > > >> > > > > > >> > > On Tue, Sep 15, 2015 at 8:31 AM, Thomas Weise < >> > > > > > [email protected]> >> > > > > > >> > > wrote: >> > > > > > >> > > >> > > > > > >> > > > It has not changed the operator execution model. >>State >> > > > > > serialization >> > > > > > >> is >> > > > > > >> > > > still synchronous, write to HDFS is taken out of the >> > > operator >> > > > > > thread. >> > > > > > >> > > > >> > > > > > >> > > > On Tue, Sep 15, 2015 at 8:18 AM, Amol Kekre < >> > > > > [email protected] >> > > > > > > >> > > > > > >> > > wrote: >> > > > > > >> > > > >> > > > > > >> > > > > >> > > > > > >> > > > > Sent too soon. Has asynchronous checkpointing >>changed >> > > this? >> > > > > > >> > > > > >> > > > > > >> > > > > Amol >> > > > > > >> > > > > >> > > > > > >> > > > > Sent from my iPhone >> > > > > > >> > > > > >> > > > > > >> > > > > > On Sep 15, 2015, at 12:38 AM, Bhupesh Chawda < >> > > > > > >> > > [email protected]> >> > > > > > >> > > > > wrote: >> > > > > > >> > > > > > >> > > > > > >> > > > > > Hi All, >> > > > > > >> > > > > > >> > > > > > >> > > > > > Is it safe to assume that the checkpointed() and >>the >> > > > > > >> beginWindow() >> > > > > > >> > > > calls >> > > > > > >> > > > > > are sequenced? >> > > > > > >> > > > > > In other words, are these calls part of the same >> > thread >> > > > and >> > > > > > may >> > > > > > >> not >> > > > > > >> > > run >> > > > > > >> > > > > in >> > > > > > >> > > > > > parallel? >> > > > > > >> > > > > > >> > > > > > >> > > > > > Thanks. >> > > > > > >> > > > > > >> > > > > > >> > > > > > -- >> > > > > > >> > > > > > -Bhupesh >> > > > > > >> > > > > >> > > > > > >> > > > >> > > > > > >> > > >> > > > > > >> > >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >>
