> On Feb. 12, 2014, 11:43 p.m., Jie Yu wrote:
> > We scanned all the log related code. There are a few places that need to be 
> > taken care of.
> > 
> > Log::Writer::append/truncate
> > Log::Reader::read
> > 
> > In the above functions, if timeout happens, we'll invoke 
> > 'future.discard()'. Should we wait for 'future' to become DISCARDED before 
> > we return None()? Maybe a TODO there?
> > 
> > LogReaderProcess/LogWriterProcess::recover()
> > 
> > Should we register 'onDiscard' callback on promise->future() and do 
> > 'promise->discard()' if we detect a discard attempt from the user?

We shouldn't need to do anything for Log::Writer::append/truncate or 
Log::Writer::read since those functions don't return a future. The underlying 
functions LogWriterProcess::append/truncate and LogReaderProcess::read just 
chain futures so a Future::discard on what ever they return should propagate 
through (unlike the cgroups code where we return a future from a promise and 
don't chain or associate that promise with any other asynchronous calls that 
are made).

The reason why I didn't wait for the completion of the future in Log::Writer::* 
and Log::Reader::* after we do a Future::discard is because we weren't waiting 
before (well, we couldn't technically wait before since discarded happened 
immediately!).

I've added a TODO to Log*Process::recover to register onDiscard callbacks.


> On Feb. 12, 2014, 11:43 p.m., Jie Yu wrote:
> > src/log/catchup.cpp, line 235
> > <https://reviews.apache.org/r/17686/diff/3/?file=470407#file470407line235>
> >
> >     We have a hard time understanding why you are changing the logic here. 
> > Seems that the timer you created here will get fired no matter what. What 
> > if the 'catching' operation succeeds?
> >     
> >     IIUC, except for the 'finalize' function, you don't have to do any 
> > change here.

With the old semantics calling 'timedout' would do 'catching.discard()' which 
would cause 'discarded' to get invoked which would restart 'catchup'. There was 
a lot weird about this IMHO:

(1) We used 'catching.discard()' to imply a timeout, even the code in 
'discarded' mentions the timeout, but that discard could have come from 
'log::catchup'!
(2) Given (1), if 'log::catchup' actually discarded the future we just simply 
tried again! :(

So now, when we timeout, we simply want to start another 'log::catchup'. Note 
that we don't wait for the old 'log::catchup' to complete just as we didn't 
before. In addition, a discarded event now properly propagates, and in this 
case I choose to propagate it as an error.

I did notice here that I should really do 'catching.discard()' in 'timedout' 
and then skip old attempts at 'log::catchup' in 'discarded' so I'm keeping this 
issue open for you to take another look.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17686/#review34246
-----------------------------------------------------------


On Feb. 5, 2014, 11:41 p.m., Benjamin Hindman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17686/
> -----------------------------------------------------------
> 
> (Updated Feb. 5, 2014, 11:41 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Ian Downes, Jie Yu, Niklas 
> Nielsen, TILL TOENSHOFF, Vinod Kone, and Jiang Yan Xu.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/java/jni/org_apache_mesos_state_AbstractState.cpp 
> 2ee0b1b631b80ec783e6bce683cdeaa77e56b2aa 
>   src/linux/cgroups.cpp 19ab1f348191ab0315271477b206aa8c6456fd5a 
>   src/log/catchup.cpp 4ee32f285f77eb2de661e22a301b743bb8a06f9c 
>   src/log/consensus.cpp b89673a3b8f233e901eaf9ae69a9979099f4eb73 
>   src/log/recover.cpp bb32e51e1172a8b32ac74be9848c1e72db5cefa0 
>   src/master/detector.cpp 2b169c551affe7acd2feac7806a27b46eb99bb88 
>   src/master/registrar.cpp 915885a160f790399e8185c28c6e6555af1ee76e 
>   src/sasl/authenticatee.hpp f1a677f8aed0979f958e51f85e0a8210a03bd343 
>   src/sasl/authenticator.hpp 1478f6771b424555c34586a0d61f208dc15b0e7d 
>   src/slave/gc.cpp 405350bf8f498d2e59e9e6b4c4c19b7bdaa974de 
>   src/zookeeper/contender.cpp 6710da4e64fc0a43c1eabfc0f39fb0133c13df14 
>   src/zookeeper/group.cpp ecb6c002e8194b8d67e262826d988f747414f9f3 
> 
> Diff: https://reviews.apache.org/r/17686/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>

Reply via email to