> On April 15, 2014, 9:12 p.m., Ben Mahler wrote:
> > src/log/recover.cpp, line 113
> > <https://reviews.apache.org/r/18600/diff/9/?file=549174#file549174line113>
> >
> >     If we're auto-initializing, shouldn't we 'watch' for the cluster size 
> > as opposed to the quorum size to ensure we don't get stuck?

Beyond just watching for the cluster size to appear, what happens when this 
watch is triggered but before any messages are sent out the replicas die (or 
are stopped by an operator)? We don't want the replicas to get into a 
completely blocked state so we need some way of either retrying to do the 
auto-initialization after some timeouts and all together bailing after some 
number of retries.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18600/#review40464
-----------------------------------------------------------


On April 4, 2014, 7:12 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18600/
> -----------------------------------------------------------
> 
> (Updated April 4, 2014, 7:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-984
>     https://issues.apache.org/jira/browse/MESOS-984
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/log/log.hpp 6787c80 
>   src/log/log.cpp 9dd992f 
>   src/log/recover.hpp 634bc06 
>   src/log/recover.cpp 688da5f 
>   src/tests/log_tests.cpp 4f08927 
> 
> Diff: https://reviews.apache.org/r/18600/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Reply via email to