> On Nov. 26, 2013, 3:58 a.m., Ben Mahler wrote:
> > src/zookeeper/group.cpp, line 350
> > <https://reviews.apache.org/r/15706/diff/3/?file=390859#file390859line350>
> >
> >     Ditto about adding braces.
> >     
> >     What if we're in connecting? Should this instead be a CHECK that we're 
> > in AUTHENTICATED?

Ditto. Added CHECK.


> On Nov. 26, 2013, 3:58 a.m., Ben Mahler wrote:
> > src/tests/group_tests.cpp, lines 334-336
> > <https://reviews.apache.org/r/15706/diff/3/?file=390857#file390857line334>
> >
> >     Is this guaranteed to cause retries?

Likely... and I have read logs to see the retries of authenticate() and 
create() *always* get executed.

This test is testing that code's robustness can tolerate many expirations (one 
type of retryable errors that we can control) but with authenticate() and 
create() not being called through dispatch() it is hard to control the exact 
timing. I think high iteration without flakiness should be good proof.


> On Nov. 26, 2013, 3:58 a.m., Ben Mahler wrote:
> > src/zookeeper/group.cpp, line 321
> > <https://reviews.apache.org/r/15706/diff/3/?file=390859#file390859line321>
> >
> >     Can you CHECK that state is ready here before you call sync()?
> >     
> >     Since this ignores the result of sync(), the assumption here is that if 
> > sync returns false, we must be imminently getting a callback from ZK, does 
> > this assumption hold? Can you add a note?

About the CHECK:
It's actually not true. The status might still be CONNECTING because during the 
execution of the initial connected() the network might go down and it might 
come back soon enough that connected(reconnect=true) is called instead of 
expired(). In this case it's reconnected but the group setup is not finished 
yet thus not READY.

About sync():
OK I actually think I overlooked this one. I should need to check the return 
value because the the errors are retryable (e.g. operation timed out) we might 
not be notified by ZK. However if authenticate, create or cache fails due to 
non-retryable error should have aborted and will be notified. (If joins, datas, 
cancels fail in such a way no retry/group abort will follow).

I rewrote GroupProcess::connected.


> On Nov. 26, 2013, 3:58 a.m., Ben Mahler wrote:
> > src/zookeeper/group.cpp, line 327
> > <https://reviews.apache.org/r/15706/diff/3/?file=390859#file390859line327>
> >
> >     Can you add braces?
> >     
> >     if (...) {
> >       return true;
> >     }
> >     
> >     Should this instead be a CHECK that we're in CONNECTING?

In this review, if a retry() is already timed out after the delay and enqueued 
when expire() happens, it cannot be cancelled. Thus it can be executed before 
connected() is called and in connected() when authenticate() and create() are 
invoked they MIGHT already unnecessary depending on the result of retry().

But now that I have refactored the code so that connected() and authenticate() 
are only called from one place and within if branches, I can add CHECKs now.


- Jiang Yan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15706/#review29414
-----------------------------------------------------------


On Nov. 28, 2013, 8:52 a.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15706/
> -----------------------------------------------------------
> 
> (Updated Nov. 28, 2013, 8:52 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone.
> 
> 
> Bugs: MESOS-814
>     https://issues.apache.org/jira/browse/MESOS-814
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/tests/group_tests.cpp fb4c2f103ab7436b0fa2d6276c1ec1011a9405f0 
>   src/zookeeper/group.hpp 04068e357cec95457d1f24c166d0b60f86d997d2 
>   src/zookeeper/group.cpp 12c781b29f4300ca8a29660adc3f1e55e03d5d04 
> 
> Diff: https://reviews.apache.org/r/15706/diff/
> 
> 
> Testing
> -------
> 
> make check & mesos-tests.sh 
> --gtest_filter=GroupTest*:ZooKeeperTest*:ZooKeeperMasterContenderDetectorTest*
>  with 100 iterations.
> 
> Added a test to exercise authenticate() and create() retries.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>

Reply via email to