This is an automatically generated e-mail. To reply, visit:

Fix it, then Ship it!

Thanks for taking this on Neil!
As we found out, this code is not the easiest to reason through.
I left some issues for places we may be able to make it easier to read through 
the state assertions for the next set of readers.

src/tests/group_tests.cpp (lines 445 - 446)

    Can we add a short comment as to the state we're trying to achieve here?
    I think it will help readers of the test.

src/tests/group_tests.cpp (lines 451 - 452)

    Maybe a comment explaining that we're triggering the timeout? Or is this 
too self-explanatory?

src/zookeeper/group.cpp (lines 128 - 137)

    Not yours:
    Can we add a comment that we don't need to clean up the `delay` `Timer`s 
because they won't be invoked if libprocess can no longer get a 
`ProcessReference` to this Actor?

src/zookeeper/group.cpp (line 154)

    Should we s/promptly/within the sessionTimeout/ to be more clear?

src/zookeeper/group.cpp (lines 154 - 159)

    Some places we refer to `ZK` as in Zookeeper. Other places we refer to the 
handle `zk` as in the variable.
    This introduces a third `Zk`. Can we keep the code consistent with just the 
2 names above?
    We could say either the `ZK handle` or the ``zk` handle`?
    Here and elsewhere in your patch.

src/zookeeper/group.cpp (lines 365 - 366)

    Can we explain that a timer always exists during a fresh connection, and a 
    Maybe we can point to a top level comment where you explain the DNS 
stale-ness problem.

src/zookeeper/group.cpp (lines 367 - 368)

    Comment along these lines:
    Once we are connected, we will be notified of a disconnect through the 
`reconnecting` callback, at which point we will re-establish a timer (per the 
DNS stale-ness issue).

src/zookeeper/group.cpp (line 464)

    Comment along the lines of:
    This assertion tests that we only receive a single `reconnecting` callback 
for the `connected -> disconnected` state transition in the zookeeper client.

- Joris Van Remoortere

On Jan. 30, 2016, 1:16 a.m., Neil Conway wrote:
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42988/
> -----------------------------------------------------------
> (Updated Jan. 30, 2016, 1:16 a.m.)
> Review request for mesos and Joris Van Remoortere.
> Bugs: MESOS-4546
>     https://issues.apache.org/jira/browse/MESOS-4546
> Repository: mesos
> Description
> -------
> The previous implementation of `GroupProcess` tried to establish a single
> ZooKeeper connection on startup, but didn't attempt to retry. ZooKeeper will
> retry internally, but it only retries by attempting to reconnect to a list of
> previously resolved IPs; it doesn't attempt to re-resolve those IPs to pickup
> updates to DNS configuration. Because DNS configuration can be quite dynamic,
> we now close the current Zk handle and open a new one if we've seen a
> successful `zookeeper_init` but haven't been connected within the ZooKeeper
> session timeout.
> Diffs
> -----
>   src/tests/group_tests.cpp 77349465e0163c8aa6bed6deefe3f98efb442f3d 
>   src/zookeeper/group.hpp cf82fec290a2fa9bec122539c2eb0f12b45c2fb2 
>   src/zookeeper/group.cpp 2ae3193e0e138c90b205d45400d80e80853e1b99 
>   src/zookeeper/zookeeper.cpp 3c4fdad972dcd1728c52a05970646c713dcf98c8 
> Diff: https://reviews.apache.org/r/42988/diff/
> Testing
> -------
> make check, on both OSX and Arch Linux. Manually configured a situation in 
> which the Mesos agent uses stale DNS information in a loop: validated that 
> without the patch, we don't pickup DNS changes, whereas with the patch, we do.
> Also added a new unit test. Verified that the test fails w/o this patch 
> applied and passes deterministically (`gtest_repeat=100`) with the patch 
> applied.
> Thanks,
> Neil Conway

Reply via email to