Does anyone review this path? I have tested it for several days in our testing environment, it have passed more than 120 cases, it works well for us.
On Mon, Jul 16, 2012 at 1:07 PM, Yunkai Zhang <[email protected]> wrote: > From: Yunkai Zhang <[email protected]> > > V2: > - fix zk_queue_pop() when it's called by zk_unblock(): > continue to process block event when is_zk_unblock is True > -------------------------------------------------------- >8 > > As cluster request may retry infinitely when some sheeps left, than > cluster_op_done could not to be called forever, so it will cause cluster > hang problem. > > By giving priority to process LEAVE event when there is unfinished BLOCK > event, we can fix this issue, but also comply with the rule which is very > important for distributed system I think: > > All sheeps should process all events in the same order. > > Signed-off-by: Yunkai Zhang <[email protected]> > --- > sheep/cluster/zookeeper.c | 17 +++++++++++------ > 1 file changed, 11 insertions(+), 6 deletions(-) > > diff --git a/sheep/cluster/zookeeper.c b/sheep/cluster/zookeeper.c > index 7bd20bd..e03fd22 100644 > --- a/sheep/cluster/zookeeper.c > +++ b/sheep/cluster/zookeeper.c > @@ -71,6 +71,7 @@ static struct zk_event zk_levents[SD_MAX_NODES]; > static int nr_zk_levents; > static unsigned zk_levent_head; > static unsigned zk_levent_tail; > +static bool is_zk_unblock; > > static void *zk_node_btroot; > static struct zk_node *zk_master; > @@ -239,9 +240,11 @@ static int zk_queue_pop(zhandle_t *zh, struct zk_event > *ev) > struct zk_event *lev; > eventfd_t value = 1; > > - /* process leave event */ > - if (uatomic_read(&zk_notify_blocked) <= 0 && > - uatomic_read(&nr_zk_levents)) { > + /* > + * Continue to process LEAVE event even if > + * we have an unfinished BLOCK event. > + */ > + if (!is_zk_unblock && uatomic_read(&nr_zk_levents)) { > nr_levents = uatomic_sub_return(&nr_zk_levents, 1) + 1; > dprintf("nr_zk_levents:%d, head:%u\n", nr_levents, > zk_levent_head); > > @@ -282,6 +285,9 @@ static int zk_queue_pop(zhandle_t *zh, struct zk_event > *ev) > return 0; > } > > + if (!is_zk_unblock && uatomic_read(&zk_notify_blocked) > 0) > + return -1; > + > if (zk_queue_empty(zh)) > return -1; > > @@ -618,7 +624,9 @@ static void zk_unblock(void *msg, size_t msg_len) > struct zk_event ev; > eventfd_t value = 1; > > + is_zk_unblock = 1; > rc = zk_queue_pop(zhandle, &ev); > + is_zk_unblock = 0; > assert(rc == 0); > > ev.type = EVENT_NOTIFY; > @@ -656,9 +664,6 @@ static void zk_handler(int listen_fd, int events, void > *data) > if (ret < 0) > return; > > - if (uatomic_read(&zk_notify_blocked) > 0) > - return; > - > ret = zk_queue_pop(zhandle, &ev); > if (ret < 0) > goto out; > -- > 1.7.10.4 > -- Yunkai Zhang Work at Taobao -- sheepdog mailing list [email protected] http://lists.wpkg.org/mailman/listinfo/sheepdog
