>>>>>I'd need some time to dig into ZK-832, but from the description I don't think it is a blocker.
Agreed, its not a blocker for 3.4.7 release. This is not a common scenario, will wait to see more consensus on the proposed algorithm and then push it in. -Rakesh On Fri, Oct 23, 2015 at 2:37 AM, Flavio Junqueira <f...@apache.org> wrote: > Raul, > > I'd need some time to dig into ZK-832, but from the description I don't > think it is a blocker. As I understand this is happening because the server > lost persistent state, and this is isn't a common scenario in a replicated > deployment. I'm fine with downgrading it from blocker to major/critical. > > As for ZK-1029, if we know what the problem is, would be difficult to > provide a patch? > > -Flavio > > > On 22 Oct 2015, at 21:57, Raúl Gutiérrez Segalés <r...@itevenworks.net> > wrote: > > > > On 5 October 2015 at 11:01, Raúl Gutiérrez Segalés <r...@itevenworks.net > <mailto:r...@itevenworks.net>> > > wrote: > > > >> On 8 September 2015 at 23:15, Raúl Gutiérrez Segalés < > r...@itevenworks.net> > >> wrote: > >> > >>> Hi, > >>> > >>> On 23 August 2015 at 14:51, Raúl Gutiérrez Segalés < > r...@itevenworks.net> > >>> wrote: > >>> > >>>> On 23 August 2015 at 14:44, Raúl Gutiérrez Segalés < > r...@itevenworks.net> > >>>> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> sorry about dropping the ball here. So going over the unresolved > >>>>> issues, I think these ones would be nice to tackle before cutting an > RC: > >>>>> > >>>>> * ZOOKEEPER-1833: fix windows build (one sub-task still opened: > >>>>> ZOOKEEPER-1868) > >>>>> * ZOOKEEPER-1029: C client bug in zookeeper_init (if bad hostname is > >>>>> given) > >>>>> (no one has this assigned, I'll try to get a patch out by tomorrow) > >>>>> * ZOOKEEPER-832: Invalid session id causes infinite loop during > >>>>> automatic reconnect > >>>>> (I've asked Rakesh if can wrap it up, if anyone else can help that > >>>>> would be great) > >>>>> * ZOOKEEPER-2033: zookeeper follower fails to start after a restart > >>>>> immediately following a new epoch > >>>>> (pinged Flavio to get some feedback) > >>>>> > >>>>> Everything else can probably be punted for 3.4.8, unless anyone > >>>>> disagrees. > >>>>> > >>>> > >>>> One more, which needs to be back-ported from trunk: > >>>> > >>>> ZOOKEEPER-1506: Re-try DNS hostname -> IP resolution if node > connection > >>>> fails > >>>> > >>> > >>> There's been some movement in the bug tracker, but ZOOKEEPER-1506 and > ZOOKEEPER-832 > >>> still need reviews (hopefully tomorrow, unless someone can beat me to > it) > >>> and I still need to get to ZOOKEEPER-1029. > >>> > >> > >> So ZOOKEEPER-1506 is done. Still waiting on ZOOKEEPER-832 and I am > hoping > >> to finally get to ZOOKEEPER-1029 this week (unless someone beats me to > it, > >> which would be much appreciated). > >> > > > > > > Circling back, it turns out that ZOOKEEPER-1029 is actually not the > cause > > for MESOS-2186. The fact that we are not properly checking if the locks > > have been initialized before trying to get the locks is still wrong, but > > ignoring the return codes from pthread_cond_broadcast and > > pthread_mutex_lock (EINVAL) is not causing the reported crashers. > > > > I propose we punt ZOOKEEPER-1029 and ZOOKEEPER-832 for 3.4.8, so that we > > can keep moving with the release candidate. > > > > Any objections? > > > > > > -rgs > >