"and that's not
guaranteed currently unless one deletes the old first-in-line."

Yeah, that is what I said in the final step. ask n1 (the current leader) to
rejoin election.
The rejoin command always makes a node join at TAIL and  rejoinAtHead makes
a node join right behind the current HEAD


"If one doesn't have the former first-in-line go to the tail, "

I fail to understand this. There will be always a node that is first in
line (as long as there is a line)

"it all depends on the session ID that's associated"
 really? how?

On Fri, Dec 5, 2014 at 10:43 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Noble:
>
> Thanks, that'll probably solve my immediate problem, but it still
> seems flawed. I should be able to specify "rejoin at head" on a
> particular node and next time a leader is elected the node I told to
> rejoin at head _should_ be the one that comes up, and that's not
> guaranteed currently unless one deletes the old first-in-line.
>
> If one doesn't have the former first-in-line go to the tail, then
> depending on the sub-ordering the node I told to rejoin at head may or
> may not become leader, it all depends on the session ID that's
> associated. So in general any time anything rejoins at head the second
> call to delete the old first-in-line is required.
>
> That said, if this work it'll solve my immediate problem without
> getting into all the leader re-election code which, as you mentioned
> before, is pretty difficult to get right.
>
> Thanks!
> Erick
>
> On Thu, Dec 4, 2014 at 6:06 PM, Noble Paul <noble.p...@gmail.com> wrote:
> > "Let's say you have 5 nodes in n1, n2, n3, n4, n5.
> >
> > n1 is the leader, n2 watches n1 etc.
> >
> > Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
> > watching n1. So far, so good.
> >
> > My expectation is that deleting n1 would cause n3 to become leader,
> > but it isn't at all guaranteed. I have a test case illustrating this"
> >
> >
> > deleting n1 is not enough
> >
> > before that, you should ask n2 to rejoin election (joinAthead=false).
> This
> > will ensure that n2 is at tail now. Now the order is n1,n3,n4....
> > now ask n1 to rejoin (not at head) and it will join back at tail and n3
> will
> > become leader
> >
> > On Wed, Dec 3, 2014 at 7:20 AM, Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >>
> >> Thanks! I somewhat remember seeing that conversation but I confess I
> >> didn't follow it that closely.
> >>
> >> I can't cope with looking at it any more tonight, but I'll check in
> >> the morning. The problem I see is I don't think there's any way, once
> >> a node is re-inserted in the queue, for another node to figure out
> >> that it's not supposed to be the leader if it's first in line after
> >> the nodes are sorted, but I may have missed that.
> >>
> >> Erick
> >>
> >> On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet <mewmewb...@gmail.com>
> >> wrote:
> >> > This is reminiscent of my conversation with Noble on this SOLR-6095
> >> > starting
> >> > at this comment:
> >> >
> >> >
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386
> >> >
> >> > Unfortunately I dropped off following it and my memory is a bit vague
> >> > right
> >> > now. Reading from the comments, I think Noble had in mind that the
> >> > tie-breaker can pick the wrong node (n2) to be the leader, but then
> the
> >> > wrong node will then re-initiate the process to renounce leadership
> and
> >> > re-join (according to
> >> >
> >> >
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619
> ).
> >> >
> >> > I then asked about when that renounce process will happen for n2
> >> >
> >> > (
> https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659
> ),
> >> > and I'm not sure if that was ever specifically answered. Figuring if
> and
> >> > how
> >> > that happens might be key in moving forward?
> >> >
> >> > Jessica
> >> >
> >> > On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson <
> erickerick...@gmail.com>
> >> > wrote:
> >> >>
> >> >> I'm particularly interested in Noble and Mark's comments...
> >> >>
> >> >> Let's say you have 5 nodes in n1, n2, n3, n4, n5.
> >> >>
> >> >> n1 is the leader, n2 watches n1 etc.
> >> >>
> >> >> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are
> >> >> watching n1. So far, so good.
> >> >>
> >> >> My expectation is that deleting n1 would cause n3 to become leader,
> >> >> but it isn't at all guaranteed. I have a test case illustrating this.
> >> >>
> >> >> Incidentally, I think I should get the same result by calling
> >> >> retryElection on n1 with joinAtHead=false; n3 should become the
> >> >> leader.
> >> >>
> >> >> I was working on SOLR-6691 and slowly going crazy since everything I
> >> >> was trying would fail. Basically, to rebalance leaders (thanks Noble
> >> >> for pointing out how far off I was in my original approach) it seemed
> >> >> like it would be sufficient to
> >> >>
> >> >> 1> have the preferred leader retry the election at the head
> >> >> 2> tell the old leader to retry at the tail
> >> >>
> >> >> I expected the old node that was watching the leader to figure out
> >> >> that it wasn't really next in line and re-add itself to the end.
> >> >>
> >> >> But things went all to hell in a handbasket when I wrote a harness
> >> >> that exercised it, and it drove me a bit nuts. Especially since it
> >> >> would fail one way one time and another way the next. And it'd even
> >> >> succeed upon occasion....
> >> >>
> >> >> I figured out that my expectations weren't being met. Due to the way
> >> >> leader queues are sorted, if the two sequence numbers are identical
> >> >> then the tie-breaker does NOT pick the last node to join at head.  It
> >> >> picks the one with the lowest (highest? didn't track that down
> >> >> entirely) session ID. Either way, sometimes it picks the node newly
> >> >> added at the head and sometimes it picks the old one.
> >> >>
> >> >> If I _am_ on the right path, then I propose the following:
> >> >> 1> I'll raise a new JIRA for leader sequence sorting and take it on.
> >> >> I'm not quite sure how fix it, the ideas I have are fairly hacky.
> >> >>
> >> >> 2> I'll back out the REBALANCELEADER  stuff. Currently it'll break
> >> >> things badly and we're too close to 5.0 to try to do anything about
> >> >> <1> IMO. this just means that I'll comment out the collections API
> >> >> call in the code and update the ref guide.
> >> >>
> >> >> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but that
> >> >> won't be before 5.1
> >> >>
> >> >> Erick
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >
> >
> >
> > --
> > -----------------------------------------------------
> > Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


-- 
-----------------------------------------------------
Noble Paul

Reply via email to