Noble: Yep. but check me out on this, so far every "flaw" I've found has been shooting myself in the foot! Actually, I'll make a new comment on 6691 so we have a record there.
Thanks! Erick On Sun, Dec 7, 2014 at 11:15 PM, Noble Paul <noble.p...@gmail.com> wrote: > "Ideally, I'd like to just tell a node to rejoin at head and have to do > nothing else." > > The rejoin-at-head is an internal API which is used by other APIs (and not > exposed to others). So , in that way it is a ready-to-cook API and not a > ready-to-eat one. So, use it with caution. > > The entity that triggers the API should choreograph the entire sequence. Any > failure in between should be handled properly > > On Sat, Dec 6, 2014 at 2:25 AM, Erick Erickson <erickerick...@gmail.com> > wrote: >> >> Ahhh, I wasn't too clear. >> >> Ideally, I'd like to just tell a node to rejoin at head and have to do >> nothing else. Specifically, not have to tell the old first-in-line to >> rejoin at tail. >> >> If I do _not_ do the second step, i.e. tell the old first-in-line to >> rejoin at tail and _do_ tell the leader to rejoin at tail, both the >> old first-in-line and node that rejoined at head's watchers get >> triggered, and their sequence IDs are identical. So which one wins >> relies on the fallback comparison of the entire election node which >> starts with the session ID. Thus my comment that "it all depends on >> the session ID that's associated". >> >> You're right in that there's always a first in line and it's a >> determinate algorithm. And I can get the behavior I want by doing the >> step I was omitting, i.e. tell the old first-in-line to rejoin at >> tail. And to have the behavior I was hoping for (i.e. no need to tell >> the old first-in-line to rejoin at tail) requires reworking the leader >> election code, which as you well know isn't something to be approached >> lightly. >> >> And I don't intend to even try that after looking at that code for a >> while. I mean saving myself the "trouble" of issuing the rejoin at >> tail isn't even close to worth the risk..... >> >> On Fri, Dec 5, 2014 at 9:32 AM, Noble Paul <noble.p...@gmail.com> wrote: >> > "and that's not >> > guaranteed currently unless one deletes the old first-in-line." >> > >> > Yeah, that is what I said in the final step. ask n1 (the current leader) >> > to >> > rejoin election. >> > The rejoin command always makes a node join at TAIL and rejoinAtHead >> > makes >> > a node join right behind the current HEAD >> > >> > >> > "If one doesn't have the former first-in-line go to the tail, " >> > >> > I fail to understand this. There will be always a node that is first in >> > line >> > (as long as there is a line) >> > >> > "it all depends on the session ID that's associated" >> > really? how? >> > >> > On Fri, Dec 5, 2014 at 10:43 PM, Erick Erickson >> > <erickerick...@gmail.com> >> > wrote: >> >> >> >> Noble: >> >> >> >> Thanks, that'll probably solve my immediate problem, but it still >> >> seems flawed. I should be able to specify "rejoin at head" on a >> >> particular node and next time a leader is elected the node I told to >> >> rejoin at head _should_ be the one that comes up, and that's not >> >> guaranteed currently unless one deletes the old first-in-line. >> >> >> >> If one doesn't have the former first-in-line go to the tail, then >> >> depending on the sub-ordering the node I told to rejoin at head may or >> >> may not become leader, it all depends on the session ID that's >> >> associated. So in general any time anything rejoins at head the second >> >> call to delete the old first-in-line is required. >> >> >> >> That said, if this work it'll solve my immediate problem without >> >> getting into all the leader re-election code which, as you mentioned >> >> before, is pretty difficult to get right. >> >> >> >> Thanks! >> >> Erick >> >> >> >> On Thu, Dec 4, 2014 at 6:06 PM, Noble Paul <noble.p...@gmail.com> >> >> wrote: >> >> > "Let's say you have 5 nodes in n1, n2, n3, n4, n5. >> >> > >> >> > n1 is the leader, n2 watches n1 etc. >> >> > >> >> > Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 are >> >> > watching n1. So far, so good. >> >> > >> >> > My expectation is that deleting n1 would cause n3 to become leader, >> >> > but it isn't at all guaranteed. I have a test case illustrating this" >> >> > >> >> > >> >> > deleting n1 is not enough >> >> > >> >> > before that, you should ask n2 to rejoin election (joinAthead=false). >> >> > This >> >> > will ensure that n2 is at tail now. Now the order is n1,n3,n4.... >> >> > now ask n1 to rejoin (not at head) and it will join back at tail and >> >> > n3 >> >> > will >> >> > become leader >> >> > >> >> > On Wed, Dec 3, 2014 at 7:20 AM, Erick Erickson >> >> > <erickerick...@gmail.com> >> >> > wrote: >> >> >> >> >> >> Thanks! I somewhat remember seeing that conversation but I confess I >> >> >> didn't follow it that closely. >> >> >> >> >> >> I can't cope with looking at it any more tonight, but I'll check in >> >> >> the morning. The problem I see is I don't think there's any way, >> >> >> once >> >> >> a node is re-inserted in the queue, for another node to figure out >> >> >> that it's not supposed to be the leader if it's first in line after >> >> >> the nodes are sorted, but I may have missed that. >> >> >> >> >> >> Erick >> >> >> >> >> >> On Tue, Dec 2, 2014 at 5:34 PM, Jessica Mallet >> >> >> <mewmewb...@gmail.com> >> >> >> wrote: >> >> >> > This is reminiscent of my conversation with Noble on this >> >> >> > SOLR-6095 >> >> >> > starting >> >> >> > at this comment: >> >> >> > >> >> >> > >> >> >> > >> >> >> > https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032386&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032386 >> >> >> > >> >> >> > Unfortunately I dropped off following it and my memory is a bit >> >> >> > vague >> >> >> > right >> >> >> > now. Reading from the comments, I think Noble had in mind that the >> >> >> > tie-breaker can pick the wrong node (n2) to be the leader, but >> >> >> > then >> >> >> > the >> >> >> > wrong node will then re-initiate the process to renounce >> >> >> > leadership >> >> >> > and >> >> >> > re-join (according to >> >> >> > >> >> >> > >> >> >> > >> >> >> > https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032619). >> >> >> > >> >> >> > I then asked about when that renounce process will happen for n2 >> >> >> > >> >> >> > >> >> >> > >> >> >> > (https://issues.apache.org/jira/browse/SOLR-6095?focusedCommentId=14032659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032659), >> >> >> > and I'm not sure if that was ever specifically answered. Figuring >> >> >> > if >> >> >> > and >> >> >> > how >> >> >> > that happens might be key in moving forward? >> >> >> > >> >> >> > Jessica >> >> >> > >> >> >> > On Tue, Dec 2, 2014 at 4:30 PM, Erick Erickson >> >> >> > <erickerick...@gmail.com> >> >> >> > wrote: >> >> >> >> >> >> >> >> I'm particularly interested in Noble and Mark's comments... >> >> >> >> >> >> >> >> Let's say you have 5 nodes in n1, n2, n3, n4, n5. >> >> >> >> >> >> >> >> n1 is the leader, n2 watches n1 etc. >> >> >> >> >> >> >> >> Now I retryElection for n3 with joinAtHead=true. Both n2 and n3 >> >> >> >> are >> >> >> >> watching n1. So far, so good. >> >> >> >> >> >> >> >> My expectation is that deleting n1 would cause n3 to become >> >> >> >> leader, >> >> >> >> but it isn't at all guaranteed. I have a test case illustrating >> >> >> >> this. >> >> >> >> >> >> >> >> Incidentally, I think I should get the same result by calling >> >> >> >> retryElection on n1 with joinAtHead=false; n3 should become the >> >> >> >> leader. >> >> >> >> >> >> >> >> I was working on SOLR-6691 and slowly going crazy since >> >> >> >> everything I >> >> >> >> was trying would fail. Basically, to rebalance leaders (thanks >> >> >> >> Noble >> >> >> >> for pointing out how far off I was in my original approach) it >> >> >> >> seemed >> >> >> >> like it would be sufficient to >> >> >> >> >> >> >> >> 1> have the preferred leader retry the election at the head >> >> >> >> 2> tell the old leader to retry at the tail >> >> >> >> >> >> >> >> I expected the old node that was watching the leader to figure >> >> >> >> out >> >> >> >> that it wasn't really next in line and re-add itself to the end. >> >> >> >> >> >> >> >> But things went all to hell in a handbasket when I wrote a >> >> >> >> harness >> >> >> >> that exercised it, and it drove me a bit nuts. Especially since >> >> >> >> it >> >> >> >> would fail one way one time and another way the next. And it'd >> >> >> >> even >> >> >> >> succeed upon occasion.... >> >> >> >> >> >> >> >> I figured out that my expectations weren't being met. Due to the >> >> >> >> way >> >> >> >> leader queues are sorted, if the two sequence numbers are >> >> >> >> identical >> >> >> >> then the tie-breaker does NOT pick the last node to join at head. >> >> >> >> It >> >> >> >> picks the one with the lowest (highest? didn't track that down >> >> >> >> entirely) session ID. Either way, sometimes it picks the node >> >> >> >> newly >> >> >> >> added at the head and sometimes it picks the old one. >> >> >> >> >> >> >> >> If I _am_ on the right path, then I propose the following: >> >> >> >> 1> I'll raise a new JIRA for leader sequence sorting and take it >> >> >> >> on. >> >> >> >> I'm not quite sure how fix it, the ideas I have are fairly hacky. >> >> >> >> >> >> >> >> 2> I'll back out the REBALANCELEADER stuff. Currently it'll >> >> >> >> break >> >> >> >> things badly and we're too close to 5.0 to try to do anything >> >> >> >> about >> >> >> >> <1> IMO. this just means that I'll comment out the collections >> >> >> >> API >> >> >> >> call in the code and update the ref guide. >> >> >> >> >> >> >> >> 3> When <1> is resolved, I'll put REBALANCELEADERs back in, but >> >> >> >> that >> >> >> >> won't be before 5.1 >> >> >> >> >> >> >> >> Erick >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> --------------------------------------------------------------------- >> >> >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> --------------------------------------------------------------------- >> >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > ----------------------------------------------------- >> >> > Noble Paul >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> > >> > >> > >> > -- >> > ----------------------------------------------------- >> > Noble Paul >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > > > > -- > ----------------------------------------------------- > Noble Paul --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org