[ 
https://issues.apache.org/jira/browse/SOLR-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated SOLR-9555:
----------------------------
    Attachment: SOLR-9555-WIP.patch

Here's a work in progress patch. The rough outline of the changes is:
- When publishing an active state, also publish an Active into the LIR znode 
and put a watch on that. If the leader overwrites this as down, start recovery.
-- I tried to have a check here to ensure that nodes can only publish 
themselves as active, but I got messed up on the logic. Not sure if it's 
necessary for correctness, but felt like a good safegaurd.
- Leader no longer needs to send a request recovery command directly to the 
replica. The ZK watch should handle this.
- Leader no longer publishes the node's state. The node will update this itself 
when it starts the recovery process.
-- This means that there is a period of time after the leader has encountered 
the first error and before the node puts itself into recovery that the leader 
may try to send additional updates and get additional errors. Might need a flag 
to mark the node as dead locally or something like that.


I've got about 5 test failures here, and I put an {{@Ignore}} on the 
TestLeaderInitiatedRecoveryThread class because the whole internals of that are 
changing. I think I somehow broke leader election with this change set, so any 
help would be appreciated.

> Leader incorrectly publishes state for replica when it puts replica into LIR.
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-9555
>                 URL: https://issues.apache.org/jira/browse/SOLR-9555
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Alan Woodward
>         Attachments: SOLR-9555-WIP.patch
>
>
> See 
> https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/17888/consoleFull 
> for an example



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to