[ 
https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417558#comment-15417558
 ] 

Pushkar Raste edited comment on SOLR-9310 at 8/11/16 8:31 PM:
--------------------------------------------------------------

Here is short description of bug
1. A node goes down in solr cloud 
2. More documents and added (and may be a commit issued)
3. Node that was down comes up. 
4. Node gets fingerprint from the leader and version too 
5. Node calculates diff for missing versions and  requests updates for the same 
 
6. Node applies updates and then checks it's fingerprint against the leader's 
fingerprint
7. Check in #6 always fail, fingerprint of recovering node does not reflect 
updates applied during PeerSync


There are two proposed fixes
* My fix is not to buffer updates commands that have PEER_SYNC flag on it. 
I think, hesitation about my patch, we don't know what other side effect it may 
have. (All test cases are passing, but we might not have a test case where my 
fix would break things)

* Noble's fix to check fingerprint before we start applying updates.
In my opinion this no really fixing original issue, what really matters is if 
fingerprint matches after applying updates during PeerSync.

and we don't know which approach in right. May be there 3rd better approach


was (Author: praste):
Here is short description of bug
1. A node goes down in solr cloud 
2. More documents and added (and may be a commit issued)
3. Node that was down comes up. 
4. Node gets fingerprint from the leader and version too 
5. Node calculates diff for missing versions and  requests updates for the same 
 
6. Node applies updates and then checks it's fingerprint against the leader's 
fingerprint
7. Check in #6 always fail, fingerprint of recovering node does not reflect 
updates applied during PeerSync


There are two proposed fixes
* My fix is not to buffer updates commands that have PEER_SYNC flag on it. 
I think, hesitation about my patch, we don't know what other side effect it may 
have. (All test cases are passing, but we might not have a test case where my 
fix would break things)

*Noble's fix to check fingerprint before we start applying updates.
In my opinion this no really fixing original issue, what really matters is if 
fingerprint matches after applying updates during PeerSync.

and we don't know which approach in right. May be there 3rd better approach

> PeerSync fails on a node restart due to IndexFingerPrint mismatch
> -----------------------------------------------------------------
>
>                 Key: SOLR-9310
>                 URL: https://issues.apache.org/jira/browse/SOLR-9310
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Pushkar Raste
>            Assignee: Noble Paul
>         Attachments: PeerSync_Experiment.patch, SOLR-9310.patch, 
> SOLR-9310.patch, SOLR-9310.patch, SOLR-9310.patch
>
>
> I found that Peer Sync fails if a node restarts and documents were indexed 
> while node was down. IndexFingerPrint check fails after recovering node 
> applies updates. 
> This happens only when node restarts and not if node just misses updates due 
> reason other than it being down.
> Please check attached patch for the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to