[jira] [Commented] (SOLR-10006) Cannot do a full sync (fetchindex) if the replica can't open a searcher

Erick Erickson (JIRA) Sat, 21 Jan 2017 20:37:02 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833285#comment-15833285
 ]


Erick Erickson commented on SOLR-10006:
---------------------------------------

This doesn't seem to handle my case. The problem is that the core is not there 
after restart due to core init failure. Once the failure is in the 
coreInitFailures, I don't think we can get back to doing much of anything with 
the core. It certainly doesn't recover on its own.

As a side-note, CoreAdminOperation has several operations that silently swallow 
exceptions, so in the situation I described where the core has an init failure, 
issuing a REQUESTRECOVERY fails silently. I'll raise a separate JIRA about that.

> Cannot do a full sync (fetchindex) if the replica can't open a searcher
> -----------------------------------------------------------------------
>
>                 Key: SOLR-10006
>                 URL: https://issues.apache.org/jira/browse/SOLR-10006
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 5.3.1, 6.4
>            Reporter: Erick Erickson
>         Attachments: SOLR-10006.patch
>
>
> Doing a full sync or fetchindex requires an open searcher and if you can't 
> open the searcher those operations fail.
> For discussion. I've seen a situation in the field where a replica's index 
> became corrupt. When the node was restarted, the replica tried to do a full 
> sync but fails because the core can't open a searcher. The replica went into 
> an endless sync/fail/sync cycle.
> I couldn't reproduce that exact scenario, but it's easy enough to get into a 
> similar situation. Create a 2x2 collection and index some docs. Then stop one 
> of the instances and go in and remove a couple of segments files and restart.
> The replica stays in the "down" state, fine so far.
> Manually issue a fetchindex. That fails because the replica can't open a 
> searcher. Sure, issuing a fetchindex is abusive.... but I think it's the same 
> underlying issue: why should we care about the state of a replica's current 
> index when we're going to completely replace it anyway?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-10006) Cannot do a full sync (fetchindex) if the replica can't open a searcher

Reply via email to