[jira] [Commented] (SOLR-10006) Cannot do a full sync (fetchindex) if the replica can't open a searcher

Erick Erickson (JIRA) Tue, 14 Feb 2017 07:46:27 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865990#comment-15865990
 ]


Erick Erickson commented on SOLR-10006:
---------------------------------------

MIke:

As for the admin UI, I agree that's a separate issue. No good reason to let 
this JIRA sprawl.

As far as the force option. I'm a little fuzzy on when it'd apply. I completely 
agree that adding expense to fetchindex (and by implication polling?) to handle 
this case isn't a good tradeoff.

Are you thinking that the force would bypass the necessity of having an open 
searcher? Well, if it works ;)

The scenario here is that somehow the index got corrupt _and_ the instance got 
restarted. There's no way to recover the replica without bouncing the server 
again, and usually that would mean going into the file system and deleting the 
data directory. Which for some installations is prohibitively expensive, and/or 
requires filling out forms and the like ;) If we can force the fetchindex to 
happen on a core that's never successfully opened a searcher that's the end 
point I'm looking for. The rest is gravy.

Thanks!
Erick

> Cannot do a full sync (fetchindex) if the replica can't open a searcher
> -----------------------------------------------------------------------
>
>                 Key: SOLR-10006
>                 URL: https://issues.apache.org/jira/browse/SOLR-10006
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 5.3.1, 6.4
>            Reporter: Erick Erickson
>         Attachments: SOLR-10006.patch, SOLR-10006.patch, solr.log, solr.log
>
>
> Doing a full sync or fetchindex requires an open searcher and if you can't 
> open the searcher those operations fail.
> For discussion. I've seen a situation in the field where a replica's index 
> became corrupt. When the node was restarted, the replica tried to do a full 
> sync but fails because the core can't open a searcher. The replica went into 
> an endless sync/fail/sync cycle.
> I couldn't reproduce that exact scenario, but it's easy enough to get into a 
> similar situation. Create a 2x2 collection and index some docs. Then stop one 
> of the instances and go in and remove a couple of segments files and restart.
> The replica stays in the "down" state, fine so far.
> Manually issue a fetchindex. That fails because the replica can't open a 
> searcher. Sure, issuing a fetchindex is abusive.... but I think it's the same 
> underlying issue: why should we care about the state of a replica's current 
> index when we're going to completely replace it anyway?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-10006) Cannot do a full sync (fetchindex) if the replica can't open a searcher

Reply via email to