[ 
https://issues.apache.org/jira/browse/SOLR-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13914110#comment-13914110
 ] 

Noble Paul commented on SOLR-5476:
----------------------------------

bq.I also think we need to add some defensive programming. If we try and remove 
a queue item that is not there, we should assume another overseer already ate 
it rather than letting the Overseer thread die.


We are trying to kill an overseer safely.

What is being done is  wait for STATE_UPDATE_DELAY+100 ms and then start the 
next Overseer . I agree that this is not 100% foolproof (but I have not seen 
OverseerRolesTest fail for quite sometime) 

We can definitely add more checks to ensure a proper handing over of overseer 
role. I didn't quite get the above suggestion clearly. Please elaborate . I'll 
be glad to implement that 

I think this should be an issue in itself rather than confusing with the 
"Roles" issue.

bq.This is why I said on the phone that I would prefer we mark it as 
experimental 

Where do I need to sign?

BTW , I would say the users of this feature are very unlikely to hit this issue

> Overseer Role for nodes
> -----------------------
>
>                 Key: SOLR-5476
>                 URL: https://issues.apache.org/jira/browse/SOLR-5476
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>             Fix For: 4.7, 5.0
>
>         Attachments: SOLR-5476.patch, SOLR-5476.patch, SOLR-5476.patch, 
> SOLR-5476.patch, SOLR-5476.patch, SOLR-5476.patch, SOLR-5476.patch
>
>
> In a very large cluster the Overseer is likely to be overloaded.If the same 
> node is a serving a few other shards it can lead to OverSeer getting slowed 
> down due to GC pauses , or simply too much of work  . If the cluster is 
> really large , it is possible to dedicate high end h/w for OverSeers
> It works as a new collection admin command
> command=addrole&role=overseer&node=192.168.1.5:8983_solr
> This results in the creation of a entry in the /roles.json in ZK which would 
> look like the following
> {code:javascript}
> {
> "overseer" : ["192.168.1.5:8983_solr"]
> }
> {code}
> If a node is designated for overseer it gets preference over others when 
> overseer election takes place. If no designated servers are available another 
> random node would become the Overseer.
> Later on, if one of the designated nodes are brought up ,it would take over 
> the Overseer role from the current Overseer to become the Overseer of the 
> system



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to