Re: Unable to set preferred leader

2020-06-23 Thread Erick Erickson
First of all, unless you have a lot of shards worrying about which one is the 
leader is
not worth the effort. That code was put in there to deal with a situation where 
there
were 100s of shards and when the system was cold-started they all could have 
their
leader be on the same node.

The extra work a leader does is actually quite minimal, so unless you have a 
lot of
leaders. I wouldn’t start to worry until on the order of 20-30, then I’d 
measure to 
be sure. And the extra work is during indexing when it has to distribute the 
updates
to followers, FWIW.

But to your question, I have no idea. I’d say “look at the logs”, but you’ve 
already
done that. What happens is the preferred leader gets inserted in the 
overseer_election
queue watching the current leader, then the current leader is moved to the end
of the election queue. This _should_ trigger the watch on the preferred leader
to take over. I wouldn’t necessarily expect error messages in the logs BTW, 
you’d
need to look at the INFO level messages for both the PreferredLeader, Overseer
and current leader in that order.

The other place that’d be interesting is where the preferred leader is in the 
leader
election queue for that shard after it’s all done. It actually shouldn’t be in 
the 
election queue at all on success.

Not much help I know. The code is in RebalanceLeaders.java along with some
explanatory notes.

Best,
Erick


> On Jun 23, 2020, at 3:43 AM, Karl Stoney 
>  wrote:
> 
> Hey,
> We have a SolrCloud collection with 8 replicas, and one of those replicas has 
> the `property.preferredleader: true` set.   However when we perform a 
> `REBALANCELEADERS` we get:
> 
> ```
> {
>  "responseHeader": {
>"status": 0,
>"QTime": 62268
>  },
>  "Summary": {
>"Failure": "Not all active replicas with preferredLeader property are 
> leaders"
>  },
>  "failures": {
>"shard1": {
>  "status": "failed",
>  "msg": "Could not change leder for slice shard1 to core_node9"
>}
>  }
> }
> ```
> 
> There is nothing in the solr logs on any of the nodes to indicate the reason 
> for the failure.
> 
> What I have noticed is that 4 of the nodes briefly go orange in the gui (eg 
> “down”), and for a moment 9 of them go into yellow (eg “recovering”), before 
> all becoming active again with the same (incorrect) leader.
> 
> We use the same model on 4 other collections to set the preferred leader to a 
> particular replica and they all work fine.
> 
> Does anyone have any ideas?
> 
> Thanks
> Karl
> Unless expressly stated otherwise in this email, this e-mail is sent on 
> behalf of Auto Trader Limited Registered Office: 1 Tony Wilson Place, 
> Manchester, Lancashire, M15 4FN (Registered in England No. 03909628). Auto 
> Trader Limited is part of the Auto Trader Group Plc group. This email and any 
> files transmitted with it are confidential and may be legally privileged, and 
> intended solely for the use of the individual or entity to whom they are 
> addressed. If you have received this email in error please notify the sender. 
> This email message has been swept for the presence of computer viruses.



Unable to set preferred leader

2020-06-23 Thread Karl Stoney
Hey,
We have a SolrCloud collection with 8 replicas, and one of those replicas has 
the `property.preferredleader: true` set.   However when we perform a 
`REBALANCELEADERS` we get:

```
{
  "responseHeader": {
"status": 0,
"QTime": 62268
  },
  "Summary": {
"Failure": "Not all active replicas with preferredLeader property are 
leaders"
  },
  "failures": {
"shard1": {
  "status": "failed",
  "msg": "Could not change leder for slice shard1 to core_node9"
}
  }
}
```

There is nothing in the solr logs on any of the nodes to indicate the reason 
for the failure.

What I have noticed is that 4 of the nodes briefly go orange in the gui (eg 
“down”), and for a moment 9 of them go into yellow (eg “recovering”), before 
all becoming active again with the same (incorrect) leader.

We use the same model on 4 other collections to set the preferred leader to a 
particular replica and they all work fine.

Does anyone have any ideas?

Thanks
Karl
Unless expressly stated otherwise in this email, this e-mail is sent on behalf 
of Auto Trader Limited Registered Office: 1 Tony Wilson Place, Manchester, 
Lancashire, M15 4FN (Registered in England No. 03909628). Auto Trader Limited 
is part of the Auto Trader Group Plc group. This email and any files 
transmitted with it are confidential and may be legally privileged, and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error please notify the sender. 
This email message has been swept for the presence of computer viruses.