The current implementation removes non live nodes from the set of
nodes to connect to. Getting the live nodes requires connecting to a
specific node in the cluster that is therefore live when that happens.
Worst case, if there is a single node up in the cluster, the client
ends with a single node in its connection candidates list.
For the issue to manifest, that Solr node then has to go down.
Subsequently, even if other nodes are up, the client only has the
address of a down node and can't connect.

The fix is not a big deal.
Nodes initially passed as configuration to the client should never be
removed from the set of candidate nodes to connect to, even if they
are not live.
Other live nodes could be added to that set (and removed from it if we
so desire when they are no longer live) to increase resiliency in case
the cluster does have live nodes but all initially configured nodes
are not live.
The design issue is treating the configured set of nodes to connect to
and the set of live nodes as one thing.

Ilan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Reply via email to