[ 
https://issues.apache.org/jira/browse/SOLR-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791438#action_12791438
 ] 

Yonik Seeley commented on SOLR-1277:
------------------------------------

While our designs shouldn't preclude load based node selection, I don't think 
we should tackle it now - it's fraught with peril.

We should allow the configuration of "capacity" for a node (or host?) and 
eventually implement a load balancing mechanism that takes such capacity into 
account.  If one node has half the capacity of another, it will be sent half 
the number of requests.   This type of static balancing is easier to predict 
and test.

The other issue with updating statistics is the write cost on zookeeper - we 
may not want to do it by default, and if we do, we wouldn't want to do it with 
a high frequency.

Some other considerations when choosing nodes for distributed search:
 - the same node should be used for a particular shard for the multiple phases 
of a distributed search, both for better consistency between phases, and better 
caching.
 - zookeeper could be used to take a node out of service (and other nodes 
should immediately stop making requests to that node), but each node also needs 
to be able to determine failure of another node and retry a different node 
independent of zookeeper.

Everything (search traffic) should work when disconnected from zookeeper, based 
on the last cluster configuration seen.


> Implement a Solr specific naming service (using Zookeeper)
> ----------------------------------------------------------
>
>                 Key: SOLR-1277
>                 URL: https://issues.apache.org/jira/browse/SOLR-1277
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: log4j-1.2.15.jar, SOLR-1277.patch, SOLR-1277.patch, 
> SOLR-1277.patch, SOLR-1277.patch, zookeeper-3.2.1.jar
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> The goal is to give Solr server clusters self-healing attributes
> where if a server fails, indexing and searching don't stop and
> all of the partitions remain searchable. For configuration, the
> ability to centrally deploy a new configuration without servers
> going offline.
> We can start with basic failover and start from there?
> Features:
> * Automatic failover (i.e. when a server fails, clients stop
> trying to index to or search it)
> * Centralized configuration management (i.e. new solrconfig.xml
> or schema.xml propagates to a live Solr cluster)
> * Optionally allow shards of a partition to be moved to another
> server (i.e. if a server gets hot, move the hot segments out to
> cooler servers). Ideally we'd have a way to detect hot segments
> and move them seamlessly. With NRT this becomes somewhat more
> difficult but not impossible?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to