I've got a setup like yours -- lots of cores and replicas, but no need for
shards -- and here's what I've found so far:

1. Zookeeper is tiny. I would think network I/O is going to be the biggest
concern.

2. I think this is more about high availability than performance. I've been
expirementing with taking down parts of my setup to see what happens. When
zookeeper goes down, the solr instances still serve requests. It appears,
however, that updating and replication stop. I want to make frequent updates
so this is a big concern for me.

3. On ec2, I launch a server which is configured to register itself with my
zookeeper box upon launch. When they are ready I add them to my load
balancer. Theoretically, zookeeper would help further balance them, but
right now I find those queries to be too slow. Since the load balancer is
already distributing the load, I'm adding the parameter "distrib=false" to
my queries. This forces the request to stay on the box the load balancer
chose.

4. This is interesting. I started down this path of wanting to maintain a
master, but I've moved towards a system where all of my update requests go
through my load balancer. Since zookeeper dynamically elects a leader, no
matter which box gets the update the leader gets it anyway. This is very
nice for me because I want all my solr instances to be identical.

Since there's not a lot of documentation on this yet, I hope other people
share their findings, too.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/some-general-solr-4-0-questions-tp4009267p4009286.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to