I think Mr. Erickson summarized the issue of hardware sizing quite well in the following article:
http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ Best regards, Primož From: Henrik Ossipoff Hansen <h...@entertainment-trading.com> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> Date: 08.10.2013 14:59 Subject: Hardware dimension for new SolrCloud cluster We're in the process of moving onto SolrCloud, and have gotten to the point where we are considering how to do our hardware setup. We're limited to VMs running on our server cluster and storage system, so buying new physical servers is out of the question - the question is how we should dimension the new VMs. Our document area is somewhat small, with about 1.2 million orders (rising of course), 75k products (divided into 5 countries - each which will be their own collection/core) and some million customers. In our current master/slave setup, we only index the products, with each country taking up about 35 MB of disk space. The index frequency i more or less updating the indexes 8 times per hour (mostly this is not all data thought, but atomic updates with new stock data, new prices etc.). Our upcoming order and customer indexes however will more or less receive updates "on the fly" as it happens (softcommit) and we expect the same to be the case for products in the near future. - For hardware, it's down to 1 or 2 cores - current master runs with 2 cores - RAM - currently our master runs with 6 GB only - How much heap space should we allocate for max heap? We currently plan on this setup: - 1 machine for a simple loadbalancer - 4 VMs totally for the Solr machines themselves (for both leaders and replicas, just one replica per shard is enough for our use case) - A qorum of 3 ZKs Question is - is this machine setup enough? And how exactly do we dimension the Solr machines? Any help, pointers or resources will be much appreciated :) Thank you!