Jilani, you did say "My team needs that option if at all possible", my first response would be "why?". Why do they want to limit the number of documents per shard, what's the rationale/use case behind that requirement? Once we understand that, we can explain why its a bad idea. :)
I suspect I'm re-iterating Jack's comments, but why are you sharding in the first place? 8 shards split across 4 machines, so 2 shards per machine. But you have 2 replicas of each shard, so you have 16 Solr core, and hence 4 Solr cores per machine? Since you need an instance of all 8 shards to be up in order to service requests, you can get away with everything on 2 machines, but you still have 8 Solr cores to manage in order to have a fully functioning system. What's the benefit of sharding in this scenario? Sharding adds complexity, so you normally only add sharding if your search times are too slow without it. You need to work out how much disk space the whole 20m docs is going to take (maybe index 1m or 5m docs and extrapolate if they are all equivalent in size), then split it across 4 machines. But as Erick points out you need to allow for merges to occur, so whatever the space of the "static" data set, you need to allow for double that from time to time if background merges are happening. On 7 May 2015 at 16:05, Jack Krupansky <jack.krupan...@gmail.com> wrote: > A leader is also a replica - SolrCloud is not a master/slave architecture. > Any replica can be elected to be the leader, but that is only temporary and > can change over time. > > You can place multiple shards on a single node, but was that really your > intention? > > Generally, number of nodes equals number of shards times the replication > factor. But then divided by shards per node if you do place more than one > shard per node. > > -- Jack Krupansky > > On Thu, May 7, 2015 at 1:29 AM, Jilani Shaik <jilani24...@gmail.com> > wrote: > > > Hi, > > > > Is it possible to restrict number of documents per shard in Solr cloud? > > > > Lets say we have Solr cloud with 4 nodes, and on each node we have one > > leader and one replica. Like wise total we have 8 shards that includes > > replicas. Now I need to index my documents in such a way that each shard > > will have only 5 million documents. Total documents in Solr cloud should > be > > 20 million documents. > > > > > > Thanks, > > Jilani > > >