Hi Neo,
Shard size determines query latency, so you split your index when queries 
become too slow. Distributed search comes with some overhead, so oversharding 
is not the way to go either. There is no hard rule what are the best numbers, 
but here  are some thought how to approach this: 
http://www.od-bits.com/2018/01/solrelasticsearch-capacity-planning.html 
<http://www.od-bits.com/2018/01/solrelasticsearch-capacity-planning.html>

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 11 Apr 2018, at 12:15, neotorand <neotor...@gmail.com> wrote:
> 
> Hi Team
> First of all i take this opportunity to thank you all for creating a
> beautiful place where people can explore ,learn and debate.
> 
> I have been on my knees for couple of days to decide on this.
> 
> When i am creating a solr cloud eco system i need to decide on number of
> shards and collection.
> What are the best practices for taking this decisions.
> 
> I believe heterogeneous data can be indexed to same collection and i can
> have multiple shards for the index to be partitioned.So whats the need of a
> second collection?. yes when collection size grows i should look for more
> collection.what exactly that size is? what KPI drives the decision of having
> more collection?Any pointers or links for best practice.
> 
> when should i go for multiple shards?
> yes when shard size grows.Right? whats the size and how do i benchmark.
> 
> I am sorry for my question if its already asked but googled all the ecospace
> quora,stackoverflow,lucid
> 
> Regards
> Neo
> 
> 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply via email to