I've seen single nodes handle 10M docs using 64G of heap (using Zing).
I've seen 300M in 12G of memory. There's absolutely no way to tell.

See: 
https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
for a methodology to answer the question with _your_ data and _your_
query pattern...

Best,
Erick

On Thu, Mar 23, 2017 at 5:08 AM, Emir Arnautovic
<emir.arnauto...@sematext.com> wrote:
> Hi Vrindavda,
>
> It is hard to tell anything without testing and details on what/how is
> indexed, how it is going to be queried and what are latency/throughput
> requirements.
>
> 25M or 12.5M documents per shard might be too much if you have strict
> latency requirements, but testing is the only way to tell. I would suggest
> that you set up index with single shard and see how many documents you can
> put into it to meet latency requirements under expected load (if you plan to
> have 2 replicas that is roughly the half of the expected load). Leave some
> room for distributed query overhead. After you get that number you can see
> how many shards you need to have.
>
> HTH,
> Emir
>
>
>
> On 23.03.2017 09:46, vrindavda wrote:
>>
>> Hello,
>>
>> My production index is expected to contain 50 million documents, with
>> addition of around 1 million every year.
>>
>> Should I go for 64GB RAM (4 Shards /4 Replicas) Or 128GB (2 Shards/ 2
>> Replicas) ?
>>
>> Please suggest if above assumptions are incorrect. What all parameters
>> should I consider ?
>>
>>
>> Thank you,
>> Vrinda Davda
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Architecture-suggestions-tp4326436.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>

Reply via email to