The heap should be as big as your largest shard, irrespective of what index
it belongs to or if it's a replica.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [email protected]
web: www.campaignmonitor.com


On 3 July 2014 05:50, mrno42 <[email protected]> wrote:

> When you say "- do not let a shard grow bigger than your JVM heap (this is
> really a rough estimation) so segment merging will work flawlessly"
>
> are we counting all the primary and replicas shards of all indexes on that
> node?  So for example, if we had two indexes with on 10 node cluster. Each
> index has 10 shards and 1 replica(40 total in cluster).
>
> So per node, the heap size should be larger than:
>
> 1 shard for first index
> 1 shard for replica of first index
> 1 shard for second index
> 1 shard for replica second index
>
> the four shards combined?
>
> Thanks again for your advice
>
>
>
> On Saturday, August 10, 2013 6:50:27 AM UTC-7, Jörg Prante wrote:
>>
>> Your concern is a single shard getting "too big". If you use 64bit JVM
>> and mmapfs (quite common), you can open even the largest files. So from
>> this point of view, a node can handle the biggest files. There is no real
>> limit.
>>
>> Another question is throughput performance with large shard files. For
>> example, the more mixed read/write operations are in the workload, the
>> smaller the Lucene indexes should be, to allow the JVM/OS a better load
>> distribution.
>>
>> For selecting a total number of shards and shard size, here are some
>> general rules of thumb:
>>
>> - do not select a smaller number of shards than your total number of
>> nodes you will add to the cluster. Each node should hold at least one shard.
>>
>> - do not let a shard grow bigger than your JVM heap (this is really a
>> rough estimation) so segment merging will work flawlessly
>>
>> - if you want fast recovery, or if you want to move shards around (not a
>> common case), the smaller a shard is the faster the operation will get done
>>
>> In case you are worried about shards getting out of bounds, you can
>> reindex with a higher number of shards (having the _source enabled is
>> always an advantage for reindexing) with your favorite custom tool.
>> Reindexing can take significant time, and may not be an option if you can't
>> stop indexing.
>>
>> Jörg
>>
>>
>> On Fri, Aug 9, 2013 at 4:32 PM, David Arata <[email protected]> wrote:
>>
>>> My concern is what would would be the best strategy so that an index or
>>> single shard in an index does not get too big for a node to handle and if
>>> its approaching that size what can be done?
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3fc800b0-28d6-4ad0-8aa5-eb182d9b27ee%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/3fc800b0-28d6-4ad0-8aa5-eb182d9b27ee%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Z_Fdx7-ew-XaapNN7wN6zfjD97PXSLk3G-QrXOVVoX6A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to