The heap should be as big as your largest shard, irrespective of what index it belongs to or if it's a replica.
Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: [email protected] web: www.campaignmonitor.com On 3 July 2014 05:50, mrno42 <[email protected]> wrote: > When you say "- do not let a shard grow bigger than your JVM heap (this is > really a rough estimation) so segment merging will work flawlessly" > > are we counting all the primary and replicas shards of all indexes on that > node? So for example, if we had two indexes with on 10 node cluster. Each > index has 10 shards and 1 replica(40 total in cluster). > > So per node, the heap size should be larger than: > > 1 shard for first index > 1 shard for replica of first index > 1 shard for second index > 1 shard for replica second index > > the four shards combined? > > Thanks again for your advice > > > > On Saturday, August 10, 2013 6:50:27 AM UTC-7, Jörg Prante wrote: >> >> Your concern is a single shard getting "too big". If you use 64bit JVM >> and mmapfs (quite common), you can open even the largest files. So from >> this point of view, a node can handle the biggest files. There is no real >> limit. >> >> Another question is throughput performance with large shard files. For >> example, the more mixed read/write operations are in the workload, the >> smaller the Lucene indexes should be, to allow the JVM/OS a better load >> distribution. >> >> For selecting a total number of shards and shard size, here are some >> general rules of thumb: >> >> - do not select a smaller number of shards than your total number of >> nodes you will add to the cluster. Each node should hold at least one shard. >> >> - do not let a shard grow bigger than your JVM heap (this is really a >> rough estimation) so segment merging will work flawlessly >> >> - if you want fast recovery, or if you want to move shards around (not a >> common case), the smaller a shard is the faster the operation will get done >> >> In case you are worried about shards getting out of bounds, you can >> reindex with a higher number of shards (having the _source enabled is >> always an advantage for reindexing) with your favorite custom tool. >> Reindexing can take significant time, and may not be an option if you can't >> stop indexing. >> >> Jörg >> >> >> On Fri, Aug 9, 2013 at 4:32 PM, David Arata <[email protected]> wrote: >> >>> My concern is what would would be the best strategy so that an index or >>> single shard in an index does not get too big for a node to handle and if >>> its approaching that size what can be done? >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/3fc800b0-28d6-4ad0-8aa5-eb182d9b27ee%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/3fc800b0-28d6-4ad0-8aa5-eb182d9b27ee%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z_Fdx7-ew-XaapNN7wN6zfjD97PXSLk3G-QrXOVVoX6A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
