Just out of curiosity, are aggregations on multiple shards on a single node executed serially or in parallel? In my experience, it appears that they're executed serially (my CPU usage did not change when going from 1 shard to 2 shards per node, but I didn't test this extensively). I'm interested in maximizing the parallelism of an aggregation without creating a massive number of nodes.
On Friday, December 19, 2014 at 10:31:45 AM UTC-5, Jörg Prante wrote: > > Yes, I have 3 nodes and each index has 3 shards, on 32 core machines. > > Each shard contains many segments, which can be read and written > concurrently by Lucene. Since Lucene 4, there have been massive > improvements in that area. > > Maybe you have observed the effect that many shards on a node for a single > index show a different performance behavior when docs are added over long > periods of time. It simply takes longer before large segment merging begins > because docs are wider distributed and use smaller segment sizes for a > longer time. The downside is that huge segment counts may occur (and many > users encounter high file descriptor numbers). With the right > configuration, you can set up a single shard per index on a node, and > segment merging / segment count is not a real problem. > > You are right if you consider shard size as a factor for moving the shard > around (into snapshot/restore) or for export, or at node recovery when the > node starts up. I think shard sizes over 30 GB are a bit heavy, but this > also depends on the speed of the I/O subsystem. With SSD or RAID 0, I can > operate at I/O rates of over 1 GB/sec at sequential read. The shard size > factor has to be balanced out, either by using more than one index, or a > higher number of nodes, or faster I/O subsystem. > > Jörg > > > > On Fri, Dec 19, 2014 at 3:42 PM, AlexR <[email protected] <javascript:>> > wrote: > >> Jorg, if you have a single large index and a cluster with 3 nodes do you >> suggest to create just 3 shards even though each node has say 16 cores. >> With just three shards they will be very big and not much patallelism in >> computations will occur. >> am I missing something? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/36d1a61a-e996-4bec-97b7-0842fc118cb2%40googlegroups.com >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/813098da-f6e8-42ae-b162-7ac551f4be18%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
