Unfortunately that is too hard/complicated. I have now enabled all 12 disks per machine, so going forward I will get some "sharing" across all disks. Not sure how it will allocate new data across the disks?
If I move a shard from one node to another with the new 12-disk paths, will the receiving node "share" the data across the disks? That way I could move all shards and get a redistribution of existing data? On Thursday, September 18, 2014 10:35:24 AM UTC+1, Mark Walkom wrote: > > Does your server have hardware RAID capabilities? > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: [email protected] <javascript:> > web: www.campaignmonitor.com > > On 18 September 2014 19:30, <[email protected] <javascript:>> wrote: > >> Good point on heap, so I will bring that back down to 30GB >> >> Versions: >> ES 1.3.2-1 >> java 1.7.0_67 >> >> I definitely want to start using all 12 disks, rather than the 1 at the >> moment! If I add paths for the other 11 disks and restart, will ES do any >> 'rebalancing'? If it won't then is there any way to move the data around >> all 12 disks? I really don't want to re-index everthing!! >> >> Thanks >> >> >> On Thursday, September 18, 2014 10:03:18 AM UTC+1, Mark Walkom wrote: >>> >>> Also given you're over 32GB heap your java pointers aren't going to be >>> compressed, which means GC will suffer. >>> >>> You haven't mentioned what ES and java versions you are using, which >>> would be useful. >>> >>> Regards, >>> Mark Walkom >>> >>> Infrastructure Engineer >>> Campaign Monitor >>> email: [email protected] >>> web: www.campaignmonitor.com >>> >>> On 18 September 2014 18:57, Michael McCandless <[email protected]> >>> wrote: >>> >>>> Try disabling merge IO throttling, especially if your index is on >>>> SSD/s. (It's on by default at a paltry 20 MB/sec). Merge IO throttling >>>> causes merges to run slowly which eventually causes them to back up enough >>>> to the point where indexing must be throttled... >>>> >>>> Also see the recent post about tuning to favor indexing throughput: >>>> http://www.elasticsearch.org/blog/performance-considerations- >>>> elasticsearch-indexing/ >>>> >>>> Mike McCandless >>>> >>>> http://blog.mikemccandless.com >>>> >>>> >>>> On Thu, Sep 18, 2014 at 4:54 AM, <[email protected]> wrote: >>>> >>>>> Setup: >>>>> 4 nodes >>>>> Replication = 0 >>>>> ES_HEAP_SIZE = 75GB >>>>> Number of Indices = 59 (using logstash one index per month) >>>>> Total shards = 234 (each index is 4 hards, one per node) >>>>> Total docs = 7.4 billion >>>>> Total size = 4.7TB >>>>> >>>>> When I add a new file, which I do using logstash on all four nodes, >>>>> the indexing immediately throttles. For instance: >>>>> >>>>> [2014-09-18 09:41:42,326][INFO ][index.engine.internal ] [hdp13] [ >>>>> logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, >>>>> maxNumMerges=5 >>>>> [2014-09-18 09:41:45,267][INFO ][index.engine.internal ] [hdp13] >>>>> [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6, >>>>> maxNumMerges=5 >>>>> [2014-09-18 09:41:45,303][INFO ][index.engine.internal ] [hdp13] >>>>> [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, >>>>> maxNumMerges=5 >>>>> [2014-09-18 09:41:51,273][INFO ][index.engine.internal ] [hdp13] >>>>> [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6, >>>>> maxNumMerges=5 >>>>> [2014-09-18 09:41:51,379][INFO ][index.engine.internal ] [hdp13] >>>>> [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, >>>>> maxNumMerges=5 >>>>> [2014-09-18 09:42:06,429][INFO ][index.engine.internal ] [hdp13] >>>>> [logstash-2014.09][2] now t >>>>> >>>>> Where should I be looking to tuning the indexing performance? The >>>>> query load on the cluster is very low as it is a research cluster and so >>>>> I >>>>> would sacrifice query performance for indexing. >>>>> >>>>> The 4 nodes all run logstash, listening one various ports. I use >>>>> netcat to 'feed' the data to the 4 nodes from a hadoop cluster. >>>>> >>>>> hadoop1 netcat --------> >>>>> hadoop2 netcat --------> ES1 >>>>> hadoop3 netcat --------> >>>>> >>>>> And so on. >>>>> >>>>> Each ES node has 24 disks but I am only using one at the moment. This >>>>> is an obvious IO bottleneck, but I am unclear how to use all disks? If I >>>>> add more disks with ES share the data between them all? eg; /mnt/disk1 >>>>> /mnt/disk2 etc >>>>> >>>>> Thanks >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba% >>>>> 40googlegroups.com >>>>> <https://groups.google.com/d/msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/CAD7smRdJwXcsq%2BdUpyMZ%3D2UZsDbGwX7CEeE91L_ >>>> rFan1FP6bDw%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/elasticsearch/CAD7smRdJwXcsq%2BdUpyMZ%3D2UZsDbGwX7CEeE91L_rFan1FP6bDw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/2c8d4764-954e-4f13-8b03-89afa2a2d573%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/2c8d4764-954e-4f13-8b03-89afa2a2d573%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1eeb45aa-6957-4046-ae33-00fc4a7df015%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
