Does your server have hardware RAID capabilities? Regards, Mark Walkom
Infrastructure Engineer Campaign Monitor email: [email protected] web: www.campaignmonitor.com On 18 September 2014 19:30, <[email protected]> wrote: > Good point on heap, so I will bring that back down to 30GB > > Versions: > ES 1.3.2-1 > java 1.7.0_67 > > I definitely want to start using all 12 disks, rather than the 1 at the > moment! If I add paths for the other 11 disks and restart, will ES do any > 'rebalancing'? If it won't then is there any way to move the data around > all 12 disks? I really don't want to re-index everthing!! > > Thanks > > > On Thursday, September 18, 2014 10:03:18 AM UTC+1, Mark Walkom wrote: >> >> Also given you're over 32GB heap your java pointers aren't going to be >> compressed, which means GC will suffer. >> >> You haven't mentioned what ES and java versions you are using, which >> would be useful. >> >> Regards, >> Mark Walkom >> >> Infrastructure Engineer >> Campaign Monitor >> email: [email protected] >> web: www.campaignmonitor.com >> >> On 18 September 2014 18:57, Michael McCandless <[email protected]> >> wrote: >> >>> Try disabling merge IO throttling, especially if your index is on SSD/s. >>> (It's on by default at a paltry 20 MB/sec). Merge IO throttling causes >>> merges to run slowly which eventually causes them to back up enough to the >>> point where indexing must be throttled... >>> >>> Also see the recent post about tuning to favor indexing throughput: >>> http://www.elasticsearch.org/blog/performance-considerations- >>> elasticsearch-indexing/ >>> >>> Mike McCandless >>> >>> http://blog.mikemccandless.com >>> >>> >>> On Thu, Sep 18, 2014 at 4:54 AM, <[email protected]> wrote: >>> >>>> Setup: >>>> 4 nodes >>>> Replication = 0 >>>> ES_HEAP_SIZE = 75GB >>>> Number of Indices = 59 (using logstash one index per month) >>>> Total shards = 234 (each index is 4 hards, one per node) >>>> Total docs = 7.4 billion >>>> Total size = 4.7TB >>>> >>>> When I add a new file, which I do using logstash on all four nodes, the >>>> indexing immediately throttles. For instance: >>>> >>>> [2014-09-18 09:41:42,326][INFO ][index.engine.internal ] [hdp13] [ >>>> logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, >>>> maxNumMerges=5 >>>> [2014-09-18 09:41:45,267][INFO ][index.engine.internal ] [hdp13] >>>> [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6, >>>> maxNumMerges=5 >>>> [2014-09-18 09:41:45,303][INFO ][index.engine.internal ] [hdp13] >>>> [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, >>>> maxNumMerges=5 >>>> [2014-09-18 09:41:51,273][INFO ][index.engine.internal ] [hdp13] >>>> [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6, >>>> maxNumMerges=5 >>>> [2014-09-18 09:41:51,379][INFO ][index.engine.internal ] [hdp13] >>>> [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, >>>> maxNumMerges=5 >>>> [2014-09-18 09:42:06,429][INFO ][index.engine.internal ] [hdp13] >>>> [logstash-2014.09][2] now t >>>> >>>> Where should I be looking to tuning the indexing performance? The query >>>> load on the cluster is very low as it is a research cluster and so I would >>>> sacrifice query performance for indexing. >>>> >>>> The 4 nodes all run logstash, listening one various ports. I use netcat >>>> to 'feed' the data to the 4 nodes from a hadoop cluster. >>>> >>>> hadoop1 netcat --------> >>>> hadoop2 netcat --------> ES1 >>>> hadoop3 netcat --------> >>>> >>>> And so on. >>>> >>>> Each ES node has 24 disks but I am only using one at the moment. This >>>> is an obvious IO bottleneck, but I am unclear how to use all disks? If I >>>> add more disks with ES share the data between them all? eg; /mnt/disk1 >>>> /mnt/disk2 etc >>>> >>>> Thanks >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/CAD7smRdJwXcsq%2BdUpyMZ%3D2UZsDbGwX7CEeE91L_ >>> rFan1FP6bDw%40mail.gmail.com >>> <https://groups.google.com/d/msgid/elasticsearch/CAD7smRdJwXcsq%2BdUpyMZ%3D2UZsDbGwX7CEeE91L_rFan1FP6bDw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/2c8d4764-954e-4f13-8b03-89afa2a2d573%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/2c8d4764-954e-4f13-8b03-89afa2a2d573%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Y3Kn8pexvrFCMOK_B9mvM-T3fWdQ-jEh01qwSk-4zuUw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
