Re: Indexing is being throttled

Mark Walkom Thu, 18 Sep 2014 02:35:57 -0700

Does your server have hardware RAID capabilities?

Regards,
Mark Walkom


Infrastructure Engineer
Campaign Monitor
email: [email protected]
web: www.campaignmonitor.com

On 18 September 2014 19:30, <[email protected]> wrote:

> Good point on heap, so I will bring that back down to 30GB
>
> Versions:
> ES 1.3.2-1
> java 1.7.0_67
>
> I definitely want to start using all 12 disks, rather than the 1 at the
> moment! If I add paths for the other 11 disks and restart, will ES do any
> 'rebalancing'? If it won't then is there any way to move the data around
> all 12 disks? I really don't want to re-index everthing!!
>
> Thanks
>
>
> On Thursday, September 18, 2014 10:03:18 AM UTC+1, Mark Walkom wrote:
>>
>> Also given you're over 32GB heap your java pointers aren't going to be
>> compressed, which means GC will suffer.
>>
>> You haven't mentioned what ES and java versions you are using, which
>> would be useful.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: [email protected]
>> web: www.campaignmonitor.com
>>
>> On 18 September 2014 18:57, Michael McCandless <[email protected]>
>> wrote:
>>
>>> Try disabling merge IO throttling, especially if your index is on SSD/s.
>>>  (It's on by default at a paltry 20 MB/sec).  Merge IO throttling causes
>>> merges to run slowly which eventually causes them to back up enough to the
>>> point where indexing must be throttled...
>>>
>>> Also see the recent post about tuning to favor indexing throughput:
>>> http://www.elasticsearch.org/blog/performance-considerations-
>>> elasticsearch-indexing/
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>>
>>> On Thu, Sep 18, 2014 at 4:54 AM, <[email protected]> wrote:
>>>
>>>> Setup:
>>>> 4 nodes
>>>> Replication            = 0
>>>> ES_HEAP_SIZE   = 75GB
>>>> Number of Indices = 59  (using logstash one index per month)
>>>> Total shards          = 234 (each index is 4 hards, one per node)
>>>> Total docs             = 7.4 billion
>>>> Total size               = 4.7TB
>>>>
>>>> When I add a new file, which I do using logstash on all four nodes, the
>>>> indexing immediately throttles. For instance:
>>>>
>>>> [2014-09-18 09:41:42,326][INFO ][index.engine.internal    ] [hdp13] [
>>>> logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4,
>>>> maxNumMerges=5
>>>> [2014-09-18 09:41:45,267][INFO ][index.engine.internal    ] [hdp13]
>>>> [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6,
>>>> maxNumMerges=5
>>>> [2014-09-18 09:41:45,303][INFO ][index.engine.internal    ] [hdp13]
>>>> [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4,
>>>> maxNumMerges=5
>>>> [2014-09-18 09:41:51,273][INFO ][index.engine.internal    ] [hdp13]
>>>> [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6,
>>>> maxNumMerges=5
>>>> [2014-09-18 09:41:51,379][INFO ][index.engine.internal    ] [hdp13]
>>>> [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4,
>>>> maxNumMerges=5
>>>> [2014-09-18 09:42:06,429][INFO ][index.engine.internal    ] [hdp13]
>>>> [logstash-2014.09][2] now t
>>>>
>>>> Where should I be looking to tuning the indexing performance? The query
>>>> load on the cluster is very low as it is a research cluster and so I would
>>>> sacrifice query performance for indexing.
>>>>
>>>> The 4 nodes all run logstash, listening one various ports. I use netcat
>>>> to 'feed' the data to the 4 nodes from  a hadoop cluster.
>>>>
>>>> hadoop1 netcat -------->
>>>> hadoop2 netcat -------->   ES1
>>>> hadoop3 netcat -------->
>>>>
>>>> And so on.
>>>>
>>>> Each ES node has 24 disks but I am only using one at the moment. This
>>>> is an obvious IO bottleneck, but I am unclear how to use all disks? If I
>>>> add more disks with ES share the data between them all? eg; /mnt/disk1
>>>> /mnt/disk2 etc
>>>>
>>>> Thanks
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba%
>>>> 40googlegroups.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/CAD7smRdJwXcsq%2BdUpyMZ%3D2UZsDbGwX7CEeE91L_
>>> rFan1FP6bDw%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elasticsearch/CAD7smRdJwXcsq%2BdUpyMZ%3D2UZsDbGwX7CEeE91L_rFan1FP6bDw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2c8d4764-954e-4f13-8b03-89afa2a2d573%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/2c8d4764-954e-4f13-8b03-89afa2a2d573%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Y3Kn8pexvrFCMOK_B9mvM-T3fWdQ-jEh01qwSk-4zuUw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Indexing is being throttled

Reply via email to