Good point on heap, so I will bring that back down to 30GB

Versions:
ES 1.3.2-1
java 1.7.0_67

I definitely want to start using all 12 disks, rather than the 1 at the 
moment! If I add paths for the other 11 disks and restart, will ES do any 
'rebalancing'? If it won't then is there any way to move the data around 
all 12 disks? I really don't want to re-index everthing!!

Thanks


On Thursday, September 18, 2014 10:03:18 AM UTC+1, Mark Walkom wrote:
>
> Also given you're over 32GB heap your java pointers aren't going to be 
> compressed, which means GC will suffer.
>
> You haven't mentioned what ES and java versions you are using, which would 
> be useful.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [email protected] <javascript:>
> web: www.campaignmonitor.com
>
> On 18 September 2014 18:57, Michael McCandless <[email protected] 
> <javascript:>> wrote:
>
>> Try disabling merge IO throttling, especially if your index is on SSD/s. 
>>  (It's on by default at a paltry 20 MB/sec).  Merge IO throttling causes 
>> merges to run slowly which eventually causes them to back up enough to the 
>> point where indexing must be throttled...
>>
>> Also see the recent post about tuning to favor indexing throughput: 
>> http://www.elasticsearch.org/blog/performance-considerations-elasticsearch-indexing/
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Thu, Sep 18, 2014 at 4:54 AM, <[email protected] <javascript:>> 
>> wrote:
>>
>>> Setup:
>>> 4 nodes
>>> Replication            = 0
>>> ES_HEAP_SIZE   = 75GB
>>> Number of Indices = 59  (using logstash one index per month)
>>> Total shards          = 234 (each index is 4 hards, one per node)
>>> Total docs             = 7.4 billion
>>> Total size               = 4.7TB
>>>
>>> When I add a new file, which I do using logstash on all four nodes, the 
>>> indexing immediately throttles. For instance:
>>>
>>> [2014-09-18 09:41:42,326][INFO ][index.engine.internal    ] [hdp13] [
>>> logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, 
>>> maxNumMerges=5
>>> [2014-09-18 09:41:45,267][INFO ][index.engine.internal    ] [hdp13] 
>>> [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6, 
>>> maxNumMerges=5
>>> [2014-09-18 09:41:45,303][INFO ][index.engine.internal    ] [hdp13] 
>>> [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, 
>>> maxNumMerges=5
>>> [2014-09-18 09:41:51,273][INFO ][index.engine.internal    ] [hdp13] 
>>> [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6, 
>>> maxNumMerges=5
>>> [2014-09-18 09:41:51,379][INFO ][index.engine.internal    ] [hdp13] 
>>> [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, 
>>> maxNumMerges=5
>>> [2014-09-18 09:42:06,429][INFO ][index.engine.internal    ] [hdp13] 
>>> [logstash-2014.09][2] now t
>>>
>>> Where should I be looking to tuning the indexing performance? The query 
>>> load on the cluster is very low as it is a research cluster and so I would 
>>> sacrifice query performance for indexing.
>>>
>>> The 4 nodes all run logstash, listening one various ports. I use netcat 
>>> to 'feed' the data to the 4 nodes from  a hadoop cluster.
>>>
>>> hadoop1 netcat -------->
>>> hadoop2 netcat -------->   ES1     
>>> hadoop3 netcat -------->
>>>
>>> And so on.
>>>
>>> Each ES node has 24 disks but I am only using one at the moment. This is 
>>> an obvious IO bottleneck, but I am unclear how to use all disks? If I add 
>>> more disks with ES share the data between them all? eg; /mnt/disk1 
>>> /mnt/disk2 etc
>>>
>>> Thanks
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] <javascript:>.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CAD7smRdJwXcsq%2BdUpyMZ%3D2UZsDbGwX7CEeE91L_rFan1FP6bDw%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/CAD7smRdJwXcsq%2BdUpyMZ%3D2UZsDbGwX7CEeE91L_rFan1FP6bDw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2c8d4764-954e-4f13-8b03-89afa2a2d573%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to