Re: ES starting process and using huge HEAP memory problem.

Prometheus WillSurvive Thu, 26 Dec 2013 07:57:10 -0800

Hi Mr. Walkom,

thank you for your kind response.


Actually we already set cluster.routing.allocation.disable_allocation to 
true  and there is no shard routing between nodes to avoid this issue. 
 When indexes start to open its using gigs of  memory . If I set the Heap 
below 32 gig its not going to open and keeping GC running and deadlock. I 
suspected that ES loading something to the cache when its start opening. 
 Below a capture from when the index opening threads:

*********************************************
::: [SE03indicto][57GE4QrOQV-w-5OzVcd6mQ][inet[/10.0.0.63:9300]]

  100.5% (502.7ms out of 500ms) cpu usage by thread 
'elasticsearch[SE03indicto][warmer][T#6]'
    3/10 snapshots sharing following 18 elements
      sun.nio.ch.NativeThread.current(Native Method)
      sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
      sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:695)
      sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:684)
      
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:169)
      
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:271)
      
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:137)
      
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:113)
      
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readTermsBlock(Lucene41PostingsReader.java:215)
      
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2410)
      
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadNextFloorBlock(BlockTreeTermsReader.java:2340)
      
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.next(BlockTreeTermsReader.java:2115)
      
org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredTermsEnum.next(BloomFilterPostingsFormat.java:257)
      
org.elasticsearch.index.cache.id.simple.SimpleIdCache.refresh(SimpleIdCache.java:151)
      
org.elasticsearch.search.SearchService$FieldDataWarmer$2.run(SearchService.java:689)
      
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      java.lang.Thread.run(Thread.java:744)
    4/10 snapshots sharing following 8 elements
      
org.elasticsearch.common.hppc.ObjectIntOpenHashMap.containsKey(ObjectIntOpenHashMap.java:702)
      
org.elasticsearch.index.cache.id.simple.SimpleIdCache$TypeBuilder.canReuse(SimpleIdCache.java:323)
      
org.elasticsearch.index.cache.id.simple.SimpleIdCache.checkIfCanReuse(SimpleIdCache.java:287)
      
org.elasticsearch.index.cache.id.simple.SimpleIdCache.refresh(SimpleIdCache.java:181)
      
org.elasticsearch.search.SearchService$FieldDataWarmer$2.run(SearchService.java:689)
      
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      java.lang.Thread.run(Thread.java:744)
    2/10 snapshots sharing following 17 elements
      sun.nio.ch.NativeThread.current(Native Method)
      sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
      sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:695)
      sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:684)
      
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:169)
      
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:271)
      
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:137)
      
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:113)
      
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2384)
      
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadNextFloorBlock(BlockTreeTermsReader.java:2340)
      
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.next(BlockTreeTermsReader.java:2115)
      
org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredTermsEnum.next(BloomFilterPostingsFormat.java:257)
      
org.elasticsearch.index.cache.id.simple.SimpleIdCache.refresh(SimpleIdCache.java:151)
      
org.elasticsearch.search.SearchService$FieldDataWarmer$2.run(SearchService.java:689)
      
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      java.util.concurrent.Thread

****************************************
What I found a manual solution is clearing caches while indexes opening so 
heap not going crazy.   I also tried the 0.90.9  that a new feature : 
index.codec.bloom.load that can control if the boom filters will be loaded 
or not. But 0.90.9 brought us to cluster inconsistency and can not open 
indexes. 

I would be very happy to  use HEAP under 32 gig.  How can I say the ES just 
initialize the index and not load any cache assuming this is the problem. 

We also used solr  for a couple of years  and very big  index sizes. It 
never took such a long time and huge memory usage.  Keep digging the issue 
and keep posted.


Thanks













On Wednesday, December 25, 2013 11:25:09 PM UTC+2, Mark Walkom wrote:
>
> Running > 32GB for heap size brings a lot of inefficiencies into play as 
> java pointers will not be compressed and your GC will increase (as you can 
> see).
>
> When you shut a node down the indexes it has allocated to it go into an 
> unallocated state and the cluster will try to reallocate all of these 
> shards. But if you (re)join a node to the cluster it will still initialise 
> and reallocate them even if it only ends up putting them onto the same 
> node. This is to ensure the cluster state is maintained and your shards are 
> evenly distributed.
>
> If you are just restarting a node for whatever reason, you can 
> set cluster.routing.allocation.disable_allocation to true, then restart the 
> node, and when it comes back it will simply reinitialise and reopen the 
> index shards on the local machine, which is a lot faster recovery. Make 
> sure you set that back to false when things are green again.
>
>
> Ideally you need to add more nodes to your cluster, you could split those 
> two and make 4 VMs/containers, or just add more physical nodes. Just 
> remember my first comment though.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [email protected] <javascript:>
> web: www.campaignmonitor.com
>
>
> On 26 December 2013 00:13, Prometheus WillSurvive <
> [email protected] <javascript:>> wrote:
>
>> Last couple of days I am looking the forums and the blogs regarding to 
>> find a help  or clue about the Large indexes and the Heap usage similar to 
>> the our use case.
>>
>>  Unfortunatelly I didn’t find a solution that helps my case.  Here the 
>> setup:
>>
>> We have two test server,  each :
>>
>> 128 gig Ram
>> 24 core xeon cpu
>> 3TB disk
>>
>> We have 10 indexes and 5 shards 0 replica  . Each index has around 120 
>> gig size.  Two of them 190 Gig size.  Index mapping has a parent child 
>> relation. 
>>
>> ES heap is 80 gig
>>
>>
>> Our main problem is the opening ES server.  When ES start to open indexes 
>> it always require recovering process even we decently close the ES server 
>> before. 
>>
>> When ES recovering the index shard Heap always going up. Specially when 
>> ES start to recover/initialize 190 gig size index  Heap almost full and its 
>> going to endless GC process .
>>
>> Why ES using so much heap to open / recover / initialize shards ? 
>>
>> After successfully opened shards ES not releasing heap that used  ?
>>
>> What is the mechanism behind the initializing indexes process ?
>>
>> Why ES recovering indexes every time ? 
>>
>> What could be your suggestion ? 
>>
>> thanks
>>   
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/5c1a453c-04d5-4f99-ab0a-d9bd743e3b34%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f0f6996-f5f0-4ec1-b9ae-4831c5e67959%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: ES starting process and using huge HEAP memory problem.

Reply via email to