Index recovery failure on node restart since v1.3.x

Ankush Jhalani Tue, 07 Oct 2014 13:57:13 -0700

We have a single node ES instance, which is restarted once a week. Every 
time it's restarted, one specific index recovery is always stuck at -


> [2014-10-06 22:47:48,107][DEBUG][index.translog           ] [testnode] 
> [testindex_20140930][0] interval [5s], flush_threshold_ops [2147483647], 
> flush_threshold_size [200mb], flush_threshold_period [30m]
> [2014-10-06 22:47:48,108][DEBUG][index.shard.service      ] [testnode] 
> [testindex_20140930][0] state: [CREATED]->[RECOVERING], reason [from 
> gateway]
> [2014-10-06 22:47:48,108][DEBUG][index.gateway            ] [testnode] 
> [testindex_20140930][0] starting recovery from local ...
> [2014-10-06 22:47:48,203][DEBUG][index.engine.internal    ] [testnode] 
> [testindex_20140930][0] starting engine
>
>
>
>  We have to delete that index for recovery to complete. Doing hot threads 
dump, we get following logs - 
::: 
[testnode.node][ff9m9KnRSqWfkrTZiAMbsA][testnode][inet[/10.126.143.197:9301]]{datacenter=nj,
 
master=true}
   
   102.9% (514.3ms out of 500ms) cpu usage by thread 
'elasticsearch[testnode.node][generic][T#2]'
     10/10 snapshots sharing following 14 elements
      
 
org.elasticsearch.index.engine.internal.InternalEngine$SearchFactory.newSearcher(InternalEngine.java:1574)
      
 org.apache.lucene.search.SearcherManager.getSearcher(SearcherManager.java:160)
      
 
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:122)
      
 
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
      
 
org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
      
 
org.apache.lucene.search.ReferenceManager.maybeRefresh(ReferenceManager.java:225)
      
 
org.elasticsearch.index.engine.internal.InternalEngine.refresh(InternalEngine.java:779)
      
 
org.elasticsearch.index.engine.internal.InternalEngine.delete(InternalEngine.java:686)
      
 
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:780)
      
 
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:250)
      
 
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
      
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
      
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       java.lang.Thread.run(Thread.java:722)
   


We started seeing this error with upgrade to v1.3.2, and still happening 
with v1.3.4. Could someone advice what could be happening? Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/584e7b07-0957-49ca-b67a-3f8dc281312a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Index recovery failure on node restart since v1.3.x

Reply via email to