Thanks. It's difficult to replicate w/o the data but I will try to ask on github.
On Wednesday, October 8, 2014 6:04:52 AM UTC-4, Thibaut wrote: > > Hi, > > I would open up an issue on github. Even if it's just one node, > elasticsearch should restart. > > Thanks, > Thibaut > > On Tue, Oct 7, 2014 at 11:03 PM, Ankush Jhalani <[email protected] > <javascript:>> wrote: > >> Well it's a shared resource (not prod), used for other stuff and due to >> historical/enterprise reasons it's bounced every week. Though not ideal, I >> expect ES to be able to restart without issues. >> >> On Tuesday, October 7, 2014 5:01:15 PM UTC-4, Mark Walkom wrote: >>> >>> Why are you restarting the node every week? >>> That sounds like a problem you should solve to stop this one happening. >>> >>> Regards, >>> Mark Walkom >>> >>> Infrastructure Engineer >>> Campaign Monitor >>> email: [email protected] >>> web: www.campaignmonitor.com >>> >>> On 8 October 2014 07:56, Ankush Jhalani <[email protected]> wrote: >>> >>>> We have a single node ES instance, which is restarted once a week. >>>> Every time it's restarted, one specific index recovery is always stuck at >>>> - >>>> >>>>> [2014-10-06 22:47:48,107][DEBUG][index.translog ] >>>>> [testnode] [testindex_20140930][0] interval [5s], flush_threshold_ops >>>>> [2147483647], flush_threshold_size [200mb], flush_threshold_period >>>>> [30m] >>>>> [2014-10-06 22:47:48,108][DEBUG][index.shard.service ] >>>>> [testnode] [testindex_20140930][0] state: [CREATED]->[RECOVERING], reason >>>>> [from gateway] >>>>> [2014-10-06 22:47:48,108][DEBUG][index.gateway ] >>>>> [testnode] [testindex_20140930][0] starting recovery from local ... >>>>> [2014-10-06 22:47:48,203][DEBUG][index.engine.internal ] >>>>> [testnode] [testindex_20140930][0] starting engine >>>>> >>>>> >>>>> >>>>> We have to delete that index for recovery to complete. Doing hot >>>> threads dump, we get following logs - >>>> ::: >>>> [testnode.node][ff9m9KnRSqWfkrTZiAMbsA][testnode][inet[/10.126.143.197:9301]]{datacenter=nj, >>>> >>>> master=true} >>>> >>>> 102.9% (514.3ms out of 500ms) cpu usage by thread >>>> 'elasticsearch[testnode.node][generic][T#2]' >>>> 10/10 snapshots sharing following 14 elements >>>> org.elasticsearch.index.engine.internal. >>>> InternalEngine$SearchFactory.newSearcher(InternalEngine.java:1574) >>>> org.apache.lucene.search.SearcherManager.getSearcher( >>>> SearcherManager.java:160) >>>> org.apache.lucene.search.SearcherManager.refreshIfNeeded( >>>> SearcherManager.java:122) >>>> org.apache.lucene.search.SearcherManager.refreshIfNeeded( >>>> SearcherManager.java:58) >>>> org.apache.lucene.search.ReferenceManager.doMaybeRefresh( >>>> ReferenceManager.java:176) >>>> org.apache.lucene.search.ReferenceManager.maybeRefresh( >>>> ReferenceManager.java:225) >>>> org.elasticsearch.index.engine.internal.InternalEngine.refresh( >>>> InternalEngine.java:779) >>>> org.elasticsearch.index.engine.internal.InternalEngine.delete( >>>> InternalEngine.java:686) >>>> org.elasticsearch.index.shard.service.InternalIndexShard. >>>> performRecoveryOperation(InternalIndexShard.java:780) >>>> org.elasticsearch.index.gateway.local.LocalIndexShardGateway. >>>> recover(LocalIndexShardGateway.java:250) >>>> org.elasticsearch.index.gateway.IndexShardGatewayService$1. >>>> run(IndexShardGatewayService.java:132) >>>> java.util.concurrent.ThreadPoolExecutor.runWorker( >>>> ThreadPoolExecutor.java:1110) >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run( >>>> ThreadPoolExecutor.java:603) >>>> java.lang.Thread.run(Thread.java:722) >>>> >>>> >>>> >>>> We started seeing this error with upgrade to v1.3.2, and still >>>> happening with v1.3.4. Could someone advice what could be happening? >>>> Thanks. >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/584e7b07-0957-49ca-b67a-3f8dc281312a% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/584e7b07-0957-49ca-b67a-3f8dc281312a%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/327c2b19-109a-4f42-9031-93a2c8c275e9%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/327c2b19-109a-4f42-9031-93a2c8c275e9%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e68dff51-c65f-4149-b693-048011326a73%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
