We enlarged our cluster to 5 nodes and now the QUORUM error message seems to have disappeared. "failed to process cluster event (acquire index lock) within 1s" kind of messages are still happening though.
:( Tom; On Fri, Jan 9, 2015 at 3:11 PM, Tomas Andres Rossi <[email protected]> wrote: > We enlarged our cluster to 5 nodes and now the QUORUM error message seems > to have disappeared. > "failed to process cluster event (acquire index lock) within 1s" kind of > messages are still happening though. > > :( > > Tom; > > On Fri, Jan 9, 2015 at 1:25 PM, [email protected] < > [email protected]> wrote: > >> Exactly, with 3 nodes, the error will be gone. >> >> Please, always use an odd number of data nodes, in particular with >> replica > 0, in order not to confuse ES quorum formula, and also to avoid >> split brains with minimun_master_nodes >> >> Jörg >> >> On Fri, Jan 9, 2015 at 3:17 PM, Tom <[email protected]> wrote: >> >>> Also, we have another cluster (for different purposes) that has 3 nodes >>> but we didn't experience such errors with it (for this ES we create indices >>> on a daily basis). >>> >>> El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió: >>> >>>> 4 >>>> >>>> El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió: >>>>> >>>>> How many nodes do you have in the cluster? >>>>> >>>>> Jörg >>>>> >>>>> On Thu, Jan 8, 2015 at 6:57 PM, Tom <[email protected]> wrote: >>>>> >>>>>> Hi, we'd been using ES for a while now. Specifically version 0.90.3. >>>>>> A couple of months ago we decided to migrate to the latest version which >>>>>> was finally frozen to be 1.4.1. No data migration was necessary because >>>>>> we >>>>>> have a redundant MongoDB, but yesterday we enabled data writing to the >>>>>> new >>>>>> ES cluster. All was running smoothly when we noticed that at o'clock >>>>>> times >>>>>> there were bursts of four or five log messages of the following kinds: >>>>>> >>>>>> Error indexing None into index ind-analytics-2015.01.08. Total >>>>>> elapsed time: 1065 ms. org.elasticsearch.cluster.metadata. >>>>>> ProcessClusterEventTimeoutException: failed to process cluster event >>>>>> (acquire index lock) within 1s >>>>>> at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1. >>>>>> run(MetaDataCreateIndexService.java:148) ~[org.elasticsearch. >>>>>> elasticsearch-1.4.1.jar:na] >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>>> ~[na:1.7.0_17] >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>>> ~[na:1.7.0_17] >>>>>> at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17] >>>>>> >>>>>> [ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy - >>>>>> Scattering to failed in 1043ms >>>>>> org.elasticsearch.action.UnavailableShardsException: >>>>>> [ind-2015.01.08.00][0] Not enough active copies to meet write consistency >>>>>> of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index >>>>>> {[ind-2015.01.08.00][search][...]} >>>>>> at org.elasticsearch.action.support.replication. >>>>>> TransportShardReplicationOperationAction$AsyncShardOperationAction. >>>>>> retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784) >>>>>> ~[org.elasticsearch.elasticsearch-1.4.1.jar:na] >>>>>> at org.elasticsearch.action.support.replication. >>>>>> TransportShardReplicationOperationAction$AsyncShardOperationAction. >>>>>> raiseFailureIfHaveNotEnoughActiveShardCopies( >>>>>> TransportShardReplicationOperationAction.java:776) >>>>>> ~[org.elasticsearch.elasticsearch-1.4.1.jar:na] >>>>>> at org.elasticsearch.action.support.replication. >>>>>> TransportShardReplicationOperationAction$AsyncShardOperationAction. >>>>>> performOnPrimary(TransportShardReplicationOperationAction.java:507) >>>>>> ~[org.elasticsearch.elasticsearch-1.4.1.jar:na] >>>>>> at org.elasticsearch.action.support.replication. >>>>>> TransportShardReplicationOperationAction$AsyncShardOperationAction$1. >>>>>> run(TransportShardReplicationOperationAction.java:419) >>>>>> ~[org.elasticsearch.elasticsearch-1.4.1.jar:na] >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>>> ~[na:1.7.0_17] >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>>> ~[na:1.7.0_17] >>>>>> at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17] >>>>>> >>>>>> This occurs at o'clock times because we write over hour-based >>>>>> indices. For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes >>>>>> to >>>>>> ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, >>>>>> and >>>>>> so on. >>>>>> >>>>>> With 0.90.3 version of ES, automatic index creation was working >>>>>> flawlessly (with no complaints) but the new version doesn't seem to >>>>>> handle >>>>>> that feature very well. It looks like, when all those concurrent writes >>>>>> competes to be the first to create the index, all but one fails. Of >>>>>> course >>>>>> we could just create such indices manually to avoid this situation >>>>>> altogether, but this would only be a workaround for a feature that >>>>>> previously worked. >>>>>> >>>>>> Also, we use ES through the native Java client and the configuration >>>>>> for all our indices is >>>>>> >>>>>> settings = { >>>>>> number_of_shards = 5, >>>>>> number_of_replicas = 2 >>>>>> } >>>>>> >>>>>> Any ideas? >>>>>> >>>>>> Thanks in advance, >>>>>> Tom; >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elasticsearch" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>>> msgid/elasticsearch/4deefb09-bed1-499a-b9fc-3ed4d78fc4c0% >>>>>> 40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/elasticsearch/4deefb09-bed1-499a-b9fc-3ed4d78fc4c0%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/4b052ab5-ab02-49bb-ad79-8e47f249e755%40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/4b052ab5-ab02-49bb-ad79-8e47f249e755%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "elasticsearch" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/elasticsearch/-H-sNVTSYbQ/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFaU96UN8YaguRs%2BMqD%2BtgypEWd6LP0CrisyFhh%2BTzjKw%40mail.gmail.com >> <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFaU96UN8YaguRs%2BMqD%2BtgypEWd6LP0CrisyFhh%2BTzjKw%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Tom; > -- Tom; -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOs2X_cjcJNOt-aRZQc_3u7Xj0Knev%3D66Z_6fxc43zKpRUNg_g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
