I would like to know:
- What is the root cause?
- How do I fix that?
- If it's memory problem? Is there anything that I can do (except for
upgrade)?
On Jun 5, 2014, at 9:54 AM, Mark Walkom <[email protected]> wrote:
> What do you want to know exactly?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [email protected]
> web: www.campaignmonitor.com
>
>
> On 5 June 2014 12:40, Quan Tong Anh <[email protected]> wrote:
> I'm running a 3-node cluster with 2 data nodes. My configuration:
>
> es1, es2:
>
> node:
> name: elasticsearch-1
> master: true
> data: true
>
>
> discovery:
> zen: ping:
> multicast:
> enabled: false
> unicast:
> hosts:
> ["elasticsearch-1.domain.com:9300","logs.domain.com:9300","elasticsearch-2.domain.com:9300",]
>
>
>
> gl2:
>
> node:
> name: graylog2
> master: false
> data: false
>
>
>
> Shinken has sent me a notification that said there is only 2 nodes in cluster:
>
> {
> "cluster_name" : "domain.com",
> "status" : "red",
> "timed_out" : false,
> "number_of_nodes" : 2,
> "number_of_data_nodes" : 1,
> "active_primary_shards" : 12,
> "active_shards" : 12,
> "relocating_shards" : 0,
> "initializing_shards" : 0,
> "unassigned_shards" : 12
> }
>
>
> Log on the ES-1:
>
> [2014-06-04 15:51:09,281][WARN ][transport ] [elasticsearch-1]
> Received response for a request that has timed out, sent [61627ms] ago, timed
> ou
> t [30338ms] ago, action [discovery/zen/fd/masterPing], node
> [[elasticsearch-2][Vcvb6dtMQf-nfuB-wR9iew][inet[/107.170.x.y:9300]]{master=true}],
> id [272380]
> [2014-06-04 15:51:50,542][WARN ][index.cache.field.data.resident]
> [elasticsearch-1] [graylog2-graylog2_2] loading field [_date ] caused out of
> memory failure
> java.lang.OutOfMemoryError: Java heap space
> [2014-06-04 15:55:16,351][DEBUG][action.admin.indices.stats]
> [elasticsearch-1] [graylog2-graylog2_5][2], node[Vcvb6dtMQf-nfuB-wR9iew],
> [P], s[STARTED]: Failed
> to execute
> [org.elasticsearch.action.admin.indices.stats.IndicesStatsRequest@7631d2a2]
> org.elasticsearch.transport.RemoteTransportException:
> [elasticsearch-2][inet[/107.170.x.y:9300]][indices/stats/s]
> Caused by: org.elasticsearch.index.IndexShardMissingException:
> [graylog2-graylog2_5][2] missing
> at
> org.elasticsearch.index.service.InternalIndexService.shardSafe(InternalIndexService.java:179)
> at
> org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:145)
> at
> org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:53)
> at
> org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler.messageReceived(TransportBroadcastOperationAction.java:398)
> at
> org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler.messageReceived(TransportBroadcastOperationAction.java:384)
> at
> org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:268)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> [2014-06-04 15:56:29,504][WARN ][index.engine.robin ] [elasticsearch-1]
> [graylog2_recent][0] failed engine
> java.lang.OutOfMemoryError: Java heap space
>
>
>
> Log on the ES-2:
>
> [2014-06-04 15:51:02,276][WARN ][transport.netty ] [elasticsearch-2]
> exception caught on transport layer [[id: 0x72906b9d, /107.170.z.t:52899 => /1
> 07.170.x.y:9300]], closing connection
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.DirectByteBuffer.duplicate(DirectByteBuffer.java:217)
> at
> org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:87)
> at
> org.elasticsearch.common.netty.channel.socket.nio.SocketSendBufferPool.acquire(SocketSendBufferPool.java:46)
> at
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:190)
> at
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:150)
> at
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:335)
> at
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
> at
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
> at
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
> at
> org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
> at
> org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> at
> org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>
> [2014-06-04 15:51:27,143][WARN ][indices.cluster ] [elasticsearch-2]
> [graylog2-graylog2_5][2] master
> [[elasticsearch-2][Vcvb6dtMQf-nfuB-wR9iew][inet[/107.170.x.y:9300]]{master=true}]
> marked shard as started, but shard have not been created, mark shard as
> failed
>
>
> Log on the GL2:
>
> Jun 4 15:51:11 graylog2 graylog2-server: 2014-06-04 15:51:11,040 WARN :
> org.graylog2.buffers.processors.OutputBufferProcessor - Timeout reached. Not
> waiting
> any longer for writer threads to complete.
> Jun 4 15:51:14 graylog2 graylog2-server: 2014-06-04 15:51:14,694 WARN :
> org.elasticsearch.discovery.zen - [graylog2] master_left and no other node
> elected to become master, current nodes:
> {[graylog2][hHcLLZ2GTamMajmE-a5lXg][inet[/107.170.z.t:9300]]{client=true,
> data=false, master=false},}
> Jun 4 15:51:14 graylog2 graylog2-server: 2014-06-04 15:51:14,708 ERROR:
> org.graylog2.periodical.DeflectorManagerThread - Tried to check for number of
> messages in current deflector target but did not find index. Aborting.
> Jun 4 15:51:14 graylog2 graylog2-server:
> org.elasticsearch.cluster.block.ClusterBlockException: blocked by:
> [SERVICE_UNAVAILABLE/1/state not recovered /
> initialized];[SERVICE_UNAVAILABLE/2/no master];
> Jun 4 15:51:14 graylog2 graylog2-server: 2014-06-04 15:51:14,709 ERROR:
> org.graylog2.periodical.DeflectorManagerThread - Couldn't delete outdated or
> empty indices
> Jun 4 15:52:57 graylog2 graylog2-server: 2014-06-04 15:52:57,339 ERROR:
> org.graylog2.indexer.EmbeddedElasticSearchClient - Could not read name of ES
> node.
> Jun 4 15:52:57 graylog2 graylog2-server: java.lang.NullPointerException
> Jun 4 15:52:57 graylog2 graylog2-server: at
> org.graylog2.indexer.EmbeddedElasticSearchClient.nodeIdToName(EmbeddedElasticSearchClient.java:135)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> org.graylog2.indexer.DeflectorInformation.getShardInformation(DeflectorInformation.java:125)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> org.graylog2.indexer.DeflectorInformation.getIndexInformation(DeflectorInformation.java:110)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> org.graylog2.indexer.DeflectorInformation.getAsDatabaseObject(DeflectorInformation.java:84)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> org.graylog2.periodical.DeflectorInformationWriterThread.run(DeflectorInformationWriterThread.java:72)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> Jun 4 15:52:57 graylog2 graylog2-server: at
> java.lang.Thread.run(Thread.java:744)
>
> Please let me know if you need further information.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b8044e30-2246-4246-b9ce-291644ef0021%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to a topic in the Google
> Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/GhKnPvx1rHw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEM624Y7mgLV-eDwSbyA4SFhYOmAHmyCgzoXrrtvpsGoOhRaeQ%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/81C7B576-CC9A-4F4B-9F70-FF7CD6F0E0CC%40gmail.com.
For more options, visit https://groups.google.com/d/optout.