Hi guys 

We have an issue with adding a Graylog server to an existing Graylog 
cluster. Basically, we need to add another server to scale out our logging 
capabilities. 
We're doing this right now on test cluster just to understand what we have 
to do in production. 

*Environment*
We have an existing Graylog cluster with the following machines: 

   - Graylog web interface box
   - Graylog server 
   - 3 Elasticsearch nodes


On the existing Graylog server [or svr1], ismaster=true in its server.conf 
and its rest_listen_ip is set to the internal network IP of 10.x.y.z. 
The new Graylog server [svr2] that I am adding into the cluster has 
ismaster=false and just like the primary Graylog server its rest_listen_ip 
is set to its internal network IP. 


*Errors*
The existing cluster was operating as expected until we added the second 
server - svr2.

When we start up svr2, the Graylog web interface server appears to see it 
and on login to the Graylog admin page, I can see it registered under 
System>Nodes. However, after a few minutes, svr2 drops off the Nodes page. 
On checking logs of the the Graylog web interface box, I see multiple API 
call failures for svr 2 and a few for svr1:

 

2015-11-16 13:57:03,273 - [ERROR] - from 
org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
API call failed to execute.
java.util.concurrent.ExecutionException: java.net.ConnectException: 
Connection refused: /SVR2_IP_ADDRESS:12900 to 
http://SVR2_IP_ADDRESS:12900/system/cluster/node
        at 
com.ning.http.client.providers.netty.NettyResponseFuture.abort(NettyResponseFuture.java:342)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        at 
com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:108)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:431)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:422)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:384)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:109)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_11]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_11]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_11]
Caused by: java.net.ConnectException: Connection refused: 
/SVR2_IP_ADDRESS:12900 to http://SVR2_IP_ADDRESS:12900/system/cluster/node
        at 
com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:104)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        ... 12 common frames omitted
Caused by: java.net.ConnectException: Connection refused: 
/SVR2_IP_ADDRESS:12900
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
~[na:1.8.0_11]
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:712) 
~[na:1.8.0_11]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        ... 8 common frames omitted

2015-11-16 13:57:03,276 - [ERROR] - from 
org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
API call failed to execute.
java.util.concurrent.ExecutionException: java.net.ConnectException: 
Connection refused: /SVR1_IP_ADDRESS:12900 to 
http://SVR1_IP_ADDRESS:12900/system/cluster/node
        at 
com.ning.http.client.providers.netty.NettyResponseFuture.abort(NettyResponseFuture.java:342)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        at 
com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:108)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:431)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:422)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:384)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:109)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_11]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_11]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_11]
Caused by: java.net.ConnectException: Connection refused: 
/SVR1_IP_ADDRESS:12900 to http://SVR1_IP_ADDRESS:12900/system/cluster/node
        at 
com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:104)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        ... 12 common frames omitted
Caused by: java.net.ConnectException: Connection refused: 
/SVR1_IP_ADDRESS:12900
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
~[na:1.8.0_11]
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:712) 
~[na:1.8.0_11]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        ... 8 common frames omitted

2015-11-16 13:57:08,282 - [ERROR] - from 
org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
API call failed to execute.
java.util.concurrent.ExecutionException: java.net.ConnectException: 
Connection refused: /SVR2_IP_ADDRESS:12900 to 
http://SVR2_IP_ADDRESS:12900/system/cluster/node
        at 
com.ning.http.client.providers.netty.NettyResponseFuture.abort(NettyResponseFuture.java:342)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        at 
com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:108)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:431)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:422)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:384)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:109)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_11]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_11]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_11]
Caused by: java.net.ConnectException: Connection refused: 
/SVR2_IP_ADDRESS:12900 to http://SVR2_IP_ADDRESS:12900/system/cluster/node
        at 
com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:104)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        ... 12 common frames omitted
Caused by: java.net.ConnectException: Connection refused: 
/SVR2_IP_ADDRESS:12900
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
~[na:1.8.0_11]
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:712) 
~[na:1.8.0_11]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
 
~[io.netty.netty-3.9.3.Final.jar:na]
        ... 8 common frames omitted

2015-11-16 13:57:52,182 - [ERROR] - from 
org.graylog2.restclient.lib.ApiClient in 
play-akka.actor.default-dispatcher-10
API call failed to execute.
java.util.concurrent.ExecutionException: 
java.util.concurrent.TimeoutException: No response received after 5000
        at 
com.ning.http.client.providers.netty.NettyResponseFuture.get(NettyResponseFuture.java:266)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        at 
org.graylog2.restclient.lib.ApiClientImpl$ApiRequestBuilder.executeOnAll(ApiClientImpl.java:558)
 
~[org.graylog2.graylog2-rest-client-1.0.1.jar:na]
        at 
org.graylog2.restclient.models.ClusterService.getClusterJvmStats(ClusterService.java:157)
 
[org.graylog2.graylog2-rest-client-1.0.1.jar:na]
        at controllers.NodesController.nodes(NodesController.java:61) 
[graylog-web-interface.graylog-web-interface-1.0.1.jar:1.0.1]
        at 
Routes$$anonfun$routes$1$$anonfun$applyOrElse$44$$anonfun$apply$496.apply(routes_routing.scala:1691)
 
[graylog-web-interface.graylog-web-interface-1.0.1.jar:na]
        at 
Routes$$anonfun$routes$1$$anonfun$applyOrElse$44$$anonfun$apply$496.apply(routes_routing.scala:1691)
 
[graylog-web-interface.graylog-web-interface-1.0.1.jar:na]
        at 
play.core.Router$HandlerInvokerFactory$$anon$4.resultCall(Router.scala:264) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
play.core.Router$HandlerInvokerFactory$JavaActionInvokerFactory$$anon$15$$anon$1.invocation(Router.scala:255)
 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at play.core.j.JavaAction$$anon$1.call(JavaAction.scala:55) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at play.GlobalSettings$1.call(GlobalSettings.java:67) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at play.mvc.Security$AuthenticatedAction.call(Security.java:44) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at play.core.j.JavaAction$$anonfun$11.apply(JavaAction.scala:82) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at play.core.j.JavaAction$$anonfun$11.apply(JavaAction.scala:82) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at 
play.core.j.HttpExecutionContext$$anon$2.run(HttpExecutionContext.scala:40) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
play.api.libs.iteratee.Execution$trampoline$.execute(Execution.scala:46) 
[com.typesafe.play.play-iteratees_2.10-2.3.6.jar:2.3.6]
        at 
play.core.j.HttpExecutionContext.execute(HttpExecutionContext.scala:32) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at scala.concurrent.impl.Future$.apply(Future.scala:31) 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at scala.concurrent.Future$.apply(Future.scala:485) 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at play.core.j.JavaAction$class.apply(JavaAction.scala:82) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
play.core.Router$HandlerInvokerFactory$JavaActionInvokerFactory$$anon$15$$anon$1.apply(Router.scala:252)
 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130)
 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130)
 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at play.utils.Threads$.withContextClassLoader(Threads.scala:21) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:129) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:128) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at scala.Option.map(Option.scala:145) 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at play.api.mvc.Action$$anonfun$apply$1.apply(Action.scala:128) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at play.api.mvc.Action$$anonfun$apply$1.apply(Action.scala:121) 
[com.typesafe.play.play_2.10-2.3.6.jar:2.3.6]
        at 
play.api.libs.iteratee.Iteratee$$anonfun$mapM$1.apply(Iteratee.scala:483) 
[com.typesafe.play.play-iteratees_2.10-2.3.6.jar:2.3.6]
        at 
play.api.libs.iteratee.Iteratee$$anonfun$mapM$1.apply(Iteratee.scala:483) 
[com.typesafe.play.play-iteratees_2.10-2.3.6.jar:2.3.6]
        at 
play.api.libs.iteratee.Iteratee$$anonfun$flatMapM$1.apply(Iteratee.scala:519) 
[com.typesafe.play.play-iteratees_2.10-2.3.6.jar:2.3.6]
        at 
play.api.libs.iteratee.Iteratee$$anonfun$flatMapM$1.apply(Iteratee.scala:519) 
[com.typesafe.play.play-iteratees_2.10-2.3.6.jar:2.3.6]
        at 
play.api.libs.iteratee.Iteratee$$anonfun$flatMap$1$$anonfun$apply$14.apply(Iteratee.scala:496)
 
[com.typesafe.play.play-iteratees_2.10-2.3.6.jar:2.3.6]
        at 
play.api.libs.iteratee.Iteratee$$anonfun$flatMap$1$$anonfun$apply$14.apply(Iteratee.scala:496)
 
[com.typesafe.play.play-iteratees_2.10-2.3.6.jar:2.3.6]
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) 
[com.typesafe.akka.akka-actor_2.10-2.3.4.jar:na]
        at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
 
[com.typesafe.akka.akka-actor_2.10-2.3.4.jar:na]
        at 
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
[org.scala-lang.scala-library-2.10.4.jar:na]
        at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 
[org.scala-lang.scala-library-2.10.4.jar:na]
Caused by: java.util.concurrent.TimeoutException: No response received 
after 5000
        at 
com.ning.http.client.providers.netty.NettyResponseFuture.get(NettyResponseFuture.java:260)
 
~[com.ning.async-http-client-1.8.14.jar:na]
        ... 43 common frames omitted


It seems like timeouts or API failures are causing node comms to fallover 
on the cluster. We tried upping the Heap size on the web interface but no 
luck. 
This cluster is not under any load - basically its under build out right 
now, so message throughput should not be an issue. 

Any qs for more info, let me know!
Any help you can give would be appreciated! 


Thanks
Alexia





-- 


This message is for the named person's use only. If you received this 
message in error, please immediately delete it and all copies and notify 
the sender. You must not, directly or indirectly, use, disclose, 
distribute, print, or copy any part of this message if you are not the 
intended recipient. Any views expressed in this message are those of the 
individual sender and not Trustev Ltd. Trustev is registered in Ireland No. 
516425 and trades from 2100 Cork Airport Business Park, Cork, Ireland.

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/5583b0f9-5703-48b5-b219-009ad2e4c0e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to