Hello, We have a pair of GL nodes with a cluster of three ES servers at the backend. The occasional capacity problem aside this has been working fine for the most part. Today one of the GL nodes decided to act up though, it's behavior is pretty strange:
The process is up, I can connect to it via JMX. It doesn't reply to any API calls, so as far as the web interface is concerned it's dead. It still listens on the relevant ports, it just reply to curl for example, or the web interface. Checking with JMX, the process keeps eating and GC'ing memory, very slowly increasing the memory usage pattern. Negligible CPU use. I can still see GL as a client node in the ES cluster. I can see nothing suspicious about the process or the machine it's running on, it's a physical server and there doesn't appear to be any hardware errors or such. One of the ES nodes is running on the same server without issue. The config is exactly the same on the other GL server, same hardware and running both GL and ES without issue. The graylog2-server.log file has quite a few of these lines(along with a stack trace): 2014-08-19 13:03:13,628 ERROR: org.graylog2.jersey.container.netty.NettyContainer - Uncaught exception during jersey resource handling java.nio.channels.ClosedChannelException 2014-08-19 13:03:13,628 INFO : org.graylog2.jersey.container.netty.NettyContainer - Not writing any response, channel is already closed. java.nio.channels.ClosedChannelException I've looked through the other boxes, and aside from the other node being pretty heavily loaded since it has to deal with the entire load, I can't find anything apparently wrong. The setup is all Debian 7/amd64, running Sun JDK 7u67, and Graylog has a 12 GB heap size configured. Any tips? Regards Johan -- You received this message because you are subscribed to the Google Groups "graylog2" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
