OK.
Separate filesystem doesn't resolve the issue, graylog2 just runs until the
message_cache_spool_dir fills up and then crashes.
I am writing a couple of scripts to catch when the disk is nearly full,
stop services, delete all the files in the message_cache_spool_dir and
start services back up.
I am going to try a fresh install of graylog on another host and see if the
issue occurs.
On Monday, October 6, 2014 10:09:57 AM UTC-4, Dustin Tennill wrote:
>
> It crashed once the disk filled up.
>
> I am going create a partition just for the message_cache_spool_dir to see
> if perhaps it is aware of full disk and will resolve the issue itself.
>
> Anyone have any specific information on this setting? Documentation
> doesn't mention it yet, and I can't see any way to handle it other than
> stop/delete files/start.
>
>
>
> On Sunday, October 5, 2014 3:22:10 PM UTC-4, Dustin Tennill wrote:
>>
>> The spool directory is growing at a steady rate - around 500M every five
>> minutes.
>>
>> root@myhost:/var/lib/graylog2-server/message-cache-spool# sleep 300; du
>> -sh *; date;sleep 300; du -sh *; date;sleep 300; du -sh *; date;
>> 40K input-cache
>> 664K input-cache.p
>> 900K input-cache.t
>> 61M output-cache
>> 2.9G output-cache.p
>> 904K output-cache.t
>> Sun Oct 5 14:45:50 EDT 2014
>> 40K input-cache
>> 664K input-cache.p
>> 900K input-cache.t
>> 61M output-cache
>> 3.3G output-cache.p
>> 904K output-cache.t
>> Sun Oct 5 14:50:50 EDT 2014
>> 40K input-cache
>> 664K input-cache.p
>> 900K input-cache.t
>> 61M output-cache
>> 3.7G output-cache.p
>> 1.7M output-cache.t
>> Sun Oct 5 14:55:50 EDT 2014
>>
>> Based on past experience, this will grow until graylog2 crashes.
>>
>>
>> On Sunday, October 5, 2014 2:18:39 PM UTC-4, Dustin Tennill wrote:
>>>
>>> Apologies to the group - I didn't realize my posts were being moderated
>>> until I had attempted post the same comment several times.
>>>
>>> I enabled the message_cache_off_heap setting and it seems to have
>>> resolved slow GC crash issue.
>>> message_cache_off_heap = true
>>> message_cache_spool_dir = /var/lib/graylog2-server/message-cache-spool
>>> With this setting on, my 20G heap stays between 5G and 10G utilized.
>>>
>>> However, as far as I can tell the message_cache_spool_dir seems to grow
>>> until the disk fills up.
>>>
>>> Has anyone experienced this? Is there a cleanup operation I should be
>>> performing?
>>>
>>> Dustin
>>>
>>>
>>> On Wednesday, October 1, 2014 12:16:19 PM UTC-4, Dustin Tennill wrote:
>>>>
>>>> All,
>>>>
>>>> I recently upgraded to rc.1/ElasticSearch 1.3.2 and am having some
>>>> issues. We are not in production yet, and I understand that I should
>>>> expect
>>>> problems with the release candidate code.
>>>>
>>>> *Our Graylog Environment:*
>>>> A single Graylog Radio Server (0.91.0-rc.1)
>>>> A single Graylog Server (0.91.0-rc.1)
>>>> Java Settings: -Xmx20480M -Xms20480M -verbose:gc
>>>> -Xloggc:/var/log/grayloggc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
>>>> A single Graylog-Web Server (0.91.0-rc.1)
>>>> Two ElasticSearch Nodes (1.3.2)
>>>> Statistics: 6000-7000 msgs per second when things are working correctly
>>>>
>>>> *1. "Cluster information currently unavailable" message shown when I
>>>> browse to the system page. *
>>>> Since upgrading to the current release, I note that the ElasticSearch
>>>> health indication page nearly always shows "Cluster information currently
>>>> unavailable".
>>>> My ElasticSearch cluster appears healthy to me. I am using the head
>>>> plugin, and can confirm all is "green" and both nodes are caught up.
>>>> At least once this has worked correctly - not sure why.
>>>>
>>>> This doesn't appear to mean anything, data is still coming in and being
>>>> processed correctly.
>>>>
>>>> *2. Graylog2-server - crashes eventually due to slow garbage
>>>> collection. *
>>>> I don't know for sure that this is WHY I seem to have a crash, but the
>>>> trend seems to be if GC takes longer than a few seconds, I start seeing
>>>> these message patterns.
>>>>
>>>> 2014-10-01 11:59:09,598 WARN : org.elasticsearch.monitor.jvm -
>>>> [graylog2-server] [gc][old][2150][139] duration [1.1m], collections
>>>> [1]/[1.1m], total [1.1m]/[1.9h], memory [17.8gb]->[17.9gb]/[19.1gb],
>>>> all_pools {[young] [4.5gb]->[4.5gb]/[4.7gb]}{[survivor]
>>>> [0b]->[0b]/[911mb]}{[old] [13.3gb]->[13.3gb]/[13.3gb]}
>>>> 2014-10-01 11:59:09,601 ERROR:
>>>> org.graylog2.jersey.container.netty.NettyContainer - Uncaught exception
>>>> during jersey resource handling
>>>> java.io.IOException: Broken pipe
>>>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>>> at sun.nio.ch.IOUtil.write(IOUtil.java:51)
>>>> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:203)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:201)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:146)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:99)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:36)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:779)
>>>> at org.jboss.netty.channel.Channels.write(Channels.java:725)
>>>> at
>>>> org.jboss.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
>>>> at
>>>> org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:784)
>>>> at
>>>> org.jboss.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:280)
>>>> at
>>>> org.jboss.netty.handler.stream.ChunkedWriteHandler.handleDownstream(ChunkedWriteHandler.java:121)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
>>>> at org.jboss.netty.channel.Channels.write(Channels.java:704)
>>>> at org.jboss.netty.channel.Channels.write(Channels.java:671)
>>>> at
>>>> org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
>>>> at
>>>> org.graylog2.jersey.container.netty.NettyContainer$NettyResponseWriter$1.write(NettyContainer.java:142)
>>>> at
>>>> org.glassfish.jersey.message.internal.CommittingOutputStream.write(CommittingOutputStream.java:229)
>>>> at
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor$UnCloseableOutputStream.write(WriterInterceptorExecutor.java:299)
>>>> at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
>>>> at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
>>>> at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
>>>> at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
>>>> at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
>>>> at java.io.BufferedWriter.flush(BufferedWriter.java:254)
>>>> at
>>>> org.glassfish.jersey.message.internal.ReaderWriter.writeToAsString(ReaderWriter.java:192)
>>>> at
>>>> org.glassfish.jersey.message.internal.AbstractMessageReaderWriterProvider.writeToAsString(AbstractMessageReaderWriterProvider.java:129)
>>>> at
>>>> org.glassfish.jersey.message.internal.StringMessageProvider.writeTo(StringMessageProvider.java:99)
>>>> at
>>>> org.glassfish.jersey.message.internal.StringMessageProvider.writeTo(StringMessageProvider.java:59)
>>>> at
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.invokeWriteTo(WriterInterceptorExecutor.java:265)
>>>> at
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.aroundWriteTo(WriterInterceptorExecutor.java:250)
>>>> at
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)
>>>> at
>>>> org.glassfish.jersey.server.internal.JsonWithPaddingInterceptor.aroundWriteTo(JsonWithPaddingInterceptor.java:106)
>>>> at
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)
>>>> at
>>>> org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:85)
>>>> at
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)
>>>> at
>>>> org.glassfish.jersey.message.internal.MessageBodyFactory.writeTo(MessageBodyFactory.java:1154)
>>>> at
>>>> org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:621)
>>>> at
>>>> org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:377)
>>>> at
>>>> org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:367)
>>>> at
>>>> org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:274)
>>>> at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
>>>> at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
>>>> at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
>>>> at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
>>>> at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
>>>> at
>>>> org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:297)
>>>> at
>>>> org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:254)
>>>> at
>>>> org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1028)
>>>> at
>>>> org.graylog2.jersey.container.netty.NettyContainer.messageReceived(NettyContainer.java:356)
>>>> at
>>>> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>>> at
>>>> org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>>> at
>>>> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>>>> at
>>>> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
>>>> at
>>>> org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
>>>> at
>>>> org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
>>>> at
>>>> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>> at
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>>>> at
>>>> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>>>> at
>>>> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>>>> at
>>>> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>>>> at
>>>> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>>>> at
>>>> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>>
>>>> I routinely get the java.io.IOException: Broken pipe message if I just
>>>> browse to the "System" page directly.
>>>>
>>>> Thoughts?
>>>>
>>>> Any information I didn't provide?
>>>>
>>>> Thanks !!
>>>>
>>>> Dustin Tennill
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
--
You received this message because you are subscribed to the Google Groups
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.