OK. 

Separate filesystem doesn't resolve the issue, graylog2 just runs until the 
message_cache_spool_dir fills up and then crashes. 

I am writing a couple of scripts to catch when the disk is nearly full, 
stop services, delete all the files in the message_cache_spool_dir and 
start services back up. 

I am going to try a fresh install of graylog on another host and see if the 
issue occurs. 





On Monday, October 6, 2014 10:09:57 AM UTC-4, Dustin Tennill wrote:
>
> It crashed once the disk filled up. 
>
> I am going create a partition just for the message_cache_spool_dir to see 
> if perhaps it is aware of full disk and will resolve the issue itself. 
>
> Anyone have any specific information on this setting? Documentation 
> doesn't mention it yet, and I can't see any way to handle it other than 
> stop/delete files/start. 
>
>
>
> On Sunday, October 5, 2014 3:22:10 PM UTC-4, Dustin Tennill wrote:
>>
>> The spool directory is growing at a steady rate - around 500M every five 
>> minutes.
>>
>> root@myhost:/var/lib/graylog2-server/message-cache-spool# sleep 300; du 
>> -sh *; date;sleep 300; du -sh *; date;sleep 300; du -sh *; date;
>> 40K    input-cache
>> 664K    input-cache.p
>> 900K    input-cache.t
>> 61M    output-cache
>> 2.9G    output-cache.p
>> 904K    output-cache.t
>> Sun Oct  5 14:45:50 EDT 2014
>> 40K    input-cache
>> 664K    input-cache.p
>> 900K    input-cache.t
>> 61M    output-cache
>> 3.3G    output-cache.p
>> 904K    output-cache.t
>> Sun Oct  5 14:50:50 EDT 2014
>> 40K    input-cache
>> 664K    input-cache.p
>> 900K    input-cache.t
>> 61M    output-cache
>> 3.7G    output-cache.p
>> 1.7M    output-cache.t
>> Sun Oct  5 14:55:50 EDT 2014
>>
>> Based on past experience, this will grow until graylog2 crashes. 
>>
>>
>> On Sunday, October 5, 2014 2:18:39 PM UTC-4, Dustin Tennill wrote:
>>>
>>> Apologies to the group - I didn't realize my posts were being moderated 
>>> until I had attempted post the same comment several times. 
>>>
>>> I enabled the message_cache_off_heap setting and it seems to have 
>>> resolved slow GC crash issue.
>>> message_cache_off_heap = true 
>>> message_cache_spool_dir = /var/lib/graylog2-server/message-cache-spool
>>> With this setting on, my 20G heap stays between 5G and 10G utilized. 
>>>
>>> However, as far as I can tell the message_cache_spool_dir seems to grow 
>>> until the disk fills up. 
>>>
>>> Has anyone experienced this? Is there a cleanup operation I should be 
>>> performing? 
>>>
>>> Dustin
>>>
>>>
>>> On Wednesday, October 1, 2014 12:16:19 PM UTC-4, Dustin Tennill wrote:
>>>>
>>>> All,
>>>>
>>>> I recently upgraded to rc.1/ElasticSearch 1.3.2 and am having some 
>>>> issues. We are not in production yet, and I understand that I should 
>>>> expect 
>>>> problems with the release candidate code. 
>>>>
>>>> *Our Graylog Environment:*
>>>> A single Graylog Radio Server (0.91.0-rc.1)
>>>> A single Graylog Server (0.91.0-rc.1)
>>>> Java Settings:  -Xmx20480M -Xms20480M -verbose:gc 
>>>> -Xloggc:/var/log/grayloggc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
>>>> A single Graylog-Web Server (0.91.0-rc.1)
>>>> Two ElasticSearch Nodes (1.3.2)
>>>> Statistics: 6000-7000 msgs per second when things are working correctly
>>>>
>>>> *1. "Cluster information currently unavailable" message shown when I 
>>>> browse to the system page. *
>>>> Since upgrading to the current release, I note that the ElasticSearch 
>>>> health indication page nearly always shows "Cluster information currently 
>>>> unavailable". 
>>>> My ElasticSearch cluster appears healthy to me. I am using the head 
>>>> plugin, and can confirm all is "green" and both nodes are caught up. 
>>>> At least once this has worked correctly - not sure why. 
>>>>
>>>> This doesn't appear to mean anything, data is still coming in and being 
>>>> processed correctly. 
>>>>
>>>> *2. Graylog2-server - crashes eventually due to slow garbage 
>>>> collection. *
>>>> I don't know for sure that this is WHY I seem to have a crash, but the 
>>>> trend seems to be if GC takes longer than a few seconds, I start seeing 
>>>> these message patterns. 
>>>>
>>>> 2014-10-01 11:59:09,598 WARN : org.elasticsearch.monitor.jvm - 
>>>> [graylog2-server] [gc][old][2150][139] duration [1.1m], collections 
>>>> [1]/[1.1m], total [1.1m]/[1.9h], memory [17.8gb]->[17.9gb]/[19.1gb], 
>>>> all_pools {[young] [4.5gb]->[4.5gb]/[4.7gb]}{[survivor] 
>>>> [0b]->[0b]/[911mb]}{[old] [13.3gb]->[13.3gb]/[13.3gb]}
>>>> 2014-10-01 11:59:09,601 ERROR: 
>>>> org.graylog2.jersey.container.netty.NettyContainer - Uncaught exception 
>>>> during jersey resource handling
>>>> java.io.IOException: Broken pipe
>>>>     at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>     at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>>>     at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>>>     at sun.nio.ch.IOUtil.write(IOUtil.java:51)
>>>>     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:203)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:201)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:146)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:99)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:36)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:779)
>>>>     at org.jboss.netty.channel.Channels.write(Channels.java:725)
>>>>     at 
>>>> org.jboss.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
>>>>     at 
>>>> org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:784)
>>>>     at 
>>>> org.jboss.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:280)
>>>>     at 
>>>> org.jboss.netty.handler.stream.ChunkedWriteHandler.handleDownstream(ChunkedWriteHandler.java:121)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
>>>>     at org.jboss.netty.channel.Channels.write(Channels.java:704)
>>>>     at org.jboss.netty.channel.Channels.write(Channels.java:671)
>>>>     at 
>>>> org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
>>>>     at 
>>>> org.graylog2.jersey.container.netty.NettyContainer$NettyResponseWriter$1.write(NettyContainer.java:142)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.CommittingOutputStream.write(CommittingOutputStream.java:229)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor$UnCloseableOutputStream.write(WriterInterceptorExecutor.java:299)
>>>>     at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
>>>>     at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
>>>>     at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
>>>>     at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
>>>>     at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
>>>>     at java.io.BufferedWriter.flush(BufferedWriter.java:254)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.ReaderWriter.writeToAsString(ReaderWriter.java:192)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.AbstractMessageReaderWriterProvider.writeToAsString(AbstractMessageReaderWriterProvider.java:129)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.StringMessageProvider.writeTo(StringMessageProvider.java:99)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.StringMessageProvider.writeTo(StringMessageProvider.java:59)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.invokeWriteTo(WriterInterceptorExecutor.java:265)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.aroundWriteTo(WriterInterceptorExecutor.java:250)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)
>>>>     at 
>>>> org.glassfish.jersey.server.internal.JsonWithPaddingInterceptor.aroundWriteTo(JsonWithPaddingInterceptor.java:106)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)
>>>>     at 
>>>> org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:85)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162)
>>>>     at 
>>>> org.glassfish.jersey.message.internal.MessageBodyFactory.writeTo(MessageBodyFactory.java:1154)
>>>>     at 
>>>> org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:621)
>>>>     at 
>>>> org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:377)
>>>>     at 
>>>> org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:367)
>>>>     at 
>>>> org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:274)
>>>>     at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
>>>>     at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
>>>>     at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
>>>>     at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
>>>>     at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
>>>>     at 
>>>> org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:297)
>>>>     at 
>>>> org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:254)
>>>>     at 
>>>> org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1028)
>>>>     at 
>>>> org.graylog2.jersey.container.netty.NettyContainer.messageReceived(NettyContainer.java:356)
>>>>     at 
>>>> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>>>     at 
>>>> org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>>>     at 
>>>> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>>>>     at 
>>>> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
>>>>     at 
>>>> org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
>>>>     at 
>>>> org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
>>>>     at 
>>>> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>>     at 
>>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>>>>     at 
>>>> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>>>>     at 
>>>> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>>>>     at 
>>>> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>>>>     at 
>>>> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>>>>     at 
>>>> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>>>>     at 
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>     at 
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>     at java.lang.Thread.run(Thread.java:745)
>>>>
>>>> I routinely get the java.io.IOException: Broken pipe message if I just 
>>>> browse to the "System" page directly. 
>>>>
>>>> Thoughts? 
>>>>
>>>> Any information I didn't provide? 
>>>>
>>>> Thanks !!
>>>>
>>>> Dustin Tennill
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to