[jira] [Commented] (ARTEMIS-2180) Host and broker runs out of memory when stopping a backup in a cluster

Justin Bertram (JIRA) Mon, 18 Mar 2019 20:36:19 -0700


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795626#comment-16795626
 ]


Justin Bertram commented on ARTEMIS-2180:
-----------------------------------------

Is this the same behavior you were seeing when this issue was originally 
reported? The description says, "the live broker starts leaking memory, runs 
out of memory and core dumps." However, the attached log indicates nothing of 
the sort. I don't see any evidence of a memory leak or a core dump. The log 
indicates there was a disk IO error which caused the broker to shut itself 
down, i.e.:

{noformat}
2019-03-12 07:22:55,920 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222010: Critical IO Error, shutting down the server. file=NIOSequentialFile 
/var/lib/Primary-172.31.2.21/data/large-messages/291747.msg, message=No space 
left on device: ActiveMQIOErrorException[errorType=IO_ERROR message=No space 
left on device]
        at 
org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.internalWrite(NIOSequentialFile.java:286)
 [artemis-journal-2.6.3.jar:2.6.3]
        at 
org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.writeDirect(NIOSequentialFile.java:255)
 [artemis-journal-2.6.3.jar:2.6.3]
        at 
org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.addBytesToLargeMessage(JournalStorageManager.java:794)
 [artemis-server-2.6.3.jar:2.6.3]
        at 
org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl.addBytes(LargeServerMessageImpl.java:129)
 [artemis-server-2.6.3.jar:2.6.3]
        at 
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.messageToLargeMessage(ServerSessionImpl.java:1405)
 [artemis-server-2.6.3.jar:2.6.3]
        at 
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.send(ServerSessionImpl.java:1420)
 [artemis-server-2.6.3.jar:2.6.3]
        at 
org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback.serverSend(AMQPSessionCallback.java:539)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback.serverSend(AMQPSessionCallback.java:498)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerReceiverContext.onMessage(ProtonServerReceiverContext.java:254)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.onDelivery(AMQPConnectionContext.java:519)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.proton.handler.Events.dispatch(Events.java:92)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.dispatch(ProtonHandler.java:494)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.flush(ProtonHandler.java:307)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.inputBuffer(ProtonHandler.java:272)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.inputBuffer(AMQPConnectionContext.java:158)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.protocol.amqp.broker.ActiveMQProtonRemotingConnection.bufferReceived(ActiveMQProtonRemotingConnection.java:147)
 [artemis-amqp-protocol-2.6.3.jar:]
        at 
org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$DelegatingBufferHandler.bufferReceived(RemotingServiceImpl.java:643)
 [artemis-server-2.6.3.jar:2.6.3]
        at 
org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.channelRead(ActiveMQChannelHandler.java:73)
 [artemis-core-client-2.6.3.jar:2.6.3]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:808)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$1.run(AbstractEpollChannel.java:387)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:309) 
[netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
 [netty-all-4.1.24.Final.jar:4.1.24.Final]
        at 
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
 [artemis-commons-2.6.3.jar:2.6.3]
Caused by: java.io.IOException: No space left on device
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method) 
[rt.jar:1.8.0_191]
        at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) 
[rt.jar:1.8.0_191]
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
[rt.jar:1.8.0_191]
        at sun.nio.ch.IOUtil.write(IOUtil.java:65) [rt.jar:1.8.0_191]
        at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211) 
[rt.jar:1.8.0_191]
        at 
org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.doInternalWrite(NIOSequentialFile.java:301)
 [artemis-journal-2.6.3.jar:2.6.3]
        at 
org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.internalWrite(NIOSequentialFile.java:282)
 [artemis-journal-2.6.3.jar:2.6.3]
        ... 31 more
{noformat}

The underlying error is coming from the JVM - {{java.io.IOException: No space 
left on device}}. I have no explanation as to why the disk would consistently 
report that there's no space left when you stop the slave.  Presumably they are 
completely independent of one another. What kind of filesystem is being used 
here? Is it being accessed over the network or is it local?  

Have you been able to reproduce this in any other environment?

I'm not sure if this is the topology you'd use for production, but for what 
it's worth, a single replicated live/backup pair is not recommended due to the 
risk of split brain.

> Host and broker runs out of memory when stopping a backup in a cluster
> ----------------------------------------------------------------------
>
>                 Key: ARTEMIS-2180
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2180
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: AMQP
>    Affects Versions: 2.6.3
>            Reporter: Simon Chalmers
>            Priority: Critical
>         Attachments: artemis.log, hs_err_pid15534.log, 
> server2-primary-broker.xml, server3-backup-broker.xml
>
>
> When running a live-backup cluster pair and stopping the backup, the live 
> broker starts leaking memory, runs out of memory and core dumps.
> This occurs during a performance test when the broker is under load; the 
> slave has been running successfully but is then terminated whilst the broker 
> is still under load.
> Core dump attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARTEMIS-2180) Host and broker runs out of memory when stopping a backup in a cluster

Reply via email to