[ https://issues.apache.org/jira/browse/LOG4J2-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296089#comment-17296089 ]
Volkan Yazici commented on LOG4J2-2926: --------------------------------------- Hello [~kaushik.vankayala]! Sorry to hear about your production problem – "shit happens"^(TM)^. I couple of tips from my side: * You might want to consider using AsyncAppender with a certain buffer size and error handling strategy. * You might want to consider FailoverAppender with a file/console fallback in case of SocketAppender failures. * At work, we shield ELK with a Redis cluster – yes, our ELK collapses occasionally too. This setup has been serving us pretty well for many years. We implement this using Log4j JsonTemplateLayout and [log4j2-redis-appender|https://github.com/vy/log4j2-redis-appender] (an external project). I can with confidence recommend you this setup. > Application OUTAGE due to Unable to write to stream TCP > ------------------------------------------------------- > > Key: LOG4J2-2926 > URL: https://issues.apache.org/jira/browse/LOG4J2-2926 > Project: Log4j 2 > Issue Type: Bug > Components: Appenders > Affects Versions: 2.13.3 > Environment: Mulesoft, Linux, ELK (hosted service on AWS) > Reporter: Kaushik Vankayala > Assignee: Ralph Goers > Priority: Major > Labels: SocketAppender, beginner > Fix For: 2.13.3 > > > Hi Team, we have recently encountered an outage in our PRODUCTION > application. We have custom logging using log4j2 and the remote server was > out of storage. We suspect we got the issue because of the same reason and > the ERROR we faced is as below; > > 2020-08-30 22:23:04,686 Log4j2-TF-17-AsyncLoggerConfig-9 ERROR Unable to > write to stream > TCP:[api-manager-2623b9734249246e.elb.ap-southeast-1.amazonaws.com|http://api-manager-2623b9734249246e.elb.ap-southeast-1.amazonaws.com/]:8500 > for appender SOCKET > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error > sending to > TCP:[api-manager-2623b9734249246e.elb.ap-southeast-1.amazonaws.com|http://api-manager-2623b9734249246e.elb.ap-southeast-1.amazonaws.com/]:8500 > for > [api-manager-2623b9734249246e.elb.ap-southeast-1.amazonaws.com/52.221.23.118:8500|http://api-manager-2623b9734249246e.elb.ap-southeast-1.amazonaws.com/52.221.23.118:8500] > at > org.apache.logging.log4j.core.net.TcpSocketManager.write(TcpSocketManager.java:231) > at > org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:190) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.writeByteArrayToManager(AbstractOutputStreamAppender.java:206) > at > org.apache.logging.log4j.core.appender.SocketAppender.directEncodeEvent(SocketAppender.java:459) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutputStreamAppender.java:190) > "http.listener.02 SelectorRunner" #76 prio=5 os_prio=0 > tid=0x00007f314c52d800 nid=0xb19 waiting for monitor entry > [0x00007f314a6fc000] java.lang.Thread.State: BLOCKED (on object monitor) at > org.apache.logging.log4j.core.async.AsyncLoggerConfigDisruptor.enqueue(AsyncLoggerConfigDisruptor.java:376) > - waiting to lock <0x0000000088b43a58> (a java.lang.Object) at > org.apache.logging.log4j.core.async.AsyncLoggerConfigDisruptor.enqueueEvent(AsyncLoggerConfigDisruptor.java:330) > at > org.apache.logging.log4j.core.async.AsyncLoggerConfig.logInBackgroundThread(AsyncLoggerConfig.java:159) > at > org.apache.logging.log4j.core.async.EventRoute$1.logMessage(EventRoute.java:46) > > We tried to follow the link > ([https://help.mulesoft.com/s/article/Mule-instance-which-implements-a-log4j2-SocketAppender-complains-with-Broken-Pipe-Error]). > > Unlike splunk we have ELK in our architecture. Our Socket appender looks like > below > > {{<Socket name="SOCKET" host="${sys:tcp.host}" port="${sys:tcp.port}" > reconnectDelayMillis="30000" immediateFail="false" bufferedIo="true" > bufferSize="204800" protocol="TCP" immediateFlush="false">}} > > We have couple of queries below if you could kindly address them; > # With the current Socket Appender what additional tags may be needed to > independently stream the logs irrestive of the remote destination status > # Our ELK server is a hosted servie. The first point after Cloudhub is a > Load Balancer after which there is an EC2 server where Logstash is running. > Do we need to configure any keep-alive configuration at the O/S level? > # Why should a storage issue at a remote destination cause an issue in the > socket appender and eventually fail the running of an application. Logging by > socket appender should ideally be an independent activithy. > Finally, we would request you to recommend a solution for the case where the > remote endpoint storage is exhausted or there may be any TCP sockets dead, > and how we can avoid the OUTAGE of MuleSoft application due to a logging > problem by Log4j2. -- This message was sent by Atlassian Jira (v8.3.4#803005)