[ 
https://issues.apache.org/jira/browse/GEODE-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk Lund updated GEODE-8131:
-----------------------------
    Affects Version/s:     (was: 1.12.0)
                       1.0.0-incubating
                       1.1.0
                       1.2.0
                       1.3.0
                       1.4.0
                       1.5.0
                       1.6.0
                       1.7.0
                       1.8.0
                       1.9.0
                       1.10.0

> alert service hangs, blocking cache operations
> ----------------------------------------------
>
>                 Key: GEODE-8131
>                 URL: https://issues.apache.org/jira/browse/GEODE-8131
>             Project: Geode
>          Issue Type: Bug
>          Components: logging
>    Affects Versions: 1.0.0-incubating, 1.1.0, 1.2.0, 1.3.0, 1.4.0, 1.5.0, 
> 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.10.0
>            Reporter: Bruce J Schuchardt
>            Assignee: Kirk Lund
>            Priority: Major
>              Labels: GeodeOperationAPI
>
> This v1.8 TcpConduit reader thread was blocked in a production system.  It 
> had experienced a deserialization error and was trying to log the exception.  
> A Manager was present in the cluster and had registered as an alert listener. 
>  Another thread was blocked sending something on the shared/unordered 
> connection that this alert should be sent on.  This persisted for over 6 
> hours and we never saw the serialization exception in the log file.  
> Consequently we had to recommend setting the alert level to None and have 
> them run into the serialization problem again.
> This is a serious flaw in the alerting system and it's caused us grief many 
> times.  The alerting system should not block other threads.  Maybe a 
> background thread could consume and transmit alerts to alert-listeners?
>  
> {noformat}
> "P2P message reader for 10.236.28.120(servername-removed)<v491>:56152 shared 
> unordered uid=9 port=41204" tid=0xd49 (in native)    java.lang.Thread.State: 
> RUNNABLE at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at 
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at 
> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at 
> sun.nio.ch.IOUtil.write(IOUtil.java:51) at 
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) -  locked 
> java.lang.Object@24528b9b at 
> org.apache.geode.internal.tcp.Connection.nioWriteFully(Connection.java:3291) 
> -  locked java.lang.Object@42a1a79b at 
> org.apache.geode.internal.tcp.Connection.sendPreserialized(Connection.java:2527)
>  at org.apache.geode.internal.tcp.MsgStreamer.realFlush(MsgStreamer.java:319) 
> at 
> org.apache.geode.internal.tcp.MsgStreamer.writeMessage(MsgStreamer.java:244) 
> at 
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:393)
>  at 
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:250)
>  at 
> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:615)
>  at 
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1717)
>  at 
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1898)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2878)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:2798)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2837)
>  at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1531)
>  at 
> org.apache.geode.internal.alerting.AlertMessaging.sendAlert(AlertMessaging.java:75)
>  at 
> org.apache.geode.internal.logging.log4j.AlertAppender.sendAlertMessage(AlertAppender.java:188)
>  at 
> org.apache.geode.internal.logging.log4j.AlertAppender.doAppend(AlertAppender.java:163)
>  at 
> org.apache.geode.internal.logging.log4j.AlertAppender.lambda$append$0(AlertAppender.java:159)
>  at 
> org.apache.geode.internal.logging.log4j.AlertAppender$$Lambda$168/1102181662.run(Unknown
>  Source) at 
> org.apache.geode.internal.alerting.AlertingAction.execute(AlertingAction.java:29)
>  at 
> org.apache.geode.internal.logging.log4j.AlertAppender.append(AlertAppender.java:159)
>  at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)
>  at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)
>  at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120)
>  at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
>  at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:464)
>  at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:448)
>  at 
> org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:431) 
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.logParent(LoggerConfig.java:455)
>  at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:450)
>  at 
> org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:431) 
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:406) 
> at 
> org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:63)
>  at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:146) at 
> org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(ExtendedLoggerWrapper.java:217)
>  at 
> org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2170)
>  at 
> org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2125)
>  at 
> org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2108)
>  at 
> org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2002)
>  at 
> org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1974)
>  at 
> org.apache.logging.log4j.spi.AbstractLogger.fatal(AbstractLogger.java:1054) 
> at 
> org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3610)
>  at 
> org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1824) 
> at org.apache.geode.internal.tcp.Connection.run(Connection.java:1686) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to