Java Broker becomes unresponsive after BDB Null Pointer Exception encountered
-----------------------------------------------------------------------------

                 Key: QPID-3646
                 URL: https://issues.apache.org/jira/browse/QPID-3646
             Project: Qpid
          Issue Type: Bug
          Components: Java Broker, Java Broker BDB Store, Java Client
    Affects Versions: 0.5
            Reporter: Andrew MacBean


Java Broker becomes unresponsive after BDB Null Pointer Exception encountered 
in the logs.  System encountered disk space issues previously but these were 
not thought to have affected BDB, only QPID logs. The BDBStore was stored on a 
different file system that was healthy.

The error was observed and then later on clients had issues connecting to the 
broker and it was required to be restarted.  The NPE seemed to be thrown by a 
BDB cleaner thread that had an invalid offset.

The BDB Store is version: 4.0.103


Recorded Timeline of Events

1) Nov 23 12:40:57 

The filesystem where QPID code resides and the logs are written gets filled up 
, due to another process, and a message is observed in QPID log:

log4j:ERROR Failed to flog4j:ERROR Failed to flush writer,
java.io.IOException: No space left on device

Disk space issue was then resolved by 1300.

2) Nov 23 15:27:41

The following message appears in the QPID log:

<DaemonThread name="Cleaner-1"/> caught exception: 
java.lang.NullPointerException
java.lang.NullPointerException at 
com.sleepycat.je.cleaner.OffsetList$Segment.get(OffsetList.java:192)
            at com.sleepycat.je.cleaner.OffsetList.contains(OffsetList.java:135)
            at 
com.sleepycat.je.cleaner.TrackedFileSummary.containsObsoleteOffset(TrackedFileSummary.java:169)
            at 
com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:479)
            at 
com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:241)
            at 
com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:140)
            at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:160)
            at java.lang.Thread.run(Thread.java:662)

3) Around Nov 23 16:30 

Clients start reporting QPID is not responding and report the following errors:

org.apache.qpid.client.JMSAMQException: Failed to commit: Server did not 
respond in a timely fashion [error code 408: Request Timeout]
and
Error creating session: org.apache.qpid.AMQTimeoutException: Server did not 
respond in a timely fashion [error code 408: Request Timeout]

Errors in qpid log seen:

2011-11-23 16:29:12,715 ERROR [SocketAcceptorIoProcessor-0.3] 
protocol.AMQPFastProtocolHandler (AMQPFastProtocolHandler.java:223) - 
IOException caught in/169.93.5.123:56138(notifier_ods), session closed 
implictly: java.io.IOException: Connection reset by peer
and 
2011-11-23 16:33:41,495 WARN  [Queue-housekeeping-notifier] server.AMQChannel 
(AMQChannel.java:1071) - IDLE TRANSACTION ALERT 
[con:1,069(notifier_ods@/169.93.5.123:55879/notifier)/ch:3]  270932 ms

QPID Management Console is frozen showing no updates 

QPID Jconsole doesn't show anything wrong  


4) Nov 23 17:15:26

QPID is bounced and no further issues are observed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to