Alan Conway created QPID-4293:
---------------------------------

             Summary: HA broker crashes on startup
                 Key: QPID-4293
                 URL: https://issues.apache.org/jira/browse/QPID-4293
             Project: Qpid
          Issue Type: Bug
          Components: C++ Clustering
    Affects Versions: 0.18
            Reporter: Alan Conway
            Assignee: Alan Conway
             Fix For: 0.19


>From Nitin Shah:

I tried to start the version 18 of the C++ broker and get the following error 
in /var/log/messages and the broker dies. Any idea what we are doing wrong. We 
have been using the version 16 and that starts fine.

10:29:35 nshah_1 qpidd[1550]: 2012-09-05 10:29:35 [Broker] notice SASL 
disabled: No Authentication Performed
Sep  5 10:29:35 nshah_1 qpidd[1550]: 2012-09-05 10:29:35 [Network] notice 
Listening on TCP/TCP6 port 5672
Sep  5 10:29:35 nshah_1 qpidd[1549]: 2012-09-05 10:29:35 [Broker] critical 
Unexpected error: Cannot read from child process.

I started doing some investigation on the new release mainly because I could 
not see what we were ( if possible) doing wrong with the release. The broker 
would start executing and immediately one was getting an assert as shown below 
in the output I generated with running it under GDB. It asserts because it 
fails the test in file types.cpp in qpid/ha line 38 ( assert(value < count). I 
noticed that this is happening as a result of the call from the 
HaBroker::initialize() function line 90 in the HaBroker.cpp file where a 
QPID_LOG is being invoked.

I believe the root of the problem is the BrokerInfo class constructor is not 
initializing the private class data called "BrokerStatus status" which is 
defined in file BrokerInfo.h . BrokerStatus is defined in types.h as an enum as 
follows

enum BrokerStatus {
    JOINING,                    ///< New broker, looking for primary
    CATCHUP,                    ///< Backup: Connected to primary, catching up 
on state.
    READY,                      ///< Backup: Caught up, ready to take over.
    RECOVERING,                 ///< Primary: waiting for backups to connect & 
sync
    ACTIVE,                     ///< Primary: actively serving clients.
    STANDALONE                  ///< Not part of a cluster.
};

It seems like the assert happens on the second call to EnumBase::str() in 
types.cpp. The count was 6 and the value was some large uninitialized value.

I initialized the status variable in the constructor to STANDALONE and the 
broker came up and worked fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to