RE: FW: problems starting the version 18 broker and its Crash
You are welcome. Nitin -Original Message- From: Alan Conway [mailto:acon...@redhat.com] Sent: Monday, September 10, 2012 10:38 AM To: dev@qpid.apache.org Cc: Gordon Sim; Nitin Shah Subject: RE: FW: problems starting the version 18 broker and its Crash Thanks a lot for tracking this down. For some reason it happens on some machines but not others. I'll put up a JIRA and a patch and fix this on trunk. On Fri, 2012-09-07 at 09:43 -0400, Nitin Shah wrote: Hi, Thanks for the information. I started doing some investigation on the new release mainly because I could not see what we were ( if possible) doing wrong with the release. The broker would start executing and immediately one was getting an assert as shown below in the output I generated with running it under GDB. It asserts because it fails the test in file types.cpp in qpid/ha line 38 ( assert(value count). I noticed that this is happening as a result of the call from the HaBroker::initialize() function line 90 in the HaBroker.cpp file where a QPID_LOG is being invoked. I believe the root of the problem is the BrokerInfo class constructor is not initializing the private class data called BrokerStatus status which is defined in file BrokerInfo.h . BrokerStatus is defined in types.h as an enum as follows enum BrokerStatus { JOINING,/// New broker, looking for primary CATCHUP,/// Backup: Connected to primary, catching up on state. READY, /// Backup: Caught up, ready to take over. RECOVERING, /// Primary: waiting for backups to connect sync ACTIVE, /// Primary: actively serving clients. STANDALONE /// Not part of a cluster. }; It seems like the assert happens on the second call to EnumBase::str() in types.cpp. The count was 6 and the value was some large uninitialized value. I initialized the status variable in the constructor to STANDALONE and the broker came up and worked fine. *I assume we are going to need a patch or an update for the broker to be used. NOTE:: If I start the broker with the --log-enable warning or for that matter others like notice, the broker comes up and works fine. But, if the broker is started with no log-enable parameters, IT CRASHES. This seems pretty strange as I thought you guys would have seen this. I am running the broker on a CentOs 6.2 system. We have been using version 16 to date. The images are created a new and loaded on a VM, so the VM has no remnants of the version 16, so it is clean version 18 load and use. If I wish to continue using this release, can someone tell me what the correct process for getting a patch is? Alternatively, what is the patch I should apply, meaning can I add a initialize of the status parameter and what is the CORRECT enum value to set it to? Nitin Ps let me know if there any other information you need Thanks (gdb) where #0 0x758b58a5 in raise () from /lib64/libc.so.6 #1 0x758b7085 in abort () from /lib64/libc.so.6 #2 0x758aea1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x758aeae0 in __assert_fail () from /lib64/libc.so.6 #4 0x750464fb in qpid::ha::EnumBase::str (this=value optimized out) at qpid/ha/types.cpp:38 #5 0x75046533 in qpid::ha::operator (o=..., e=...) at qpid/ha/types.cpp:64 #6 0x75013a5a in qpid::ha::operator (o=value optimized out, b=value optimized out) at qpid/ha/BrokerInfo.cpp:99 #7 0x75029fb4 in operator qpid::ha::BrokerInfo (this=0x6507b0) at ../include/qpid/Msg.h:63 #8 qpid::ha::HaBroker::initialize (this=0x6507b0) at qpid/ha/HaBroker.cpp:90 #9 0x775d4861 in operator() (f=value optimized out) at /usr/include/boost/bind/mem_fn_template.hpp:162 #10 operator()boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list1qpid::Plugin* const (f=value optimized out) at /usr/include/boost/bind/bind.hpp:306 #11 operator()qpid::Plugin* (f=value optimized out) at /usr/include/boost/bind/bind_template.hpp:47 #12 for_each__gnu_cxx::__normal_iteratorqpid::Plugin* const*, std::vectorqpid::Plugin*, std::allocatorqpid::Plugin* , boost::_bi::bind_tvoid, boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list2boost::arg1, boost::reference_wrapperqpid::Plugin::Target(f=value optimized out) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/b its/stl_algo.h:4200 #13 qpid::(anonymous namespace)::each_pluginboost::_bi::bind_tvoid, boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list2boost::arg1, boost::reference_wrapperqpid::Plugin::Target( f=value optimized out) at qpid/Plugin.cpp:73 #14 0x775d48a2 in qpid::Plugin::initializeAll (t=value optimized out) at qpid/Plugin.cpp:91
RE: FW: problems starting the version 18 broker and its Crash
Thanks a lot for tracking this down. For some reason it happens on some machines but not others. I'll put up a JIRA and a patch and fix this on trunk. On Fri, 2012-09-07 at 09:43 -0400, Nitin Shah wrote: Hi, Thanks for the information. I started doing some investigation on the new release mainly because I could not see what we were ( if possible) doing wrong with the release. The broker would start executing and immediately one was getting an assert as shown below in the output I generated with running it under GDB. It asserts because it fails the test in file types.cpp in qpid/ha line 38 ( assert(value count). I noticed that this is happening as a result of the call from the HaBroker::initialize() function line 90 in the HaBroker.cpp file where a QPID_LOG is being invoked. I believe the root of the problem is the BrokerInfo class constructor is not initializing the private class data called BrokerStatus status which is defined in file BrokerInfo.h . BrokerStatus is defined in types.h as an enum as follows enum BrokerStatus { JOINING,/// New broker, looking for primary CATCHUP,/// Backup: Connected to primary, catching up on state. READY, /// Backup: Caught up, ready to take over. RECOVERING, /// Primary: waiting for backups to connect sync ACTIVE, /// Primary: actively serving clients. STANDALONE /// Not part of a cluster. }; It seems like the assert happens on the second call to EnumBase::str() in types.cpp. The count was 6 and the value was some large uninitialized value. I initialized the status variable in the constructor to STANDALONE and the broker came up and worked fine. *I assume we are going to need a patch or an update for the broker to be used. NOTE:: If I start the broker with the --log-enable warning or for that matter others like notice, the broker comes up and works fine. But, if the broker is started with no log-enable parameters, IT CRASHES. This seems pretty strange as I thought you guys would have seen this. I am running the broker on a CentOs 6.2 system. We have been using version 16 to date. The images are created a new and loaded on a VM, so the VM has no remnants of the version 16, so it is clean version 18 load and use. If I wish to continue using this release, can someone tell me what the correct process for getting a patch is? Alternatively, what is the patch I should apply, meaning can I add a initialize of the status parameter and what is the CORRECT enum value to set it to? Nitin Ps let me know if there any other information you need Thanks (gdb) where #0 0x758b58a5 in raise () from /lib64/libc.so.6 #1 0x758b7085 in abort () from /lib64/libc.so.6 #2 0x758aea1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x758aeae0 in __assert_fail () from /lib64/libc.so.6 #4 0x750464fb in qpid::ha::EnumBase::str (this=value optimized out) at qpid/ha/types.cpp:38 #5 0x75046533 in qpid::ha::operator (o=..., e=...) at qpid/ha/types.cpp:64 #6 0x75013a5a in qpid::ha::operator (o=value optimized out, b=value optimized out) at qpid/ha/BrokerInfo.cpp:99 #7 0x75029fb4 in operator qpid::ha::BrokerInfo (this=0x6507b0) at ../include/qpid/Msg.h:63 #8 qpid::ha::HaBroker::initialize (this=0x6507b0) at qpid/ha/HaBroker.cpp:90 #9 0x775d4861 in operator() (f=value optimized out) at /usr/include/boost/bind/mem_fn_template.hpp:162 #10 operator()boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list1qpid::Plugin* const (f=value optimized out) at /usr/include/boost/bind/bind.hpp:306 #11 operator()qpid::Plugin* (f=value optimized out) at /usr/include/boost/bind/bind_template.hpp:47 #12 for_each__gnu_cxx::__normal_iteratorqpid::Plugin* const*, std::vectorqpid::Plugin*, std::allocatorqpid::Plugin* , boost::_bi::bind_tvoid, boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list2boost::arg1, boost::reference_wrapperqpid::Plugin::Target(f=value optimized out) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algo.h:4200 #13 qpid::(anonymous namespace)::each_pluginboost::_bi::bind_tvoid, boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list2boost::arg1, boost::reference_wrapperqpid::Plugin::Target( f=value optimized out) at qpid/Plugin.cpp:73 #14 0x775d48a2 in qpid::Plugin::initializeAll (t=value optimized out) at qpid/Plugin.cpp:91 -Original Message- From: Gordon Sim [mailto:g...@redhat.com] Sent: Wednesday, September 05, 2012 12:29 PM To: dev@qpid.apache.org Subject: Re: FW: problems starting the version 18 broker Can you run it with full tracing on? i.e. --log-enable trace+ What
RE: FW: problems starting the version 18 broker and its Crash
Hi, Thanks for the information. I started doing some investigation on the new release mainly because I could not see what we were ( if possible) doing wrong with the release. The broker would start executing and immediately one was getting an assert as shown below in the output I generated with running it under GDB. It asserts because it fails the test in file types.cpp in qpid/ha line 38 ( assert(value count). I noticed that this is happening as a result of the call from the HaBroker::initialize() function line 90 in the HaBroker.cpp file where a QPID_LOG is being invoked. I believe the root of the problem is the BrokerInfo class constructor is not initializing the private class data called BrokerStatus status which is defined in file BrokerInfo.h . BrokerStatus is defined in types.h as an enum as follows enum BrokerStatus { JOINING,/// New broker, looking for primary CATCHUP,/// Backup: Connected to primary, catching up on state. READY, /// Backup: Caught up, ready to take over. RECOVERING, /// Primary: waiting for backups to connect sync ACTIVE, /// Primary: actively serving clients. STANDALONE /// Not part of a cluster. }; It seems like the assert happens on the second call to EnumBase::str() in types.cpp. The count was 6 and the value was some large uninitialized value. I initialized the status variable in the constructor to STANDALONE and the broker came up and worked fine. *I assume we are going to need a patch or an update for the broker to be used. NOTE:: If I start the broker with the --log-enable warning or for that matter others like notice, the broker comes up and works fine. But, if the broker is started with no log-enable parameters, IT CRASHES. This seems pretty strange as I thought you guys would have seen this. I am running the broker on a CentOs 6.2 system. We have been using version 16 to date. The images are created a new and loaded on a VM, so the VM has no remnants of the version 16, so it is clean version 18 load and use. If I wish to continue using this release, can someone tell me what the correct process for getting a patch is? Alternatively, what is the patch I should apply, meaning can I add a initialize of the status parameter and what is the CORRECT enum value to set it to? Nitin Ps let me know if there any other information you need Thanks (gdb) where #0 0x758b58a5 in raise () from /lib64/libc.so.6 #1 0x758b7085 in abort () from /lib64/libc.so.6 #2 0x758aea1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x758aeae0 in __assert_fail () from /lib64/libc.so.6 #4 0x750464fb in qpid::ha::EnumBase::str (this=value optimized out) at qpid/ha/types.cpp:38 #5 0x75046533 in qpid::ha::operator (o=..., e=...) at qpid/ha/types.cpp:64 #6 0x75013a5a in qpid::ha::operator (o=value optimized out, b=value optimized out) at qpid/ha/BrokerInfo.cpp:99 #7 0x75029fb4 in operator qpid::ha::BrokerInfo (this=0x6507b0) at ../include/qpid/Msg.h:63 #8 qpid::ha::HaBroker::initialize (this=0x6507b0) at qpid/ha/HaBroker.cpp:90 #9 0x775d4861 in operator() (f=value optimized out) at /usr/include/boost/bind/mem_fn_template.hpp:162 #10 operator()boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list1qpid::Plugin* const (f=value optimized out) at /usr/include/boost/bind/bind.hpp:306 #11 operator()qpid::Plugin* (f=value optimized out) at /usr/include/boost/bind/bind_template.hpp:47 #12 for_each__gnu_cxx::__normal_iteratorqpid::Plugin* const*, std::vectorqpid::Plugin*, std::allocatorqpid::Plugin* , boost::_bi::bind_tvoid, boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list2boost::arg1, boost::reference_wrapperqpid::Plugin::Target(f=value optimized out) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algo.h:4200 #13 qpid::(anonymous namespace)::each_pluginboost::_bi::bind_tvoid, boost::_mfi::mf1void, qpid::Plugin, qpid::Plugin::Target, boost::_bi::list2boost::arg1, boost::reference_wrapperqpid::Plugin::Target ( f=value optimized out) at qpid/Plugin.cpp:73 #14 0x775d48a2 in qpid::Plugin::initializeAll (t=value optimized out) at qpid/Plugin.cpp:91 -Original Message- From: Gordon Sim [mailto:g...@redhat.com] Sent: Wednesday, September 05, 2012 12:29 PM To: dev@qpid.apache.org Subject: Re: FW: problems starting the version 18 broker Can you run it with full tracing on? i.e. --log-enable trace+ What platform are you running on? Did make check pass when building? Did you build and install from source on that machine; did you uninstall the 0.16 release first or are they both installed in different locations? The second error appears to be coming from the HA module, possibly related to the options. Could you
Re: FW: problems starting the version 18 broker and its Crash
On 09/07/2012 02:43 PM, Nitin Shah wrote: Hi, Thanks for the information. I started doing some investigation on the new release mainly because I could not see what we were ( if possible) doing wrong with the release. The broker would start executing and immediately one was getting an assert as shown below in the output I generated with running it under GDB. It asserts because it fails the test in file types.cpp in qpid/ha line 38 ( assert(value count). I noticed that this is happening as a result of the call from the HaBroker::initialize() function line 90 in the HaBroker.cpp file where a QPID_LOG is being invoked. I believe the root of the problem is the BrokerInfo class constructor is not initializing the private class data called BrokerStatus status which is defined in file BrokerInfo.h . Good analysis; thanks! *I assume we are going to need a patch or an update for the broker to be used. NOTE:: If I start the broker with the --log-enable warning or for that matter others like notice, the broker comes up and works fine. But, if the broker is started with no log-enable parameters, IT CRASHES. This seems pretty strange as I thought you guys would have seen this. It actually works fine for me for repeated restarts, though if I run under valgrind the uninitialised value is reported and of course by inspection of the code you have correctly identified a problem. I guess we have just been unlucky in not hitting the problem... I am running the broker on a CentOs 6.2 system. We have been using version 16 to date. The images are created a new and loaded on a VM, so the VM has no remnants of the version 16, so it is clean version 18 load and use. If I wish to continue using this release, can someone tell me what the correct process for getting a patch is? Alternatively, what is the patch I should apply, meaning can I add a initialize of the status parameter and what is the CORRECT enum value to set it to? The attached patch should take care of the crash. I believe the status is correctly set later on, just not in time for the logging statement. Index: cpp/src/qpid/ha/BrokerInfo.cpp === --- cpp/src/qpid/ha/BrokerInfo.cpp (revision 1382034) +++ cpp/src/qpid/ha/BrokerInfo.cpp (working copy) @@ -44,7 +44,7 @@ using framing::FieldTable; BrokerInfo::BrokerInfo(const std::string host, uint16_t port_, const types::Uuid id) : -hostName(host), port(port_), systemId(id) +hostName(host), port(port_), systemId(id), status(JOINING) { updateLogId(); } - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
Re: FW: problems starting the version 18 broker
Can you run it with full tracing on? i.e. --log-enable trace+ What platform are you running on? Did make check pass when building? Did you build and install from source on that machine; did you uninstall the 0.16 release first or are they both installed in different locations? The second error appears to be coming from the HA module, possibly related to the options. Could you be picking up the 0.16 modules when starting the 0.18 broker? (Trying with --module-dir pointing at a 0.18 modules only directory would verify that). On 09/05/2012 04:48 PM, Nitin Shah wrote: Ps if I run it in foreground then I get the following message 2012-09-05 11:39:45 [Broker] notice SASL disabled: No Authentication Performed 2012-09-05 11:39:45 [Network] notice Listening on TCP/TCP6 port 5672 qpidd: qpid/ha/types.cpp:38: std::string qpid::ha::EnumBase::str() const: Assertion `value count' failed. Aborted Thanks Nitin Description: bti_logo_small *From:*Nitin Shah *Sent:* Wednesday, September 05, 2012 10:37 AM *To:* dev@qpid.apache.org *Cc:* Nitin Shah *Subject:* problems starting the version 18 broker Hi, I tried to start the version 18 of the C++ broker and get the following error in /var/log/messages and the broker dies. Any idea what we are doing wrong. We have been using the version 16 and that starts fine. 10:29:35 nshah_1 qpidd[1550]: 2012-09-05 10:29:35 [Broker] notice SASL disabled: No Authentication Performed Sep 5 10:29:35 nshah_1 qpidd[1550]: 2012-09-05 10:29:35 [Network] notice Listening on TCP/TCP6 port 5672 Sep 5 10:29:35 nshah_1 qpidd[1549]: 2012-09-05 10:29:35 [Broker] critical Unexpected error: Cannot read from child process. Nitin - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
Re: FW: problems starting the version 18 broker
This problem is very reminiscent of a recurrent make check error we've seen in the ha tests on some machines. Andrew On Wed, 2012-09-05 at 17:28 +0100, Gordon Sim wrote: Can you run it with full tracing on? i.e. --log-enable trace+ What platform are you running on? Did make check pass when building? Did you build and install from source on that machine; did you uninstall the 0.16 release first or are they both installed in different locations? The second error appears to be coming from the HA module, possibly related to the options. Could you be picking up the 0.16 modules when starting the 0.18 broker? (Trying with --module-dir pointing at a 0.18 modules only directory would verify that). On 09/05/2012 04:48 PM, Nitin Shah wrote: Ps if I run it in foreground then I get the following message 2012-09-05 11:39:45 [Broker] notice SASL disabled: No Authentication Performed 2012-09-05 11:39:45 [Network] notice Listening on TCP/TCP6 port 5672 qpidd: qpid/ha/types.cpp:38: std::string qpid::ha::EnumBase::str() const: Assertion `value count' failed. Aborted Thanks Nitin Description: bti_logo_small *From:*Nitin Shah *Sent:* Wednesday, September 05, 2012 10:37 AM *To:* dev@qpid.apache.org *Cc:* Nitin Shah *Subject:* problems starting the version 18 broker Hi, I tried to start the version 18 of the C++ broker and get the following error in /var/log/messages and the broker dies. Any idea what we are doing wrong. We have been using the version 16 and that starts fine. 10:29:35 nshah_1 qpidd[1550]: 2012-09-05 10:29:35 [Broker] notice SASL disabled: No Authentication Performed Sep 5 10:29:35 nshah_1 qpidd[1550]: 2012-09-05 10:29:35 [Network] notice Listening on TCP/TCP6 port 5672 Sep 5 10:29:35 nshah_1 qpidd[1549]: 2012-09-05 10:29:35 [Broker] critical Unexpected error: Cannot read from child process. Nitin - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org