On 11/14/2011 10:28 PM, Brandon Pedersen wrote:
On Mon, Nov 14, 2011 at 4:01 AM, Gordon Sim<[email protected]> wrote:
On 11/13/2011 07:45 PM, Brandon Pedersen wrote:
I have a durable federation link set up. When I start the broker that
initializes the connection there are sometimes when I get a weird
error about the connection receiving an invalid frame and subsequently
kills the qpid daemon. Is this expected behavior?
No, that is a bug. What version are you using and are you able to isolate a
reproducible test case?
Running version 0.12. I think I have narrowed it down to flakiness in
the link between the 2 brokers. I am doing this over a cellular
connection and this seems to happen only when the cell connection is
brought up and perhaps not been fully initialized yet. It is somewhat
tricky to reproduce, but what happens is I fire up the broker (which
has a durable route/link) and then fire up the cellular connection.
Sometimes the connection will succeed, other times it will fail and
then seg fault.
Could you raise a JIRA for this? Sounds like a dangling pointer to a
failed session or similar perhaps... perhaps related to heartbeat
induced connection abort, perhaps related to push routes specifically.
I'll try and get some time to reproduce and find a fix, but am tied up
right at the minute.
It seems if there is
an error trying to connect it should just retry. Here is what I see in
the log:
Nov 13 13:31:28 mtcdp daemon.err qpidd[1579]: 2011-11-13 13:31:28
error Connection local:59780-remote:5672 closed by error: Connection
not yet open, invalid frame received.(501)
Any idea how to fix this?
Can you enable core dumps on the broker and get a backtrace?
I enabled core dumps and got a couple, both of them have the following trace:
Core was generated by `qpidd'.
Program terminated with signal 11, Segmentation fault.
#0 0x403fdc98 in qpid::SessionState::disableReceiverTracking() ()
from /usr/lib/libqpidcommon.so.2
(gdb) backtrace
#0 0x403fdc98 in qpid::SessionState::disableReceiverTracking() ()
from /usr/lib/libqpidcommon.so.2
#1 0x4010e8a8 in
qpid::broker::Bridge::create(qpid::broker::Connection&) () from
/usr/lib/libqpidbroker.so.2
#2 0x4017ed18 in qpid::broker::Link::ioThreadProcessing() () from
/usr/lib/libqpidbroker.so.2
#3 0x40180920 in ?? () from /usr/lib/libqpidbroker.so.2
Cannot access memory at address 0x2d74c0f8
Also, I am a little suspicious of the log as well. That message that
is output to the log actually appears twice, one right after another,
just before it dies. So it looks like:
Nov 14 15:43:35 mtcdp daemon.err qpidd[6790]: 2011-11-14 15:43:35
error Connection local:55764-remote:5672 closed by error: Connection
not yet open, invalid frame received.(501)
Nov 14 15:43:35 mtcdp daemon.err qpidd[6790]: 2011-11-14 15:43:35
error Connection local:55764-remote:5672 closed by error: Connection
not yet open, invalid frame received.(501)
I'm not sure if that helps or not....
Yes, add all that in the JIRA, it will certainly help.
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:[email protected]