Re: problem with qpid heartbeats when sending msgs with size over 1KB

Tom M Fri, 13 Jan 2012 06:08:19 -0800

Hi Gordon,
sorry I didn't get back to you yesterday, but I needed to get an
opportunity to start a separate broker that I could stop on one of our
systems.

We had seen these different results with our deployed system, and I want to
make sure that this test client acted the same way.
(on our deployed sytem, most of failover testing had been done with a kill
on the broker and we would see the client detect the lost connection.  But
later, when a host died, we found this other condition, where the client
did not detect it)

So, I ran the test:
* started a separate broker (see below)
* started the same test client, as before, with the larger msg size
* did a kill on the broker:    kill -STOP <pid>
* saw the same results as you did, the client detected the loss connection
in about 2x heartbeat rate

Then, to verify my earlier results, I ran the same exact test, except this
time pulled the network cable:
* started a separate broker (same as previous run above)
* started the same test client, as before, with the larger msg size
* pulled network cable
* saw same results as my previous tests: client continued to "send" well
past the heartbeat timeout should have been (seeing same trace messages),
until about 80 seconds later, locked up.

Note, I've noted this result (client sending the larger msg missing the
detection of the lost connection) also happens if the broker host abruptly
dies (which is how we first detected the problem).

Note: I used the following to start broker:
/usr/sbin/qpidd -p 18102 --log-to-syslog no --log-to-file
/export/hps/dda/qpidd_x/log/qpidd_x.log --worker-threads 3 --data-dir
/export/hps/dda/qpidd_x/data --pid-dir /export/hps/dda/qpidd_x/pid-dir
--auth no --config /dev/null

So, please let me know if you can run the test again, but pulling the
network cable (I'm pulling net between broker and switch, but, I'm pretty
sure I've seen the same when pulling net between switch and client).
thanks,
Tom

On Thu, Jan 12, 2012 at 10:05 AM, Gordon Sim <[email protected]> wrote:

> On 01/07/2012 01:26 AM, Tom M wrote:
>
>> I’ve created a simpler test client (based on our deployed application)
>> to test this problem.
>>
>
> I ran your test client against the same package versions you listed for
> qpidd and the client lib. I didn't pull a cable (as I was testing on remote
> boxes) but instead issues a kill -STOP against the broker which should be
> similar from the perspective of the client (i.e. it will miss two
> heartbeats and abort the connection).
>
> However in all my attempts it did correctly detect the closed connection
> and issue an exception. The output was of the following form:
>
>  01_12 15:51:34  TstConn:  sending msg: 45
>>  01_12 15:51:34  TstConn:      msg sent
>>  01_12 15:51:35  TstConn:  sending msg: 46
>>  01_12 15:51:35  TstConn:      msg sent
>> 2012-01-12 10:51:35 warning Connection [42787 mrg11:5672] closed
>>
>>    01_12 15:51:36 TstConn:  connection_.isOpen() detected lost connection
>>       note:  detected with isOpen() call,  not an exception....
>>
>>  01_12 15:51:36  TstConn:  sending msg: 47
>>  01_12 15:51:36TstConn: qpid::Exception: Connection [42787 mrg11:5672]
>> closed
>>
>>
>>   ...waiting on user (allow user to reconnect cable,  so can attempt to
>> close connection on broker)
>>
>>  To continue shutdown, enter:  1
>>
>
> This is the same whichever size I choose. Would you mind verifying if you
> see your problem with a kill -STOP in place of the network cable removal?
> If not, I'll try and get a setup to test on where i can pull a physical
> cable; if you can then it confirms there is something else different in our
> test setup.
>
>
>
> ------------------------------**------------------------------**---------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: 
> mailto:users-subscribe@qpid.**apache.org<[email protected]>
>
>

Re: problem with qpid heartbeats when sending msgs with size over 1KB

Reply via email to