Re: Strange behaviour for pn_messenger_send on CentOS 6

Frank Quinn Tue, 09 Jun 2015 14:43:22 -0700

Hi Folks,

Just tried this and can recreate on a 64 bit laptop running CentOS 6.6
natively too. The send application never exits if a receiver is not yet
ready. Can anyone else see this or am I going mad?


Cheers,
Frank

On Thu, Jun 4, 2015 at 3:51 PM, Frank Quinn <fquinn...@gmail.com> wrote:

> Thanks Darryl,
>
> Also note this is on a VMWare VM if that makes a difference (clock types
> available perhaps?) and it is pinning CPU at the time.
>
> After seeing that a recompile didn't fix the problem, I nuked the newly
> compiled libraries and reverted to the EPEL RPMs:
>
> Installed Packages
> Name        : qpid-proton-c-devel
> Arch        : x86_64
> Version     : 0.8
> Release     : 1.el6
> Size        : 300 k
> Repo        : installed
> From repo   : epel
> Summary     : Development libraries for writing messaging apps with Qpid
> Proton
> URL         : http://qpid.apache.org/proton/
> License     : ASL 2.0
> Description : Development libraries for writing messaging apps with Qpid
> Proton.
>
>
> The issue is with send not exiting, so after running the trace on send:
>
> [fquinn@omvmfq01 messenger]$ PN_TRACE_FRM=1 ./send
> [0x19ddf10]:  -> SASL
> [0x19ddf10]:0 -> @sasl-init(65) [mechanism=:ANONYMOUS,
> initial-response=b""]
> recv: Connection refused
> [0x19ddf10]:0 -> @open(16) [container-id=""]
> [0x19ddf10]:0 -> @close(24) [error=@error(29)
> [condition=:"amqp:connection:framing-error", description="SASL header
> mismatch: ''"]]
> [0x19ddf10]:ERROR amqp:connection:framing-error SASL header mismatch: ''
> [0x19ddf10]:  <- EOS
> CONNECTION ERROR connection aborted (remote)
>
>
> Stack trace of hung application:
>
> (gdb) bt
> #0  clock_gettime (clock_id=0, tp=0x7fffac992580) at
> ../sysdeps/unix/clock_gettime.c:94
> #1  0x00007f5a1c00912e in pn_i_now () at
> /usr/src/debug/qpid-proton-0.8/proton-c/src/platform.c:31
> #2  0x00007f5a1c0079bf in pn_selector_select (selector=0x19da280,
> timeout=-1) at
> /usr/src/debug/qpid-proton-0.8/proton-c/src/posix/selector.c:161
> #3  0x00007f5a1c0052f9 in pn_messenger_tsync (messenger=0x19d9960,
> predicate=0x7f5a1c0020e0 <pn_messenger_sent>, timeout=<value optimized out>)
>     at
> /usr/src/debug/qpid-proton-0.8/proton-c/src/messenger/messenger.c:1440
> #4  0x0000000000401335 in main ()
>
> Another stack trace from the same instance:
>
> (gdb) bt
> #0  0x00007f5a1b1e9e15 in clock_gettime (clock_id=0, tp=0x7fffac992560)
>  at ../sysdeps/unix/clock_gettime.c:94
> #1  0x00007f5a1c00912e in pn_i_now ()    at
> /usr/src/debug/qpid-proton-0.8/proton-c/src/platform.c:31
> #2  0x00007f5a1c004dce in pni_messenger_tick (messenger=0x19d9960)    at
> /usr/src/debug/qpid-proton-0.8/proton-c/src/messenger/messenger.c:1330
> #3  pn_messenger_process (messenger=0x19d9960)    at
> /usr/src/debug/qpid-proton-0.8/proton-c/src/messenger/messenger.c:1367
> #4  0x00007f5a1c0052b8 in pn_messenger_tsync (messenger=0x19d9960,
> predicate=0x7f5a1c0020e0 <pn_messenger_sent>,    timeout=<value optimized
> out>)
>     at
> /usr/src/debug/qpid-proton-0.8/proton-c/src/messenger/messenger.c:1423
> #5  0x0000000000401335 in main ()
>
>
> Cheers,
> Frank
>
> On Thu, Jun 4, 2015 at 3:38 PM, Darryl L. Pierce <dpie...@redhat.com>
> wrote:
>
>> On Thu, Jun 04, 2015 at 03:12:51PM +0100, Frank Quinn wrote:
>> > I hit  a strange issue today when setting up a qpid proton development
>> > environment on a fresh CentOS 6 VM. I first found the issue in our
>> > application, but when I went a little deeper, I realized I could
>> recreate
>> > the issue with the qpid proton send and recv example applications. All
>> you
>> > need to do is run ‘send’ on its own and the pn_messenger_send call hangs
>> > indefinitely. If you start ‘recv’ first, it works fine, but ‘send’ on
>> its
>> > own hangs every time.
>> >
>> > This is contrary to its behaviour on my Fedora 21 laptop (latest yum
>> > provisioned 0.8 version) where it always attempts once, logs a failure,
>> > then exits (which is what I would expect).
>> >
>> > This effectively deadlocks our application. So far, I’ve tried compiling
>> > qpid proton c myself (both 0.8 and 0.9.1), setting pn_messenger_send
>> > timeout to 1 (it was previously -1), turning off iptables entirely and
>> > disabling selinux and rebooting but no luck. Is this something you folks
>> > have seen before?
>>
>> Hrm, this isn't something I've heard reported before. Does it do the
>> same if you use the Python recv.py example as well?
>>
>> Also, can you do the following:
>>
>>  $ PN_TRACE_FRM=1 ./recv [options]
>>
>> and share the output displayed?
>>
>> Also, is this solely with binaries you've built or are you installed
>> RPMs from EPEL for Proton?
>>
>> --
>> Darryl L. Pierce, Sr. Software Engineer @ Red Hat, Inc.
>> Delivering value year after year.
>> Red Hat ranks #1 in value among software vendors.
>> http://www.redhat.com/promo/vendor/
>>
>>
>

Re: Strange behaviour for pn_messenger_send on CentOS 6

Reply via email to