Hi Folks, Just tried this and can recreate on a 64 bit laptop running CentOS 6.6 natively too. The send application never exits if a receiver is not yet ready. Can anyone else see this or am I going mad?
Cheers, Frank On Thu, Jun 4, 2015 at 3:51 PM, Frank Quinn <fquinn...@gmail.com> wrote: > Thanks Darryl, > > Also note this is on a VMWare VM if that makes a difference (clock types > available perhaps?) and it is pinning CPU at the time. > > After seeing that a recompile didn't fix the problem, I nuked the newly > compiled libraries and reverted to the EPEL RPMs: > > Installed Packages > Name : qpid-proton-c-devel > Arch : x86_64 > Version : 0.8 > Release : 1.el6 > Size : 300 k > Repo : installed > From repo : epel > Summary : Development libraries for writing messaging apps with Qpid > Proton > URL : http://qpid.apache.org/proton/ > License : ASL 2.0 > Description : Development libraries for writing messaging apps with Qpid > Proton. > > > The issue is with send not exiting, so after running the trace on send: > > [fquinn@omvmfq01 messenger]$ PN_TRACE_FRM=1 ./send > [0x19ddf10]: -> SASL > [0x19ddf10]:0 -> @sasl-init(65) [mechanism=:ANONYMOUS, > initial-response=b""] > recv: Connection refused > [0x19ddf10]:0 -> @open(16) [container-id=""] > [0x19ddf10]:0 -> @close(24) [error=@error(29) > [condition=:"amqp:connection:framing-error", description="SASL header > mismatch: ''"]] > [0x19ddf10]:ERROR amqp:connection:framing-error SASL header mismatch: '' > [0x19ddf10]: <- EOS > CONNECTION ERROR connection aborted (remote) > > > Stack trace of hung application: > > (gdb) bt > #0 clock_gettime (clock_id=0, tp=0x7fffac992580) at > ../sysdeps/unix/clock_gettime.c:94 > #1 0x00007f5a1c00912e in pn_i_now () at > /usr/src/debug/qpid-proton-0.8/proton-c/src/platform.c:31 > #2 0x00007f5a1c0079bf in pn_selector_select (selector=0x19da280, > timeout=-1) at > /usr/src/debug/qpid-proton-0.8/proton-c/src/posix/selector.c:161 > #3 0x00007f5a1c0052f9 in pn_messenger_tsync (messenger=0x19d9960, > predicate=0x7f5a1c0020e0 <pn_messenger_sent>, timeout=<value optimized out>) > at > /usr/src/debug/qpid-proton-0.8/proton-c/src/messenger/messenger.c:1440 > #4 0x0000000000401335 in main () > > Another stack trace from the same instance: > > (gdb) bt > #0 0x00007f5a1b1e9e15 in clock_gettime (clock_id=0, tp=0x7fffac992560) > at ../sysdeps/unix/clock_gettime.c:94 > #1 0x00007f5a1c00912e in pn_i_now () at > /usr/src/debug/qpid-proton-0.8/proton-c/src/platform.c:31 > #2 0x00007f5a1c004dce in pni_messenger_tick (messenger=0x19d9960) at > /usr/src/debug/qpid-proton-0.8/proton-c/src/messenger/messenger.c:1330 > #3 pn_messenger_process (messenger=0x19d9960) at > /usr/src/debug/qpid-proton-0.8/proton-c/src/messenger/messenger.c:1367 > #4 0x00007f5a1c0052b8 in pn_messenger_tsync (messenger=0x19d9960, > predicate=0x7f5a1c0020e0 <pn_messenger_sent>, timeout=<value optimized > out>) > at > /usr/src/debug/qpid-proton-0.8/proton-c/src/messenger/messenger.c:1423 > #5 0x0000000000401335 in main () > > > Cheers, > Frank > > On Thu, Jun 4, 2015 at 3:38 PM, Darryl L. Pierce <dpie...@redhat.com> > wrote: > >> On Thu, Jun 04, 2015 at 03:12:51PM +0100, Frank Quinn wrote: >> > I hit a strange issue today when setting up a qpid proton development >> > environment on a fresh CentOS 6 VM. I first found the issue in our >> > application, but when I went a little deeper, I realized I could >> recreate >> > the issue with the qpid proton send and recv example applications. All >> you >> > need to do is run ‘send’ on its own and the pn_messenger_send call hangs >> > indefinitely. If you start ‘recv’ first, it works fine, but ‘send’ on >> its >> > own hangs every time. >> > >> > This is contrary to its behaviour on my Fedora 21 laptop (latest yum >> > provisioned 0.8 version) where it always attempts once, logs a failure, >> > then exits (which is what I would expect). >> > >> > This effectively deadlocks our application. So far, I’ve tried compiling >> > qpid proton c myself (both 0.8 and 0.9.1), setting pn_messenger_send >> > timeout to 1 (it was previously -1), turning off iptables entirely and >> > disabling selinux and rebooting but no luck. Is this something you folks >> > have seen before? >> >> Hrm, this isn't something I've heard reported before. Does it do the >> same if you use the Python recv.py example as well? >> >> Also, can you do the following: >> >> $ PN_TRACE_FRM=1 ./recv [options] >> >> and share the output displayed? >> >> Also, is this solely with binaries you've built or are you installed >> RPMs from EPEL for Proton? >> >> -- >> Darryl L. Pierce, Sr. Software Engineer @ Red Hat, Inc. >> Delivering value year after year. >> Red Hat ranks #1 in value among software vendors. >> http://www.redhat.com/promo/vendor/ >> >> >