Re: Deadlock in pn_messenger_stop? (C Qpid Library)

2013-06-24 Thread atarutin
Hi, Frank.

I just had the similar problem. In my case I'm trying to call
pn_messenger_stop and the thread hangs. I have only one main thread. So, it
seems to be a bug.



--
View this message in context: 
http://qpid.2158936.n2.nabble.com/Deadlock-in-pn-messenger-stop-C-Qpid-Library-tp7594142p7594461.html
Sent from the Apache Qpid Proton mailing list archive at Nabble.com.


Re: Deadlock in pn_messenger_stop? (C Qpid Library)

2013-06-22 Thread Frank Quinn
Hi folks,

Can you confirm that this is an issue? Should I raise a ticket for this?

Cheers,
Frank

- Original Message -
From: Frank Quinn
Sent: Friday, June 14, 2013 06:03 PM
To: 'proton@qpid.apache.org' proton@qpid.apache.org
Cc: Damian Maguire; Glenn McClements
Subject: RE: Deadlock in pn_messenger_stop? (C Qpid Library)

Hmm - perhaps I spoke too soon...

This looked like it worked in that little test application but when I plugged 
it into my actual application, it seemed to work some of the times and not 
others. I then took your advice and introduced a join in the test application 
and as soon as I do so, stopping the subscriber on the subscriber thread 
stopped working (perhaps it never did work properly and this just flagged it).

I have attached the latest source for it with the subscriber living entirely on 
its own thread, and with a join now introduced in main(). It deadlocks with the 
same backtrace as originally reported when trying to run pn_messenger_stop.

Cheers,
Frank

-Original Message-
From: Frank Quinn
Sent: 14 June 2013 13:49
To: proton@qpid.apache.org
Subject: RE: Deadlock in pn_messenger_stop? (C Qpid Library)

Thanks Ted,

I did try inserting a sleep of 1s between the send and the stop (I suspected 
the same thing you did), but the behaviour persisted so I don't think the 
problem was the time for the message to land.

However, your suggestion to stop it in the subscriber thread did seem to work 
(still unsure why, but I'll not question it for now), thanks!

Cheers,
Frank

-Original Message-
From: Ted Ross [mailto:tr...@redhat.com]
Sent: 14 June 2013 13:09
To: proton@qpid.apache.org
Subject: Re: Deadlock in pn_messenger_stop? (C Qpid Library)

I would suggest that you not stop the subscriber messenger in the main thread.  
Rather, stop it in the subscriber thread right before it exits.  Alternatively 
(preferably), you should pthread_join the thread in main before stopping the 
messengers.

It looks to me that you've got a race condition where main is stopping mSub at 
the same time the thread is processing messages on mSub.

Keep in mind that pn_messenger_send is only going to block until the message 
has been pushed to the socket.  It will not wait for the message to be received 
and processed by the other thread.

-Ted


On 06/14/2013 07:10 AM, Frank Quinn wrote:
 Hi Folks,

 See attached code: I'm encountering a deadlock when I try to stop
 messengers. The general workflow is:

 1. Create pub and sub Messengers
 2. Start the Messengers
 3. Thread sub off onto its own thread as recv is a blocking call 4.
 Publish round trip from the pub messenger to the sub messenger with a
 destroy subject (recv is uninteruptable at the moment so this is our
 only to interrupt it) 5. Stop the messengers

 When I try and stop the messengers, the application deadlocks with the
 following backtrace (there is only one thread running at this point as
 the subscribe thread has since exited):

 Thread 1 (Thread 0x7f38181a4840 (LWP 6688)):
 #0  0x003518ce99ad in poll () at
 ../sysdeps/unix/syscall-template.S:81
 #1  0x00309c226a1c in poll (__timeout=optimized out,
 __nfds=optimized out, __fds=optimized out)
 at /usr/include/bits/poll2.h:46
 #2  pn_driver_wait_2 (d=d@entry=0x1a81140, timeout=optimized out,
 timeout@entry=-1)
 at /usr/src/debug/qpid-proton-0.4/proton-c/src/posix/driver.c:752
 #3  0x00309c226c42 in pn_driver_wait (d=0x1a81140,
 timeout=timeout@entry=-1)
 at /usr/src/debug/qpid-proton-0.4/proton-c/src/posix/driver.c:807
 #4  0x00309c2242d3 in pn_messenger_tsync (messenger=0x1a81050,
 predicate=0x309c222d80 pn_messenger_stopped, timeout=optimized
 out)
 at /usr/src/debug/qpid-proton-0.4/proton-c/src/messenger.c:623
 #5  0x00400ffb in main () at qpid_deadlock_repro.c:123

 Is this the correct workflow for this or am I missing a flush or
 unlock step somewhere along the way?

 Cheers,
 Frank

 --
 --

 Please consider the environment before printing this e-mail.

 This e-mail may contain confidential and/or privileged information. If
 you are not the intended recipient or have received this e-mail in
 error, please advise the sender immediately by reply e-mail and delete
 this message and any attachments without retaining a copy.

 Any unauthorised copying, disclosure or distribution of the material
 in this e-mail is strictly forbidden.
 --
 --

 *Please consider the environment before printing this email.*

 *Visit our website at http://www.nyse.com
 **
 ***

 Note: The information contained in this message and any attachment to
 it is privileged, confidential and protected from disclosure. If the
 reader of this message is not the intended recipient, or an employee
 or agent responsible for delivering this message

RE: Deadlock in pn_messenger_stop? (C Qpid Library)

2013-06-14 Thread Frank Quinn
Hmm - perhaps I spoke too soon...

This looked like it worked in that little test application but when I plugged 
it into my actual application, it seemed to work some of the times and not 
others. I then took your advice and introduced a join in the test application 
and as soon as I do so, stopping the subscriber on the subscriber thread 
stopped working (perhaps it never did work properly and this just flagged it).

I have attached the latest source for it with the subscriber living entirely on 
its own thread, and with a join now introduced in main(). It deadlocks with the 
same backtrace as originally reported when trying to run pn_messenger_stop.

Cheers,
Frank

-Original Message-
From: Frank Quinn
Sent: 14 June 2013 13:49
To: proton@qpid.apache.org
Subject: RE: Deadlock in pn_messenger_stop? (C Qpid Library)

Thanks Ted,

I did try inserting a sleep of 1s between the send and the stop (I suspected 
the same thing you did), but the behaviour persisted so I don't think the 
problem was the time for the message to land.

However, your suggestion to stop it in the subscriber thread did seem to work 
(still unsure why, but I'll not question it for now), thanks!

Cheers,
Frank

-Original Message-
From: Ted Ross [mailto:tr...@redhat.com]
Sent: 14 June 2013 13:09
To: proton@qpid.apache.org
Subject: Re: Deadlock in pn_messenger_stop? (C Qpid Library)

I would suggest that you not stop the subscriber messenger in the main thread.  
Rather, stop it in the subscriber thread right before it exits.  Alternatively 
(preferably), you should pthread_join the thread in main before stopping the 
messengers.

It looks to me that you've got a race condition where main is stopping mSub at 
the same time the thread is processing messages on mSub.

Keep in mind that pn_messenger_send is only going to block until the message 
has been pushed to the socket.  It will not wait for the message to be received 
and processed by the other thread.

-Ted


On 06/14/2013 07:10 AM, Frank Quinn wrote:
 Hi Folks,

 See attached code: I'm encountering a deadlock when I try to stop
 messengers. The general workflow is:

 1. Create pub and sub Messengers
 2. Start the Messengers
 3. Thread sub off onto its own thread as recv is a blocking call 4.
 Publish round trip from the pub messenger to the sub messenger with a
 destroy subject (recv is uninteruptable at the moment so this is our
 only to interrupt it) 5. Stop the messengers

 When I try and stop the messengers, the application deadlocks with the
 following backtrace (there is only one thread running at this point as
 the subscribe thread has since exited):

 Thread 1 (Thread 0x7f38181a4840 (LWP 6688)):
 #0  0x003518ce99ad in poll () at
 ../sysdeps/unix/syscall-template.S:81
 #1  0x00309c226a1c in poll (__timeout=optimized out,
 __nfds=optimized out, __fds=optimized out)
 at /usr/include/bits/poll2.h:46
 #2  pn_driver_wait_2 (d=d@entry=0x1a81140, timeout=optimized out,
 timeout@entry=-1)
 at /usr/src/debug/qpid-proton-0.4/proton-c/src/posix/driver.c:752
 #3  0x00309c226c42 in pn_driver_wait (d=0x1a81140,
 timeout=timeout@entry=-1)
 at /usr/src/debug/qpid-proton-0.4/proton-c/src/posix/driver.c:807
 #4  0x00309c2242d3 in pn_messenger_tsync (messenger=0x1a81050,
 predicate=0x309c222d80 pn_messenger_stopped, timeout=optimized
 out)
 at /usr/src/debug/qpid-proton-0.4/proton-c/src/messenger.c:623
 #5  0x00400ffb in main () at qpid_deadlock_repro.c:123

 Is this the correct workflow for this or am I missing a flush or
 unlock step somewhere along the way?

 Cheers,
 Frank

 --
 --

 Please consider the environment before printing this e-mail.

 This e-mail may contain confidential and/or privileged information. If
 you are not the intended recipient or have received this e-mail in
 error, please advise the sender immediately by reply e-mail and delete
 this message and any attachments without retaining a copy.

 Any unauthorised copying, disclosure or distribution of the material
 in this e-mail is strictly forbidden.
 --
 --

 *Please consider the environment before printing this email.*

 *Visit our website at http://www.nyse.com
 **
 ***

 Note: The information contained in this message and any attachment to
 it is privileged, confidential and protected from disclosure. If the
 reader of this message is not the intended recipient, or an employee
 or agent responsible for delivering this message to the intended
 recipient, you are hereby notified that any dissemination,
 distribution or copying of this communication is strictly prohibited.
 If you have received this communication in error, please notify the
 sender immediately by replying to the message, and please delete it
 from your system. Thank you. NYSE Euronext