RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
... [Tim Peters] asyncore gives me a headache. [Tony Meyer] I think this is true for any value of me 0.5 wink. Not Sam Rushing -- he's asyncore's dad. He doesn't use threads, and, for that matter, doesn't use asyncore anymore either (possibly because it isn't confusing enough wink). But an _appropriate_ single-threaded asyncore-based wire-protocol implementation can be very easy to follow, and scale to thousands of clients even on feeble HW. It has its place. It just doesn't seem to be a place I usually go ... [Tim, explains how ZEO's ThreadedAsync/LoopCallback.py monkey patches Python's asyncore.loop] [Tony] Argh. This explains a lot. I couldn't understand why print statements in asyncore.loop didn't print, unless I renamed loop and called the renamed function (which would then have done bad things to ZEO, no doubt). Nasty indeed :) Yup! It's hard to account for how many lost debugging hours this may have cost various people, including that LoopCallback also took over asyncore's poll() function (so debugging prints, breakpoints, etc in _that_ also got lost). Yesterday I noticed that Python's asyncore.loop signature changed in Python 2.4 (a `count` argument was added), so LoopCallback's replacement is plain wrong for Python 2.4. The reason for replacing asyncore.poll() went away too, so I rewrote it all for ZODB 3.4.1 and 3.5 (neither released yet). It still replaces asyncore.loop, but with a wrapper that calls the _original_ asyncore.loop in the middle, so if anyone adds prints/breakpoints/etc to Python's asyncore.loop in the future, they'll still be effective despite ZEO's interference. Sorry for the bother, but take comfort in knowing that whining about it helped get it fixed wink. If the flow is like this: asyncore mainloop invokes POP3 proxy code POP3 proxy code makes a synchronous ZEO call then I figure the app may well hang then: the thread running the asyncore mainloop is still running a POP3 proxy callback, waiting for a response that can never happen until the asyncore mainloop gets control back (in order to send receive ZEO messages). This was definitely the problem. The easiest solution (partly because some of this work is already done wink), IMO, is to separate out the ZEO and asyncore-based proxy into separate asyncore maps and have two asyncore mainloop threads, one for each map. This follows Tim's comment about ZEO expecting the asyncore loop to be in a separate thread, too. Excellent! That makes some sense. ZEO _may_ change to do a similar thing, but I need to find/make time to be sure of the details. For historical reasons, ZEO doesn't actually require an asyncore mainloop to be running, and it replaces asyncore.loop precisely so it can get notified if anyone else starts the asyncore loop. If someone does, then ZEO kinda reconfigures itself on the fly to exploit that an asyncore mainloop is running. I'm not clear on why ZEO didn't just start one itself (that's one of the details I'm still unclear about). Anyway, there's a lot of weird internal complexity there trying to live both with and without asyncore, and to switch modes dynamically based on monkey-patching and callbacks, and I suspect it would go pretty easily to get rid of it all by having each ZEO client spin off a thread to run its own, dedicated-to-it asyncore mainloop -- effectively doing all the time what you've been provoked into doing by hand now. Then it could stop monkey-patching Python's asyncore too. Maybe that would just be a step on the way to removing asyncore dependence entirely. Anyway, this appears to have fixed the problem. Many thanks for the clues - you might not have understood why it was hanging, but your comments were enough to get it fixed anyway :) Ya, that's par for the course wink. I'm very glad you got unstuck with relatively little pain! ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
Tim Peters wrote: ... [Tim Peters] asyncore gives me a headache. [Paul Boots] Same here Then it's time to admit that ZEO's attempts to mix threads with asyncore give me migraine headaches 0.5 wink. I wonder whether this could be the problem: Paul said he's calling ZEO from within the proxy code, but it sounds like the proxy code itself runs as a side effect of asyncore callbacks. If the flow is like this: asyncore mainloop invokes POP3 proxy code POP3 proxy code makes a synchronous ZEO call then I figure the app may well hang then: the thread running the asyncore mainloop is still running a POP3 proxy callback, waiting for a response that can never happen until the asyncore mainloop gets control back (in order to send receive ZEO messages). I think that's exactly how the Proxy runs, we use asynchat and the 'line_terminator' to trigger a callback, so it appears the code runs 'magically' at first glance. I never used asynchat ( ZEO doesn't either), so can't guess whether it's contributing new complications. ZEO's control flow is murky to me too. I _think_ (but may well be wrong) that ZEO expects asyncore to be running in a different thread than the thread(s) application code using ZEO clients is(are) running in. Maybe someone who understands this better than I will jump in with a revelation. ZEO has 2 modes, synchronous and asynchronous. In asynchronous mode, ZEO expects a asyncore main loop to be running in it's own thread. A basic constraint of asyncore (and Twisted) programming is that I/O handlers must perform their tasks very quickly. Generally, they are expected to do short tasks, moving small amounts of data around at a time. Generally, this means that they should not be calling application code directly. This is why Zope executes application logic in separate application threads and limits work done by asyncore handlers to simple I/O. ... IMO/IME, asyncore is a poor fit for applications where the callbacks are fancy, or even where they may just take a long time to complete (because the asyncore mainloop is unresponsive for the duration). So if I had to use asyncore (I've never done so on my own initiative wink), I'd gravitate toward a work-queue model anyway, where threads unfettered by asyncore worries do all the real work-- especially on Windows, which loves to run threads --and where asyncore callbacks do as little as possible. Agreed. This is exactly the model that Zope uses. Jim -- Jim Fulton mailto:[EMAIL PROTECTED] Python Powered! CTO (540) 361-1714http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
On 6/22/05, Jim Fulton [EMAIL PROTECTED] wrote: Tim Peters wrote: IMO/IME, asyncore is a poor fit for applications where the callbacks are fancy, or even where they may just take a long time to complete (because the asyncore mainloop is unresponsive for the duration). So if I had to use asyncore (I've never done so on my own initiative wink), I'd gravitate toward a work-queue model anyway, where threads unfettered by asyncore worries do all the real work-- especially on Windows, which loves to run threads --and where asyncore callbacks do as little as possible. Agreed. This is exactly the model that Zope uses. ZEO also runs several potentially slow operations in separate threads. I think we've wondered in the past whether the tpc vote should be another of those operations as the disk IO for a large transaction is non-trivial. Jeremy ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
Tim Peters wrote at 2005-6-21 14:56 -0400: ... [Dieter Maurer] If you happen to run your application on Linux (and use the GDB), I can provide detailed instructions on how to find out where your code hangs... That would be helpful! Also you already solved the main problem, I will share how I analyse hanging problems with GDB under Linux. I attach the hanging progess with GDB gdb python process id info threads tells me about the process' threads, thread i allows me to switch into the context of the various threads. bt shows me the call trace in current thread. That's all C level information. To relate that to the Python level, I use a .gdbinit with the following definitions. ps (for Print String) outputs the value of a Python string variable. pfr (for Print FRame) can be called in eval_frame C level stack frames and tells filename, function name and line number of the corresponding Python call. With these commands I can reconstruct the Python call stack for the given thread (although it is a bit cumbersome). Would I know more about how Python stores its interpreter state per thread, this reconstruction would probably be even easier... def ps x/s ({PyStringObject}$arg0)-ob_sval end def pfr ps f-f_code-co_filename ps f-f_code-co_name #p f-f_lineno lineno end define lineno set $__co = f-f_code set $__lasti = f-f_lasti set $__sz = ((PyStringObject *)$__co-co_lnotab)-ob_size/2 set $__p = (unsigned char *)((PyStringObject *)$__co-co_lnotab)-ob_sval set $__li = $__co-co_firstlineno set $__ad = 0 while ($__sz-1 = 0) set $__sz = $__sz - 1 set $__ad = $__ad + *$__p set $__p = $__p + 1 if ($__ad $__lasti) # break -- interpreted as breakpoint set $__sz = -1 end if ($__sz = 0) set $__li = $__li + *$__p set $__p = $__p + 1 end end printf %d\n, $__li end -- Dieter ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
Hi, On Tue, 21 Jun 2005 11:37:34 +0200, [EMAIL PROTECTED] said: Dear people, We have an application that makes use of a ZEO client and has other async socket code Somehow, a call to the ZEO client never returns, it just hangs and sits there. Just a stab in the dark, but forget the async code for a minute. Have you tried just making the simplest possible connection from ZEO client to the server (across the network, I presume)? For example, at the interactive Python interpreter create a connection instance and see if you can get something from the root object. I once had what might have been a similar problem. ZEO clients on the same machine as the server worked fine but when the client was on a different machine I could sort of conenct but never got any response. Or rather, the client would hang indefinitely when I tried to connect. (Can't remember what the ZEO server log said.) I seem to recall that it had something to do with the server configuration and the fact that I was binding to 127.0.0.1 when I needed to bind to 192.whatever in order for remote machines to complete a request. Something like that. Anyway, your problem may have nothing to do with the details of your application if something like this is happening, so you should verify that you can successfully connect at all. In an attempt to fix this problem we added code to use a different socket map for the proxy objects, the ZEO client uses the default asyncore socket_map. But that did NOT solve our problem. At this point we are clueless to the reason behind this behaviour and hope that anyone on this list has some ideas or similar experiences. I realize this is a very terse description of our application, when necessary I can elaborate. -- Best regards, Paul Boots ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
[EMAIL PROTECTED] We have an application that makes use of a ZEO client and has other async socket code that implements as POP3 proxy. The ZEO client is called (to query and store ZEO server) from within the proxy code when it runs during mail checks, so we have multiple async connections at the same time. Somehow, a call to the ZEO client never returns, it just hangs and sits there. [Dieter Maurer] As long as you ensure that the asyncore mainloop is running, there should not be a problem to have more asyncore clients. If you happen to run your application on Linux (and use the GDB), I can provide detailed instructions on how to find out where your code hangs... That would be helpful! asyncore gives me a headache. I wonder whether this could be the problem: Paul said he's calling ZEO from within the proxy code, but it sounds like the proxy code itself runs as a side effect of asyncore callbacks. If the flow is like this: asyncore mainloop invokes POP3 proxy code POP3 proxy code makes a synchronous ZEO call then I figure the app may well hang then: the thread running the asyncore mainloop is still running a POP3 proxy callback, waiting for a response that can never happen until the asyncore mainloop gets control back (in order to send receive ZEO messages). IOW, if Paul added print statements to ZODB's ZEO/zrpc/smac.py's SizedMessageAsyncConnection readable() and writable() methods, I bet they never trigger when the app appears to be hung (which would mean that the thread running asyncore's mainloop is in fact not getting a chance to run the asyncore loop anymore). ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
Hi All, A combined reply ... [Stephen Masterman] Anyway, your problem may have nothing to do with the details of your application if something like this is happening, so you should verify that you can successfully connect at all. Good advice, and yes, we did confirm the connection upfront in a simple unittest, we test the connection, storing and retrieving from the ZODB with ZEO client/server. The client and server run on different machines. (XP and linux respectively) [Dieter Maurer] As long as you ensure that the asyncore mainloop is running, there should not be a problem to have more asyncore clients. If you happen to run your application on Linux (and use the GDB), I can provide detailed instructions on how to find out where your code hangs... That would be helpful! As said above, the client runs on Windows XP, the server on Linux (debian). [Tim Peters] asyncore gives me a headache. Same here I wonder whether this could be the problem: Paul said he's calling ZEO from within the proxy code, but it sounds like the proxy code itself runs as a side effect of asyncore callbacks. If the flow is like this: asyncore mainloop invokes POP3 proxy code POP3 proxy code makes a synchronous ZEO call I think that's exactly how the Proxy runs, we use asynchat and the 'line_terminator' to trigger a callback, so it appears the code runs 'magically' at first glance. then I figure the app may well hang then: the thread running the asyncore mainloop is still running a POP3 proxy callback, waiting for a response that can never happen until the asyncore mainloop gets control back (in order to send receive ZEO messages). IOW, if Paul added print statements to ZODB's ZEO/zrpc/smac.py's SizedMessageAsyncConnection readable() and writable() methods, I bet they never trigger when the app appears to be hung (which would mean that the thread running asyncore's mainloop is in fact not getting a chance to run the asyncore loop anymore). You're right - I added the suggested comments as first line in the readable() and writable() methods they never appear. Could I do synchronous calls to the ZEO server? An other option to bypass the problem is to use Zope/XMLRPC to do what we want, I assume that will not suffer they same problem. Your opinion would be much appreciated, Thanks -- Vriendelijke groet, Paul -- Vriendelijke groet, Paul ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
... [Tim Peters] asyncore gives me a headache. [Paul Boots] Same here Then it's time to admit that ZEO's attempts to mix threads with asyncore give me migraine headaches 0.5 wink. I wonder whether this could be the problem: Paul said he's calling ZEO from within the proxy code, but it sounds like the proxy code itself runs as a side effect of asyncore callbacks. If the flow is like this: asyncore mainloop invokes POP3 proxy code POP3 proxy code makes a synchronous ZEO call then I figure the app may well hang then: the thread running the asyncore mainloop is still running a POP3 proxy callback, waiting for a response that can never happen until the asyncore mainloop gets control back (in order to send receive ZEO messages). I think that's exactly how the Proxy runs, we use asynchat and the 'line_terminator' to trigger a callback, so it appears the code runs 'magically' at first glance. I never used asynchat ( ZEO doesn't either), so can't guess whether it's contributing new complications. ZEO's control flow is murky to me too. I _think_ (but may well be wrong) that ZEO expects asyncore to be running in a different thread than the thread(s) application code using ZEO clients is(are) running in. Maybe someone who understands this better than I will jump in with a revelation. IOW, if Paul added print statements to ZODB's ZEO/zrpc/smac.py's SizedMessageAsyncConnection readable() and writable() methods, I bet they never trigger when the app appears to be hung (which would mean that the thread running asyncore's mainloop is in fact not getting a chance to run the asyncore loop anymore). You're right - I added the suggested comments as first line in the readable() and writable() methods they never appear. The asyncore loop calls readable() and writable() on every object registered with asyncore, each time around the asyncore loop. So if those aren't getting called, the asyncore loop isn't running -- or it is running but the timeout on asyncore's select.select() call is so large that you didn't wait long enough to get output (I think that one's unlikely, but ...). BTW, something that might help get more clues: ZEO does a nasty thing to asyncore. In ZEO's ThreadedAsync/LoopCallback.py, it reaches into Python's asyncore module and _replaces_ asyncore.loop with its own loop function. That shouldn't change the functionality of asyncore, but it means that if you, e.g., put print statements or debugger breakpoints in Python's asyncore loop, they'll never trigger. If you're working at that level, you need to put them in LoopCallback.py's functions instead. Could I do synchronous calls to the ZEO server? An other option to bypass the problem is to use Zope/XMLRPC to do what we want, I assume that will not suffer they same problem. Your opinion would be much appreciated, *Someone's* might be -- like maybe Dieter's wink. I'm sorry, but I don't understand your application well enough to suggest something useful. I'm not familiar with Zope/XMLRPC either. For that matter, I don't really understand why your app is hanging now, although I seemed to get lucky with at least part of my guess last time. The only vague idea I have is along the lines of spinning off another thread to talk with ZEO, and have the POP3 proxy code queue up work requests for the ZEO thread to process (e.g., via an instance of Python's Queue.Queue, which is designed for this purpose). That's based on the guess that there's no problem with the POP3 proxy and ZEO just sharing asyncore, the problem is in trying to invoke ZEO _from_ an asyncore callback. IMO/IME, asyncore is a poor fit for applications where the callbacks are fancy, or even where they may just take a long time to complete (because the asyncore mainloop is unresponsive for the duration). So if I had to use asyncore (I've never done so on my own initiative wink), I'd gravitate toward a work-queue model anyway, where threads unfettered by asyncore worries do all the real work-- especially on Windows, which loves to run threads --and where asyncore callbacks do as little as possible. ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code
[Tim Peters] asyncore gives me a headache. I think this is true for any value of me 0.5 wink. [Tim, later] BTW, something that might help get more clues: ZEO does a nasty thing to asyncore. In ZEO's ThreadedAsync/LoopCallback.py, it reaches into Python's asyncore module and _replaces_ asyncore.loop with its own loop function. That shouldn't change the functionality of asyncore, but it means that if you, e.g., put print statements or debugger breakpoints in Python's asyncore loop, they'll never trigger. If you're working at that level, you need to put them in LoopCallback.py's functions instead. Argh. This explains a lot. I couldn't understand why print statements in asyncore.loop didn't print, unless I renamed loop and called the renamed function (which would then have done bad things to ZEO, no doubt). Nasty indeed :) If the flow is like this: asyncore mainloop invokes POP3 proxy code POP3 proxy code makes a synchronous ZEO call then I figure the app may well hang then: the thread running the asyncore mainloop is still running a POP3 proxy callback, waiting for a response that can never happen until the asyncore mainloop gets control back (in order to send receive ZEO messages). This was definitely the problem. The easiest solution (partly because some of this work is already done wink), IMO, is to separate out the ZEO and asyncore-based proxy into separate asyncore maps and have two asyncore mainloop threads, one for each map. This follows Tim's comment about ZEO expecting the asyncore loop to be in a separate thread, too. Anyway, this appears to have fixed the problem. Many thanks for the clues - you might not have understood why it was hanging, but your comments were enough to get it fixed anyway :) =Tony.Meyer ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev