RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-23 Thread Tim Peters
...

[Tim Peters]
 asyncore gives me a headache.

[Tony Meyer]
 I think this is true for any value of me 0.5 wink.

Not Sam Rushing -- he's asyncore's dad.  He doesn't use threads, and, for
that matter, doesn't use asyncore anymore either (possibly because it isn't
confusing enough wink).  But an _appropriate_ single-threaded
asyncore-based wire-protocol implementation can be very easy to follow, and
scale to thousands of clients even on feeble HW.  It has its place.  It just
doesn't seem to be a place I usually go ...

[Tim, explains how ZEO's ThreadedAsync/LoopCallback.py monkey patches
 Python's asyncore.loop]

[Tony]
 Argh.  This explains a lot.  I couldn't understand why print statements
 in asyncore.loop didn't print, unless I renamed loop and called the
 renamed function (which would then have done bad things to ZEO, no
 doubt).  Nasty indeed :)

Yup!  It's hard to account for how many lost debugging hours this may have
cost various people, including that LoopCallback also took over asyncore's
poll() function (so debugging prints, breakpoints, etc in _that_ also got
lost).  Yesterday I noticed that Python's asyncore.loop signature changed
in Python 2.4 (a `count` argument was added), so LoopCallback's replacement
is plain wrong for Python 2.4.  The reason for replacing asyncore.poll()
went away too, so I rewrote it all for ZODB 3.4.1 and 3.5 (neither released
yet).  It still replaces asyncore.loop, but with a wrapper that calls the
_original_ asyncore.loop in the middle, so if anyone adds
prints/breakpoints/etc to Python's asyncore.loop in the future, they'll
still be effective despite ZEO's interference.

Sorry for the bother, but take comfort in knowing that whining about it
helped get it fixed wink.

 If the flow is like this:

   asyncore mainloop invokes POP3 proxy code
   POP3 proxy code makes a synchronous ZEO call

 then I figure the app may well hang then:  the thread running the
 asyncore mainloop is still running a POP3 proxy callback, waiting for a
 response that can never happen until the asyncore mainloop gets control
 back (in order to send  receive ZEO messages).

 This was definitely the problem.  The easiest solution (partly because
 some of this work is already done wink), IMO, is to separate out the
 ZEO and asyncore-based proxy into separate asyncore maps and have two
 asyncore mainloop threads, one for each map.  This follows Tim's comment
 about ZEO expecting the asyncore loop to be in a separate thread, too.

Excellent!  That makes some sense.  ZEO _may_ change to do a similar thing,
but I need to find/make time to be sure of the details.  For historical
reasons, ZEO doesn't actually require an asyncore mainloop to be running,
and it replaces asyncore.loop precisely so it can get notified if anyone
else starts the asyncore loop.  If someone does, then ZEO kinda
reconfigures itself on the fly to exploit that an asyncore mainloop is
running.  I'm not clear on why ZEO didn't just start one itself (that's one
of the details I'm still unclear about).

Anyway, there's a lot of weird internal complexity there trying to live both
with and without asyncore, and to switch modes dynamically based on
monkey-patching and callbacks, and I suspect it would go pretty easily to
get rid of it all by having each ZEO client spin off a thread to run its
own, dedicated-to-it asyncore mainloop -- effectively doing all the time
what you've been provoked into doing by hand now.

Then it could stop monkey-patching Python's asyncore too.  Maybe that would
just be a step on the way to removing asyncore dependence entirely.

 Anyway, this appears to have fixed the problem.  Many thanks for the
 clues - you might not have understood why it was hanging, but your
 comments were enough to get it fixed anyway :)

Ya, that's par for the course wink.  I'm very glad you got unstuck with
relatively little pain!

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-22 Thread Jim Fulton

Tim Peters wrote:

...

[Tim Peters]


asyncore gives me a headache.



[Paul Boots]


Same here



Then it's time to admit that ZEO's attempts to mix threads with asyncore
give me migraine headaches 0.5 wink.


I wonder whether this could be the problem:  Paul said he's calling
ZEO from within the proxy code, but it sounds like the proxy code
itself runs as a side effect of asyncore callbacks.  If the flow is
like this:

 asyncore mainloop invokes POP3 proxy code
 POP3 proxy code makes a synchronous ZEO call

then I figure the app may well hang then:  the thread running the
asyncore mainloop is still running a POP3 proxy callback, waiting for a
response that can never happen until the asyncore mainloop gets control
back (in order to send  receive ZEO messages).




I think that's exactly how the Proxy runs, we use asynchat and the
'line_terminator' to trigger a callback, so it appears the code runs
'magically' at first glance.



I never used asynchat ( ZEO doesn't either), so can't guess whether it's
contributing new complications.  ZEO's control flow is murky to me too.  I
_think_ (but may well be wrong) that ZEO expects asyncore to be running in a
different thread than the thread(s) application code using ZEO clients
is(are) running in.  Maybe someone who understands this better than I will
jump in with a revelation.


ZEO has 2 modes, synchronous and asynchronous.

In asynchronous mode, ZEO expects a asyncore main loop to be running
in it's own thread.

A basic constraint of asyncore (and Twisted) programming is that
I/O handlers must perform their tasks very quickly.  Generally,
they are expected to do short tasks, moving small amounts of data
around at a time.  Generally, this means that they should not
be calling application code directly.  This is why Zope executes
application logic in separate application threads and limits work
done by asyncore handlers to simple I/O.

...


IMO/IME, asyncore is a poor fit for applications where the callbacks are
fancy, or even where they may just take a long time to complete (because
the asyncore mainloop is unresponsive for the duration).  So if I had to use
asyncore (I've never done so on my own initiative wink), I'd gravitate
toward a work-queue model anyway, where threads unfettered by asyncore
worries do all the real work-- especially on Windows, which loves to run
threads --and where asyncore callbacks do as little as possible.


Agreed.

This is exactly the model that Zope uses.

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-22 Thread Jeremy Hylton
On 6/22/05, Jim Fulton [EMAIL PROTECTED] wrote:
 Tim Peters wrote:
  IMO/IME, asyncore is a poor fit for applications where the callbacks are
  fancy, or even where they may just take a long time to complete (because
  the asyncore mainloop is unresponsive for the duration).  So if I had to use
  asyncore (I've never done so on my own initiative wink), I'd gravitate
  toward a work-queue model anyway, where threads unfettered by asyncore
  worries do all the real work-- especially on Windows, which loves to run
  threads --and where asyncore callbacks do as little as possible.
 
 Agreed.
 
 This is exactly the model that Zope uses.

ZEO also runs several potentially slow operations in separate threads.
 I think we've wondered in the past whether the tpc vote should be
another of those operations as the disk IO for a large transaction is
non-trivial.

Jeremy
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-22 Thread Dieter Maurer
Tim Peters wrote at 2005-6-21 14:56 -0400:
 ...
[Dieter Maurer]
 If you happen to run your application on Linux (and use the GDB), I
 can provide detailed instructions on how to find out where your code
 hangs...

That would be helpful!

Also you already solved the main problem, I will share
how I analyse hanging problems with GDB under Linux.

I attach the hanging progess with GDB

 gdb python process id

info threads tells me about the process' threads,
thread i allows me to switch into the context
of the various threads. bt shows me the call trace in current
thread. That's all C level information.

To relate that to the Python level, I use a .gdbinit
with the following definitions.
ps (for Print String) outputs the value of a Python
string variable.
pfr (for Print FRame) can be called in eval_frame
C level stack frames and tells filename, function name and line number
of the corresponding Python call.

With these commands I can reconstruct the Python call stack
for the given thread (although it is a bit cumbersome).

Would I know more about how Python stores its interpreter
state per thread, this reconstruction would probably be
even easier...


def ps
x/s ({PyStringObject}$arg0)-ob_sval
end

def pfr
ps f-f_code-co_filename
ps f-f_code-co_name
#p f-f_lineno
lineno
end

define lineno
set $__co = f-f_code
set $__lasti = f-f_lasti
set $__sz = ((PyStringObject *)$__co-co_lnotab)-ob_size/2
set $__p = (unsigned char *)((PyStringObject *)$__co-co_lnotab)-ob_sval
set $__li = $__co-co_firstlineno
set $__ad = 0
while ($__sz-1 = 0)
  set $__sz = $__sz - 1
  set $__ad = $__ad + *$__p
  set $__p = $__p + 1
  if ($__ad  $__lasti)
# break -- interpreted as breakpoint
set $__sz = -1
  end
  if ($__sz = 0)
set $__li = $__li + *$__p
set $__p = $__p + 1
  end
end
printf %d\n, $__li
end


-- 
Dieter
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-21 Thread Stephen Masterman
Hi, 
On Tue, 21 Jun 2005 11:37:34 +0200, [EMAIL PROTECTED] said:
 Dear people,
 
 
 We have an application that makes use of a ZEO client and has other async
 socket code

 Somehow, a call to the ZEO client never returns, it just hangs and sits
 there.

Just a stab in the dark, but forget the async code for a minute. Have
you tried just making the simplest possible connection from ZEO client
to the server (across the network, I presume)? For example, at the
interactive Python interpreter create a connection instance and see if
you can get something from the root object.

I once had what might have been a similar problem. ZEO clients on the
same machine as the server worked fine but when the client was on a
different machine I could sort of conenct but never got any response. Or
rather, the client would hang indefinitely when I tried to connect.
(Can't remember what the ZEO server log said.) I seem to recall that it
had something to do with the server configuration and the fact that I
was binding to 127.0.0.1 when I needed to bind to 192.whatever in order
for remote machines to complete a request. Something like that.

Anyway, your problem may have nothing to do with the details of your
application if something like this is happening, so you should verify
that you can successfully connect at all.

 
 In an attempt to fix this problem we added code to use a different socket
 map for the proxy objects,
 the ZEO client uses the default asyncore socket_map. But that did NOT
 solve our problem.
 
 At this point we are clueless to the reason behind this behaviour and
 hope that anyone on this
 list has some ideas or similar experiences.
 
 I realize this is a very terse description of our application, when
 necessary I can elaborate.
 
 
 -- 
 Best regards,
 Paul Boots
 
 
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/
 
 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-21 Thread Tim Peters
[EMAIL PROTECTED]
 We have an application that makes use of a ZEO client and has other
 async socket code that implements as POP3 proxy. The ZEO client is
 called (to query and store ZEO server) from within the proxy code when
 it runs during mail checks, so we have multiple async connections at the
 same time.

 Somehow, a call to the ZEO client never returns, it just hangs and sits
 there.

[Dieter Maurer]
 As long as you ensure that the asyncore mainloop is running, there
 should not be a problem to have more asyncore clients.

 If you happen to run your application on Linux (and use the GDB), I
 can provide detailed instructions on how to find out where your code
 hangs...

That would be helpful!

asyncore gives me a headache.  I wonder whether this could be the problem:
Paul said he's calling ZEO from within the proxy code, but it sounds like
the proxy code itself runs as a side effect of asyncore callbacks.  If the
flow is like this:

   asyncore mainloop invokes POP3 proxy code
   POP3 proxy code makes a synchronous ZEO call

then I figure the app may well hang then:  the thread running the asyncore
mainloop is still running a POP3 proxy callback, waiting for a response that
can never happen until the asyncore mainloop gets control back (in order to
send  receive ZEO messages).

IOW, if Paul added print statements to ZODB's ZEO/zrpc/smac.py's
SizedMessageAsyncConnection readable() and writable() methods, I bet they
never trigger when the app appears to be hung (which would mean that the
thread running asyncore's mainloop is in fact not getting a chance to run
the asyncore loop anymore).

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-21 Thread zodb
Hi All,

A combined reply ...

[Stephen Masterman]
Anyway, your problem may have nothing to do with the details of your
application if something like this is happening, so you should verify
that you can successfully connect at all.

Good advice, and yes, we did confirm the connection upfront in a simple 
unittest,
we test the connection, storing and retrieving from the ZODB with ZEO 
client/server.
The client and server run on different machines. (XP and linux respectively)

[Dieter Maurer]
  As long as you ensure that the asyncore mainloop is running, there
 should not be a problem to have more asyncore clients.

 If you happen to run your application on Linux (and use the GDB), I
 can provide detailed instructions on how to find out where your code
 hangs...

That would be helpful!

As said above, the client runs on Windows XP, the server on Linux (debian).

[Tim Peters]
asyncore gives me a headache.

Same here

  I wonder whether this could be the problem:
Paul said he's calling ZEO from within the proxy code, but it sounds like
the proxy code itself runs as a side effect of asyncore callbacks.  If the
flow is like this:

   asyncore mainloop invokes POP3 proxy code
   POP3 proxy code makes a synchronous ZEO call

I think that's exactly how the Proxy runs, we use asynchat and the 
'line_terminator'
to trigger a callback, so it appears the code runs 'magically' at first glance.

then I figure the app may well hang then:  the thread running the asyncore
mainloop is still running a POP3 proxy callback, waiting for a response that
can never happen until the asyncore mainloop gets control back (in order to
send  receive ZEO messages).

IOW, if Paul added print statements to ZODB's ZEO/zrpc/smac.py's
SizedMessageAsyncConnection readable() and writable() methods, I bet they
never trigger when the app appears to be hung (which would mean that the
thread running asyncore's mainloop is in fact not getting a chance to run
the asyncore loop anymore).

You're right - I added the suggested comments as first line in the readable() 
and writable() methods
they never appear.

Could I do synchronous calls to the ZEO server?
An other option to bypass the problem is to use Zope/XMLRPC to do what we want,
I assume that will not suffer they same problem.

Your opinion would be much appreciated,

Thanks

-- 
Vriendelijke groet,
Paul
-- 
Vriendelijke groet,
Paul
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-21 Thread Tim Peters
...

[Tim Peters]
 asyncore gives me a headache.

[Paul Boots]
 Same here

Then it's time to admit that ZEO's attempts to mix threads with asyncore
give me migraine headaches 0.5 wink.

 I wonder whether this could be the problem:  Paul said he's calling
 ZEO from within the proxy code, but it sounds like the proxy code
 itself runs as a side effect of asyncore callbacks.  If the flow is
 like this:

   asyncore mainloop invokes POP3 proxy code
   POP3 proxy code makes a synchronous ZEO call

 then I figure the app may well hang then:  the thread running the
 asyncore mainloop is still running a POP3 proxy callback, waiting for a
 response that can never happen until the asyncore mainloop gets control
 back (in order to send  receive ZEO messages).

 I think that's exactly how the Proxy runs, we use asynchat and the
 'line_terminator' to trigger a callback, so it appears the code runs
 'magically' at first glance.

I never used asynchat ( ZEO doesn't either), so can't guess whether it's
contributing new complications.  ZEO's control flow is murky to me too.  I
_think_ (but may well be wrong) that ZEO expects asyncore to be running in a
different thread than the thread(s) application code using ZEO clients
is(are) running in.  Maybe someone who understands this better than I will
jump in with a revelation.

 IOW, if Paul added print statements to ZODB's ZEO/zrpc/smac.py's
 SizedMessageAsyncConnection readable() and writable() methods, I bet
 they never trigger when the app appears to be hung (which would mean
 that the thread running asyncore's mainloop is in fact not getting a
 chance to run the asyncore loop anymore).

 You're right - I added the suggested comments as first line in the
 readable() and writable() methods they never appear.

The asyncore loop calls readable() and writable() on every object registered
with asyncore, each time around the asyncore loop.  So if those aren't
getting called, the asyncore loop isn't running -- or it is running but the
timeout on asyncore's select.select() call is so large that you didn't wait
long enough to get output (I think that one's unlikely, but ...).

BTW, something that might help get more clues:  ZEO does a nasty thing to
asyncore.  In ZEO's ThreadedAsync/LoopCallback.py, it reaches into Python's
asyncore module and _replaces_ asyncore.loop with its own loop function.
That shouldn't change the functionality of asyncore, but it means that if
you, e.g., put print statements or debugger breakpoints in Python's asyncore
loop, they'll never trigger.  If you're working at that level, you need to
put them in LoopCallback.py's functions instead.

 Could I do synchronous calls to the ZEO server?  An other option to
 bypass the problem is to use Zope/XMLRPC to do what we want, I assume
 that will not suffer they same problem.

 Your opinion would be much appreciated,

*Someone's* might be -- like maybe Dieter's wink.  I'm sorry, but I don't
understand your application well enough to suggest something useful.  I'm
not familiar with Zope/XMLRPC either.  For that matter, I don't really
understand why your app is hanging now, although I seemed to get lucky with
at least part of my guess last time.  The only vague idea I have is along
the lines of spinning off another thread to talk with ZEO, and have the POP3
proxy code queue up work requests for the ZEO thread to process (e.g., via
an instance of Python's Queue.Queue, which is designed for this purpose).
That's based on the guess that there's no problem with the POP3 proxy and
ZEO just sharing asyncore, the problem is in trying to invoke ZEO _from_
an asyncore callback.

IMO/IME, asyncore is a poor fit for applications where the callbacks are
fancy, or even where they may just take a long time to complete (because
the asyncore mainloop is unresponsive for the duration).  So if I had to use
asyncore (I've never done so on my own initiative wink), I'd gravitate
toward a work-queue model anyway, where threads unfettered by asyncore
worries do all the real work-- especially on Windows, which loves to run
threads --and where asyncore callbacks do as little as possible.

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] ZEO client hangs when combined with other asyncore code

2005-06-21 Thread Tony Meyer
[Tim Peters]
 asyncore gives me a headache.

I think this is true for any value of me 0.5 wink.

[Tim, later]
 BTW, something that might help get more clues:  ZEO does a
 nasty thing to asyncore.  In ZEO's 
 ThreadedAsync/LoopCallback.py, it reaches into Python's 
 asyncore module and _replaces_ asyncore.loop with its own 
 loop function. That shouldn't change the functionality of 
 asyncore, but it means that if you, e.g., put print 
 statements or debugger breakpoints in Python's asyncore loop, 
 they'll never trigger.  If you're working at that level, you 
 need to put them in LoopCallback.py's functions instead.

Argh.  This explains a lot.  I couldn't understand why print statements in
asyncore.loop didn't print, unless I renamed loop and called the renamed
function (which would then have done bad things to ZEO, no doubt).  Nasty
indeed :)

 If the flow is like this:

   asyncore mainloop invokes POP3 proxy code
   POP3 proxy code makes a synchronous ZEO call
 then I figure the app may well hang then:  the thread running the
 asyncore mainloop is still running a POP3 proxy callback, 
 waiting for a response that can never happen until the asyncore
 mainloop gets control back (in order to send  receive ZEO messages).

This was definitely the problem.  The easiest solution (partly because some
of this work is already done wink), IMO, is to separate out the ZEO and
asyncore-based proxy into separate asyncore maps and have two asyncore
mainloop threads, one for each map.  This follows Tim's comment about ZEO
expecting the asyncore loop to be in a separate thread, too.

Anyway, this appears to have fixed the problem.  Many thanks for the clues -
you might not have understood why it was hanging, but your comments were
enough to get it fixed anyway :)

=Tony.Meyer

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev