Re: [Zope-dev] Runaway processes
Andreas Jung wrote: About 6 months to 1 year ago, I have read reports about experiments on using Zope directly from mod_python in the Zope mailing list. Since Zope provides a WSGI interface you can run Zope within almost all WSGI-enabled enviroment. Am I right in thinking this is exactly the kind of use case for Repoze? cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Runaway processes
Dieter Maurer wrote at 2007-12-7 19:53 +0100: Stephan Richter wrote at 2007-12-5 17:47 -0500: ... requiring killing long running request processing ... After a little more thoughts, I think that the most promissing and efficient way would be to let requests be handled by a persistent ZEO client activated from mod_python (or a similar technique). This would mean that the request is effectively handled by an Apache child process and this process could be killed if necessary. About 6 months to 1 year ago, I have read reports about experiments on using Zope directly from mod_python in the Zope mailing list. -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Runaway processes
--On 8. Dezember 2007 19:41:29 +0100 Dieter Maurer [EMAIL PROTECTED] wrote: Dieter Maurer wrote at 2007-12-7 19:53 +0100: Stephan Richter wrote at 2007-12-5 17:47 -0500: ... requiring killing long running request processing ... After a little more thoughts, I think that the most promissing and efficient way would be to let requests be handled by a persistent ZEO client activated from mod_python (or a similar technique). This would mean that the request is effectively handled by an Apache child process and this process could be killed if necessary. About 6 months to 1 year ago, I have read reports about experiments on using Zope directly from mod_python in the Zope mailing list. Since Zope provides a WSGI interface you can run Zope within almost all WSGI-enabled enviroment. -aj pgpCznOO9d8F4.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Runaway processes
Stephan Richter wrote at 2007-12-5 17:47 -0500: ... On Unix-like systems, we can use `os.fork()`. The advantage of this approach is that I can use OS system calls to kill the process. However, ZODB database storages cannot be shared between processes. Nikolay Kim has done some preliminary experiments and found that `db.open()` locks the system (for both, `FileStorage` and `ZeoClientStorage`). I have not verified these results or tried to figure out why it is hanging, but I can see the problem for `FileStorage`. Are there any known side-effects on what happens, if I fork after the connection has been made? We are using this kind of architecture to generate our newsletters: A scheduling process periodically checks the ZODB for new work (newsletters to be published). It does this via ZCatalog queries. If the scheduler finds a newsletter to publish, it forks and let the child produce the newsletter. I had to do some tricks to get it working -- and new ZODB versions tend to require more tricks. My code currently looks like this: pid= fork() if not pid: # the line below is necessary to prevent a child from # stealing messages destined for the parent clearParentZODBState() config.setup() # reopen storage in order not to confuse the ZEO protocol clearParentZODBState looks like this (for ZODB 3.4): def clearParentZODBState(): '''called in the forked child to clear the parents ZODB state in order to prevent the child to intercept messages destined for the parent. Almost surely dependent on the ZODB version. ''' # necessary for ZODB 3.2 from asyncore import socket_map socket_map.clear() # get rid of any handlers for the parent's IO # necessary for ZODB 3.4 try: from transaction import manager except ImportError: manager = None if manager is not None: manager._txns.clear() # get rid of the parent's transactions manager._synchs.clear() # get rid of the parent's synchronizers config.setup looks like: s = ClientStorage((zeoServer, int(zeoPort))) db = DB(s, version_cache_size=2000, ) db.setClassFactory(ClassFactory) c = db.open(temporary=1) The approach is viable only when you have truely long running processes (and not for quick requests) as opening a new connection is expensive (mainly because the cache is initially empty). Currently, we have occasionally a non-deteristic LDAP problem. I expect that the LDAP connection is shared by the forked processes -- and, understandably, the LDAP server does not expect to get requests from different, not synchronized sources on the same connection. Apart from that, our solution is working (at least until the next ZODB upgrade). -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Runaway processes
Some high-level drive-by comments: - You should avoid runaway processes. :) I'm actually quite serious. - You can run multiple processes and monitor their progress -- killing processes that are stuck. zc.z3monitor provides some output that makes this pretty straightforward. - It might be interesting to see if Java or .Net give better control over threads. If they do, then this might make Zope ports to Jython or IronPython more interesting. (People who get upset by the Python GIL should already find these platforms interesting.) JIm -- Jim Fulton Zope Corporation ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Runaway processes
Hi Jim, first of all, thanks a lot for your quick response. On Thursday 06 December 2007, Jim Fulton wrote: - You should avoid runaway processes. :) I'm actually quite serious. Yes, we are trying hard. :-) We are currently using lovely.remotetask to export those calls even to a different server, where we run the code in forked subprocesses. After having played for a day with this problem, I came to the following conclusion: Python assumes that all used C-libraries are wrapped in a way that they are non-blocking and cannot lock up. When forking, you better know what you are doing. Based on your comment, I think you agree. :-) - You can run multiple processes and monitor their progress -- killing processes that are stuck. I think this is a really good idea that requires little software and not much setup either. zc.z3monitor provides some output that makes this pretty straightforward. Wow, zc.z3monitor -- and zc.ngi which I looked at too -- are very cool. How useful! I have been following the checkins and had an idea of what it was about, but it is definitely cooler than I thought. :-) However, some code calls a blocking operation, zc.z3monitor will be locked up as well. I guess then I only have to check whether I get connectivity or not. ;-) - It might be interesting to see if Java or .Net give better control over threads. If they do, then this might make Zope ports to Jython or IronPython more interesting. (People who get upset by the Python GIL should already find these platforms interesting.) Java has a lot more control over threads, but I still have found complaints that they left out some features for portability reasons. I could not immediately find an answer on whether blocking, run-away C-calls are handled correctly. I talked to Roy Mathew, a once at a time Java guru, and he said that Java has a time-limit it gives each thread for doing some work. (I am not supposed to quote him on that. ;-) So if that is true, then Java does not have the problem. Roy did mention, though, that debugging locked threads in Java is a common skill after you reached a certain level of Java Zen. I looked yesterday quiet a bit at Win32 threads. That part of the Win32 kernel seems pretty well thought out and from what I can tell you can have quiet a bit of control over the threads. Again, I am not sure how hanging threads are handled. BTW, there is a nice article on Python threads here (probably nothing new for you, Jim): http://linuxgazette.net/107/pai.html Regards, Stephan -- Stephan Richter CBU Physics Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Runaway processes
Hi everyone, I have a problem and I am hoping that it has been solved already by someone or that I will get some input on at least.I apologize for the lengthy E-mail in advance, but I wanted to provide a detailed discussion as a starting point. Zope is designed to have very short-lived transactions. If transactions are long-living all sorts of problems arise, most notably: 1. We occupy one thread for a long time. 2. The chance of conflict errors increases. Problem 1 can be addressed by increasing the number of allowed threads or to simply add more Zope servers. But his has clearly its limits and is really just a work-around. Another way to solve the problem is to identify long-running operations and calling them asynchronous. Many of us have implemented solutions for this, one of which is lovely.remotetask. Problem number 2 can only be addressed by identifying the long-running tasks beforehand and move them into an async call, again via lovely.remotetask for example. But what happens, if a something unexpected happens and we have an unanticipated long-runnning process? The worst case being something runs forever. Then whenever this problem occurs, one thread will be locked forever, and we can have a total system lockdown in no-time. So how can this be solved? Effectively, from within Zope we cannot do anything, because (a) Zope makes no assumption about running in a thread, and (b) the application is stuck and won't have a hook to get unstuck. So we have to solve the problem from outside. Currently, Zope is commonly run from an application thread. At least both WSGI servers that we commonly use, twisted and zserver, are implemented this way. This means that by some criterion, probably some timeout, the thread should be killed. But hold on! In Python threads cannot be killed. :-( I have done some research and found issue 221115 [1], which discusses the shortcoming of not being able to kill a thread. The discussion ended in making a feature request in PEP 42 [2] which has not been implemented as far as I can tell. So I googled some more to find possible implementations. Here are two distinctively different solutions (others I have found are either obviously trivial and will not work, or are derivatives of these two): 1. A Python-only solution using sys.settrace [3]. Besides making everything very slow, sys.settrace() is only called when a new byte code instruction is executed. So in case a low-level call hangs up the process, then the trace intercept will never be called. 2. Use an exception to intercept execution on the C-level [4]. This looked very promising, until I read the following comment on the page: The exception will be raised only when executing python bytecode. If your thread calls a native/built-in blocking function, the exception will be raised only when execution returns to the python code. So my conclusion is that Python threads cannot be unconditionally killed. BTW, if a low-level call is blocking, then all Python threads are blocked. From the Python `thread` library documentation[5]: Not all built-in functions that may block waiting for I/O allow other threads to run. (The most popular ones (time.sleep(), file.read(), select.select()) work as expected.) In all fairness, though, those are very rare occurrences. Most libraries are non-blocking and the above solutions would be just fine. But in my case, I really need to find a way to kill a Zope execution environment when a C call hangs. So what other choices do we have? On Unix-like systems, we can use `os.fork()`. The advantage of this approach is that I can use OS system calls to kill the process. However, ZODB database storages cannot be shared between processes. Nikolay Kim has done some preliminary experiments and found that `db.open()` locks the system (for both, `FileStorage` and `ZeoClientStorage`). I have not verified these results or tried to figure out why it is hanging, but I can see the problem for `FileStorage`. Are there any known side-effects on what happens, if I fork after the connection has been made? Since I am using the original process merely as a control, I guess I should be fine. Of course, the interesting question is: what happens to the ZODB connection, not to mention to the DB, if it is in the middle of writing? I guess the safest solution would be to fork within the constraint of the transaction. Any comments will be very much appreciated. Once we decide on the forking approach, we have to solve the issue for Windows of course too. My googling did not turn out immediately successful, but I think if we use Windows' native threads they will provide us with the necessary API, since I can exit it at any time. .. [1]: http://bugs.python.org/issue221115 .. [2]: http://www.python.org/dev/peps/pep-0042/ .. [3]: http://www.velocityreviews.com/forums/t330554-kill-a-thread-in-python.html .. [4]:
Re: [Zope-dev] Runaway processes
On Unix-like systems, we can use `os.fork()`. The advantage of this approach is that I can use OS system calls to kill the process. However, ZODB database storages cannot be shared between processes. Nikolay Kim has done some preliminary experiments and found that `db.open()` locks the system (for both, `FileStorage` and `ZeoClientStorage`). I have not verified these results or tried to figure out why it is hanging, but I can see the problem for `FileStorage`. i have to create new zeo storage for each child to make forking server work. i think for forking server we need long running childs ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Runaway processes
On Wednesday 05 December 2007, Stephan Richter wrote: On Unix-like systems, we can use `os.fork()`. The advantage of this approach is that I can use OS system calls to kill the process. However, ZODB database storages cannot be shared between processes. Nikolay Kim has done some preliminary experiments and found that `db.open()` locks the system (for both, `FileStorage` and `ZeoClientStorage`). I have not verified these results or tried to figure out why it is hanging, but I can see the problem for `FileStorage`. Okay, I spent the rest of the day testing the waters. ;-) The results are somewhat discouraging but the situation is not hopeless. FileStorage - All the file handling works just right. However, the object index is kept in memory, and the original and forked process do not share the same memory space. Thus, once the child process is done doing its modifications to the database, the parent does not know about the updated index. I have done a small hack that reloads the index and it works. However, loading the index can take a long time for large databases. To make this approach feasible, we would need to find a way to describe changes in the index and send the result to the parent via a file. (ZEO) ClientStorage - I could not get this to work at all, because at various steps in the transaction process, the code tries to allocate a lock, but cannot get it, which causes infinite loops. I have attached a small package that demonstrates the behavior. You can use the usual bootstrap/buildout dance. Before running the zeo-based test, you need to start the ZEO server using ./bin/zeo-server fg. Regards, Stephan -- Stephan Richter CBU Physics Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training forktest.tgz Description: application/tgz ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )