Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
I don't think that is_alive would be the cause of this, it looks like a simple enough view. But if your server uses a thread pool, and all the other threads are now occupied by something that got locked up, then it could be that the server is not answering your is_alive request at all because it is waiting for a thread to free up first. Ah, so it seems that, as I leave the server running longer longer, more more threads are taken up with a `.recv()` call. I think one of my clients opens requests and does not read them/close them. Eventually all the threads are blocking in that fashion. I will fix my clients. But, is there a server-side fix to this (again, using Repoze.BFG)? Something to time out the connection after 60 seconds or so if nothing has happened? Thanks, - Claudiu ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
Ah, so it seems that, as I leave the server running longer longer, more more threads are taken up with a `.recv()` call. I think one of my clients opens requests and does not read them/close them. Eventually all the threads are blocking in that fashion. I will fix my clients. But, is there a server-side fix to this (again, using Repoze.BFG)? Something to time out the connection after 60 seconds or so if nothing has happened? Additional info: in particular the blocked threads' stack dumps look like this: Thread 140605868680960: File /usr/lib/python2.6/threading.py, line 504, in __bootstrap self.__bootstrap_inner() File /usr/lib/python2.6/threading.py, line 532, in __bootstrap_inner self.run() File /usr/lib/python2.6/threading.py, line 484, in run self.__target(*self.__args, **self.__kwargs) File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 878, in worker_thread_callback runnable() File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 1052, in lambda lambda: self.process_request_in_thread(request, client_address)) File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 1068, in process_request_in_thread self.finish_request(request, client_address) File /usr/lib/python2.6/SocketServer.py, line 322, in finish_request self.RequestHandlerClass(request, client_address, self) File /usr/lib/python2.6/SocketServer.py, line 617, in __init__ self.handle() File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 442, in handle BaseHTTPRequestHandler.handle(self) File /usr/lib/python2.6/BaseHTTPServer.py, line 329, in handle self.handle_one_request() File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 437, in handle_one_request self.wsgi_execute() File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 287, in wsgi_execute self.wsgi_start_response) File /home/tsa/env/lib/python2.6/site-packages/repoze/zodbconn/connector.py, line 21, in __call__ result = self.next_app(environ, start_response) File /home/tsa/env/lib/python2.6/site-packages/repoze/zodbconn/cachecleanup.py, line 25, in __call__ return self.next_app(environ, start_response) File /home/tsa/env/lib/python2.6/site-packages/repoze/retry/__init__.py, line 65, in __call__ chunk = original_wsgi_input.read(rest) File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 474, in read data = self.file.read(length) File /usr/lib/python2.6/socket.py, line 377, in read data = self._sock.recv(left) Was my assessment of the situation accurate? ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
Le 20/04/2012 16:06, Claudiu Saftoiu a écrit : Ah, so it seems that, as I leave the server running longer longer, more more threads are taken up with a `.recv()` call. I think one of my clients opens requests and does not read them/close them. Eventually all the threads are blocking in that fashion. I will fix my clients. But, is there a server-side fix to this (again, using Repoze.BFG)? Something to time out the connection after 60 seconds or so if nothing has happened? Additional info: in particular the blocked threads' stack dumps look like this: Thread 140605868680960: -- [snip]-- File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 474, in read data = self.file.read(length) File /usr/lib/python2.6/socket.py, line 377, in read data = self._sock.recv(left) Was my assessment of the situation accurate? I would say so, and it seems that what makes the thread hang on is that the socket is not closed by your clients. (otherwise recv would exit with an exception). Alex ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
On Fri, Apr 20, 2012 at 10:06:40AM -0400, Claudiu Saftoiu wrote: Ah, so it seems that, as I leave the server running longer longer, more more threads are taken up with a `.recv()` call. I think one of my clients opens requests and does not read them/close them. Does not write, rather. Eventually all the threads are blocking in that fashion. I will fix my clients. But, is there a server-side fix to this (again, using Repoze.BFG)? Additional info: in particular the blocked threads' stack dumps look like this: Thread 140605868680960: File /usr/lib/python2.6/threading.py, line 504, in __bootstrap self.__bootstrap_inner() File /usr/lib/python2.6/threading.py, line 532, in __bootstrap_inner self.run() File /usr/lib/python2.6/threading.py, line 484, in run self.__target(*self.__args, **self.__kwargs) File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 878, in worker_thread_callback runnable() File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 1052, in lambda lambda: self.process_request_in_thread(request, client_address)) File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 1068, in process_request_in_thread self.finish_request(request, client_address) File /usr/lib/python2.6/SocketServer.py, line 322, in finish_request self.RequestHandlerClass(request, client_address, self) File /usr/lib/python2.6/SocketServer.py, line 617, in __init__ self.handle() File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 442, in handle BaseHTTPRequestHandler.handle(self) File /usr/lib/python2.6/BaseHTTPServer.py, line 329, in handle self.handle_one_request() File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 437, in handle_one_request self.wsgi_execute() File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 287, in wsgi_execute self.wsgi_start_response) File /home/tsa/env/lib/python2.6/site-packages/repoze/zodbconn/connector.py, line 21, in __call__ result = self.next_app(environ, start_response) File /home/tsa/env/lib/python2.6/site-packages/repoze/zodbconn/cachecleanup.py, line 25, in __call__ return self.next_app(environ, start_response) File /home/tsa/env/lib/python2.6/site-packages/repoze/retry/__init__.py, line 65, in __call__ chunk = original_wsgi_input.read(rest) File /home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py, line 474, in read data = self.file.read(length) File /usr/lib/python2.6/socket.py, line 377, in read data = self._sock.recv(left) Was my assessment of the situation accurate? It appears so. Here's the code of repoze.retry: https://github.com/repoze/repoze.retry/blob/master/repoze/retry/__init__.py Line numbers differ somewhat; the original_wsgi_input.read(rest) is on line 71 on that github page right now. It does look like this is the bit that reads the HTTP request from the client. Something to time out the connection after 60 seconds or so if nothing has happened? You're using paste.httpserver. It has support for killing hung threads: http://pythonpaste.org/modules/httpserver.html#paste.httpserver.server_runner has a list of options. You could also put nginx in front, to take care of ill-behaved HTTP clients. Marius Gedminas -- Never be afraid to tell the world who you are. -- Anonymous signature.asc Description: Digital signature ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
On Thu, Apr 19, 2012 at 17:20, Claudiu Saftoiu csaft...@gmail.com wrote: My question is: what could possibly be causing the server to 'lock up', even on a simple view like 'is_alive', without using any memory or CPU? Is there some ZODB resource that might be getting gradually exhausted because I'm not handling it properly? I don't know, anything could lock up your site if something is waiting for a lock, for example. Use http://pypi.python.org/pypi/z3c.deadlockdebugger to figure out what the threads are doing at this time. Preferably, trigger the dump_threads() method of that module on SIGUSR1, like the Zope signalstack product does (see http://svn.plone.org/svn/collective/Products.signalstack/trunk/Products/signalstack/__init__.py for the exact code to bind the signal handler). That'll tell you exactly what each thread is busy with when you send the signal. -- Martijn Pieters ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
On Thu, Apr 19, 2012 at 11:33 AM, Martijn Pieters m...@zopatista.com wrote: On Thu, Apr 19, 2012 at 17:20, Claudiu Saftoiu csaft...@gmail.com wrote: My question is: what could possibly be causing the server to 'lock up', even on a simple view like 'is_alive', without using any memory or CPU? Is there some ZODB resource that might be getting gradually exhausted because I'm not handling it properly? I don't know, anything could lock up your site if something is waiting for a lock, for example. Are there locks that could possibly be used for the 'is_alive' function? Here is the definition in its entirety. In 'configure.zcml': view view=.views.is_alive name=is_alive renderer=json / in 'views.py': def is_alive(request): return True Whatever the problem is, it causes 'is_alive' to take forever, and the CPU is not spinning at 100%, and memory usage is not high. Could this be a lock problem? Use http://pypi.python.org/pypi/z3c.deadlockdebugger to figure out what the threads are doing at this time. Preferably, trigger the dump_threads() method of that module on SIGUSR1, like the Zope signalstack product does (see http://svn.plone.org/svn/collective/Products.signalstack/trunk/Products/signalstack/__init__.py for the exact code to bind the signal handler). That'll tell you exactly what each thread is busy with when you send the signal. Thanks, I'll check this out. - Claudiu ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
On Thu, Apr 19, 2012 at 18:22, Claudiu Saftoiu csaft...@gmail.com wrote: Are there locks that could possibly be used for the 'is_alive' function? Here is the definition in its entirety. In 'configure.zcml': view view=.views.is_alive name=is_alive renderer=json / in 'views.py': def is_alive(request): return True Whatever the problem is, it causes 'is_alive' to take forever, and the CPU is not spinning at 100%, and memory usage is not high. Could this be a lock problem? I have no idea; all you told is that you use the ZODB, not what server framework you use to register your views. Is this Grok, Bluebream, Repoze.BFG, Zope 2 or something else? I don't think that is_alive would be the cause of this, it looks like a simple enough view. But if your server uses a thread pool, and all the other threads are now occupied by something that got locked up, then it could be that the server is not answering your is_alive request at all because it is waiting for a thread to free up first. -- Martijn Pieters ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
I have no idea; all you told is that you use the ZODB, not what server framework you use to register your views. Is this Grok, Bluebream, Repoze.BFG, Zope 2 or something else? Ah yes, sorry about that. I'm using Repoze.BFG . Does that help any? I don't think that is_alive would be the cause of this, it looks like a simple enough view. But if your server uses a thread pool, and all the other threads are now occupied by something that got locked up, then it could be that the server is not answering your is_alive request at all because it is waiting for a thread to free up first. Yea, I don't think it is the 'is_alive'. I more mentioned that to help diagnose the problem, i.e., whatever the problem is, it also affects simple views that don't touch the database at all. Hmm, it might be the thread pool issue that you mention. That seems to make sense. I'll have to see if I have any views that are never finishing. Thanks for the pointer. Thanks again, - Claudiu ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] server stops handling requests - nowhere near 100% CPU or Memory used
I don't know, anything could lock up your site if something is waiting for a lock, for example. Use http://pypi.python.org/pypi/z3c.deadlockdebugger to figure out what the threads are doing at this time. Preferably, trigger the dump_threads() method of that module on SIGUSR1, like the Zope signalstack product does (see http://svn.plone.org/svn/collective/Products.signalstack/trunk/Products/signalstack/__init__.py for the exact code to bind the signal handler). That'll tell you exactly what each thread is busy with when you send the signal. That module seems to be just the trick for seeing if I have a deadlock issue. I set up the SIGUSR1 thing, but whenever I would send the signal to the server, it would stop, saying something about system call interrupting the 'select', so I just made it output the contents of dump_threads() to a file once a minute. If it happens again I'll look at the tail end of the file see what they're all busy with. Thanks, seems like this will likely do the trick, - Claudiu ___ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev