[Zope] Zope Patch for Zeo Client Connection.py
Hi Everyone, We were experience problems with our zeo client setup on redhat rhel4. The client would just quit responding. No memory or cpu increase was associated with this. The client would remain hung until it was restarted. We looked on the client using Netstat and the status was ESTABLISHED with the zeo server. On the zeo server the netstat said LISTENING. When running the deadlockdebugger, one thread was in asnycore wait. The others were normal actions such as folder listing or folder contents. We implemented a couple of lines of code on line 641 of connection.py in the ZEO/zrpc packages We added and else clause to call self.close() if delay is over one second. We found that one second wasn't quite enough and moved it to 5 seconds. Now we find out that this drastically improved our performance. The servers are now 1 second per page load. Before, they could be 5 seconds or more, if they loaded at all. Also, our servers used to crash several times a day and they now haven't crashed in almost a week. I just wanted to put this out there and see if anyone has any comments at all. I need to get a more permenant solution than this, but it is what we have for now. Thank you, Paul Williams ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Re: zope unresponsive
Ok, here is what we have. I did a netstat on both machines, client and server. The client sees and established connection and the server does not. In the server log there is a disconnect. As far as hardware between them, there is a switch (dell powerconnect 6024). Web Server Directors might get hold of it but there are no hops on traceroute. Traceroute only shows the client machine and the server machine. So the client is just continuously polling the connection but getting nothing back. What we are thinking about doing is changing the code in zrpc/connection.py to close the connection in wait (line 638 zope version 2.9.5) if the wait time gets too large or the poll has happened too many times. We are great at plone development, but have very little backend zope development. Would someone please advise me as to whether this is going to cause more problems? Thanks, Paul Williams Paul Williams wrote: I have posted this several times, but have not until now been able to get DeadlockDebugger installed. I see several people have had this problem, but no-one has posted a solution. zope 2.9.5 + zeo pythonm2.4.3 Red Hat RHEL 4 Plone 2.5.1 Our zeo clients hang intermittently. We have no way of reproducing the problem, but it occurs daily. The client hangs and a restart seems to fix the problem. In the event log with tracing on we get Trace zeo.zrpc.Connection(C) wait(16697) {server:8100} pending, async=0 There are hundreds to thousands of these until the server is restarted. In the zeo log we get Error caught in asyncor asyncore.py error:(110,'Connection timed out') We have been trying to track this down and have had no luck. Does anyone have any suggestions? Below is our deadlock debugger output Threads traceback dump at 2007-02-23 15:26:50 Thread -1269564496 (GET /VirtualHostBase/https/soawds:443/VirtualHostRoot/Content///training): File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZServer/PubCore/ZServerPublisher.py, line 23, in __init__ File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/Publish.py, line 395, in publish_module File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/Publish.py, line 196, in publish_module_standard File /apps1/zope2.9.5/navo_instance/Products/PlacelessTranslationService/PatchStringIO.py, line 34, in new_publish x = Publish.old_publish(request, module_name, after_list, debug) File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/Publish.py, line 115, in publish File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/mapply.py, line 88, in mapply File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/Publish.py, line 41, in call_object File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/Shared/DC/Scripts/Bindings.py, line 311, in __call__ File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/Shared/DC/Scripts/Bindings.py, line 348, in _bindAndExec File /apps1/zope2.9.5/navo_instance/Products/CMFCore/FSPageTemplate.py, line 195, in _exec result = self.pt_render(extra_context=bound_names) File /apps1/zope2.9.5/navo_instance/Products/CacheSetup/patch_cmf.py, line 38, in FSPT_pt_render result = FSPageTemplate.inheritedAttribute('pt_render')( File /apps1/zope2.9.5/navo_instance/Products/CacheSetup/patch_cmf.py, line 92, in PT_pt_render tal=not source, strictinsert=0)() File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 238, in __call__ File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 281, in interpret File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 749, in do_useMacro File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 281, in interpret File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 457, in do_optTag_tal File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 442, in do_optTag File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 437, in no_tag File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 281, in interpret File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 749, in do_useMacro File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 281, in interpret File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/TAL/TALInterpreter.py, line 507, in do_setLocal_tal File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/Products/PageTemplates/TALES.py, line 221, in evaluate File
[Zope] Re: Exporting portal member data to csv
What does your log say? Win Myint Aung wrote: When i run the external method in plone site, it shows errors. The error shows. Site Error An error was encountered while publishing this resource. *Error Type: AttributeError* *Error Value: portal_memberdata* The following is the codings used in external methon. # make heading row row = makeRow() row[0] = 'member_id' row[1] = 'password' writer.writerow(row) for member in self.portal_membership.listMembers(): # make row for each member full of blank values row = makeRow() member_id = member.getId() user = acl_users.getUser(name=member_id) password = user._getPassword() row[0] = member_id row[1] = password writer.writerow(row) request.RESPONSE.setHeader('Content-Type','application/csv') request.RESPONSE.setHeader('Content-Length',len(text.getvalue())) request.RESPONSE.setHeader('Content-Disposition','inline;filename=%smembers.csv' % time.strftime(%Y%m%d-%H%M%S-,time.localtime())) return text.getvalue() TV dinner still cooling? Check out Tonight's Picks http://us.rd.yahoo.com/evt=49979/*http://tv.yahoo.com/ on Yahoo! TV. ___ Zope-DB mailing list Zope-DB@zope.org http://mail.zope.org/mailman/listinfo/zope-db ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Re: zope unresponsive
Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Paul Williams wrote: Ok, here is what we have. I did a netstat on both machines, client and server. The client sees and established connection and the server does not. In the server log there is a disconnect. As far as hardware between them, there is a switch (dell powerconnect 6024). Web Server Directors might get hold of it but there are no hops on traceroute. Traceroute only shows the client machine and the server machine. So the client is just continuously polling the connection but getting nothing back. That sounds like some weird kernel / networking problem to me: I don't see how Zope could be able to keep calling 'select' on a socket after the other side has closed it. We agree. This is a strange situation that none of us have seen before. However, we have until tomorrow to do something and replacing hardware is not feasable. Is there any possibility that some kind of failover / IP takeover has happened, such that the storage server now running is not the same host / instance as the one to shich the clients originally connected? Are you using LVS + heartbeat, or some kind of hardware load balancer to manage such redundancy? We do have Web Services Directors that do load balancing, but in this particular case, the storage server is not setup for load balancing, I am not aware of any features that make the zodb capable of clustering except for replication services offered through zope. We are not sure whether the traffic is going to the Web Services Directores or not. Even if it is, there are thousands of settings and there is no-one available that knows what to change. The storage server is a simple nas server with a static ip address. What we are thinking about doing is changing the code in zrpc/connection.py to close the connection in wait (line 638 zope version 2.9.5) if the wait time gets too large or the poll has happened too many times. We are great at plone development, but have very little backend zope development. Would someone please advise me as to whether this is going to cause more problems? According to the log message you posted earlier in the thread, your appservers are spewing thousands of log messages from the connection's 'pending' method, although your deadlock debugger output shows the one thread blocked on 'select' inside of the connection's 'wait' method. There should be lots of log messages at TRACE level for the wait call, including a doubling / backoff of the delay value from 1 mx to 1 sec. Do you see those log messages, as well? These messages are there. You can see the time doubling. This is where we were thinking of breaking the connection once it gets to a certain point and make zope reconnect. This solves our hung connection problem, we think. However, I am hoping someone can let me know if I am breaking something else by doing this. Tres. - -- === Tres Seaver +1 540-429-0999 [EMAIL PROTECTED] Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF5Dvr+gerLs4ltQ4RAm/HAKCUN5WboOxVGeB11GhEfgYQ3wos3QCdH0TW DbcpXiMPlcQYyx0gewPFMLI= =9A/a -END PGP SIGNATURE- ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev ) ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Re: zope unresponsive
No, we haven't done that yet. That is something else we may try. Marco Bizzarri wrote: On 2/27/07, Paul Williams [EMAIL PROTECTED] wrote: Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Paul Williams wrote: Ok, here is what we have. I did a netstat on both machines, client and server. The client sees and established connection and the server does not. In the server log there is a disconnect. As far as hardware between them, there is a switch (dell powerconnect 6024). Web Server Directors might get hold of it but there are no hops on traceroute. Traceroute only shows the client machine and the server machine. So the client is just continuously polling the connection but getting nothing back. That sounds like some weird kernel / networking problem to me: I don't see how Zope could be able to keep calling 'select' on a socket after the other side has closed it. We agree. This is a strange situation that none of us have seen before. However, we have until tomorrow to do something and replacing hardware is not feasable. Is there any possibility that some kind of failover / IP takeover has happened, such that the storage server now running is not the same host / instance as the one to shich the clients originally connected? Are you using LVS + heartbeat, or some kind of hardware load balancer to manage such redundancy? We do have Web Services Directors that do load balancing, but in this particular case, the storage server is not setup for load balancing, I am not aware of any features that make the zodb capable of clustering except for replication services offered through zope. We are not sure whether the traffic is going to the Web Services Directores or not. Even if it is, there are thousands of settings and there is no-one available that knows what to change. The storage server is a simple nas server with a static ip address. What we are thinking about doing is changing the code in zrpc/connection.py to close the connection in wait (line 638 zope version 2.9.5) if the wait time gets too large or the poll has happened too many times. We are great at plone development, but have very little backend zope development. Would someone please advise me as to whether this is going to cause more problems? According to the log message you posted earlier in the thread, your appservers are spewing thousands of log messages from the connection's 'pending' method, although your deadlock debugger output shows the one thread blocked on 'select' inside of the connection's 'wait' method. There should be lots of log messages at TRACE level for the wait call, including a doubling / backoff of the delay value from 1 mx to 1 sec. Do you see those log messages, as well? These messages are there. You can see the time doubling. This is where we were thinking of breaking the connection once it gets to a certain point and make zope reconnect. This solves our hung connection problem, we think. However, I am hoping someone can let me know if I am breaking something else by doing this. I don't remember if you already mentioned it. However: did you tried to monitor the traffic outgoing and incoming? I mean, setting some iptables rules and/or using something like tcpdump to monitor what is going on here? Regards Marco ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] zope unresponsive
I know that there is a switch between zeo and zope and probably a firewall too, but how do I prove this is the problem. This is on production server in a military installation. I have major problems getting any kind of trouble shooting support. First we don't get access, and second no kind of debugging is allowed. You couldn't imagine the paperwork and the three months it took for me to get deadlockdebugger installed. Thanks, Paul ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] zope unresponsive
/lib/python/ZODB/seriali ze.py, line 537, in load_oid File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/Connect ion.py, line 201, in get File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ClientSt orage.py, line 746, in load File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ClientSt orage.py, line 769, in loadEx File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ServerSt ub.py, line 192, in loadEx File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/zrpc/con nection.py, line 531, in call File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/zrpc/con nection.py, line 638, in wait File /var/tmp/python2.4-2.4.3-root/apps1/python/lib/python2.4/asyncore.py, line 122, in poll r, w, e = select.select(r, w, e, timeout) Thread -1280054352 (GET /VirtualHostBase/https/soawds:443/VirtualHostRoot/Content/): File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZServer/PubC ore/ZServerPublisher.py, line 23, in __init__ File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/P ublish.py, line 395, in publish_module File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/P ublish.py, line 196, in publish_module_standard File /apps1/zope2.9.5/navo_instance/Products/PlacelessTranslationService/Pat chStringIO.py, line 34, in new_publish x = Publish.old_publish(request, module_name, after_list, debug) File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/P ublish.py, line 106, in publish File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/B aseRequest.py, line 366, in traverse File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/Connect ion.py, line 732, in setstate File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/Connect ion.py, line 786, in _setstate File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/seriali ze.py, line 604, in setGhostState File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/seriali ze.py, line 597, in getState File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/seriali ze.py, line 471, in _persistent_load File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/seriali ze.py, line 537, in load_oid File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/Connect ion.py, line 201, in get File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ClientSt orage.py, line 746, in load File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ClientSt orage.py, line 760, in loadEx End of dump Thank you, Paul Williams ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Zeo Client Hanging Unresponsive
-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/Connect ion.py, line 201, in get File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ClientSt orage.py, line 746, in load File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ClientSt orage.py, line 769, in loadEx File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ServerSt ub.py, line 192, in loadEx File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/zrpc/con nection.py, line 531, in call File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/zrpc/con nection.py, line 638, in wait File /var/tmp/python2.4-2.4.3-root/apps1/python/lib/python2.4/asyncore.py, line 122, in poll r, w, e = select.select(r, w, e, timeout) Thread -1280054352 (GET /VirtualHostBase/https/soawds:443/VirtualHostRoot/Content/): File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZServer/PubC ore/ZServerPublisher.py, line 23, in __init__ File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/P ublish.py, line 395, in publish_module File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/P ublish.py, line 196, in publish_module_standard File /apps1/zope2.9.5/navo_instance/Products/PlacelessTranslationService/Pat chStringIO.py, line 34, in new_publish x = Publish.old_publish(request, module_name, after_list, debug) File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/P ublish.py, line 106, in publish File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZPublisher/B aseRequest.py, line 366, in traverse File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/Connect ion.py, line 732, in setstate File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/Connect ion.py, line 786, in _setstate File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/seriali ze.py, line 604, in setGhostState File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/seriali ze.py, line 597, in getState File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/seriali ze.py, line 471, in _persistent_load File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/seriali ze.py, line 537, in load_oid File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZODB/Connect ion.py, line 201, in get File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ClientSt orage.py, line 746, in load File /var/tmp/Zope-2.9.5-1-buildroot/apps1/zope2.9.5/lib/python/ZEO/ClientSt orage.py, line 760, in loadEx End of dump Thank you, Paul Williams ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Zeo Client Unresponsive
Hello, We have been having problem with our production servers. We currently have: zope 2.8.5 + zeo pythonm2.3.4 Red Hat RHEL 4 Plone 2.1.2 Our zeo clients hang intermittently. We have no way of reproducing the problem, but it occurs daily. The client hangs and a restart seems to fix the problem. In the event log with tracing on we get Trace zeo.zrpc.Connection(C) wait(16697) {server:8100} pending, async=0 There are hundreds to thousands of these until the server is restarted. In the zeo log we get Error caught in asyncor asyncore.py error:(110,'Connection timed out') We have been trying to track this down and have had no luck. Does anyone have any suggestions? Thanks, Paul ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Zeo Client hanging
Hello,We have been having problem with our production servers. We currently have:zope 2.8.5 + zeopython 2.3.4 (red hat distribution)Red Hat RHEL 4Plone 2.1.2Our zeo clients hang intermittently. We have no way of reproducing the problem, but it occurs daily. The client hangs and a restart seems to fix the problem.In the event log with tracing on we getTrace zeo.zrpc.Connection(C) wait(16697) {server:8100} pending, async=0There are hundreds to thousands of these until the server is restarted.In the zeo log we getError caught in asyncore asyncore.pyerror:(110,'Connection timed out')We have been trying to track this down and have had no luck. Does anyone have any suggestions?Thanks,Paul ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )