[Zope] ZEO disconnects, Zope auto restarts (via zopectl)
Zope 2.9.0 We are seeing spontaneous restarts of Zope with no indication in any of the standard Zope logs. Looking at the ZEO log indicates that the restarts of Zope are due to a lost connection between Zope ZEO but with no other information. The logging level is set at the distribution default (INFO). The restarts are a huge problem because session variables are not persistent and so all of the user state they contain is lost on restart. In our statful implementation, this is a major problem. I want to adjust the configuration so that the Zope/ZEO connection is stable. In our configuration, Zope and ZEO are linked via localhost on a distinguished port. I've Googled about looking for some infomation about tuning the ZEO/Zope interface, but have found little real information. Some additional log detail would be helpful. We are running a fairly vanilla setup, excerpted below: zope.conf # ZEO client storage: # zodb_db main mount-point / # ZODB cache, in number of objects cache-size 5000 zeoclient server localhost:8301 storage 1 var $INSTANCE/var # ZEO client cache, in bytes cache-size 20MB # Uncomment to have a persistent disk cache client group1-zeo /zeoclient /zodb_db zeo.conf zeo address localhost:8301 read-only false invalidation-queue-size 100 pid-filename $INSTANCE/var/ZEO.pid # monitor-address PORT # transaction-timeout SECONDS /zeo runner program $INSTANCE/bin/runzeo socket-name $INSTANCE/etc/zeo.zdsock daemon true forever false backoff-limit 10 exit-codes 0, 2 directory $INSTANCE default-to-interactive true # user zope python /usr/bin/python2.4 zdrun /usr/local/src/zope/Zope2.9/lib64/python/zdaemon/zdrun.py # This logfile should match the one in the zeo.conf file. # It is used by zdctl's logtail command, zdrun/zdctl doesn't write it. logfile $INSTANCE/log/zeo.log /runner It's not clear what changes will lead to a more stable connection because it is not clear what's triggering the problem. Any advice would be appreciated. Presumably the shotgun approach would work -- increase the cache sizes, lengthen the invalidation-queue-size, and increase the backoff-limit but it would be nice to have some guidance. 5~ ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] ZEO disconnects, Zope auto restarts (via zopectl)
On Fri, Feb 03, 2006 at 01:00:45AM -0800, Dennis Allison wrote: Zope 2.9.0 We are seeing spontaneous restarts of Zope with no indication in any of the standard Zope logs. Looking at the ZEO log indicates that the restarts of Zope are due to a lost connection between Zope ZEO but with no other information. The logging level is set at the distribution default (INFO). Are you *sure* that is the cause, rather than the effect? If zope restarts for any reason, I'd expect the zeo log to show a disconnect and reconnect as a result. Check the clocks on your zope and zeo boxes and make sure the timing of events in your logs is really what you think it is. (Systems that aren't running ntpd are the bane of my existence...) Wild guess: Any chance your Zope process is running out of memory? I've had that on several occasions, when some naively-written software attempts to do something huge in memory that should really use a temp file on disk. (Zope itself used to have some code like that in the FTP server, don't know if it still does.) I discovered this by looking in /var/log/messages. At least on linux, the kernel will log something there when it kills a process that consumes all available memory. We are running a fairly vanilla setup, excerpted below: zope.conf # ZEO client storage: # zodb_db main mount-point / # ZODB cache, in number of objects cache-size 5000 zeoclient server localhost:8301 storage 1 var $INSTANCE/var # ZEO client cache, in bytes cache-size 20MB Unrelated to your problem, and maybe you know this, but depending on the size of your storage, I'd consider increasing the zeo client cache size. It's a disk cache and you can safely make it huge. But if you don't see cache flipping messages in your event log, it may not matter. # Uncomment to have a persistent disk cache client group1-zeo /zeoclient /zodb_db zeo.conf zeo address localhost:8301 read-only false invalidation-queue-size 100 pid-filename $INSTANCE/var/ZEO.pid # monitor-address PORT # transaction-timeout SECONDS /zeo runner program $INSTANCE/bin/runzeo socket-name $INSTANCE/etc/zeo.zdsock daemon true forever false backoff-limit 10 exit-codes 0, 2 directory $INSTANCE default-to-interactive true # user zope python /usr/bin/python2.4 zdrun /usr/local/src/zope/Zope2.9/lib64/python/zdaemon/zdrun.py # This logfile should match the one in the zeo.conf file. # It is used by zdctl's logtail command, zdrun/zdctl doesn't write it. logfile $INSTANCE/log/zeo.log /runner It's not clear what changes will lead to a more stable connection because it is not clear what's triggering the problem. Any advice would be appreciated. Presumably the shotgun approach would work -- increase the cache sizes, lengthen the invalidation-queue-size, and increase the backoff-limit but it would be nice to have some guidance. 5~ ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev ) -- Paul Winkler http://www.slinkp.com ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] ZEO disconnects, Zope auto restarts (via zopectl)
Paul, Thanks for the assist. Comments on your comments interlinearly below. I have increased cache and other resources to see what the impact will be. On Fri, 3 Feb 2006, Paul Winkler wrote: On Fri, Feb 03, 2006 at 01:00:45AM -0800, Dennis Allison wrote: Zope 2.9.0 We are seeing spontaneous restarts of Zope with no indication in any of the standard Zope logs. Looking at the ZEO log indicates that the restarts of Zope are due to a lost connection between Zope ZEO but with no other information. The logging level is set at the distribution default (INFO). Are you *sure* that is the cause, rather than the effect? No, I am not and there's nothing in the logs which hints at why it restarted. We are running under load. The failures are silent. We do have a fairly high rate of conflict errors (which all get resolved finally!). If zope restarts for any reason, I'd expect the zeo log to show a disconnect and reconnect as a result. Check the clocks on your zope and zeo boxes and make sure the timing of events in your logs is really what you think it is. (Systems that aren't running ntpd are the bane of my existence...) Timing correlates to the second. Zope and ZEO live on the same physical box. Wild guess: Any chance your Zope process is running out of memory? I've had that on several occasions, when some naively-written software attempts to do something huge in memory that should really use a temp file on disk. (Zope itself used to have some code like that in the FTP server, don't know if it still does.) I doubt if I am hitting a limit. The box has nearly 8GB of memory most of which (6GB) is used by linux as a cache. No messages in the logs. I discovered this by looking in /var/log/messages. At least on linux, the kernel will log something there when it kills a process that consumes all available memory. We are running a fairly vanilla setup, excerpted below: [snip...] Unrelated to your problem, and maybe you know this, but depending on the size of your storage, I'd consider increasing the zeo client cache size. It's a disk cache and you can safely make it huge. But if you don't see cache flipping messages in your event log, it may not matter. Done, but I cannot report on the effect. ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] ZEO disconnects, Zope auto restarts (via zopectl)
On Fri, Feb 03, 2006 at 08:35:11AM -0800, Dennis Allison wrote: Timing correlates to the second. Zope and ZEO live on the same physical box. OK. Do you have more than one ZEO client? If not, I'd reevaluate whether you need ZEO at all. (It's great for zopectl debug on a live system, but otherwise it does nothing but add overhead if you're not using it to run multiple Zopes. But you probably knew that.) I doubt if I am hitting a limit. The box has nearly 8GB of memory most of which (6GB) is used by linux as a cache. No messages in the logs. OK. It should be pretty obvious if you were hitting a limit. I don't think it's possible on linux to run out of memory without the kernel complaining somewhere in /var/log. Unrelated to your problem, and maybe you know this, but depending on the size of your storage, I'd consider increasing the zeo client cache size. It's a disk cache and you can safely make it huge. But if you don't see cache flipping messages in your event log, it may not matter. Done, but I cannot report on the effect. Well, as I said, it's very unlikely to have any impact on your problem. If I were in your shoes the first thing I'd do is bump up the log levels on both zope and zeo to BLATHER. Adds overhead I know, but you need to find the problem somehow... it's a weird one, I've never seen zope restart for no reason. -- Paul Winkler http://www.slinkp.com ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] ZEO disconnects, Zope auto restarts (via zopectl)
On Feb 3, 2006, at 1:06 PM, Paul Winkler wrote: If I were in your shoes the first thing I'd do is bump up the log levels on both zope and zeo to BLATHER. Adds overhead I know, but you need to find the problem somehow... it's a weird one, I've never seen zope restart for no reason. This can be a symptom of a segfault if you've got zope running under a daemon manager like zopectl/zdaemon/supervisord. - C ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )