Hello Mongrel gurus! We are about to deploy our Rails application to production, and even though the last couple of weeks of testing went really well, today we are seeing a new critical issue - the browser just hangs sometimes trying to load a page. When this happens, it seems that one or more of the mongrel_rails processes is hanging and not responding, while others are (so depending on which process the request is routed to, the page loads or not). The only remedy from this point onward appears to be brutal kill and restart of mongrel processes. I found the following lines in mongrel.log: Wed Aug 30 14:03:22 PDT 2006: Reaping 1 threads for slow workers because of 'shutdown' Thread #<Thread:0xb79d382c sleep> is too old, killing. Waiting for 1 requests to finish, could take 60 seconds.Wed Aug 30 14:03:23 PDT 2006: Reaping 1 threads for slow workers because of 'shutdown' Thread #<Thread:0xb78a03ac sleep> is too old, killing. And apache error log has this: [Wed Aug 30 14:46:59 2006] [error] [client 64.161.139.66] proxy: Error reading from remote server returned by /bookstore, referer: ht tp://wp01.my-secret-domain.com/create/book Apache's SSL error log has this: [Wed Aug 30 14:42:35 2006] [error] [client 64.161.139.66] proxy: error reading status line from remote server 127.0.0.1 [Wed Aug 30 14:42:35 2006] [error] [client 64.161.139.66] proxy: Error reading from remote server returned by /my/account/save_edit_a ddress/80 Restarting mongrel_cluster in a regular fashion leaves some of those processes running, and subsequent processes fail to start. Killing mongrel_rails processes with 'kill -9' actually clears the cluster and after a restart it appears to work fine. This happened a couple of times today, and I am wondering if anyone has seen this behavior - and if there is a recommended fix? - Should I upgrade to the latest unreleased version of Mongrel? I am a little worried that it's also unstable. - We did not change anything about Rails session handling. Could mongrel be locking up because it is dead-locking on the session files? - How do I tell what's really going on? - Can I enable more mongrel logging that is acceptable for production use (ie - not the full debugging info) - Can I turn on PID and date/time logging to mongrel.log so that I can see which process is having an issue? Any advice on how to approach this issue is much appreciated. Our environment: Apache 2.2.3 + mod_proxy_balancer + mod_ssl, etc... Mongrel 0.3.13.3 Ruby 1.8.4 Rails 1.1.6, SslRequirements, etc Linux 2.6.9-42.ELsmp (RedHat ES4 with all up-to-date patches) on Intel Mongrel Config: --- cwd: /data/apps/app1/current port: "5000" environment: production address: 127.0.0.1 pid_file: log/mongrel.pid servers: 10 Apache Info: # httpd -v Server version: Apache/2.2.3 Server built: Aug 22 2006 10:26:14 wp01[root]# httpd -V Server version: Apache/2.2.3 Server built: Aug 22 2006 10:26:14 Server's Module Magic Number: 20051115:3 Server loaded: APR 1.2.7, APR-Util 1.2.7 Compiled using: APR 1.2.7, APR-Util 1.2.7 Architecture: 32-bit Server MPM: Prefork threaded: no forked: yes (variable process count) Server compiled with.... -D APACHE_MPM_DIR="server/mpm/prefork" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D DYNAMIC_MODULE_LIMIT=128 -D HTTPD_ROOT="/usr/local/apache-2.2.3" -D SUEXEC_BIN="/usr/local/apache-2.2.3/bin/suexec" -D DEFAULT_PIDLOG="logs/httpd.pid" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_LOCKFILE="logs/accept.lock" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="conf/mime.types" -D SERVER_CONFIG_FILE="conf/httpd.conf" Apache Module Info: # httpd -l Compiled in modules: core.c mod_authn_file.c mod_authn_default.c mod_authz_host.c mod_authz_groupfile.c mod_authz_user.c mod_authz_default.c mod_auth_basic.c mod_include.c mod_filter.c mod_log_config.c mod_env.c mod_headers.c mod_setenvif.c mod_proxy.c mod_proxy_connect.c mod_proxy_ftp.c mod_proxy_http.c mod_proxy_ajp.c mod_proxy_balancer.c mod_ssl.c prefork.c http_core.c mod_mime.c mod_status.c mod_autoindex.c mod_asis.c mod_cgi.c mod_negotiation.c mod_dir.c mod_actions.c mod_userdir.c mod_alias.c mod_rewrite.c mod_so.c Restarting using capistrano with a few hanging mongrel processes: > cap restart loading configuration /usr/local/lib/ruby/gems/1.8/gems/capistrano-1.1.0/lib/capistrano/recipes/standard.rb loading configuration ./config/deploy.rb loading configuration #<Proc:0x003566d0@/usr/local/lib/ruby/gems/1.8/gems/mongrel_cluster-0.2.0/lib/mongrel_cluster/recipes.rb:1> * executing task restart * executing task restart_mongrel_cluster * executing task stop_mongrel_cluster * executing "mongrel_rails cluster::stop -C /data/apps/app1/current/config/mongrel_cluster.yml" servers: ["wp01.my-secret-domain.com"] [wp01.my-secret-domain.com] executing command ** [out :: wp01.my-secret-domain.com] Stopping 10 Mongrel servers... command finished * executing task start_mongrel_cluster * executing "mongrel_rails cluster::start -C /data/apps/app1/current/config/mongrel_cluster.yml" servers: ["wp01.my-secret-domain.com"] [wp01.my-secret-domain.com] executing command ** [out :: wp01.my-secret-domain.com] Starting 10 Mongrel servers... ** [out :: wp01.my-secret-domain.com] ** !!! PID file log/mongrel.5000.pid already exists. Mongrel could be running already. Check your log/mongrel.log for errors. ** [out :: wp01.my-secret-domain.com] ** !!! PID file log/mongrel.5001.pid already exists. Mongrel could be running already. Check your log/mongrel.log for errors. ** [out :: wp01.my-secret-domain.com] ** ** [out :: wp01.my-secret-domain.com] !!! PID file log/mongrel.5004.pid already exists. Mongrel could be running alrea ** [out :: wp01.my-secret-domain.com] dy. Check your log/mongrel.log for errors. ** [out :: wp01.my-secret-domain.com] ** ** [out :: wp01.my-secret-domain.com] !!! PID file log/mongrel.5005.pid already exists. Mongrel could be running alrea ** [out :: wp01.my-secret-domain.com] dy. Check your log/mongrel.log for errors. ** [out :: wp01.my-secret-domain.com] ** ** [out :: wp01.my-secret-domain.com] !!! PID file log/mongrel.5007.pid already exists. Mongrel could be running alrea ** [out :: wp01.my-secret-domain.com] dy. Check your log/mongrel.log for errors. command finished |
_______________________________________________ Mongrel-users mailing list [email protected] http://rubyforge.org/mailman/listinfo/mongrel-users
