Hello all,
For the past couple of weeks I have been spending some time debugging a
couple of issues I was having with Mongrel when I put load on it. I have
seen two distinct issues:
1. Mongrel stopped responding as if in an endless loop.
2. Mongrel crashed when severely loaded.
I believe to have resolved these two issues and have attached patches
which shows the resolution (simple as it is). Explanation of the patches
is given below.
The first problem is handled by the patch to sync.rb from the standard
library. What is happening here is that when sync_unlock is called
Thread.critical is set to true. Now if the thread is not the
sync_ex_locker an exception is thrown without Thread.critical being set
to false. This in turn resulted in a situation where the
mongrel_sleeper_thread (configurator.rb:270) was the only thread getting
back on the cpu and Thread.critical stayed true. The patch simply
ensures that Thread.critical is set to false upon leaving sync.rb.
I am not sure if this is really the correct way to handle this issue
though. As some famous programmers have been known to say "select()
ain't broken" so I'm not really sure what to think of this.
The second problem stems from the fact that Mongrel uses the
Thread#abort_on_exception. I'm not sure why this is even in there, as
the documentation says:
When set to true, causes all threads (including the main
program) to abort if an exception is raised in thr. The process
will effectively exit(0).
The patch simply removes the abort_on_exception from mongrel.rb. After
applying this patch I have been unable to make Mongrel crash.
Finally I have provided a debug patch for the Sync library which simply
adds a lot of debug output to STDERR. I believe it might be of use in
future performance optimizations as there seems to be happening a lot of
work managing the queued up clients.
--
Cheers,
- Jacob Atzen
Index: lib/mongrel.rb
===================================================================
--- lib/mongrel.rb (revision 353)
+++ lib/mongrel.rb (working copy)
@@ -687,7 +687,6 @@
reap_dead_workers("max processors")
else
thread = Thread.new(client) {|c| process_client(c) }
- thread.abort_on_exception = true
thread[:started_on] = Time.now
@workers.add(thread)
--- sync.rb Sun Oct 1 21:02:28 2006
+++ sync.new.rb Sun Oct 1 21:05:28 2006
@@ -131,8 +131,10 @@
def sync_try_lock(mode = EX)
return unlock if sync_mode == UN
+ print_critical("sync_try_lock", "1", "true")
Thread.critical = true
ret = sync_try_lock_sub(sync_mode)
+ print_critical("sync_try_lock", "2", "false")
Thread.critical = false
ret
end
@@ -140,22 +142,27 @@
def sync_lock(m = EX)
return unlock if m == UN
- until (Thread.critical = true; sync_try_lock_sub(m))
+ until (print_critical("sync_lock", "1", "true"); Thread.critical = true;
sync_try_lock_sub(m))
if sync_sh_locker[Thread.current]
sync_upgrade_waiting.push [Thread.current,
sync_sh_locker[Thread.current]]
sync_sh_locker.delete(Thread.current)
else
+ STDERR.print "[sync_lock:2] Pushing #{Thread.current.inspect} behind
#{sync_waiting.size} others\n"
sync_waiting.push Thread.current
end
+ print_critical("sync_lock", "3", "false")
Thread.stop
end
+ print_critical("sync_lock", "4", "false")
Thread.critical = false
self
end
def sync_unlock(m = EX)
+ print_critical("sync_unlock", "1", "true")
Thread.critical = true
if sync_mode == UN
+ print_critical("sync_unlock", "2", "false")
Thread.critical = false
Err::UnknownLocker.Fail(Thread.current)
end
@@ -165,6 +172,7 @@
runnable = false
case m
when UN
+ print_critical("sync_unlock", "3", "false")
Thread.critical = false
Err::UnknownLocker.Fail(Thread.current)
@@ -173,13 +181,17 @@
if (self.sync_ex_count = sync_ex_count - 1) == 0
self.sync_ex_locker = nil
if sync_sh_locker.include?(Thread.current)
+ STDERR.print "[sync_unlock] Setting sync_mode = SH\n"
self.sync_mode = SH
else
+ STDERR.print "[sync_unlock] Setting sync_mode = UN\n"
self.sync_mode = UN
end
runnable = true
end
else
+ # Patching criticalities when exceptions are thrown
+ print_critical("sync_unlock", "4", "false")
Thread.critical = false
Err::UnknownLocker.Fail(Thread.current)
end
@@ -191,6 +203,7 @@
if (sync_sh_locker[Thread.current] = count - 1) == 0
sync_sh_locker.delete(Thread.current)
if sync_sh_locker.empty? and sync_ex_count == 0
+ STDERR.print "[sync_unlock] Setting sync_mode = UN\n"
self.sync_mode = UN
runnable = true
end
@@ -205,6 +218,11 @@
end
wait = sync_upgrade_waiting
self.sync_upgrade_waiting = []
+ for w, v in wait
+ STDERR.print "[sync_unlock:5] Starting thread #{w.inspect}\n"
+ end
+
+ print_critical("sync_unlock", "6", "false")
Thread.critical = false
for w, v in wait
@@ -213,22 +231,31 @@
else
wait = sync_waiting
self.sync_waiting = []
+ print_critical("sync_unlock", "7", "false")
Thread.critical = false
for w in wait
+ STDERR.print "[sync_unlock:8] Running #{w.inspect}\n"
w.run
end
end
end
-
+ print_critical("sync_unlock", "9", "false")
Thread.critical = false
self
end
+ def print_critical(method, count, bool)
+ STDERR.print "[#{method}:#{count}] Thread.critical = #{bool}
#{Thread.current.inspect}\n"
+ end
+
def sync_synchronize(mode = EX)
begin
+ STDERR.print "[sync_synchronize] Getting lock
#{Thread.current.inspect}\n"
sync_lock(mode)
+ STDERR.print "[sync_synchronize] Yielding #{Thread.current.inspect}\n"
yield
ensure
+ STDERR.print "[sync_synchronize] Unlocking #{Thread.current.inspect}\n"
sync_unlock
end
end
@@ -292,6 +319,7 @@
ret = false
end
else
+ print_critical("sync_try_lock_sub", "1", "false")
Thread.critical = false
Err::LockModeFailer.Fail mode
end
--- sync.orig.rb Sun Oct 1 20:57:39 2006
+++ sync.rb Sun Oct 1 21:02:28 2006
@@ -180,6 +180,7 @@
runnable = true
end
else
+ Thread.critical = false
Err::UnknownLocker.Fail(Thread.current)
end
_______________________________________________
Mongrel-users mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/mongrel-users