Bug in native timeout module causes hang when waiting on IO
-----------------------------------------------------------
Key: JRUBY-3799
URL: http://jira.codehaus.org/browse/JRUBY-3799
Project: JRuby
Issue Type: Bug
Affects Versions: JRuby 1.3.1
Reporter: Nick Sieger
Assignee: Thomas E Enebo
Priority: Critical
>From a conversation with Charlie:
{quote}
Nick Sieger wrote:
> On Jul 2, 2009, at 10:18 , Nick Sieger wrote:
>> We had a problem in kenai.com production yesterday morning where all 5
>> worker threads in all our servers eventually became exhausted. All of them
>> were stuck on net/http >> rbuf_fill inside the Timeout yield. We just
>> upgraded to JRuby 1.3.1 last release. There are some other leads I need to
>> follow up on, but was just wondering if you can think of >> any reason why
>> the new JRuby native timeout code might not be timing these out? Attaching a
>> thread dump from one of the servers.
> I realized after staring at the Timeout code a little longer that it relies
> on RubyThread.internalRaise/receiveMail, which has this comment:
> // interrupt the target thread in case it's blocking or waiting
> // WARNING: We no longer interrupt the target thread, since this
>usually means
> // interrupting IO and with NIO that means the channel is no longer
>usable.
> // We either need a new way to handle waking a target thread that's
>waiting
> // on IO, or we need to accept that we can't wake such threads and must
>wait
> // for them to complete their operation.
> //threadImpl.interrupt();
> // new interrupt, to hopefully wake it out of any blocking IO
> this.interrupt();
> Could it be that the thread is not getting interrupted out of the IO wait
> state? The phrase "hopefully wake it" does not inspire confidence.
Yes, that's probably exactly what's happening. See my other email (if it sent
ok...). We could review what the code looks like in 1.2 to confirm it, but my
guess is that RubyThread.interrupt or one of the kill/raise paths did
eventually do a hard thread interrupt and doesn't now.
{quote}
See also JRUBY-3154.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email