On Feb 24, 2011, at 12:08 PM, Darren Nickerson wrote:
>
>> In your case, does the page eventually came out rendered at all,
>> like after 5min?
>
> I will try to let it wait longer ... I think I have waited as long as 10
> minutes in the past, but it's worth trying and reporting back to you.
So, we had two instances of the hang today, and they each followed a similar
pattern.
In at least one case the problem cleared itself after some time. I do not know
how long, it was a background apache thread started by rt-mailgate that
eventually cleared, it was not a browser session.
Of the http worker threads, each one is blocked in a semop call:
[root@rt4 Plack-0.9970]# strace -p 30307
Process 30307 attached - interrupt to quit
semop(1802244, {{0, -1, SEM_UNDO}}, 1^C <unfinished ...>
Process 30307 detached
[root@rt4 Plack-0.9970]# strace -p 30308
Process 30308 attached - interrupt to quit
semop(1802244, {{0, -1, SEM_UNDO}}, 1^C <unfinished ...>
Process 30308 detached
[root@rt4 Plack-0.9970]# strace -p 30309
Process 30309 attached - interrupt to quit
semop(1802244, {{0, -1, SEM_UNDO}}, 1^C <unfinished ...>
Process 30309 detached
except for one which is reading from fd 1:
[root@rt4 Plack-0.9970]# strace -p 30310
Process 30310 attached - interrupt to quit
read(1, ^C <unfinished ...>
That FD is a network connection to our database server:
[root@rt4 Plack-0.9970]# ls -l /proc/30310/fd/1
lrwx------. 1 root root 64 Feb 24 17:44 /proc/30310/fd/1 -> socket:[281592]
[root@rt4 Plack-0.9970]# netstat -antep | grep 281592
tcp 0 5 10.0.12.149:49410 10.0.11.100:3306
ESTABLISHED 48 281592 30310/httpd
The database server has no record of that tcp connection any longer, and
mysqladmin processlist shows all threads sleeping.
>> and can you see if it makes any difference if you change around line
>> 222 of RT::Interface::Web::Handler from:
>>
>> my $h = RT::Interface::Web::Handler::NewHandler(
>> 'HTML::Mason::PSGIHandler::Streamy');
>> to:
>>
>> my $h = RT::Interface::Web::Handler::NewHandler(
>> 'HTML::Mason::PSGIHandler');
I have not yet tried this. Given the new detailed information above, does it
still make sense to do so?
-Darren