LinuxInsight wrote:
> Juan José Amor wrote:
>> Hi!
>>
>> pub crawler escribió:
>>> Well I believe we have conquered the 503 Reverse Proxy issues finally
>>> and this is my information on the causes and how to resolve this issue
>>> and perhaps other similar issues going forward. I'll submit this
>>> information to bug tracking and recommend we provide this information
>>> in the docs and/or cookbooks.
>>
>> I have applied your suggested changes, although since our migration to
>> 0.99.24 the 5xx error has not been repeated (I guess).
>>
>
> I also have "trouble" to get the dreaded 504 Gateway timeout since I've
> moved to 0.99.24 (actually the latest svn!). :)
>
> While I was running 0.99.22 I've had numerous "lockups" where cherokee
> would return 504 to all clients during 15-20 min window, and then
> magically recover all by itself. In all such occasions it would log 200
> 0 (size) in the log file, so it seems that logging could be improved to
> better reflect the reality. I might report this to the bug tracker if I
> see it again.
>
> But, I'll still keep running with full tracing turned on, for another
> week or so, before I'm completely sure that the problem is gone.
Oh well, I really spoke too soon. The problem happened again, php pages
weren't served for ~ 17min. It started with this:
{0xf7f4cd90} [08/09/2009 10:47:34.356] thread.c:0526 (
process_polling_connections): conn 0x91556f8(fd=67): Time out
{0xf7f4cd90} [08/09/2009 10:47:34.356] socket.c:0985 (
cherokee_socket_bufwrite): write fd=67 len=187 ret=0 written=187
{0xf7f4cd90} [08/09/2009 10:47:34.356] util.c:1310 (
cherokee_fd_close): fd=67 re=0
{0xf7f4cd90} [08/09/2009 10:47:34.356] socket.c:0208 (
cherokee_socket_close): fd=67 is_tls=0 re=0
{0xf7f4cd90} [08/09/2009 10:47:34.356] handler_fcgi.c:0317 (
cherokee_handler_fcgi_free): fcgi handler free: 0x9081540
{0xf7f4cd90} [08/09/2009 10:47:34.356] util.c:1310 (
cherokee_fd_close): fd=70 re=0
{0xf7f4cd90} [08/09/2009 10:47:34.356] socket.c:0208 (
cherokee_socket_close): fd=70 is_tls=0 re=0
{0xf7f4cd90} [08/09/2009 10:47:34.356] connection.c:0357 (
cherokee_connection_clean): conn 0x91556f8, has headers 0
conn 0x91556f8 was the first one to timeout (after 60 seconds) and for
the next 17 minutes all connections that dispatched requests to php-cgi
timeouted in such manner and sent 504 Gateway timeout. Serving static
content (icons, files..) worked normally during that time.
I've gone through the setup phase of the connection above and there's
nothing weird there, it gets matched to many rules, rewritten, and
finally a request is dispatched to php-cgi. It surely looks like php-cgi
never answers and such behavior continues for a long time after this
first failure.
cherokee.error has absolutely nothing in it
standard apache log has this (ip obscured):
1x4.2y7.1z2.6 - - [08/Oct/2009:10:47:34 +0200] "GET
/proc_sys_net_ipv4_route_secret_interval.html HTTP/1.1" 200 0 "-"
"Mozilla/4.0 (compatible;)"
So it logs that it returned 200 and zero content size, but in reality
504 is returned to the browser really.
During those 17 minutes, there isn't any kind of overload, requests come
at a normal pace, CPU utilization in fact drops because php-cgi
obviously are not processing requests etc. Then at some point in time,
everything automagically heals, php pages continue to be server normally
once again, and exactly the same php-cgi processes are doing it.
This really looks like a tough problem, even with full tracing turned on
it's not obvious who's responsible for such behavior, cherokee or
php-cgi? Although I'm heavily inclined to the idea that php-cgi is the
culprit.
--
http://www.linuxinsight.com/
_______________________________________________
Cherokee mailing list
[email protected]
http://lists.octality.com/listinfo/cherokee