I would look at setting a fail director. Restart if there is a 503, and if restarts > 0 select the patient director with very generous health checking. Your timeouts are reasonable, but try .timeout 20s and .threshold 1 for the patient director. Having a different view of the backends usually deals with occasional 503s.
Stefan Caunter Operations Torstar Digital m: (416) 561-4871 -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Ronan Mullally Sent: March-08-11 3:38 PM To: [email protected] Subject: Varnish 503ing on ~1/100 POSTs I'm seeing intermittant 503s on POSTs to a fairly busy VBulletin website. The current load is light (up to a couple of thousand active sessions, peak is around five thousand). Varnish has a fairly simple config with a director consisting of two Apache backends: backend backend1 { .host = "1.2.3.4"; .port = "80"; .connect_timeout = 5s; .first_byte_timeout = 90s; .between_bytes_timeout = 90s; .probe = { .timeout = 5s; .interval = 5s; .window = 5; .threshold = 3; .request = "HEAD /favicon.ico HTTP/1.0" "X-Forwarded-For: 1.2.3.4" "Connection: close"; } } backend backend2 { .host = "5.6.7.8"; .port = "80"; .connect_timeout = 5s; .first_byte_timeout = 90s; .between_bytes_timeout = 90s; .probe = { .timeout = 5s; .interval = 5s; .window = 5; .threshold = 3; .request = "HEAD /favicon.ico HTTP/1.0" "X-Forwarded-For: 5.6.7.8" "Connection: close"; } } The numbers are modest, but significant - about 1 POST in a hundred fails. I've upped the backend timeouts to 90 seconds (first_byte / between_bytes) and I'm pretty confident they're responding in well under that time. varnishlog does not show any backend health changes. A typical event looks like: Varnish: a.b.c.d - - [08/Mar/2011:14:48:03 +0000] "POST http://www.sitename.net/newreply.php?do=postreply&t=285227 HTTP/1.1" 503 2623 Backend: a.b.c.d - - [08/Mar/2011:14:48:03 +0000] "POST /newreply.php?do=postreply&t=285227 HTTP/1.1" 200 2686 The POST appears to work fine on the backend but the user gets a 503 from Varnish. It's not unusual to see users getting the error several times in a row (presumably re-submitting the post): a.b.c.d - - [08/Mar/2011:18:21:23 +0000] "POST http://www.sitename.net/editpost.php?do=updatepost&p=9405408 HTTP/1.1" 503 2623 a.b.c.d - - [08/Mar/2011:18:21:36 +0000] "POST http://www.sitename.net/editpost.php?do=updatepost&p=9405408 HTTP/1.1" 503 2623 a.b.c.d - - [08/Mar/2011:18:21:50 +0000] "POST http://www.sitename.net/editpost.php?do=updatepost&p=9405408 HTTP/1.1" 503 2623 A typical request is below. The first attempt fails with: 33 FetchError c http first read error: -1 0 (Success) there is presumably a restart and the second attempt (sometimes to backend1, sometimes backend2) fails with: 33 FetchError c backend write error: 11 (Resource temporarily unavailable) This pattern has been the same on the few transactions I've examined in detail. The full log output of a typical request is below. I'm stumped. Has anybody got any ideas what might be causing this? -Ronan 33 RxRequest c POST 33 RxURL c /ajax.php 33 RxProtocol c HTTP/1.1 33 RxHeader c Accept: */* 33 RxHeader c Accept-Language: nl-be 33 RxHeader c Referer: http://www.redcafe.net/ 33 RxHeader c x-requested-with: XMLHttpRequest 33 RxHeader c Content-Type: application/x-www-form-urlencoded; charset=UTF-8 33 RxHeader c Accept-Encoding: gzip, deflate 33 RxHeader c User-Agent: Mozilla/4.0 (compatible; ...) 33 RxHeader c Host: www.sitename.net 33 RxHeader c Content-Length: 82 33 RxHeader c Connection: Keep-Alive 33 RxHeader c Cache-Control: no-cache 33 RxHeader c Cookie: ... 33 VCL_call c recv 33 VCL_return c pass 33 VCL_call c hash 33 VCL_return c hash 33 VCL_call c pass 33 VCL_return c pass 33 Backend c 44 backend backend1 44 TxRequest b POST 44 TxURL b /ajax.php 44 TxProtocol b HTTP/1.1 44 TxHeader b Accept: */* 44 TxHeader b Accept-Language: nl-be 44 TxHeader b Referer: http://www.sitename.net/ 44 TxHeader b x-requested-with: XMLHttpRequest 44 TxHeader b Content-Type: application/x-www-form-urlencoded; charset=UTF-8 44 TxHeader b User-Agent: Mozilla/4.0 (compatible; ...) 44 TxHeader b Host: www.sitename.net 44 TxHeader b Content-Length: 82 44 TxHeader b Cache-Control: no-cache 44 TxHeader b Cookie: ... 44 TxHeader b Accept-Encoding: gzip 44 TxHeader b X-Forwarded-For: a.b.c.d 44 TxHeader b X-Varnish: 657185708 * 33 FetchError c http first read error: -1 0 (Success) 44 BackendClose b backend1 33 Backend c 47 backend backend2 47 TxRequest b POST 47 TxURL b /ajax.php 47 TxProtocol b HTTP/1.1 47 TxHeader b Accept: */* 47 TxHeader b Accept-Language: nl-be 47 TxHeader b Referer: http://www.sitename.net/ 47 TxHeader b x-requested-with: XMLHttpRequest 47 TxHeader b Content-Type: application/x-www-form-urlencoded; charset=UTF-8 47 TxHeader b User-Agent: Mozilla/4.0 (compatible; ...) 47 TxHeader b Host: www.sitename.net 47 TxHeader b Content-Length: 82 47 TxHeader b Cache-Control: no-cache 47 TxHeader b Cookie: ... 47 TxHeader b Accept-Encoding: gzip 47 TxHeader b X-Forwarded-For: a.b.c.d 47 TxHeader b X-Varnish: 657185708 * 33 FetchError c backend write error: 11 (Resource temporarily unavailable) 47 BackendClose b backend2 33 VCL_call c error 33 VCL_return c deliver 33 VCL_call c deliver 33 VCL_return c deliver 33 TxProtocol c HTTP/1.1 33 TxStatus c 503 33 TxResponse c Service Unavailable 33 TxHeader c Server: Varnish 33 TxHeader c Retry-After: 0 33 TxHeader c Content-Type: text/html; charset=utf-8 33 TxHeader c Content-Length: 2623 33 TxHeader c Date: Tue, 08 Mar 2011 17:08:33 GMT 33 TxHeader c X-Varnish: 657185708 33 TxHeader c Age: 3 33 TxHeader c Via: 1.1 varnish 33 TxHeader c Connection: close 33 Length c 2623 33 ReqEnd c 657185708 1299604110.559967279 1299604113.447372913 0.000037670 2.887368441 0.000037193 33 SessionClose c error 33 StatSess c a.b.c.d 50044 3 1 1 0 1 0 235 2623 _______________________________________________ varnish-misc mailing list [email protected] http://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc _______________________________________________ varnish-misc mailing list [email protected] http://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
