Re: Maintenance message
In message 2caa2d26-7b56-4402-88e1-559361135...@gmail.com, Brad Schick writes : I have a varnish server working well, but I'd like to have a standby server that does nothing but server up Sorry we are preforming maintenance. My thought was to write VCL code to check the health of the director, and if that was bad use a different server (something like the example below). But that doesn't work. Any suggestions? Actually, it just takes a bit of a detour: sub vcl_recv { set req.backend = cluster; if (!req.backend.healthy) { set req.backend = maint; } } -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Child panics on OpenSolaris
Hello list, Using the Letsgetdugg[1] article I've installed Varnish on an OpenSolaris zone. During testing it works as expected but when it receives production traffic I'm seeing children die with three different types of panics[2][3][4] that look like this: Panic message: Assert error in TCP_nonblocking(), tcp.c line 172: Panic message: Assert error in TCP_blocking(), tcp.c line 163: Assert error in VCA_Prep(), cache_acceptor.c line 163: I've tried both enabling and disabling KeepAlive on the backend server but doesn't seem to have any effect. I've also tried a 2GB and 1GB malloc cache just in case it was a 32bit issue (it's not and I've since confirmed it's running as a 64bit process). The VCL I'm using is pretty simple[5], it normalises the host header and unsets the cookie header if the request is for a static asset. This is how I'm starting up Varnish at the moment: newtask -p highfile /opt/sbin/varnishd -f /opt/etc/varnish/firebox.vcl -F \ -p cc_command='/opt/SunStudioExpress/bin/cc -Kpic -G -m64 -o %o %s' \ -T 127.0.0.1:9001 \ -s malloc,1G \ -p sess_timeout=5s \ -p max_restarts=12 \ -p waiter=poll \ -p connect_timeout=0s \ -p sess_workspace=65536 Is there anything that jumps out as incorrect? Is there some additional configuration required for Solaris or are these panics to be expected? Cheers, Paul. [1] - http://letsgetdugg.com/2009/12/04/varnish-on-solaris/ [2] First panic type: Child (18997) died signal=6 Child (18997) Panic message: Assert error in TCP_nonblocking(), tcp.c line 172: Condition((ioctl(sock, ((int)((uint32_t)(0x8000|(((sizeof (int))0xff)16)| ('f'8)|126))), i)) == 0) not true. errno = 9 (Bad file number) thread = (cache-worker) ident = -smalloc,-hcritbit,poll Backtrace: 44548b: /opt/sbin/varnishd'pan_backtrace+0x1b [0x44548b] 445795: /opt/sbin/varnishd'pan_ic+0x1c5 [0x445795] fd7ff3e5dfec: /opt/lib/libvarnish.so.1.0.0'TCP_nonblocking+0x7c [0xfd7ff3e5dfec] 419091: /opt/sbin/varnishd'vca_return_session+0x1b1 [0x419091] 42675d: /opt/sbin/varnishd'cnt_wait+0x2bd [0x42675d] 42b94a: /opt/sbin/varnishd'CNT_Session+0x4ba [0x42b94a] 44801b: /opt/sbin/varnishd'wrk_do_cnt_sess+0x19b [0x44801b] 447614: /opt/sbin/varnishd'wrk_thread_real+0x854 [0x447614] 447b73: /opt/sbin/varnishd'wrk_thread+0x123 [0x447b73] fd7ff653acf5: /lib/amd64/libc.so.1'_thrp_setup+0x8d [0xfd7ff653acf5] sp = 866548 { fd = 25, id = 25, xid = 0, client = 92.41.40.169:2589, step = STP_WAIT, handling = deliver, restarts = 0, esis = 0 ws = 8665b8 { id = sess, {s,f,r,e} = {8672c0,+18,+32786,+65536}, }, http[req] = { ws = 8665b8[sess] , /i/template/2009/search_icon_1.gif, HTTP/1.1, Accept: */*, Referer: http://www.firebox.com/product/2579/Yurakoro-Lucky-Cats?aff=1781;, Accept-Language: en-gb, User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0), Accept-Encoding: gzip, deflate, Connection: Keep-Alive, host: media.firebox.com, X-Forwarded-For: 92.41.40.169, }, }, [3] Second panic type: Child (12024) said Child starts Child (12024) died signal=6 Child (12024) Panic message: Assert error in TCP_blocking(), tcp.c line 163: Condition((ioctl(sock, ((int)((uint32_t)(0x8000|(((sizeof (int))0xff)16)| ('f'8)|126))), i)) == 0) not true. errno = 9 (Bad file number) thread = (cache-worker) ident = -smalloc,-hcritbit,poll Backtrace: 44548b: /opt/sbin/varnishd'pan_backtrace+0x1b [0x44548b] 445795: /opt/sbin/varnishd'pan_ic+0x1c5 [0x445795] fd7ff3e5df5c: /opt/lib/libvarnish.so.1.0.0'TCP_blocking+0x7c [0xfd7ff3e5df5c] 42b686: /opt/sbin/varnishd'CNT_Session+0x1f6 [0x42b686] 44801b: /opt/sbin/varnishd'wrk_do_cnt_sess+0x19b [0x44801b] 447614: /opt/sbin/varnishd'wrk_thread_real+0x854 [0x447614] 447b73: /opt/sbin/varnishd'wrk_thread+0x123 [0x447b73] fd7ff653acf5: /lib/amd64/libc.so.1'_thrp_setup+0x8d [0xfd7ff653acf5] fd7ff653afb0: /lib/amd64/libc.so.1'_lwp_start+0x0 [0xfd7ff653afb0] sp = 3491f88 { fd = 156, id = 156, xid = 0, client = ?.?.?.?:?, step = STP_FIRST, handling = deliver, restarts = 0, esis = 0 ws = 3491ff8 { id = sess, {s,f,r,e} = {3492d00,3492d00,0,+65536}, }, http[req] = { ws = 3491ff8[sess] , /pic/p2387_search.jpg, HTTP/1.1, User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.8.1.20) Gecko/20081217 Firefox/2.0.0.20 (.NET CLR 3.5.30729), Accept: image/png,*/*;q=0.5, Accept-Language: en-gb,en;q=0.5, Accept-Encoding: gzip,deflate, Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7, Keep-Alive: 300, Connection: keep-alive, Referer: http://www.firebox.com/admin/allproducts;, host: media.firebox.com, X-Forwarded-For: 94.196.164.41, }, worker = fd7ff8e08d30 { ws =
Re: Error compiling VCL when using '% in regexp
In message dd929cd9109b0a4d8c34b732a6d35cc43c403...@mbx03.exg5.exghost.com, N aama Bamberger writes: I already tried using the escaped %25. The compilation succeeded, but the regexp didn't find a match in the problematic URLs: # If the URL ends with % and one digit (a broken hex value) - remove the last 2 characters. if (req.url ~ (.*)%25[0-9a-fA-F]$) { set req.url = regsub(req.url, (.*)%25[0-9a-fA-F]$, \1); } I just whipped up a varnishtest case, and it seems to work in -trunk: test random test server s1 { rxreq expect req.url == /foo txresp } -start varnish v1 -vcl+backend { sub vcl_recv { if (req.url ~ (.*)%25[0-9a-fA-F]$) { set req.url = regsub(req.url, (.*)%25[0-9a-fA-F]$, \1); } } } -start client c1 { txreq -url /foo%a rxresp } -run ### c1 Connect to 127.0.0.1:17621 ### c1 Connected to 127.0.0.1:17621 fd is 9 c1 txreq| GET /foo%a HTTP/1.1\r\n c1 txreq| \r\n ### c1 rxresp ### s1 Accepted socket fd is 4 ### s1 rxreq s1 rxhdr| GET /foo HTTP/1.1\r\n s1 rxhdr| X-Forwarded-For: 127.0.0.1\r\n s1 rxhdr| X-Varnish: 1001\r\n s1 rxhdr| Host: 127.0.0.1\r\n s1 rxhdr| \r\n -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Child panics on OpenSolaris
In message 282e72051002100305m5d7a0d1fj3a1afac6ea7dd...@mail.gmail.com, Paul Wright writes: Hi Paul, We have a number of tickets on this issue already (626,615,588). The problem is that EBADF return indicates that Varnish by mistake has closed a file descriptor, that should still be open. The alternative explanation, that Solaris can return EBADF because the other end closed a TCP connection, does not seem to have any support in Solaris documentation. We are trying to get a Solaris box running so we can figure this out once and for all. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Child panics on OpenSolaris
On 10 February 2010 13:00, Poul-Henning Kamp p...@phk.freebsd.dk wrote: In message 282e72051002100305m5d7a0d1fj3a1afac6ea7dd...@mail.gmail.com, Paul Wright writes: Hi Paul, We have a number of tickets on this issue already (626,615,588). The problem is that EBADF return indicates that Varnish by mistake has closed a file descriptor, that should still be open. The alternative explanation, that Solaris can return EBADF because the other end closed a TCP connection, does not seem to have any support in Solaris documentation. We are trying to get a Solaris box running so we can figure this out once and for all. Thanks for the explanation of what's going on. Looking at those tickets there are suggestions to try the poll waiter which we're already using - are there any further tests we could try to help narrow down this issue? I'm happy to assist trying out patches. Cheers, Paul. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Child panics on OpenSolaris
In message 282e72051002100615x701a37a8o416c9af6d1d7f...@mail.gmail.com, Paul Wright writes: Thanks for the explanation of what's going on. Looking at those tickets there are suggestions to try the poll waiter which we're already using - are there any further tests we could try to help narrow down this issue? I'm happy to assist trying out patches. I can see three ways to nail this issue: 1. Catch a tcpdump, when it happens, showing that the client side did close, and Solaris (incorrectly) returns EBADF. 2. Catch a ktrace/systrace/dtrace, when it happens, that show that Varnish incorrectly closes the fd. 3. Setup some synthetic test to show that solaris returns EBADF when it shouldn't If either of those are in your reach, by all means go for it... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Maintenance message
On Feb 10, 2010, at 3:04 AM, Reinis Rozitis wrote: I have a varnish server working well, but I'd like to have a standby server that does nothing but server up Sorry we are preforming maintenance. My thought was to write VCL code to check the health of the director, and if that was bad use a different server (something like the example below). But that doesn't work. Any suggestions? Why not use the vcl_error ? Just customize the default html which is included in the sample config and you can have a nice error page without even the need of a extra server. Thanks for the suggestion, but our error page isn't trivial and I don't like the idea of maintaining the site within a varnish configuration file. It actually won't be an extra server, it will just be on a port on the same machine as varnish. But served by a proper http server. -Brad ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Connections to backend not closing
Hello Poul-Henning, thanks for your quick response. I am not sure that this behavour is really harmless, at least its not for me :) After 1 day running varnish I have 140 sockets of the backend webserver in FIN_WAIT2 state, this is quite a lot. (btw; I don't know why FIN_WAIT2 sockets stay for such a long time in that state and don't time out...) With a litte bit more semi-open connections I can get my backend to a state where stops responsing because of Too many open connections (I think 256 connections is the limit at the moment). As you can imagine that is quote annoying :) Is there any possibility to say varnish to close CLOSE_WAIT connections immediately ? Or do you have other ideas ? Thanks in advance Thimo Am 10.02.2010 11:04, schrieb Poul-Henning Kamp: In message4b71f7a0.2050...@digithi.de, Thimo E. writes: Dear all, first of all, varnish is a really nice software! But... :) ...At the moment I have some problems with varnish and its backend connection(s). [..] Some time later (at least 5 minutes !) the last entry CLOSE_WAIT disappears but the FIN_WAIT2 persists, so the webserver still has a semi-open socket: This is actually per design, varnish keeps backend connections around if they look like they can be reused, and only revisits them when it tries to reuse them, so they may linger for quite a while before varnish discovers they have been closed by the backend. Apart from the socket hanging around, it is harmless. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Connections to backend not closing
On Wed, Feb 10, 2010 at 4:17 PM, Thimo E. a...@digithi.de wrote: After 1 day running varnish I have 140 sockets of the backend webserver in FIN_WAIT2 state, this is quite a lot. I'm why do you believe this is a lot? Do you have evidence that this is causing your server to behave suboptimally? The impact should be no more than a bit of RAM. --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc