Re: Maintenance message

2010-02-10 Thread Poul-Henning Kamp
In message 2caa2d26-7b56-4402-88e1-559361135...@gmail.com, Brad Schick writes
:
I have a varnish server working well, but I'd like to have a standby
server that does nothing but server up Sorry we are preforming
maintenance. My thought was to write VCL code to check the health
of the director, and if that was bad use a different server (something
like the example below). But that doesn't work. Any suggestions?

Actually, it just takes a bit of a detour:

sub vcl_recv {
set req.backend = cluster;
if (!req.backend.healthy) {
set req.backend = maint;
}
}

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Child panics on OpenSolaris

2010-02-10 Thread Paul Wright
Hello list,

Using the Letsgetdugg[1] article I've installed Varnish on an
OpenSolaris zone.  During testing it works as expected but when it
receives production traffic I'm seeing children die with three
different types of panics[2][3][4] that look like this:

Panic message: Assert error in TCP_nonblocking(), tcp.c line 172:
Panic message: Assert error in TCP_blocking(), tcp.c line 163:
Assert error in VCA_Prep(), cache_acceptor.c line 163:

I've tried both enabling and disabling KeepAlive on the backend server
but doesn't seem to have any effect.  I've also tried a 2GB and 1GB
malloc cache just in case it was a 32bit issue (it's not and I've
since confirmed it's running as a 64bit process).

The VCL I'm using is pretty simple[5], it normalises the host header
and unsets the cookie header if the request is for a static asset.
This is how I'm starting up Varnish at the moment:

newtask -p highfile /opt/sbin/varnishd -f /opt/etc/varnish/firebox.vcl -F \
-p cc_command='/opt/SunStudioExpress/bin/cc -Kpic -G -m64 -o %o %s' \
-T 127.0.0.1:9001 \
-s malloc,1G \
-p sess_timeout=5s \
-p max_restarts=12 \
-p waiter=poll \
-p connect_timeout=0s \
-p sess_workspace=65536

Is there anything that jumps out as incorrect?  Is there some
additional configuration required for Solaris or are these panics to
be expected?

Cheers,

Paul.

[1] - http://letsgetdugg.com/2009/12/04/varnish-on-solaris/

[2] First panic type:
Child (18997) died signal=6
Child (18997) Panic message: Assert error in TCP_nonblocking(), tcp.c line 172:
  Condition((ioctl(sock, ((int)((uint32_t)(0x8000|(((sizeof
(int))0xff)16)| ('f'8)|126))), i)) == 0) not true.
errno = 9 (Bad file number)
thread = (cache-worker)
ident = -smalloc,-hcritbit,poll
Backtrace:
  44548b: /opt/sbin/varnishd'pan_backtrace+0x1b [0x44548b]
  445795: /opt/sbin/varnishd'pan_ic+0x1c5 [0x445795]
  fd7ff3e5dfec: /opt/lib/libvarnish.so.1.0.0'TCP_nonblocking+0x7c
[0xfd7ff3e5dfec]
  419091: /opt/sbin/varnishd'vca_return_session+0x1b1 [0x419091]
  42675d: /opt/sbin/varnishd'cnt_wait+0x2bd [0x42675d]
  42b94a: /opt/sbin/varnishd'CNT_Session+0x4ba [0x42b94a]
  44801b: /opt/sbin/varnishd'wrk_do_cnt_sess+0x19b [0x44801b]
  447614: /opt/sbin/varnishd'wrk_thread_real+0x854 [0x447614]
  447b73: /opt/sbin/varnishd'wrk_thread+0x123 [0x447b73]
  fd7ff653acf5: /lib/amd64/libc.so.1'_thrp_setup+0x8d [0xfd7ff653acf5]
sp = 866548 {
  fd = 25, id = 25, xid = 0,
  client = 92.41.40.169:2589,
  step = STP_WAIT,
  handling = deliver,
  restarts = 0, esis = 0
  ws = 8665b8 {
id = sess,
{s,f,r,e} = {8672c0,+18,+32786,+65536},
  },
  http[req] = {
ws = 8665b8[sess]
  ,
  /i/template/2009/search_icon_1.gif,
  HTTP/1.1,
  Accept: */*,
  Referer:
http://www.firebox.com/product/2579/Yurakoro-Lucky-Cats?aff=1781;,
  Accept-Language: en-gb,
  User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1;
Trident/4.0; GTB6.4; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR
3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3;
OfficeLivePatch.0.0),
  Accept-Encoding: gzip, deflate,
  Connection: Keep-Alive,
  host: media.firebox.com,
  X-Forwarded-For: 92.41.40.169,
  },
},


[3] Second panic type:
Child (12024) said Child starts
Child (12024) died signal=6
Child (12024) Panic message: Assert error in TCP_blocking(), tcp.c line 163:
  Condition((ioctl(sock, ((int)((uint32_t)(0x8000|(((sizeof
(int))0xff)16)| ('f'8)|126))), i)) == 0) not true.
errno = 9 (Bad file number)
thread = (cache-worker)
ident = -smalloc,-hcritbit,poll
Backtrace:
  44548b: /opt/sbin/varnishd'pan_backtrace+0x1b [0x44548b]
  445795: /opt/sbin/varnishd'pan_ic+0x1c5 [0x445795]
  fd7ff3e5df5c: /opt/lib/libvarnish.so.1.0.0'TCP_blocking+0x7c
[0xfd7ff3e5df5c]
  42b686: /opt/sbin/varnishd'CNT_Session+0x1f6 [0x42b686]
  44801b: /opt/sbin/varnishd'wrk_do_cnt_sess+0x19b [0x44801b]
  447614: /opt/sbin/varnishd'wrk_thread_real+0x854 [0x447614]
  447b73: /opt/sbin/varnishd'wrk_thread+0x123 [0x447b73]
  fd7ff653acf5: /lib/amd64/libc.so.1'_thrp_setup+0x8d [0xfd7ff653acf5]
  fd7ff653afb0: /lib/amd64/libc.so.1'_lwp_start+0x0 [0xfd7ff653afb0]
sp = 3491f88 {
  fd = 156, id = 156, xid = 0,
  client = ?.?.?.?:?,
  step = STP_FIRST,
  handling = deliver,
  restarts = 0, esis = 0
  ws = 3491ff8 {
id = sess,
{s,f,r,e} = {3492d00,3492d00,0,+65536},
  },
  http[req] = {
ws = 3491ff8[sess]
  ,
  /pic/p2387_search.jpg,
  HTTP/1.1,
  User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB;
rv:1.8.1.20) Gecko/20081217 Firefox/2.0.0.20 (.NET CLR 3.5.30729),
  Accept: image/png,*/*;q=0.5,
  Accept-Language: en-gb,en;q=0.5,
  Accept-Encoding: gzip,deflate,
  Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7,
  Keep-Alive: 300,
  Connection: keep-alive,
  Referer: http://www.firebox.com/admin/allproducts;,
  host: media.firebox.com,
  X-Forwarded-For: 94.196.164.41,
  },
  worker = fd7ff8e08d30 {
ws = 

Re: Error compiling VCL when using '% in regexp

2010-02-10 Thread Poul-Henning Kamp
In message dd929cd9109b0a4d8c34b732a6d35cc43c403...@mbx03.exg5.exghost.com, N
aama Bamberger writes:
I already tried using the escaped %25.
The compilation succeeded, but the regexp didn't find a match in the 
problematic URLs:

   # If the URL ends with % and one digit (a broken hex value) - remove 
 the last 2 characters.
   if (req.url ~ (.*)%25[0-9a-fA-F]$) {
   set req.url = regsub(req.url, (.*)%25[0-9a-fA-F]$, \1);
   }

I just whipped up a varnishtest case, and it seems to work in -trunk:

test random test

server s1 {
rxreq
expect req.url == /foo
txresp
} -start

varnish v1 -vcl+backend {

sub vcl_recv {
if (req.url ~ (.*)%25[0-9a-fA-F]$) {
set req.url = regsub(req.url, (.*)%25[0-9a-fA-F]$, 
\1);
}
}

} -start

client c1 {
txreq -url /foo%a
rxresp
} -run

###  c1   Connect to 127.0.0.1:17621
###  c1   Connected to 127.0.0.1:17621 fd is 9
 c1   txreq| GET /foo%a HTTP/1.1\r\n
 c1   txreq| \r\n
###  c1   rxresp
###  s1   Accepted socket fd is 4
###  s1   rxreq
 s1   rxhdr| GET /foo HTTP/1.1\r\n
 s1   rxhdr| X-Forwarded-For: 127.0.0.1\r\n
 s1   rxhdr| X-Varnish: 1001\r\n
 s1   rxhdr| Host: 127.0.0.1\r\n
 s1   rxhdr| \r\n


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Child panics on OpenSolaris

2010-02-10 Thread Poul-Henning Kamp
In message 282e72051002100305m5d7a0d1fj3a1afac6ea7dd...@mail.gmail.com, Paul 
Wright writes:


Hi Paul,

We have a number of tickets on this issue already (626,615,588).

The problem is that EBADF return indicates that Varnish by mistake
has closed a file descriptor, that should still be open.

The alternative explanation, that Solaris can return EBADF because
the other end closed a TCP connection, does not seem to have any
support in Solaris documentation.

We are trying to get a Solaris box running so we can figure this
out once and for all.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Child panics on OpenSolaris

2010-02-10 Thread Paul Wright
On 10 February 2010 13:00, Poul-Henning Kamp p...@phk.freebsd.dk wrote:
 In message 282e72051002100305m5d7a0d1fj3a1afac6ea7dd...@mail.gmail.com, Paul
 Wright writes:


 Hi Paul,

 We have a number of tickets on this issue already (626,615,588).

 The problem is that EBADF return indicates that Varnish by mistake
 has closed a file descriptor, that should still be open.

 The alternative explanation, that Solaris can return EBADF because
 the other end closed a TCP connection, does not seem to have any
 support in Solaris documentation.

 We are trying to get a Solaris box running so we can figure this
 out once and for all.

Thanks for the explanation of what's going on.  Looking at those
tickets there are suggestions to try the poll waiter which we're
already using - are there any further tests we could try to help
narrow down this issue?  I'm happy to assist trying out patches.

Cheers,

Paul.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Child panics on OpenSolaris

2010-02-10 Thread Poul-Henning Kamp
In message 282e72051002100615x701a37a8o416c9af6d1d7f...@mail.gmail.com, Paul 
Wright writes:

Thanks for the explanation of what's going on.  Looking at those
tickets there are suggestions to try the poll waiter which we're
already using - are there any further tests we could try to help
narrow down this issue?  I'm happy to assist trying out patches.

I can see three ways to nail this issue:

1. Catch a tcpdump, when it happens, showing that the client side
   did close, and Solaris (incorrectly) returns EBADF.

2. Catch a ktrace/systrace/dtrace, when it happens, that show
   that Varnish incorrectly closes the fd.

3. Setup some synthetic test to show that solaris returns EBADF
   when it shouldn't

If either of those are in your reach, by all means go for it...

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Maintenance message

2010-02-10 Thread Brad Schick

On Feb 10, 2010, at 3:04 AM, Reinis Rozitis wrote:

 I have a varnish server working well, but I'd like to have a standby server 
 that does nothing but server up Sorry we are 
 preforming maintenance. My thought was to write VCL code to check the 
 health of the director, and if that was bad use a different 
 server (something like the example below). But that doesn't work. Any 
 suggestions?
 
 Why not use the vcl_error ?
 Just customize the default html which is included in the sample config and 
 you can have a nice error page without even the need of a 
 extra server.
 

Thanks for the suggestion, but our error page isn't trivial and I don't like 
the idea of maintaining the site within a varnish configuration file. It 
actually won't be an extra server, it will just be on a port on the same 
machine as varnish. But served by a proper http server.


-Brad
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Connections to backend not closing

2010-02-10 Thread Thimo E.
Hello Poul-Henning,

thanks for your quick response. I am not sure that this behavour is 
really harmless, at least its not for me :)

After 1 day running varnish I have 140 sockets of the backend webserver 
in FIN_WAIT2 state, this is quite a lot.
(btw; I don't know why FIN_WAIT2 sockets stay for such a long time in 
that state and don't time out...)

With a litte bit more semi-open connections I can get my backend to a 
state where stops responsing because of Too many open connections (I 
think 256 connections is the limit at the moment). As you can imagine 
that is quote annoying :)

Is there any possibility to say varnish to close CLOSE_WAIT 
connections immediately ? Or do you have other ideas ?

Thanks in advance
   Thimo

Am 10.02.2010 11:04, schrieb Poul-Henning Kamp:
 In message4b71f7a0.2050...@digithi.de, Thimo E. writes:

 Dear all,

 first of all, varnish is a really nice software! But... :)
 ...At the moment I have some problems with varnish and its backend
 connection(s).

 [..]

 Some time later (at least 5 minutes !) the last entry CLOSE_WAIT
 disappears but the FIN_WAIT2 persists, so the webserver still has a
 semi-open socket:
  
 This is actually per design, varnish keeps backend connections around
 if they look like they can be reused, and only revisits them when it
 tries to reuse them, so they may linger for quite a while before
 varnish discovers they have been closed by the backend.

 Apart from the socket hanging around, it is harmless.




___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Connections to backend not closing

2010-02-10 Thread Michael Fischer
On Wed, Feb 10, 2010 at 4:17 PM, Thimo E. a...@digithi.de wrote:

 After 1 day running varnish I have 140 sockets of the backend webserver
 in FIN_WAIT2 state, this is quite a lot.


I'm why do you believe this is a lot?  Do you have evidence that this is
causing your server to behave suboptimally?  The impact should be no more
than a bit of RAM.

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc