Re: varnish killing off the child process after a few minutes

2009-04-08 Thread Tung Nguyen
 01:19:24 ey03-s00344 last message repeated 2 times
Apr  8 01:19:24 ey03-s00344 varnishd[3938]: Child (4002) said , 3, 1)
Apr  8 01:19:24 ey03-s00344 varnishd[3938]: Child (4002) said Probe(GET
http://stagingbleacherreport.com/images/blank.gif HTTP/1.1
Apr  8 01:19:24 ey03-s00344 varnishd[3938]: Child (4002) said
Apr  8 01:19:24 ey03-s00344 varnishd[3938]: Child (4002) said Host:
ey03-s00344
Apr  8 01:19:24 ey03-s00344 varnishd[3938]: Child (4002) said
Apr  8 01:19:24 ey03-s00344 varnishd[3938]: Child (4002) said Connection:
close
Apr  8 01:19:24 ey03-s00344 varnishd[3938]: Child (4002) said
Apr  8 01:19:24 ey03-s00344 last message repeated 2 times
Apr  8 01:19:24 ey03-s00344 varnishd[3938]: Child (4002) said , 3, 1)
Apr  8 01:20:01 ey03-s00344 cron[4249]: (root) CMD (test -x
/usr/sbin/run-crons  /usr/sbin/run-crons )
Apr  8 01:20:01 ey03-s00344 cron[4251]: (bleacherreport) CMD (ruby
/var/bleacherreport/current/bin/hit_counts.rb every_minute RAILS_ENV=staging
1 /var/bleacherreport/current/log/cron_wrapper.log 21)
Apr  8 01:20:03 ey03-s00344 varnishd[4002]:  vcl_recv.
Apr  8 01:20:03 ey03-s00344 varnishd[4002]: /mlb/scores
Apr  8 01:20:03 ey03-s00344 varnishd[4002]:  fallthrough.
Apr  8 01:20:04 ey03-s00344 varnishd[4002]:  vcl_recv.
Apr  8 01:20:04 ey03-s00344 varnishd[4002]:
/javascripts/base_1239147022.js?1239147049
Apr  8 01:20:04 ey03-s00344 varnishd[4002]:  assets.
Apr  8 01:20:06 ey03-s00344 varnishd[4002]:  vcl_recv.
Apr  8 01:20:06 ey03-s00344 varnishd[4002]:
/javascripts/core_1239147022.js?1239147049
Apr  8 01:20:06 ey03-s00344 varnishd[4002]:  assets.
Apr  8 01:20:06 ey03-s00344 varnishd[4002]:  vcl_recv.
Apr  8 01:20:06 ey03-s00344 varnishd[4002]: /images/hat/br_icon.jpg
Apr  8 01:20:06 ey03-s00344 varnishd[4002]:  assets.
Apr  8 01:20:06 ey03-s00344 varnishd[4002]:  vcl_recv.


Tung

On Wed, Apr 8, 2009 at 2:27 AM, Kristian Lyngstol 
krist...@redpill-linpro.com wrote:

 On Tue, Apr 07, 2009 at 01:01:51PM -0700, Tung Nguyen wrote:
  Hi guys,
 
  We're on varnish 2.0.3
 
  It looks like varnish restarts the child process for us every so often,
  causing 503s :(.  Was wondering if this is a known issue.

 Can you check the syslog for any more information?

 --
 Kristian Lyngstøl
 Redpill Linpro AS
 Tlf: +47 21544179
 Mob: +47 99014497




-- 
Tung Nguyen, Lead Developer
Bleacher Report, The Open Source Sports Network
(510) 928-0475
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


varnish killing off the child process after a few minutes

2009-04-07 Thread Tung Nguyen
Hi guys,

We're on varnish 2.0.3

It looks like varnish restarts the child process for us every so often,
causing 503s :(.  Was wondering if this is a known issue.



##
 # ps auxww | grep varnishd
nobody3385  0.0  0.3 8098024 7560 ?Sl   19:49   0:00
/usr/sbin/varnishd -P /var/run/varnishd.pid -a :6081 -T localhost:6082 -p
sess_timeout 10 -p obj_workspace 8192 -p sess_workspace 32768 -p
listen_depth 8192 -p connect_timeout 1s -p thread_pool_min 100 -f
/etc/varnish/default.vcl
root  3938  0.0  0.0 111584   904 ?Ss   Apr03   0:01
/usr/sbin/varnishd -P /var/run/varnishd.pid -a :6081 -T localhost:6082 -p
sess_timeout 10 -p obj_workspace 8192 -p sess_workspace 32768 -p
listen_depth 8192 -p connect_timeout 1s -p thread_pool_min 100 -f
/etc/varnish/default.vcl
root  3961  0.0  0.0   3884   672 pts/2S+  19:55   0:00 grep
--colour=auto varnishd
 #



##
 # varnishstat -1
uptime472  .   Child uptime
client_conn40 0.08 Client connections accepted
client_req100 0.21 Client requests received
cache_hit   1 0.00 Cache hits
cache_hitpass   0 0.00 Cache hits for pass
cache_miss 23 0.05 Cache misses
backend_conn   99 0.21 Backend connections success
backend_unhealthy0 0.00 Backend connections not
attempted
backend_busy0 0.00 Backend connections too many
backend_fail0 0.00 Backend connections failures
backend_reuse  82 0.17 Backend connections reuses
backend_recycle99 0.21 Backend connections recycles
backend_unused  0 0.00 Backend connections unused
n_srcaddr   1  .   N struct srcaddr
n_srcaddr_act   0  .   N active struct srcaddr
n_sess_mem 13  .   N struct sess_mem
n_sess  1  .   N struct sess
n_object   23  .   N struct object
n_objecthead   29  .   N struct objecthead
n_smf  47  .   N struct smf
n_smf_frag  0  .   N small free smf
n_smf_large 1  .   N large free smf
n_vbe_conn  6  .   N struct vbe_conn
n_bereq 6  .   N struct bereq
n_wrk 200  .   N worker threads
n_wrk_create  200 0.42 N worker threads created
n_wrk_failed0 0.00 N worker threads not created
n_wrk_max   1332628.23 N worker threads limited
n_wrk_queue 0 0.00 N queued work requests
n_wrk_overflow  0 0.00 N overflowed work requests
n_wrk_drop  0 0.00 N dropped work requests
n_backend   2  .   N backends
n_expired   0  .   N expired objects
n_lru_nuked 0  .   N LRU nuked objects
n_lru_saved 0  .   N LRU saved objects
n_lru_moved 1  .   N LRU moved objects
n_deathrow  0  .   N objects on deathrow
losthdr 0 0.00 HTTP header overflows
n_objsendfile   0 0.00 Objects sent with sendfile
n_objwrite 65 0.14 Objects sent with write
n_objoverflow   0 0.00 Objects overflowing workspace
s_sess 40 0.08 Total Sessions
s_req 100 0.21 Total Requests
s_pipe  0 0.00 Total pipe
s_pass 76 0.16 Total pass
s_fetch99 0.21 Total fetch
s_hdrbytes  3565475.54 Total header bytes
s_bodybytes206284   437.04 Total body bytes
sess_closed 0 0.00 Session Closed
sess_pipeline   0 0.00 Session Pipeline
sess_readahead  0 0.00 Session Read Ahead
sess_linger 0 0.00 Session Linger
sess_herd 100 0.21 Session herd
shm_records  959420.33 SHM records
shm_writes   1884 3.99 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont   10 0.02 SHM MTX contention
shm_cycles  0 0.00 SHM cycles through buffer
sm_nreq   186 0.39 allocator requests
sm_nobj46  .   outstanding allocations
sm_balloc  425984  .   bytes allocated
sm_bfree   6431973376  .   bytes free
sma_nreq0 0.00 SMA allocator 

requests keep timing out

2009-04-01 Thread Tung Nguyen
Hey guys,

Our requests seem to be getting timed out.  I set all the timeout parameters
pretty high, so wondering what could be causing it.

Also, how do I check out the number of current threads being used in
varnishstat?

Thanks,
Tung

# ps auxww | grep varnish
root  2649  0.0  2.4 101776 82744 pts/0S   13:48   0:00 varnishlog
-w varnish.log
root  4301  0.0  0.0 110552  1136 ?Ss  14:05   0:00
/usr/sbin/varnishd -P /var/run/varnishd.pid -a :6081 -T localhost:6082 -p
sess_timeout 25 -p obj_workspace 8192 -p sess_workspace 32768 -p
listen_depth 8192 -p connect_timeout 3s -p thread_pool_min 100 -f
/etc/varnish/default.vcl
nobody4302  0.0  1.1 7781688 39468 ?   Sl  14:05   0:00
/usr/sbin/varnishd -P /var/run/varnishd.pid -a :6081 -T localhost:6082 -p
sess_timeout 25 -p obj_workspace 8192 -p sess_workspace 32768 -p
listen_depth 8192 -p connect_timeout 3s -p thread_pool_min 100 -f
/etc/varnish/default.vcl
root  4872  0.0  0.0   3876   556 pts/0R+  14:22   0:00 grep
--colour=auto varnish
#


0 VCL_call - prefetch
0 VCL_return   - fetch
0 Debug- Attempt Prefetch 1804102565
0 Backend_health - A3 Still healthy 4--X-S-RH 5 3 5 0.00 0.00
HTTP/1.1 200 OK
0 Backend_health - A1 Still healthy 4--X-S-RH 5 3 5 0.00 0.001500
HTTP/1.1 200 OK
0 Backend_health - A6 Still healthy 4--X-S-RH 5 3 5 0.00 0.01
HTTP/1.1 200 OK
0 Backend_health - A2 Still healthy 4--X-S-RH 5 3 5 0.00 0.00
HTTP/1.1 200 OK
0 Backend_health - A4 Still healthy 4--X-S-RH 5 3 5 0.00 0.42
HTTP/1.1 200 OK
0 ExpPick  - 1804096341 ttl
0 VCL_call - timeout
0 VCL_return   - discard
0 ExpKill  - 1804096341 -30
0 ExpPick  - 1804096350 ttl
0 VCL_call - timeout
0 VCL_return   - discard
0 ExpKill  - 1804096350 -30
0 ExpPick  - 1804096354 ttl
0 VCL_call - timeout
0 VCL_return   - discard
0 ExpKill  - 1804096354 -30
0 ExpPick  - 1804096376 ttl
0 VCL_call - timeout
0 VCL_return   - discard
0 ExpKill  - 1804096376 -30
0 ExpPick  - 1804096384 ttl
0 VCL_call - timeout
0 VCL_return   - discard
0 ExpKill  - 1804096384 -30
0 CLI  - Rd ping
0 CLI  - Wr 0 200 PONG 1238593700 1.0
0 Backend_health - A5 Still healthy 4--X-S-RH 5 3 5 0.00 0.00
HTTP/1.1 200 OK
0 ExpPick  - 1804102829 prefetch
0 VCL_call - prefetch
0 VCL_return   - fetch
0 Debug- Attempt Prefetch 1804102829
0 Backend_health - A3 Still healthy 4--X-S-RH 5 3 5 0.00 0.00
HTTP/1.1 200 OK
0 Backend_health - A1 Still healthy 4--X-S-RH 5 3 5 0.00 0.001125
HTTP/1.1 200 OK
0 Backend_health - A6 Still healthy 4--X-S-RH 5 3 5 0.00 0.01
HTTP/1.1 200 OK
0 Backend_health - A2 Still healthy 4--X-S-RH 5 3 5 0.00 0.00
HTTP/1.1 200 OK
0 Backend_health - A4 Still healthy 4--X-S-RH 5 3 5 0.00 0.32
HTTP/1.1 200 OK
0 CLI  - Rd ping
0 CLI  - Wr 0 200 PONG 1238593703 1.0
0 ExpPick  - 1803925354 ttl
0 VCL_call - timeout


param.show
200 2137
accept_fd_holdoff  50 [ms]
acceptor   default (epoll, poll)
auto_restart   on [bool]
backend_http11 on [bool]
between_bytes_timeout  60.00 [s]
cache_vbe_connsoff [bool]
cc_command exec cc -fpic -shared -Wl,-x -o %o %s
cli_buffer 8192 [bytes]
cli_timeout5 [seconds]
client_http11  off [bool]
clock_skew 10 [s]
connect_timeout1.00 [s]
default_grace  10
default_ttl120 [seconds]
diag_bitmap0x0 [bitmap]
err_ttl0 [seconds]
esi_syntax 0 [bitmap]
fetch_chunksize128 [kilobytes]
first_byte_timeout 60.00 [s]
group  nobody (65534)
listen_address :6081
listen_depth   8192 [connections]
log_hashstring off [bool]
log_local_address  off [bool]
lru_interval   2 [seconds]
max_esi_includes   5 [includes]
max_restarts   4 [restarts]
obj_workspace  8192 [bytes]
overflow_max   100 [%]
ping_interval  3 [seconds]
pipe_timeout   60 [seconds]
prefer_ipv6off [bool]
purge_dups off [bool]
purge_hash on [bool]
rush_exponent  3 [requests per request]
send_timeout   600 [seconds]
sess_timeout   10 [seconds]
sess_workspace 32768 [bytes]
session_linger 0 [ms]
shm_reclen 255 [bytes]
shm_workspace  8192 [bytes]
srcaddr_hash   1049 [buckets]
srcaddr_ttl30 [seconds]
thread_pool_add_delay  20 [milliseconds]
thread_pool_add_threshold  2 [requests]

Re: varnishd runtime parameters

2009-03-25 Thread Tung Nguyen
 added







Any response is appreciated.

Thanks,
Tung










On Tue, Mar 24, 2009 at 12:20 AM, Kristian Lyngstol 
krist...@redpill-linpro.com wrote:

 On Mon, Mar 23, 2009 at 05:58:58PM -0700, Tung Nguyen wrote:
  Hi guys,
  So, Im reading over an archive email thread about twitters configuration.
 
 
 http://projects.linpro.no/pipermail/varnish-dev/2009-February/000968.html
 
  It looks like they had to adjust a lot of parameters... and Im not
 finding
  all the parameters definitions in the varnishd man pages.  Im wondering
 if
  for most cases running varnish with the defaults is fine?
 
  Any caveats here which run time parameters should I focus on.

 You can mostly run it with the defaults, yes. This depends on what sort of
 usage you have though.

 One notable exception is that I strongly recommend that you bring
 thread_pool_min up to a decent level (reflecting how many users you
 actually have. Numbers in the hundreds is normal). You'll also want to
 adjust the cache size to your system, but that goes without saying.

 If you expect extremely high load, you might have to increase cli_timeout
 too. I've run tests where even setting it to 15 seconds is insufficient and
 causes childs to be killed off. Though for most production sites, I'd guess
 5 seconds could work and 10 seconds would definitely work.

 --
 Kristian Lyngstøl
 Redpill Linpro AS
 Tlf: +47 21544179
 Mob: +47 99014497




-- 
Tung Nguyen, Lead Developer
Bleacher Report, The Open Source Sports Network
(510) 928-0475
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: varnishd runtime parameters

2009-03-25 Thread Tung Nguyen
Hi guys,

So sometimes are backend is really slow in returning a response.  So
slow that it looks like it
is causing varnish to timeout before the backend does and so varnish
seems to eventually give up
and returns a 503.

Ther are 4 curl requests
* the first 2 timeout and give a 503
* the 3rd one is a hit miss but gives a 200
* the 4th one is a cache hit and gives a 200

https://gist.github.com/0452c374ee21dbba138d

Here's the varnishlog filtered for client requests

http://gist.github.com/279254e0f2452814bf46

Here's varnishstat

https://gist.github.com/219534d51a503b546070

Here's are current startup run time parameters options right now:

https://gist.github.com/219534d51a503b546070

So, Im pretty certain what is happening is that our backend takes too
long and varnish times
out.  How can I set the time out higher... is it sess_timeout?

Thanks guys,
Tung


On Wed, Mar 25, 2009 at 10:37 AM, Tung Nguyen tngu...@bleacherreport.comwrote:

 Kristian, thank you.

 Im glad to hear that most defaults are good.

 Yup, the default thread_pool_min = 1 seems kinda funny.  We'll set it to at
 least 100.

 I didnt even know about the cli_timeout and will set it to a base line of
 10 seconds to test.

 Im wondering how you are testing, Im using ab, apachebench, to see how
 things behave with -c 10 -n 1000, on the varnished pages.





 Here's more specific questions more run time parameters.  The general
 question I have is what to look for during testing, should I be looking at
 your varnishstat and are the most important things to look for in that
 output.


 Our varnish stack will look like this:

 LB - Varnish x 2 - Nginx x 6 - Mongrel x 60

 Some questions about how best to decide how to configure them best to
 configure the run time parameters.

 -p obj_workspace=4096
 Cant find obj_workspace in the man page but found it in the twitter email
 post
 http://projects.linpro.no/pipermail/varnish-dev/2009-February/000968.html

 Is obj_workspace how much space preallocated to be used for the obj that
 gets returned from the backend?  So, if my nginx backend returns a web page
 that is over 4MB than -p obj_workspace is not enough, would that crash
 varnish, or log the error somewhere.

 -p sess_workspace=262144
 Same deal here with the man page and twitter post.
 What is the sess_workspace?

 http_workspace
 How does sess_workspace and obj_workspace relate to http_workspace?
 If we use obj_workspace=4096 and sess_workspace=262144, does the default
 http_workspace=8192 make sense?


 -p lru_interval=60
 Shows up on the twitter post again, but no man notes yet.  Whats the
 default for this?

 -p sess_timeout=10 \
 Default for this is 5.  If the requests from the backend takes longer than
 5 seconds, what happens?  Sometimes we have really slow response from the
 backend..

 -p shm_workspace=32768 \
 Is this the same as setting the command line flag -l shmlogsize.  The
 default is 80MB.  So dont know twitter did both setting it to less..

 -p thread_pools=4 \
 -p thread_pool_min=100 \
 thread_pool_max
 The defaults are 1,1,1000 respectively.  Im wondering how best to determine
 this or just leave as default.





 ##
 # output of varnishstat, what is best to look at here?
 ##
 0+07:11:24
 Hitrate ratio:44
4
 Hitrate avg:nan  nan  nan

  400 0.00 0.02 Client connections accepted
  400 0.00 0.02 Client requests received
1 0.00 0.00 Cache hits
5 0.00 0.00 Cache misses
  399 0.00 0.02 Backend connections success
  399 0.00 0.02 Backend connections failures
1 0.00 0.00 Backend connections reuses
5 0.00 0.00 Backend connections recycles
6  ..   N struct srcaddr
   21  ..   N struct sess_mem
1  ..   N struct sess
1  ..   N struct object
1  ..   N struct objecthead
3  ..   N struct smf
1  ..   N small free smf
1  ..   N large free smf
1  ..   N struct vbe_conn
2  ..   N struct bereq
   10  ..   N worker threads
   23 0.00 0.00 N worker threads created
   76 0.00 0.00 N overflowed work requests
2  ..   N backends
5  ..   N expired objects
6 0.00 0.00 Objects sent with write
  400 0.00 0.02 Total Sessions

varnishd runtime parameters

2009-03-23 Thread Tung Nguyen
Hi guys,
So, Im reading over an archive email thread about twitters configuration.

http://projects.linpro.no/pipermail/varnish-dev/2009-February/000968.html

It looks like they had to adjust a lot of parameters... and Im not finding
all the parameters definitions in the varnishd man pages.  Im wondering if
for most cases running varnish with the defaults is fine?

Any caveats here which run time parameters should I focus on.

Like with the vcl, the main two hooks to pay attention to are vcl_recv and
vcl_fetch

Any advice is appreciated,
Thanks,
Tung
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


varnishlog output column meaning

2009-03-22 Thread Tung Nguyen
Hey guys,

In the the varnishlog output

https://gist.github.com/8b163cc29fbd5e3141f6

Where can I find out more info on what the columns mean,
So what does the 7 vs 10 mean in the first column?
What does the b vs c mean in the second column?

 7 VCL_return   c pass
10 BackendOpen  b default 127.0.0.1 57141 0.0.0.0 3000
 7 Backend  c 10 default default
10 TxRequestb HEAD

Thanks in advance for any responses,
Tung
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc