I've been looking at this more, and no combination of NFILES and fs.file-max seem to fix the problem. When I run `varnishlog -i Debug` along with the varnishd process, I get tons and tons of these:
"Accept failed errno=24" Which is the same as the "Too many open files" error I believe. Is anyone else having this error? Here's a look at varnishstat (after about 8 mins, which is on the high end of time between crashes): client_conn 26744 57.02 Client connections accepted client_req 64444 137.41 Client requests received cache_hit 30529 65.09 Cache hits cache_hitpass 0 0.00 Cache hits for pass cache_miss 33913 72.31 Cache misses backend_conn 33914 72.31 Backend connections success backend_fail 0 0.00 Backend connections failures backend_reuse 31815 67.84 Backend connections reuses backend_recycle 31935 68.09 Backend connections recycles backend_unused 0 0.00 Backend connections unused n_srcaddr 2145 . N struct srcaddr n_srcaddr_act 306 . N active struct srcaddr n_sess_mem 525 . N struct sess_mem n_sess 439 . N struct sess n_object 34061 . N struct object n_objecthead 34061 . N struct objecthead n_smf 67916 . N struct smf n_smf_frag 0 . N small free smf n_smf_large 1 . N large free smf n_vbe_conn 11 . N struct vbe_conn n_bereq 139 . N struct bereq n_wrk 199 . N worker threads n_wrk_create 316 0.67 N worker threads created n_wrk_failed 0 0.00 N worker threads not created n_wrk_max 0 0.00 N worker threads limited n_wrk_queue 0 0.00 N queued work requests n_wrk_overflow 316 0.67 N overflowed work requests n_wrk_drop 0 0.00 N dropped work requests n_backend 1 . N backends n_expired 0 . N expired objects n_lru_nuked 0 . N LRU nuked objects n_lru_saved 0 . N LRU saved objects n_lru_moved 15941 . N LRU moved objects n_deathrow 0 . N objects on deathrow losthdr 0 0.00 HTTP header overflows n_objsendfile 0 0.00 Objects sent with sendfile n_objwrite 63764 135.96 Objects sent with write s_sess 26603 56.72 Total Sessions s_req 64287 137.07 Total Requests s_pipe 0 0.00 Total pipe s_pass 0 0.00 Total pass s_fetch 33831 72.13 Total fetch s_hdrbytes 20767331 44280.02 Total header bytes s_bodybytes 2076771265 4428083.72 Total body bytes sess_closed 2658 5.67 Session Closed sess_pipeline 0 0.00 Session Pipeline sess_readahead 92 0.20 Session Read Ahead sess_herd 61921 132.03 Session herd shm_records 3795728 8093.24 SHM records shm_writes 144882 308.92 SHM writes shm_cont 142 0.30 SHM MTX contention sm_nreq 67971 144.93 allocator requests sm_nobj 67915 . outstanding allocations sm_balloc 1611931648 . bytes allocated sm_bfree 11272970240 . bytes free backend_req 33914 72.31 Backend requests made > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:varnish-misc- > [EMAIL PROTECTED] On Behalf Of Andrew Knapp > Sent: Monday, March 03, 2008 1:03 PM > To: Michael S. Fischer > Cc: [email protected] > Subject: RE: Child dying with "Too many open files" > > I'm actually getting this a lot more frequently while running trunk > (r2544). Every time the child dies it's cleaning out the cache and > starting over. Right now it's happening about every 15 seconds, which > seems crazy. > > Any ideas? > > -Andy > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:varnish-misc- > > [EMAIL PROTECTED] On Behalf Of Andrew Knapp > > Sent: Friday, February 29, 2008 12:54 PM > > To: Michael S. Fischer > > Cc: [email protected] > > Subject: RE: Child dying with "Too many open files" > > > > I'm still getting the "Too many open files" error on the child. > > > > $ sudo sysctl -a | grep file > > fs.file-max = 131072 > > > > NFILES is also set to 131072. Any ideas? > > > > -Andy > > > > > -----Original Message----- > > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > > Behalf > > > Of Michael S. Fischer > > > Sent: Thursday, February 28, 2008 3:51 PM > > > To: Andrew Knapp > > > Cc: [email protected] > > > Subject: Re: Child dying with "Too many open files" > > > > > > I can't help but wonder if you'd set it too high. What happens > when > > > you set NFILES and fs.file-max both to 131072? I've tested that as > a > > > known good value. > > > > > > --Michael > > > > > > On Thu, Feb 28, 2008 at 2:58 PM, Andrew Knapp <[EMAIL PROTECTED]> > > wrote: > > > > Yup, it is. Here's some output: > > > > > > > > $ ps auxwww | grep varnish > > > > root 12036 0.0 0.0 27704 648 ? Ss 14:54 0:00 > > > > /usr/sbin/varnishd -a :80 -f /etc/varnish/photo.vcl -T > > > <internalip>:6082 > > > > > > > > -t 120 -w 10,700,30 -s file,/c01/varnish/varnish_storage.bin,12G > -u > > > > varnish -g varnish -P /var/run/varnish.pid > > > > varnish 12037 1.2 0.4 13119108 39936 ? Sl 14:54 0:00 > > > > /usr/sbin/varnishd -a :80 -f /etc/varnish/photo.vcl -T > > > <internalip>:6082 > > > > > > > > -t 120 -w 10,700,30 -s file,/c01/varnish/varnish_storage.bin,12G > -u > > > > varnish -g varnish -P /var/run/varnish.pid > > > > > > > > -Andy > > > > > > > > > > > > > -----Original Message----- > > > > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On > > > Behalf > > > > > Of Michael S. Fischer > > > > > > > > > > > > > Sent: Thursday, February 28, 2008 1:57 PM > > > > > To: Andrew Knapp > > > > > Cc: [email protected] > > > > > Subject: Re: Child dying with "Too many open files" > > > > > > > > > > Is varnishd being started as root? (even if it drops > privileges > > > > > later) Only root can have > 1024 file descriptors open, to my > > > > > knowledge. > > > > > > > > > > --Michael > > > > > > > > > > On Thu, Feb 28, 2008 at 11:48 AM, Andrew Knapp > > <[EMAIL PROTECTED]> > > > > > wrote: > > > > > > Didn't really get a answer to this, so I'm trying again. > > > > > > > > > > > > I've done some testing with the NFILES variable, and I keep > > > getting > > > > > the > > > > > > same error as before ("Too many open files"). I've also > > > verified > > > > > that > > > > > > the limit is actually being applied by putting a ulimit -a > in > > > the > > > > > > /etc/init.d/varnish script. > > > > > > > > > > > > Anyone have any ideas? I'm running the 1.1.2-5 rpms from > > sf.net > > > on > > > > > > Centos 5.1. > > > > > > > > > > > > Thanks, > > > > > > Andy > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: [EMAIL PROTECTED] > > > [mailto:varnish- > > > > > misc- > > > > > > > [EMAIL PROTECTED] On Behalf Of Andrew Knapp > > > > > > > > > > > > > Sent: Wednesday, February 20, 2008 5:52 PM > > > > > > > To: Michael S. Fischer > > > > > > > Cc: [email protected] > > > > > > > > > > > > > > > > > > > Subject: RE: Child dying with "Too many open files" > > > > > > > > > > > > > > Here's the output: > > > > > > > > > > > > > > $ sysctl fs.file-max > > > > > > > fs.file-max = 767606 > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > From: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] On > > > > > > > Behalf > > > > > > > > Of Michael S. Fischer > > > > > > > > Sent: Wednesday, February 20, 2008 5:48 PM > > > > > > > > To: Andrew Knapp > > > > > > > > Cc: [email protected] > > > > > > > > Subject: Re: Child dying with "Too many open files" > > > > > > > > > > > > > > > > Does 'sysctl fs.file-max' say? It should be >= the > > ulimit. > > > > > > > > > > > > > > > > --Michael > > > > > > > > > > > > > > > > On Wed, Feb 20, 2008 at 4:04 PM, Andrew Knapp > > > > <[EMAIL PROTECTED]> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm getting this error when running varnishd: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Child said (2, 15369): <<Assert error in > wrk_thread(), > > > > > > cache_pool.c > > > > > > > > line > > > > > > > > > 217: > > > > > > > > > > > > > > > > > > Condition((pipe(w->pipe)) == 0) not true. > > > > > > > > > > > > > > > > > > errno = 24 (Too many open files) > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Cache child died pid=15369 status=0x6 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > uname -a: > > > > > > > > > > > > > > > > > > Linux <hostname> 2.6.18-53.1.4.el5 #1 SMP Fri Nov 30 > > > 00:45:55 > > > > > EST > > > > > > > > 2007 > > > > > > > > > x86_64 x86_64 x86_64 GNU/Linux > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > command used to start varnish: > > > > > > > > > > > > > > > > > > /usr/sbin/varnishd -d -d -a :80 -f > > /etc/varnish/photo.vcl > > > -T > > > > > > > > > <internalIP>:6082 -t 120 -w 10,700,30 -s > > > > > > > > > file,/c01/varnish/varnish_storage.bin,12G -u varnish > -g > > > > > varnish -P > > > > > > > > > /var/run/varnish.pid > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have NFILES=270000 set in /etc/sysconfig/varnish. > Do > > I > > > just > > > > > need > > > > > > > to > > > > > > > > up > > > > > > > > > that value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > Andy > > > > > > > > > _______________________________________________ > > > > > > > > > varnish-misc mailing list > > > > > > > > > [email protected] > > > > > > > > > http://projects.linpro.no/mailman/listinfo/varnish- > > misc > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > varnish-misc mailing list > > > > > > > [email protected] > > > > > > > http://projects.linpro.no/mailman/listinfo/varnish-misc > > > > > > _______________________________________________ > > > > > > varnish-misc mailing list > > > > > > [email protected] > > > > > > http://projects.linpro.no/mailman/listinfo/varnish-misc > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > varnish-misc mailing list > > [email protected] > > http://projects.linpro.no/mailman/listinfo/varnish-misc > _______________________________________________ > varnish-misc mailing list > [email protected] > http://projects.linpro.no/mailman/listinfo/varnish-misc _______________________________________________ varnish-misc mailing list [email protected] http://projects.linpro.no/mailman/listinfo/varnish-misc
