Am Freitag 30 Mai 2008 14:01:35 schrieb Audun Ytterdal: > I run trunk in front of a site. I have 3 varnishservers, all with > 32GB > > of ram serving only small pictures, thumbnails and profile pictures. > The cacheset is pretty large (1.5 TB) and changing much over time. > And before you all ask why I don't just server partitioned data from > several apache/nginx/lighttpd servers It's because we're not there > yet. The varnishes all fetch their content from one lighttpd server . > > I run into the Thread pileup-problem > > I've set threadlimit to 1500 and it usually lies between 80 and 700. > While restarting it hits the 1500 limit and stays there for a few > minutes. Then it gradualy manages to controll traffic and ends up > around 80 threads. It usually grows a bit. But not over 700-1000 ish. > But suddenly, under high traffic it goes up to the limit beeing 1500 > or 4000 or whatever i set it to. Then it stays there and usualy never > recovers without a restart. > I guess it's because the backend at some point answers slowly. But is > there a way to easier get out of this situation. > > Running varnish like this: > > (redhat 4.6 32 GB RAM) > > /usr/sbin/varnishd -a :80 -f /etc/varnish/nettby.vcl -T 127.0.0.1:82 > -t 120 -w 2,2000,30 -u varnish -g varnish -p client_http11 on -p > thread_pools 4 -p thread_pool_max 4000 -p listen_depth 4096 -p > lru_interval 3600 -h classic,500009 -s > file,/var/varnish/varnish_storage.bin,30G -P /var/run/varnish.pid > > and for testing purposes > > (redhat 5.1 32 GB RAM) > > varnish 22959 17.1 45.2 20187068 14938024 ? Sl May29 160:40 > /usr/sbin/varnishd -a :80 -f /etc/varnish/nettby.vcl -T 127.0.0.1:82 > -t 120 -u varnish -g varnish -p thread_pools 4 -p thread_pool_max > 2000 -p client_http11 on -p listen_depth 4096 -p lru_interval 3600 -h > classic,500009 -s malloc,60G -P /var/run/varnish.pid > > Each varnish handles about 3000 req/s before it caves in. > > Any suggestions?
not really sure if I can help, but at least I can tell you that we run a similar setup. however, our images are never changed, only get deleted, and if it happens, a PURGE request clears them off the proxies. Therefore, we go a little different route: huge file based cache (over 500 GB), and huge default_ttl (one year, only 404 errors have a smaller ttl of 3 hours). we also have 32 GB installed, and set thread_pool_max to 8000, and -h classic,2500009 according to a hint in the wiki. I did not look at the thread count for a longer time, and don't really know if 8000 is any good or bad. since everything works nice, I just keep it as it is. having said all this, the request pattern is not really like yours. the cache file is only used by ~60 % after 50 days, and peak traffic for each out of three proxies is only about 300 req/s. one would probably cope well with our traffic, we run three as a simple HA measure. Cheers, Sascha > > Are the parameters sane? > > -- > Audun > > > > ***************************************************************** > Denne fotnoten bekrefter at denne e-postmeldingen ble > skannet av MailSweeper og funnet fri for virus. > ***************************************************************** > This footnote confirms that this email message has been > swept by MailSweeper for the presence of computer viruses. > ***************************************************************** > > _______________________________________________ > varnish-misc mailing list > [email protected] > http://projects.linpro.no/mailman/listinfo/varnish-misc _______________________________________________ varnish-misc mailing list [email protected] http://projects.linpro.no/mailman/listinfo/varnish-misc
