Not lurker friendly at all indeed. You'll need to avoid req.* expression. Easiest way is to stash the host, user-agent and url in beresp.http.* and ban against those (unset them in vcl_deliver).
I don't think you need to expand the VSL at all. -- Guillaume Quintard On Jun 26, 2017 16:51, "Stefano Baldo" <[email protected]> wrote: Hi Guillaume. Thanks for answering. I'm using a SSD disk. I've changed from ext4 to ext2 to increase performance but it stills restarting. Also, I checked the I/O performance for the disk and there is no signal of overhead. I've changed the /var/lib/varnish to a tmpfs and increased its 80m default size passing "-l 200m,20m" to varnishd and using "nodev,nosuid,noatime,size=256M 0 0" for the tmpfs mount. There was a problem here. After a couple of hours varnish died and I received a "no space left on device" message - deleting the /var/lib/varnish solved the problem and varnish was up again, but it's weird because there was free memory on the host to be used with the tmpfs directory, so I don't know what could have happened. I will try to stop increasing the /var/lib/varnish size. Anyway, I am worried about the bans. You asked me if the bans are lurker friedly. Well, I don't think so. My bans are created this way: ban("req.http.host == " + req.http.host + " && req.url ~ " + req.url + " && req.http.User-Agent !~ Googlebot"); Are they lurker friendly? I was taking a quick look and the documentation and it looks like they're not. Best, Stefano On Fri, Jun 23, 2017 at 11:30 AM, Guillaume Quintard < [email protected]> wrote: > Hi Stefano, > > Let's cover the usual suspects: I/Os. I think here Varnish gets stuck > trying to push/pull data and can't make time to reply to the CLI. I'd > recommend monitoring the disk activity (bandwidth and iops) to confirm. > > After some time, the file storage is terrible on a hard drive (SSDs take a > bit more time to degrade) because of fragmentation. One solution to help > the disks cope is to overprovision themif they're SSDs, and you can try > different advices in the file storage definition in the command line (last > parameter, after granularity). > > Is your /var/lib/varnish mount on tmpfs? That could help too. > > 40K bans is a lot, are they ban-lurker friendly? > > -- > Guillaume Quintard > > On Fri, Jun 23, 2017 at 4:01 PM, Stefano Baldo <[email protected]> > wrote: > >> Hello. >> >> I am having a critical problem with Varnish Cache in production for over >> a month and any help will be appreciated. >> The problem is that Varnish child process is recurrently being restarted >> after 10~20h of use, with the following message: >> >> Jun 23 09:15:13 b858e4a8bd72 varnishd[11816]: Child (11824) not >> responding to CLI, killed it. >> Jun 23 09:15:13 b858e4a8bd72 varnishd[11816]: Unexpected reply from ping: >> 400 CLI communication error >> Jun 23 09:15:13 b858e4a8bd72 varnishd[11816]: Child (11824) died signal=9 >> Jun 23 09:15:14 b858e4a8bd72 varnishd[11816]: Child cleanup complete >> Jun 23 09:15:14 b858e4a8bd72 varnishd[11816]: Child (24038) Started >> Jun 23 09:15:14 b858e4a8bd72 varnishd[11816]: Child (24038) said Child >> starts >> Jun 23 09:15:14 b858e4a8bd72 varnishd[11816]: Child (24038) said SMF.s0 >> mmap'ed 483183820800 bytes of 483183820800 >> >> The following link is the varnishstat output just 1 minute before a >> restart: >> >> https://pastebin.com/g0g5RVTs >> >> Environment: >> >> varnish-5.1.2 revision 6ece695 >> Debian 8.7 - Debian GNU/Linux 8 (3.16.0) >> Installed using pre-built package from official repo at packagecloud.io >> CPU 2x2.9 GHz >> Mem 3.69 GiB >> Running inside a Docker container >> NFILES=131072 >> MEMLOCK=82000 >> >> Additional info: >> >> - I need to cache a large number of objets and the cache should last for >> almost a week, so I have set up a 450G storage space, I don't know if this >> is a problem; >> - I use ban a lot. There was about 40k bans in the system just before the >> last crash. I really don't know if this is too much or may have anything to >> do with it; >> - No registered CPU spikes (almost always by 30%); >> - No panic is reported, the only info I can retrieve is from syslog; >> - During all the time, event moments before the crashes, everything is >> okay and requests are being responded very fast. >> >> Best, >> Stefano Baldo >> >> >> _______________________________________________ >> varnish-misc mailing list >> [email protected] >> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc >> > >
_______________________________________________ varnish-misc mailing list [email protected] https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
