Hello. I am having a critical problem with Varnish Cache in production for over a month and any help will be appreciated. The problem is that Varnish child process is recurrently being restarted after 10~20h of use, with the following message:
Jun 23 09:15:13 b858e4a8bd72 varnishd[11816]: Child (11824) not responding to CLI, killed it. Jun 23 09:15:13 b858e4a8bd72 varnishd[11816]: Unexpected reply from ping: 400 CLI communication error Jun 23 09:15:13 b858e4a8bd72 varnishd[11816]: Child (11824) died signal=9 Jun 23 09:15:14 b858e4a8bd72 varnishd[11816]: Child cleanup complete Jun 23 09:15:14 b858e4a8bd72 varnishd[11816]: Child (24038) Started Jun 23 09:15:14 b858e4a8bd72 varnishd[11816]: Child (24038) said Child starts Jun 23 09:15:14 b858e4a8bd72 varnishd[11816]: Child (24038) said SMF.s0 mmap'ed 483183820800 bytes of 483183820800 The following link is the varnishstat output just 1 minute before a restart: https://pastebin.com/g0g5RVTs Environment: varnish-5.1.2 revision 6ece695 Debian 8.7 - Debian GNU/Linux 8 (3.16.0) Installed using pre-built package from official repo at packagecloud.io CPU 2x2.9 GHz Mem 3.69 GiB Running inside a Docker container NFILES=131072 MEMLOCK=82000 Additional info: - I need to cache a large number of objets and the cache should last for almost a week, so I have set up a 450G storage space, I don't know if this is a problem; - I use ban a lot. There was about 40k bans in the system just before the last crash. I really don't know if this is too much or may have anything to do with it; - No registered CPU spikes (almost always by 30%); - No panic is reported, the only info I can retrieve is from syslog; - During all the time, event moments before the crashes, everything is okay and requests are being responded very fast. Best, Stefano Baldo
_______________________________________________ varnish-misc mailing list [email protected] https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
