Hi Lasse, I finally had some time to do some more debugging.
I tried many different setting, and I still run into the same bug. In order for you to reproduce it as easily and as fast as possible please find the following gist https://gist.github.com/huguesalary/fc59034d3a49745755f4 which contain: - information about the machine running varnish - the configuration snippet - a command line triggering the bug 100% of the time on my debian servers and my docker containers (that are hosted on a debian server too) - a syslog output - a dmesg output - the *full* output from varnishlog. I successfully reproduced the bug with the default varnish setting, but also with really high values for these settings: pcre_match_limit=1000000 pcre_match_limit_recursion=1000000 workspace_client=50M If you'd like I can file a bug report in the tracker. Thank you for your help. -Hugues On Thu, Nov 13, 2014 at 9:03 AM, Hugues Alary <[email protected]> wrote: > Hi Lasse, > > Thanks for you answer. I hadn't thought of looking in syslog, here's what > I found: > > Nov 13 04:12:55 varnishd[24316]: Child (26394) died signal=11 > Nov 13 04:12:55 varnishd[24316]: Child cleanup complete > Nov 13 04:12:55 varnishd[24316]: child (26614) Started > Nov 13 04:12:55 varnishd[24316]: Child (26614) said Child starts > Nov 13 04:12:59 kernel: varnishd[26830]: segfault at 6543feb43f00 ip > 0000654412dd27ed sp 00006543feb43ec0 error 6 in > libpcre.so.3.13.1[654412dbf000+3c000] > Nov 13 04:12:59 kernel: grsec: From 58.7.84.48: Segmentation fault > occurred at 00006543feb43f00 in /usr/sbin/varnishd[varnishd:26830] > uid/euid:65534/65534 gid/egid:65534/65534, parent > /usr/sbin/varnishd[varnishd:24316] uid/euid: > 0/0 gid/egid:0/0 > > libpcre? Sounds promising. > > I'll try to increment the prce_match_limit and report back. > > Side note: on my mac, I can reproduce the bug, but not consistently. I > noticed in my logs these lines: > > 11/13/14 8:58:13.507 AM varnishd[12948]: Child (12958) said HASHMAGIC: > <47bf313ea1f6e04499e1751a3b9ca878bb51faaaf5e2128eeed6ef5b465a8e6c> -> > <000000000000000000000000000000000000000000000000000000HASHMAGIC: > <003c006a00b100dc02a0> > 11/13/14 8:58:13.507 AM varnishd[12948]: Child (12958) said > 68ab0cfdc1acf821774561a994e85995e9527eae9720da768d1066> -> > <0000000000000000000000000000000000000000000000000000000000000040> > 11/13/14 8:58:13.508 AM varnishd[12948]: Child (12958) said HASHMAGIC: > <b365fe539a4392542d968e3a3256e9feac46227167564b320fe5f21c1b5467d5> -> > <0000000000000000000000000000000000000000000000000000000000000080> > 11/13/14 8:58:13.515 AM varnishd[12948]: Child (12958) said HASHMAGIC: > <1908209a8bdd9ae3690cd3c7937000311dd0df8546d152bf5c683ef17be0f086> -> > <0100000000000000000000000000000000000000000000000000000000000000> > 11/13/14 8:58:13.516 AM varnishd[12948]: Child (12958) said HASHMAGIC: > <92bf1ef1dc244bdf6e8b09b06576c4de2888f9c1aebba2270911e4c20a0ce569> -> > <0200000000000000000000000000000000000000000000000000000000000000> > 11/13/14 8:58:13.558 AM varnishd[12948]: Child (12958) said HASHMAGIC: > <43d7cc8e4b069561e6336eebca63b9766667f952648003d0ca35211d9ec07e55> -> > <8000000000000000000000000000000000000000000000000000000000000000> > 11/13/14 8:58:13.563 AM varnishd[12948]: Child (12958) said HASHMAGIC: > <33bcb999060165791f7b08a0d9cc25bca64e737ed4faea19ecbe1fdb955a0d4d> -> > <4000000000000000000000000000000000000000000000000000000000000000> > > Is this varnish somehow degrading gracefully when the bug occurs? (which > would explain why I can reproduce is 100% of the time) > > > > On Thu, Nov 13, 2014 at 6:31 AM, Lasse Karstensen < > [email protected]> wrote: > >> On Thu, Nov 13, 2014 at 01:37:48AM -0800, Hugues Alary wrote: >> > I just upgraded to from 3 (.0.5/6) to 4.0.2 on my debian system. >> > I converted my old varnish 3 code to 4.0.2. The varnish 3 code has been >> > working flawlessly for more than 2 years. >> > Now, in my varnish 4 VCL, everything works fine except for a very >> specific >> > piece of code that seem to crash(?) varnish. I say crash with a question >> > mark as I haven't been able to actually confirm it crashes. >> > The symptom of the bug is the following: when I connect with a mobile >> > browser, the connection seems like it's being reset, and the browser >> shows: >> [..] >> > In varnishadm panic.show tells me that no panic happened recently. >> > My guess is that my condition is so long that Varnish can't handle it, >> >> Hi. >> >> I pieced together your VCL as best I could (your email client/the list >> software line wrapped some regexes), but I'm not able to reproduce your >> problem with 4.0.2 on debian. >> >> A bit of a long shot, but are there any mentions of Varnish 4 segfaults >> in dmesg? >> >> My guess would be that you either are running out of workspace_client or >> or are hitting pcre_match_limit. I suggest you increase them and see if >> the problem persists. >> >> >> -- >> Lasse Karstensen >> Varnish Software AS >> > >
_______________________________________________ varnish-misc mailing list [email protected] https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
