On the contrary, thanks for bringing that up. I was good to be made aware of such problems. Just because it worked now on Precise and Trusty didn't mean I should skip that check, as I might as well run into an obsolete or deleted feature (not that upstart seem to have those). =) On Oct 14, 2015 9:56 AM, "Robie Basak" <[email protected]> wrote:
> On Wed, Oct 14, 2015 at 12:34:33PM -0000, Tiago Stürmer Daitx wrote: > > According to http://upstart.ubuntu.com/cookbook/#stanzas-by-category > > both "kill timeout" (all upstart versions) and "kill signal" (since > > upstart 1.3) should be supported by upstart 1.5 in Precise so this SRU > > should be fine. > > You're right. Sorry for the noise. > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1073478 > > Title: > [SRU] Update squid3 upstart script to kill it with SIGINT and wait > longer > > Status in squid3 package in Ubuntu: > Triaged > > Bug description: > [Impact] > > Squid 3.1.19-1ubuntu3.12.04.3 on Precise and 3.3.8-1ubuntu6.3 on > Trusty do not clean up its PID file, causing logrotate to spit out an > error each night when cron runs it. > > The underlying issue here is that the upstart script does not wait > long enough for squid3 to finish - one might also argue that it is > using the wrong signal if it expects squid to finish quickly. > > By default, when squid3 receives a SIGTERM it will close the socket > for incoming connections and will wait for existing connections to > complete. The shutdown_lifetime directive configures how long it waits > before forcefully closing those open connections and it is set by > default to 30 seconds. > > Current setting, SIGKILL happens after 5 seconds (which is the upstart > default): > ===== > # strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid) > Process 20865 attached - interrupt to quit > 0.000000 --- SIGTERM (Terminated) @ 0 (0) --- > 0.000440 rt_sigreturn(0xf) = -1 EINTR (Interrupted system call) > 0.000418 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, > ...}) = 0 > 5.006483 +++ killed by SIGKILL +++ > > Killing squid with SIGKILL might lead to cache index corruption and a > very slow startup the next time it is started: http://lists.squid- > cache.org/pipermail/squid-users/2015-July/004752.html > > The first way to avoid this is by increasing upstart's "kill timeout" > to a few seconds more then the "shutdown_lifetime" directive. > > With "kill timeout 40" in upstart script, notice the pid file being > unlinked (the ~30 seconds wait is due to the default value of > shutdown_lifetime directive): > ===== > # strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid) > Process 20805 attached - interrupt to quit > 0.000000 --- SIGTERM (Terminated) @ 0 (0) --- > 0.000284 rt_sigreturn(0xf) = -1 EINTR (Interrupted system call) > 0.000402 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, > ...}) = 0 > 31.611726 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, > ...}) = 0 > 0.001792 --- SIGCHLD (Child exited) @ 0 (0) --- > 0.000247 rt_sigreturn(0xffffffffffffffff) = 63 > 0.006027 unlink("/var/run/squid3.pid") = 0 > 0.000539 open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 0 > 0.000387 open("/etc/group", O_RDONLY|O_CLOEXEC) = 0 > Process 20805 detached > > Still, this will unfortunately lead to a slow shutdown time as > reported in http://askubuntu.com/questions/18127/squid3-starts-and- > stops-slowly > > To solve this it is better to issue a SIGINT instead of SIGTERM > (upstart default) by adding "kill signal SIGINT" to squid's upstart > script. > > With "kill signal SIGINT", which does not wait on outstanding > connections: > ===== > # strace -r -e trace=signal,file -p $(cat /var/run/squid3.pid) > Process 20891 attached - interrupt to quit > 0.000000 --- SIGINT (Interrupt) @ 0 (0) --- > 0.000312 rt_sigreturn(0x2) = -1 EINTR (Interrupted system call) > 0.000237 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, > ...}) = 0 > 1.123564 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, > ...}) = 0 > 0.000754 --- SIGCHLD (Child exited) @ 0 (0) --- > 0.000103 rt_sigreturn(0xffffffffffffffff) = 0 > 0.002031 unlink("/var/run/squid3.pid") = 0 > 0.000154 open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 0 > 0.000142 open("/etc/group", O_RDONLY|O_CLOEXEC) = 0 > Process 20891 detached > > The whole stop process is very quick. The downside is that squid will > forcefully close all open connections immediately, but that was > already happening when SIGKILL was being issued before - so we are > actually keeping the same behavior as before and there are no bugs > complaining about it. And the PID file gets removed anyway. > > The proposal is to add both "kill signal SIGINT" and "kill timeout 40" > to squid, so squid should have enough time to close all open cache > index files. > > [Test Case] > > # start squid3 > squid3 start/running, process 10392 > # cat /var/run/squid3.pid > 10392 > # stop squid3 > squid3 stop/waiting > # cat /var/run/squid3.pid > 10392 > # logrotate -f /etc/logrotate.d/squid3 > squid: ERROR: Could not send signal 10 to process 10392: (3) No such > process > error: error running shared postrotate script for '/var/log/squid3/*.log > ' > > [Regression Potential] > > * The main regression potencial is related to the "kill timeout" value. > If set too high, a user's system might *seem* too slow to shutdown. Note > that this should rarely happen: only when squid actually needs a long time > to write down and close its cache index files (high system load maybe?). > Overall, whenever a system behaves this way the user probably has other > problems to solve. > * If "kill timeout" value is set too low we might incur in the same > problem originally reported and end up calling SIGKILL on squid too early, > probably while it is closing its cache index files, thus corrupting them. > Anyway, this was probably happening already anyway, as SIGKILL is always > called after 5 seconds in the current upstart script. > > Overall, there is very low regression potential for this fix. > > > [Other info] > > # lsb_release -d > Description: Ubuntu 12.04.5 LTS > # dpkg -l squid3 > Desired=Unknown/Install/Remove/Purge/Hold > | > Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > ||/ Name Version Description > > +++-==================-==================-==================================================== > ii squid3 3.1.19-1ubuntu3.12 Full featured Web Proxy cache > (HTTP proxy) > > # lsb_release -d > Description: Ubuntu 14.04.3 LTS > # dpkg -l squid3 > Desired=Unknown/Install/Remove/Purge/Hold > | > Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > ||/ Name Version > Architecture Description > > +++-===========================================-==========================-==========================-=========================================================================================== > ii squid3 3.3.8-1ubuntu6.3 > amd64 Full featured Web Proxy cache (HTTP proxy) > > To manage notifications about this bug go to: > > https://bugs.launchpad.net/ubuntu/+source/squid3/+bug/1073478/+subscriptions > -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1073478 Title: [SRU] Update squid3 upstart script to kill it with SIGINT and wait longer To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/squid3/+bug/1073478/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
