Re: [ccache] ccache interrupt handling bug
On Sat, Aug 29, 2015 at 8:01 PM, Joel Rosdahl j...@rosdahl.net wrote: On 20 August 2015 at 21:04, Joel Rosdahl j...@rosdahl.net wrote: I will not have time to work on improving the signal handler until next week at the earliest. I've now had a look at improving the signal handler. It would be appreciated if somebody wants to review it: https://git.samba.org/?p=ccache.git;a=commitdiff;h=737c96fabeab9d66407758b848468aa49190c946 Hi, Looking at this code, I don't see any obvious holes and it all seems correct. One thing you should perhaps consider is whether this change (using sigaction, signal sets, instead of the antique signal(), etc.) might break compilation on some antique UNIX machines or on MS-Windows (!?) or something. But I doubt any present-day system actually lacks sigaction() and friends, so I think your code is good. -- Nadav Har'El n...@cloudius-systems.com ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
Re: [ccache] ccache interrupt handling bug
On Sun, Aug 9, 2015 at 11:23 PM, Joel Rosdahl j...@rosdahl.net wrote: Hi Nadav, (Sorry for the delayed reply.) Hi, I found a bug in ccache, which makes it impossible to correctly interrupt a compilation with a control-C (I tried this on Linux). [...] Thanks for the bug report and analysis. This will be fixed in ccache 3.2.3. Regards, -- Joel Thanks. I see now the fix in 3.2.3, which if I understand correctly, is simply: --- a/ccache.c +++ b/ccache.c @@ -341,6 +341,7 @@ signal_handler(int signo) { (void)signo; clean_up_pending_tmp_files(); + _exit(1); } First of all, thanks. This indeed fixes the bug, and is exactly the first patch I tried to fix it this problem. I want to explain, though, why I ended up sending a different patch for this problem - a 4-line patch instead of this one-liner. The explanation is not very important, and I should probably have done this earlier, but perhaps someone would find it interesting. When the user types ^C, the operating system sends SIGINT to all processes belonging to the terminal's process group - and in particular it sends SIGINT to *both* the child compiler and ccache, separately. Your patch makes ccache exit as soon as it gets the SIGINT - but the child compiler might still be running for a while longer. This will usually be fine, but I thought the user can be surprised if he sees ccache (which he considers to be the compiler) exit, but ps shows the compiler is still running. This would be especially annoying if some hypothetical compiler trapped SIGINT deliberately, and continued to run long after ccache exits. My approach, was to let ccache continue after the SIGINT, and continue to wait (as it always does) for the child compiler to finish. The child, who received a SIGINT too, willl indeed finish quickly. The code already recognizes this case of the child killed by a signal, and returns status == -1. So all my 4-line patch did was to not restart the compiler if it died with a status = -1 (i.e., a signal) and not an ordinary failure. But ccache does not exit immediately on a signal - just after the child compiler exits. Again, I guess this destinction is not very important, but I just wanted to point it out in case anybody thinks it is important. -- Nadav Har'El n...@cloudius-systems.com ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
[ccache] ccache interrupt handling bug
Hi, I found a bug in ccache, which makes it impossible to correctly interrupt a compilation with a control-C (I tried this on Linux). Consider the following C++ program from hell that takes 13 seconds to compile on my machine (change 27 to a higher number to make it even slower): template int TreePos, int N struct FibSlow_t { enum { value = FibSlow_tTreePos, N - 1::value + FibSlow_tTreePos + (1 N), N - 2::value, }; }; template int TreePos struct FibSlow_tTreePos, 2 { enum { value = 1 }; }; template int TreePos struct FibSlow_tTreePos, 1 { enum { value = 1 }; }; static int s_value = FibSlow_t0, 27::value; Compile this with: CCACHE_RECACHE=1 ccache g++ -c example.cc Now try to interrupt this with control-C and note something really strange: The first control-C is seemingly ignored and compilation continues! The second control-C does work and stops compilation. When this is run from some build system (e.g., ninja or make), control-C doesn't work at all. The reason why the second interrupt works is simple: you used signal(), whose archaic behavior is to set the signal handler for only one go. So after the first ^C, the second one gets the default signal handling, i.e., exit, which works :-) But why doesn't the first ^C, which calls signal_handler(), work, and cause the compilation to continue? It turns out that signal_handler() doesn't exit after cleaning up. Rather, the log shows: [2015-07-23T15:04:02.088253 21059] Executing /usr/bin/g++ -c -o z.o /home/nyh/.ccache/tmp/z.stdout.rice.21059.nHZLHE.ii here I pressed control-C [2015-07-23T15:04:05.847059 21059] Unlink /home/nyh/.ccache/1/a/873cb37b579cd5dd45ca9c43ae8030-644.o.tmp.stdout.rice.21059.9FTbHl [2015-07-23T15:04:05.847138 21059] Failed opening /home/nyh/.ccache/tmp/tmp.cpp_stderr.rice.21059.AQUGJX: No such file or directory [2015-07-23T15:04:05.847147 21059] Failed; falling back to running the real compiler [2015-07-23T15:04:05.847151 21059] Executing /usr/bin/g++ -c z.cc So, after the signal, we delete the temporary file but continue (in waitpid() in execute.c). Very soon afterwards, waitpid() discovers the child also died (it also got the SIGINT signal, like all the processes connected to the terminal's process group). The execute() call returns -1. But the code ignores that, and considers this a general failure to run the compiler, which then causes it to run the compiler again! So it's not that the compilation isn't interrupted - it actually is, and then restarted! One way to fix this bug is to recognize that the fact execute() returned -1 has a special meaning (a signal), and ccache should exit and not try to run the compiler again. The following patch fixes the bug. I'm not sure it's the best fix, but it works: @@ -839,10 +840,15 @@ args_add(args, i_tmpfile); } cc_log(Running real compiler); status = execute(args-argv, tmp_stdout_fd, tmp_stderr_fd); +if (status == -1) { +/* The compiler was interrupted by a signal */ +exit(1); +} + args_pop(args, 3); if (x_stat(tmp_stdout, st) != 0) { /* The stdout file was removed - cleanup in progress? Better bail out. */ stats_update(STATS_MISSING); -- Nadav Har'El n...@cloudius-systems.com ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache