Re: [ccache] ccache interrupt handling bug
On Sat, Aug 29, 2015 at 8:01 PM, Joel Rosdahl wrote: > On 20 August 2015 at 21:04, Joel Rosdahl wrote: > >> I will not have time to work on improving the signal handler until next >> week at the earliest. > > > I've now had a look at improving the signal handler. It would be > appreciated if somebody wants to review it: > https://git.samba.org/?p=ccache.git;a=commitdiff;h=737c96fabeab9d66407758b848468aa49190c946 > Hi, Looking at this code, I don't see any obvious holes and it all seems correct. One thing you should perhaps consider is whether this change (using sigaction, signal sets, instead of the antique signal(), etc.) might break compilation on some antique UNIX machines or on MS-Windows (!?) or something. But I doubt any present-day system actually lacks sigaction() and friends, so I think your code is good. -- Nadav Har'El n...@cloudius-systems.com ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
Re: [ccache] ccache interrupt handling bug
On Mon, Aug 17, 2015 at 12:50 AM, Tom Lane wrote: > > * make launches "ccache gcc -o test.o test.c" > * user types control-C > * ccache receives SIGINT and exits > * make does unlink("test.o") to clean up after failed compile step, which > it thinks is done > * gcc creates test.o > Stupid question: Does ccache let gcc write to test.o directly, or does it write to some temporary file and then ccache copies it? > * gcc receives SIGINT and exits > > We're left with a probably-broken test.o in the filesystem, despite > make's attempt to clean up. Now, the window for this is pretty narrow, > but I think it could happen: I don't believe that sending signals to > multiple processes is atomic in most kernels. > > You really don't want ccache to exit until after all externally-observable > things have stopped happening. (I recently fixed a somewhat related bug > in Postgres :-(, which is why this caught my eye.) > > regards, tom lane > -- Nadav Har'El n...@cloudius-systems.com ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
Re: [ccache] ccache interrupt handling bug
On Sun, Aug 9, 2015 at 11:23 PM, Joel Rosdahl wrote: > Hi Nadav, > > (Sorry for the delayed reply.) > > Hi, I found a bug in ccache, which makes it impossible to >> correctly interrupt a compilation with a control-C (I tried this on Linux). >> [...] > > > Thanks for the bug report and analysis. This will be fixed in ccache 3.2.3. > > Regards, > -- Joel > > Thanks. I see now the fix in 3.2.3, which if I understand correctly, is simply: --- a/ccache.c +++ b/ccache.c @@ -341,6 +341,7 @@ signal_handler(int signo) { (void)signo; clean_up_pending_tmp_files(); + _exit(1); } First of all, thanks. This indeed fixes the bug, and is exactly the first patch I tried to fix it this problem. I want to explain, though, why I ended up sending a different patch for this problem - a 4-line patch instead of this one-liner. The explanation is not very important, and I should probably have done this earlier, but perhaps someone would find it interesting. When the user types ^C, the operating system sends SIGINT to all processes belonging to the terminal's process group - and in particular it sends SIGINT to *both* the child compiler and ccache, separately. Your patch makes ccache exit as soon as it gets the SIGINT - but the child compiler might still be running for a while longer. This will usually be fine, but I thought the user can be surprised if he sees ccache (which he considers to be the compiler) exit, but "ps" shows the compiler is still running. This would be especially annoying if some hypothetical compiler trapped SIGINT deliberately, and continued to run long after ccache exits. My approach, was to let ccache continue after the SIGINT, and continue to wait (as it always does) for the child compiler to finish. The child, who received a SIGINT too, willl indeed finish quickly. The code already recognizes this case of the child killed by a signal, and returns status == -1. So all my 4-line patch did was to not restart the compiler if it died with a status = -1 (i.e., a signal) and not an ordinary failure. But ccache does not exit immediately on a signal - just after the child compiler exits. Again, I guess this destinction is not very important, but I just wanted to point it out in case anybody thinks it is important. -- Nadav Har'El n...@cloudius-systems.com ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
Re: [ccache] ccache interrupt handling bug
On Sun, Aug 9, 2015 at 11:23 PM, Joel Rosdahl wrote: > > Hi, I found a bug in ccache, which makes it impossible to >> correctly interrupt a compilation with a control-C (I tried this on Linux). >> [...] > > > Thanks for the bug report and analysis. This will be fixed in ccache 3.2.3. > Thanks. Funny thing is that after I fixed this bug, I realized that several of my colleagues also encountered it many times before, but assumed was a problem in "make" or "gcc" and treated it as a fact of life - not realizing it was actually caused by ccache and simple to fix. So it will be great to have it fixed. Thanks. -- Nadav Har'El n...@cloudius-systems.com ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
[ccache] ccache interrupt handling bug
Hi, I found a bug in ccache, which makes it impossible to correctly interrupt a compilation with a control-C (I tried this on Linux). Consider the following C++ program from hell that takes 13 seconds to compile on my machine (change "27" to a higher number to make it even slower): template struct FibSlow_t { enum { value = FibSlow_t::value + FibSlow_t::value, }; }; template struct FibSlow_t { enum { value = 1 }; }; template struct FibSlow_t { enum { value = 1 }; }; static int s_value = FibSlow_t<0, 27>::value; Compile this with: "CCACHE_RECACHE=1 ccache g++ -c example.cc" Now try to interrupt this with control-C and note something really strange: The first control-C is seemingly ignored and compilation continues! The second control-C does work and stops compilation. When this is run from some build system (e.g., "ninja" or "make"), control-C doesn't work at all. The reason why the second interrupt works is simple: you used signal(), whose archaic behavior is to set the signal handler for only one go. So after the first ^C, the second one gets the default signal handling, i.e., exit, which works :-) But why doesn't the first ^C, which calls signal_handler(), work, and cause the compilation to continue? It turns out that signal_handler() doesn't exit after cleaning up. Rather, the log shows: [2015-07-23T15:04:02.088253 21059] Executing /usr/bin/g++ -c -o z.o /home/nyh/.ccache/tmp/z.stdout.rice.21059.nHZLHE.ii [2015-07-23T15:04:05.847059 21059] Unlink /home/nyh/.ccache/1/a/873cb37b579cd5dd45ca9c43ae8030-644.o.tmp.stdout.rice.21059.9FTbHl [2015-07-23T15:04:05.847138 21059] Failed opening /home/nyh/.ccache/tmp/tmp.cpp_stderr.rice.21059.AQUGJX: No such file or directory [2015-07-23T15:04:05.847147 21059] Failed; falling back to running the real compiler [2015-07-23T15:04:05.847151 21059] Executing /usr/bin/g++ -c z.cc So, after the signal, we delete the temporary file but continue (in waitpid() in execute.c). Very soon afterwards, waitpid() discovers the child also died (it also got the SIGINT signal, like all the processes connected to the terminal's process group). The execute() call returns -1. But the code ignores that, and considers this a general "failure" to run the compiler, which then causes it to run the compiler again! So it's not that the compilation isn't interrupted - it actually is, and then restarted! One way to fix this bug is to recognize that the fact execute() returned -1 has a special meaning (a signal), and ccache should exit and not try to run the compiler again. The following patch fixes the bug. I'm not sure it's the "best" fix, but it works: @@ -839,10 +840,15 @@ args_add(args, i_tmpfile); } cc_log("Running real compiler"); status = execute(args->argv, tmp_stdout_fd, tmp_stderr_fd); +if (status == -1) { +/* The compiler was interrupted by a signal */ +exit(1); +} + args_pop(args, 3); if (x_stat(tmp_stdout, &st) != 0) { /* The stdout file was removed - cleanup in progress? Better bail out. */ stats_update(STATS_MISSING); -- Nadav Har'El n...@cloudius-systems.com ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache