Hi, I found a bug in ccache, which makes it impossible to correctly interrupt a compilation with a control-C (I tried this on Linux).
Consider the following C++ program from hell that takes 13 seconds to compile on my machine (change "27" to a higher number to make it even slower): template <int TreePos, int N> struct FibSlow_t { enum { value = FibSlow_t<TreePos, N - 1>::value + FibSlow_t<TreePos + (1 << N), N - 2>::value, }; }; template <int TreePos> struct FibSlow_t<TreePos, 2> { enum { value = 1 }; }; template <int TreePos> struct FibSlow_t<TreePos, 1> { enum { value = 1 }; }; static int s_value = FibSlow_t<0, 27>::value; Compile this with: "CCACHE_RECACHE=1 ccache g++ -c example.cc" Now try to interrupt this with control-C and note something really strange: The first control-C is seemingly ignored and compilation continues! The second control-C does work and stops compilation. When this is run from some build system (e.g., "ninja" or "make"), control-C doesn't work at all. The reason why the second interrupt works is simple: you used signal(), whose archaic behavior is to set the signal handler for only one go. So after the first ^C, the second one gets the default signal handling, i.e., exit, which works :-) But why doesn't the first ^C, which calls signal_handler(), work, and cause the compilation to continue? It turns out that signal_handler() doesn't exit after cleaning up. Rather, the log shows: [2015-07-23T15:04:02.088253 21059] Executing /usr/bin/g++ -c -o z.o /home/nyh/.ccache/tmp/z.stdout.rice.21059.nHZLHE.ii <here I pressed control-C> [2015-07-23T15:04:05.847059 21059] Unlink /home/nyh/.ccache/1/a/873cb37b579cd5dd45ca9c43ae8030-644.o.tmp.stdout.rice.21059.9FTbHl [2015-07-23T15:04:05.847138 21059] Failed opening /home/nyh/.ccache/tmp/tmp.cpp_stderr.rice.21059.AQUGJX: No such file or directory [2015-07-23T15:04:05.847147 21059] Failed; falling back to running the real compiler [2015-07-23T15:04:05.847151 21059] Executing /usr/bin/g++ -c z.cc So, after the signal, we delete the temporary file but continue (in waitpid() in execute.c). Very soon afterwards, waitpid() discovers the child also died (it also got the SIGINT signal, like all the processes connected to the terminal's process group). The execute() call returns -1. But the code ignores that, and considers this a general "failure" to run the compiler, which then causes it to run the compiler again! So it's not that the compilation isn't interrupted - it actually is, and then restarted! One way to fix this bug is to recognize that the fact execute() returned -1 has a special meaning (a signal), and ccache should exit and not try to run the compiler again. The following patch fixes the bug. I'm not sure it's the "best" fix, but it works: @@ -839,10 +840,15 @@ args_add(args, i_tmpfile); } cc_log("Running real compiler"); status = execute(args->argv, tmp_stdout_fd, tmp_stderr_fd); + if (status == -1) { + /* The compiler was interrupted by a signal */ + exit(1); + } + args_pop(args, 3); if (x_stat(tmp_stdout, &st) != 0) { /* The stdout file was removed - cleanup in progress? Better bail out. */ stats_update(STATS_MISSING); -- Nadav Har'El n...@cloudius-systems.com _______________________________________________ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache