---------- Forwarded message ---------- From: Peter Collingbourne <pe...@pcc.me.uk> Date: Wed, Sep 7, 2011 at 9:17 PM Subject: Proposal: restat rules To: ninja-bu...@googlegroups.com
Hi, In this email I'll try to explain one of the oddities of make (which some CMake-based build systems rely on), and why we can't currently express this in Ninja. I'll also propose how we could extend Ninja to support this behaviour. (Warning, long essay ahead.) In the LLVM project we maintain a code generator called tblgen, the purpose of which is to generate a number of header files containing various metadata. From time to time, tblgen and its dependent libraries will be modified, causing a rebuild of tblgen. Strictly speaking, we should now rebuild the header files, and all of their reverse dependencies, even if the generated file did not change. This is suboptimal, because the reverse dependencies constitute every object file built from a source file that transitively includes one of the tblgen generated files (which is >50% of object files in the build). To avoid this problem, we cause tblgen to write its output to a temporary file, and use a utility to copy the temporary file over the target file only if the temporary file is different from the target. In makefile terms, it looks something like this: ----- all: outputuser.o tblgen: tblgen.cpp touch tblgen output.inc.tmp: tblgen touch output.inc.tmp output.inc: output.inc.tmp if cmp output.inc.tmp output.inc ; then : ; else cp output.inc.tmp output.inc ; fi outputuser.o: outputuser.cpp output.inc touch outputuser.o ----- Note what happens during an incremental build where tblgen is the only dirty file, but its output file output.inc.tmp does not change relative to output.inc. make initially schedules a rebuild of tblgen, output.inc.tmp, output.inc and outputuser.o. After output.inc has been rebuilt, its timestamp remains the same as before the build. Before make begins to rebuild outputuser.o, it will re-evaluate the dirty state of outputuser.o based on the timestamps of its inputs (i.e. it will stat them again). Because outputuser.cpp and output.inc are both older than outputuser.o, make doesn't rebuild it after all, despite it being initially scheduled for a build. The behaviour is different in Ninja, which operates in two phases: the scheduling of the build and the build itself. Like make, Ninja will schedule a rebuild of tblgen, output.inc.tmp, output.inc and outputuser.o. Unlike make, it will rebuild targetuser.o, because it does not re-stat inputs during the build. The key observation here is that unlike make, Ninja currently provides no mechanism for pruning the scheduled build graph during a build using a build rule. What I propose for Ninja is that we implement this pruning behaviour in a similar way to make, but only for specific rules with a special variable set on the rule. We can call this variable "restat" (suggestions for better names are welcome). If this variable is present on a rule, Ninja will, after executing the rule command, re-stat each output file to obtain its modification time. If the modification time is unchanged from when Ninja initially stat'ed the file before starting the build, Ninja will mark that output file as clean, and recursively for each reverse dependency of the output file, recompute its dirty status. As an improvement over what make does, Ninja then stores the current timestamp in the build log entry associated with the output file. This timestamp will be treated by future invocations of Ninja as the output file's modification time instead of the output file's actual modification time for the purpose of deciding whether it is dirty (but not whether its reverse dependencies are dirty). To give an example of how this would look, here is the above makefile translated to Ninja: ----- rule touch command = touch $out rule cpifdiff command = if cmp $in $out ; then : ; else cp $in $out ; fi restat = true build tblgen: touch tblgen.cpp build output.inc.tmp: touch tblgen build output.inc: cpifdiff output.inc.tmp build outputuser.o: touch outputuser.cpp output.inc default outputuser.o ----- Now consider what happens when Ninja is asked at timestamp 3 to rebuild the default target (outputuser.o) where tblgen.cpp has timestamp 2 and all other files have timestamp 1. Ninja will schedule a rebuild of tblgen, output.inc.tmp, output.inc and outputuser.o. Again, suppose that the contents of output.inc.tmp are equal to output.inc when built. So after output.inc has been rebuilt, it still has a timestamp 1. Ninja will notice that the timestamp is the same as at the start of the build, and will mark output.inc as clean. This is propagated through to outputuser.o, which is also marked clean. So no further rebuilding is needed. Ninja will also associate the timestamp 3 with output.inc in the build log. Suppose that Ninja is invoked again immediately afterwards. The build planner compares output.inc's timestamp in the build log, 3, against the modification time of output.inc.tmp, also 3, so it is marked as clean. It then compares output.inc's actual modification time, 1, against outputuser.o's modification time, also 1, so it is also marked as clean, and this is a no-op build. There's a small UI issue here in that the total number of files to be rebuilt is unknown during the course of a rebuild until all restat edges are done. There are a number of options for presenting the build status until this happens. I can think of: 1) Show the maximum number of files that could be rebuilt at the current time, and allow this number to drop. 1a) Same as 1, but prioritise the restat edges in order to show a correct total to the user as quickly as possible. 2) Keep the total constant, and treat any skipped outputs as completed. 3) Display a question mark in place of the total until all restat edges are done. 3a) Same as 3, but prioritise the restat edges. I'm leaning towards 1 at the moment, unless the prioritisation turns out to be easy, in which case I'd go for 3a. Thanks for reading... thoughts? Thanks, -- Peter -- +1 919 869 8849
_______________________________________________ Powered by www.kitware.com Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ Follow this link to subscribe/unsubscribe: http://www.cmake.org/mailman/listinfo/cmake