I'm planning on prototyping this just to generate numbers. I don't think I need permission to do that! But, of course, to incorporate any changes into the code base, we need consensus.
I agree that stout optimizations are outside of the scope of this discussion. Any stout optimizations are orthogonal to PCH, and thus they need not be linked together. Note that stout optimizations may be less "pressing" with PCH, but still it's separate. The fact that PCH may help stout just indicates that PCH is a good thing, particularly on platforms like Windows (where we get to include windows.h, a massive file). Also, I wanted to clarify a message from Benjamin. I did NOT mean to imply that PCH takes 20 seconds to generate. I was simply saying that PCH reads the headers ONCE and generates the PCH. As such, I don't believe that "bloat" is an issue here. In actuality, generating the PCH is about as long as reading them. But you read it once and generate the PCH, you don't read it once for each source file. That's the speed-up for PCH; a ton of header processing is done once. When I used PCH in the past, it took about 4 seconds to read all my headers. That 4 seconds was then subtracted from all the source compilations. That is, 4 seconds to generate, then all the compiles were 4 seconds faster. Regarding Andy's points: 1. I agree, we need a benchmarked prototype. Note that I will only benchmark a particular directory, I don't intend to benchmark EVERYTHING. One directory should give us enough of an idea to see how it works. 2. Maintaining ccache compatibility is a good thing. BUT I don't think it's a hard requirement. If PCH on Linux gives us reasonable performance without ccache, then I don't see a lot of value in maintaining ccache compatibility. Now, that said, I will try to do so (why not?). But I'm not sure if these workarounds for ccache will work on Windows; we'll see during the prototyping stage. 3. Maintaining the correct includes is nice, but not at the cost of compiler speed. I'm not sure if Windows has "multiple include optimizations". I will include this in my prototyping. If it does, then I agree it would be very nice to maintain this. BUT in practice, it will be hard over time. After all, if you include mesos_common.h (either literally or by build system), you may not realize that you're missing an include without that. And I don't think it's "worth it" to build twice to catch this, once with PCH and once without. That's ugly, in my honest opinion. 4. I totally disagree about auto-generating the PCH. We should go through the sources and pick what makes sense. Auto-generating implies that we auto-generate all the time (on every build), and I'd rather not scan the sources during a build (with an associated speed hit) just to try and speed up the build. Let me get some hard numbers under my belt. From that, we can make intelligent decisions about where to go. /Jeff -----Original Message----- From: Andy Schwartzmeyer [mailto:andsc...@microsoft.com.INVALID] Sent: Wednesday, February 15, 2017 1:31 PM To: dev <dev@mesos.apache.org> Subject: Re: Proposal for Mesos Build Improvements Hi, I worked with Jeff on the initial proposal for pre-compiled headers and library refactor. I think this thread should focus on the former, potentially implementing pre-compiled headers, and have a separate conversation on Jeff's original second suggestion of using more libraries inside Mesos. With that in mind, I think we have some requirements for the pre-compiled header implementation. * First and foremost, we need a benchmarked prototype that proves pre-compiled headers provide a considerable speed-up. As the most complex headers are those of the header-only Stout library, we should also benchmark improvements from making Stout non-header-only, and then prioritize; but this will likely be a separate discussion. * We must maintain ccache compatibility, as the majority of Mesos developers already use ccache. It appears the most straightforward way to do this is to _not_ `#include common.h`, but to `-include` it; this fits well with the next requirement. * We must maintain correct includes; i.e. Mesos should be compilable without the pre-compiled header. Because of multiple-include optimization, this should not affect the gains from the use of pre-compiled headers. Again, this fits well with the next requirement. * We should automatically generate the pre-compiled header, as this eliminates manual maintenance. Combined with the above two points, this approach should actually negate the original code-churn problem. By generating a common header to pre-compile, and using `-include`, we will not have to modify existing source files. This would both give us ccache compatibility and ensure that the correct includes would be maintained (and thus can be refactored independently of this work). Did I miss any points, or can we move forward with prototyping this? Thanks, -- Andy ________________________________________ From: Benjamin Bannier <benjamin.bann...@mesosphere.io> Sent: Wednesday, February 15, 2017 12:26 PM To: dev Subject: Re: Proposal for Mesos Build Improvements Hi, > I wonder if we should instead use headers like: > > <- mesos_common.h -> > #include <a> > #include <b> > #include <c> > > <- xyz.cpp, which needs headers "b" and "d" -> #include > "mesos_common.h> > > #include <b> > #include <d> > > That way, the fact that "xyz.cpp" logically depends on <b> (but not > <a> or <c>) is not obscured (in other words, Mesos should continue to > compile if 'mesos_common.h' is replaced with an empty file). That's an interesting angle for a number of reasons. It would allow local reasoning about correct includes, and it also appears to help maintain support for ccache'd builds, https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fccache.samba.org%2Fmanual.html%23_precompiled_headers&data=02%7C01%7Candschwa%40microsoft.com%7C03f9ebaea1e3491c81e908d455e0e8ed%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636227871844766180&sdata=iWmHFa2Zpg%2B9nP7p8rtuJ20dS7k7bVXommvbqfg%2FLuA%3D&reserved=0 For that one could include project headers such as `mesos_common.h` via a command line switch to the compiler invocation, without the need to make any changes to source files (possibly an interesting way to create some benchmarking POC of this proposal). Not changing source files for this would be valuable as it would keep build setup idiosyncrasies out of the source. If we wouldn't change files we'd keep the possibility to make PCH use opt-in. Right now a ccache build of the Mesos source files and tests with warm ccache takes less than 50s on my 8 core machine (a substantial fraction of this time is spent in serializing (non-parallelizable) linking steps, and I'd bet there is also some ~10s overhead from Make stat'ing files and changing directories in there). Generating precompiled headers would throw in additional serializing step, and even if it really only would take 20s to generate a header as guestimated by Jeff, we would already be approaching a point of diminishing returns on platforms with ccache, even if we compiled every source file in no time. > Does anyone know whether the header guard in <b> _should_ make the > repeated inclusion of <b> relatively cheap? Not sure how much information gcc or clang would need to serialize from the PCH, but there is of course some form of multi-include optimization in both gcc and clang, see e.g., https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Fcppinternals%2FGuard-Macros.html&data=02%7C01%7Candschwa%40microsoft.com%7C03f9ebaea1e3491c81e908d455e0e8ed%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636227871844766180&sdata=6eD5zC%2F62TgfS9q9EdCVh%2BLkQ8FqBiLc4VNc%2BR1Zn4k%3D&reserved=0 Cheers, Benjamin