On 11/25/14 10:50 PM, Andreas Gal wrote:

Would it make sense to check in some of the libraries we build that we very 
rarely change, and that don’t have a lot of configure dependencies people 
twiddle with? (icu, pixman, cairo, vp8, vp9). This could speed up build times 
in our infrastructure and for developers. This doesn’t have to be in 
mozilla-central. mach could pick up a matching binary for the current 
configuration from github or similar. Has anyone looked into this?

Let me rephrase this request: you are asking for a cache of binary artifacts for the build.

Yes, this is critically important for developer productivity.

Yes, it has been looked at extensively.

Yes, it is achievable.

But, it is a lot of work and the historical low engineering investment in the build system has prevented this from coming to fruition thus far.

Some background.

There are 2 ways you can build your cache: high-level or low-level.

In the low-level approach, you effectively have a globally distributed ccache. This is what Mike Hommey has built in sscache. It's what release automation uses. It even works on Windows. The low-level approach performs per-object lookup when the build system is ready to produce that object. Since the build system e.g. produces .o files then .so files, you must first obtain cached values for the intermediate objects, then you move on to the final objects. This is the nature of a low-level cache.

In the high-level approach, you recognize what the final output is and jump straight to fetching that. e.g. if all you really need is libxul, you'll fetch libxul.so. None of this intermediary .o files foo.

Different audiences benefit from the different approaches.

Firefox desktop, Fennec, and FxOS developers benefit mostly from a high-level approach, as they don't normally care about changing C++. They can jump straight to the end without paying a penalty of dealing with intermediaries.

Gecko/C++ developers care about the low-level approach, as they'll be changing C++ things that invalidate the final output, so they'll be fetching intermediate objects out of necessity.

Implementing an effective cache either way relies on several factors:

* For a high-level cache, a build system capable of skipping intermediates to fetch the final entity (notably *not* make). * Consistent build environments across release automation and developer machines (otherwise the binaries are different and you sacrifice cache hit rate or "accuracy"). * People having fast internet connections to the cache (round trips don't take longer than building locally). * Fixing C++ header dependency hell so when C++ developers change something locally, it doesn't invalidate the world, causing excessive cache misses and local computation. * Writing to a globally distributed cache that is also read by release automation has some fun security challenges. * Having a database to correlate source tree state with build artifacts *or* a build system that is able to compute the equivalent DAG to formulate a cache key (something we can't do today).

There is a lot buried in that bullet list. These problems aren't going to solve themselves overnight.

A low-level cache is achievable. We already have one in ccache and sccache. (ccache is local, sccache is distributed in S3). However, we won't be able to get cache hits from sccache until we reproduce the release automation build environment on local machines. Deterministic, bit-identical builds, anyone? Fortunately, Morgan Phillips is working on making release automation's build environment distributable, unblocking this aspect.

The high-level cache requires separate things. Modern build systems have artifact caches built in. They can jump straight to the end result and skip intermediaries. We can't have nice things with the 30+ year old tool that is GNU Make. Well, we could, it just require us to invent a build mode that short-circuits compilation, linking, etc and manually fetches the final object from the cache. IMO we should build this for Firefox and Firefox OS developers. We kinda/sorta already have this in xulrunner and parts of Firefox OS builds. We know the approach works. We just need to make it turnkey and better integrated with release automation.

I'd like to see us invest in both high-level and low-level approaches, as I believe both audiences are large enough to warrant targeted investment. Historically, we've leaned heavily towards low-level.
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to