In this post, I'm going to ramble a little about how abuild handles
multiple platforms. I'm sorry that this post is very long and
complicated. It is probably not possible to comprehend all this
material in one pass. It is also described in the abuild manual. As
always, my purpose of explaining this is to try to make sure some of the
issues I ran into are on the radar for gradle developers. Abuild's
platform support is pretty mature since I had plenty of prior experience
dealing with the issues, though it still has some loose ends. The vast
majority of the complexity in abuild's platform support is there to
support cross compilers, embedded platforms, link-time incompatibility
across different compilers or compiler options, and other issues that
may be critical to enterprise builds but probably won't affect most
small projects. It's also very likely that I have overcomplicated this
and that a simpler solution may present itself. So please don't walk
away from this thinking that I am suggesting that gradle's platform
support has to be as complicated as abuild's. I hope it doesn't, and I
know it doesn't have to be at first.
Many of the ideas here are good, and a few are probably not so great.
I'll point out where I am aware of weaknesses. I don't know for sure,
but my impression is that the concept of platforms is not baked into
gradle in the way it was baked into abuild since platform considerations
are not as integral to a java-based system. I think platforms have to
be first class in a build system that is really going to do justice to
C++ and other source-to-object types of environments, though a lot of
these concepts don't have to be there for a system that's good enough to
handle most common cases.
In abuild, platforms are divided into three layers: "target type",
"platform type", and "platform." There are only three target types:
platform-independent (which I'll call "indep"), Java, and "object
code". In retrospect, I'm not sure there was really any value in
separating java and indep. I had originally done this because I thought
I wanted to support different Java versions as different platforms so,
for example, you could have some parts of the system you built with only
Java 5 and some that you built with Java 5 and Java 6 with the
constraint that Java 6 things could use Java 5 artifacts but not the
other way around. In practice, we never actually did this. I won't
talk anymore about abuild's Java handling though since that's gradle's
native territory. I'm just mentioning it as an explanation for why
abuild separated java and indep. So the target type basically covers
the "build paradigm". I'm not aware of anything that wouldn't fit into
either the object code or the java/indep paradigm. The core
capabilities of abuild are the same for all the target types with the
exception of the fact that all abuild's platform logic is only enabled
for the object code target type. I'm not even sure this distinction is
really necessary...it may have been sufficient just to classify each
build item as platform-dependent or platform-independent. Originally I
had some capabilities of abuild's interface language that enabled
variables to have a scope limited to the target type, but in practice,
that feature was never useful. So, on top platform types and
platforms. The rest of this message will focus only on object code builds.
In abuild, the platform type is what the developer specifies for a build
item. The main use of platform types is to distinguish native builds
from cross compiled builds, but they can also be used for other
purposes. Abuild has three built-in platform types, one for each target
type. The built-in types are java, indep, and native, where native is
the type for object code platform types. You can use plugins to add
platform types to abuild, but only for the object code target type.
Platform types are hierarchical in that it is possible to say that an
item with platform type "A" is allowed to depend on an item with
platform type "B" but not the other way around. Abuild always lets you
override this with platform-specific dependencies, but its automatic
resolution of dependency platforms obeys the hierarchy. For example,
all platform types are allowed to depend on the built-in platform type
"indep", but items of type "indep" aren't allowed to depend on
platform-specific items without an override. A more interesting use of
platform types shows up when supporting cross compilers. The main
system that used abuild included a number of components targeted at
VxWorks, a real-time operating system that we used for code running on
some embedded hardware. VxWorks is its own operating system, much like
UNIX or Windows, which has certain ways of doing things that are going
to be common across also VxWorks systems, at least of a particular major
version. Then, you also have "board support packages" which include
specific drivers or additional calls that may be specific to a
particular board's special capabilities. Typically a vxworks system may
have the significant majority of its code not dependent on any
particular board-support package, but a little bit of code may have to
be written separately for each supported board. The way you would do
this with abuild is to declare a platform type "vxworks" and then child
platform types for each board support package (bsp), like
"vxworks-bsp1", "vxworks-bsp2", etc. (but named sensibly after the
actual board). A board-specific piece would be declared to be in one of
the bsp-specific platform types. That item could depend on anything in
the straight "vxworks" platform type without having to do anything
special. So if something in vxworks-bsp1 and something else in
vxworks-bsp2 were both implementations of the same interface, the
interface package could be declared to be in vxworks and both
implementations could depend on it.
It would also be necessary in many cases to have a facade wrapped around
board-specific packages so that some generic code could depend on
something board-specific and automatically use the most appropriate
implementation. To achieve this, abuild allowed you to create what I
called a "pass-through build item" which didn't declare a platform type
at all but that depended on multiple items of different platform types.
Suppose you have if A1 and A2 of types vxworks-bsp1 and vxworks-bsp2, B
without a declared type (so it's pass-through), and C1 and C2 are
vxworks-bsp1 and vxworks-bsp2. A1 and A2 can both depend on B, which is
some facade around the C items, and abuild will automatically set up the
appropriate dependencies. In many cases, it would be better not to use
separate platform types for these but instead to just use some kind
conditional compilation or autoconf-like tests to separate the code
based on board support package, but the problem is that sometimes you
actually have completely different compilers or at least different
invocations of the compiler that are needed for different board support
packages.
It's also possible (and common) for items to have multiple platform
types as long as they are all object-code types, so some code that works
on both native and vxworks could declare both platform types. A
native-only item or a vxworks item could both depend on something
declared with both platform types. This would be typical for utility
code or anything that's not platform-specific, which should hopefully be
most of the system.
Platform types are static in the build. They can be added to abuild
with plugins, which are resolved very early in abuild's life prior to
trying to compute the dependency graph, but the idea behind platform
types is that nothing dynamic about the system can change the shape of
the dependency graph with respect to platform types. In fact, platform
types are added by creating a text file that lists them. The reason for
this restriction is that I didn't want to have a situation where the
dependency graph would resolve properly in some environments and not in
others. Abuild basically won't let you do anything if there are errors
in the dependency graph, and I didn't want it to be possible for people
who weren't working on the embedded platforms, for example, to introduce
errors that wouldn't affect them but that would break the build entirely
on some other platforms. This was not one of the most loved abuild
features, but I think it was pretty important. If abuild had had a more
robust way of going into some kind of degraded mode in the presence of
errors in corners of the dependency graph, this restriction would not
have been necessary. Of course, if gradle's platform system is less
complex, this issue may also be avoidable. Hopefully gradle can be a
little smarter about this than abuild was. However, in practice, this
didn't cause much of a problem. Resolving platform type dependency
errors was almost always a simple matter of either adding an additional
platform type to the list of types an item was allowed to build on or
creating a new pass-through item, and systems without embedded,
cross-compiled pieces could largely ignore all this.
The last layer of the platform system was the platform itself. I will
say that people got very confused about the distinction between platform
and platform type, so may better terminology could have been used.
Anyway, while platform types are statically declared and are intended to
describe the scope of portability, if you will, for a given piece of
code, platforms are tied to the the specific toolset used to build
something. So, while the platform type might be something like
"native", the platform might be something like "g++ on Linux" or "Visual
C++ on Windows". (I'm oversimplifying a little, but that's the idea.)
Note that abuild explicitly did not use platform types for Linux vs.
Windows or other OS differences that would be better handled with tools
like autoconf. autoconf support was present in abuild from day one, and
I can't imagine trying to do anything real without it. Abuild's use of
autoconf was not usable to specify which compiler to run, however; that
was built into abuild's platform support. (My next post will be about
compiler support.) Abuild's autoconf support let you use autoconf to
figure out things like which version of standard library calls you might
have available, and a whole host of other kinds of things that people
need to test for to write portable code. Abuild itself used autoconf to
find the location of the boost C++ libraries and to make sure that a
version with the asynchronous I/O library was present, for example. At
runtime, abuild figures out all the platforms that are available on a
given platform type and then figures out which platforms to build for
each build item. The idea of platforms is that the platform defines the
boundaries of link compatibility. For example, C++ code is generally
not link-compatible across compilers because the way compilers do
name-mangling is not specified. (C code is usually linkable across
compilers, but abuild doesn't let you do that very easily, which is a
weakness. You just end up building C code with multiple compilers.)
Abuild is a little too hard-lined about platforms in some cases. For
example, on Windows with msvc (Microsoft Visual C++), "release" and
"debug" for a given compiler are different platforms because you almost
always have to compile separately for release and debug and you
generally can't mix and match because of which runtime libraries have to
be shipped with the code, among possibly other reasons. With Linux and
gcc, mixing debug and non-debug code is harmless and usually works fine,
and the -g flag, which adds debugging symbols, only affects the symbol
table and not the code. However, in practice, it's often convenient to
build debugging versions of code with compiler optimizations turned off
because such code is easier to look at in a debugger.
To understand how abuild uses platforms, it is necessary to understand
how it constructs the dependency graph. Abuild maintains two dependency
graphs: one that is platform-agnostic and the other that is
platform-aware. The platform-agnostic dependency graph can be built
statically and looks the same regardless of what platform the build runs
on. It knows about platform types and allows any build item to depend
on any other build item as long as the platform types are compatible.
The platform-agnostic dependency graph is resolved very early in the
build, and it must be error free and acyclic for abuild to do anything.
Once abuild knows which build items it is going to build, it then
figures out which platforms to build them on based on the platform
types. A platform type usually resolves to multiple platforms. At
least, for example, you'll get the debug and release platform for a
specific compiler. Actually, by default, abuild offers three variants:
debug, which is debug flags without optimization, release, which is
optimization without debug flags, and "normal" which uses both debugging
and optimization, which is the most useful setting for light debugging,
and which for most Linux distributions, is the normal way code is
compiled, particularly since -g just affects the symbol table, so
stripped binaries look the same with or without -g. Once abuild
resolves platform types to platforms, it constructs a "build graph"
where the nodes, rather than being build items, are built item/platform
pairs. So for example, let's say you have to compilers: gcc and xlc,
which is IBM's commercial compiler for AIX and also for Linux on the
powerpc platform. If A and B are both platform type "native" and both
gcc and xlc offer three variants for compilation, the static dependency
graph is just A -> B, but the build graph is A/gcc-debug -> B/gcc-debug,
A/gcc-release -> B/gcc-release, A/xlc-debug -> B/xlc-debug, etc. By
default, abuild builds each build item on only one platform depending on
what flags you invoke it with, so for a release build, it would only
build the release platforms, and for a debug build, it would only build
the debug platforms. You can tell abuild to try to build on all
platforms, in which case it will compile most things multiple ways,
potentially all the at the same time. For simple "native" cases, the
build graph usually is isomorphic with the regular graph, but there are
other cases. If a build item is available on multiple platforms and a
non-default platform is needed to resolve a dependency, abuild will
build that item on that platform to satisfy the dependency. So certain
build item/platform pairs can be built on-demand as needed. This is
actually a pretty unusual case, and the way it can happen is based on
the scope of the platform plugin, so maybe this is just a non-issue for
gradle at least for a while. It can also happen if you force a platform
by overriding abuild's default behavior. Typically this would happen in
the case of a particular tool that has to be compiled with a certain
compiler. For example, maybe you have a CPU with special math
operations and a special compiler that supports those math operations to
give you much higher performance. Let's say the build defaults to gcc,
A, B, and C can both build with gcc or xlc, and A and B both depend on
C. If you did a build, abuild would just build all three items with gcc
and leave it at that. There would be no reason to also do the xlc
build. Suppose instead that you had overridden the platforms for A and
were forcing A to build with xlc because it had some behavior that would
only be useful when built with xlc. Now abuild will still build B with
gcc only, but it will build C with both gcc and xlc since the xlc build
of C is needed to satisfy the build graph dependency of "B/xlc ->
C/xlc". There may be cases where a node in the build graph can't be
resolved. Unlike with the platform-agnostic dependency graph, errors in
the build graph are just local failures. If a desired node in the build
graph is not possible to build, abuild just treats it like that item
failed, which usually just means that its upstream dependencies won't
build but the rest of the build can continue. It's not theoretically
possible for the build graph to contain cycles if the main graph
doesn't. Both build graphs are directed acyclic graphs just like with
gradle.
There are two more aspects of this worth mentioning. One is that abuild
creates a separate output directory for each build item/platform pair.
This is just a subdirectory of that item's source directory. (For C++
builds, you typically just have sources at the top level and object code
either with the sources or down one level. You don't have something
like source/main/C++...if you did that, people would know you were a
Java build system trying to build C++ code, though maybe there's a case
to be made for that type of layout in a multi-language system.) In any
case, one important aspect of the way abuild handles platform-specific
output directories is that it actually does the build inside that
directory (with current directory set there). In other words, rather
than invoking gcc from the source directory with gcc -c sourcefile.cc -o
output-directory/sourcefile.o, it runs gcc from the output directory
with gcc -c ../sourcefile.cc -o sourcefile.o. This is really important
because gcc (and most other compilers too) likes to create temporary
files in its current directory. Typically a C++ compiler may first
compile the code into assembly language and then call the assembler.
Abuild can always build item/platform pairs in parallel based on the
dependency graph, and since A on platform 1 will basically never depend
on A on platform 2, those two variants can almost always be compiled in
parallel. Running two compiles of the same source files in the same
directory at the same time will almost always cause trouble when there
are collisions in temporary files. Such problems will be very hard to
discover and will not be easily reproducible. By running the compilers
with their current directories set to the output directory, abuild
avoids this problem entirely. If you want parallel builds to work, I
strongly recommend going with this approach. Note that running multiple
invocations of the compiler in the same directory to compile different
source files is safe and common. Just build them same file two
different ways at the same time in the same place is not safe. (So you
can run gcc -c a.c and gcc -c b.c together but not not gcc -c -g a.c and
gcc -c -O2 a.c.)
Finally, there's the issue of native vs. cross compiler builds. This is
an area where abuild is pretty good but not perfect. The most important
reason to distinguish between native compiler and cross compiler builds
is that, generally speaking, you can't run the product of a cross
compiled build on the build platform. This is important for at least
two major reasons: autoconf tests are restricted to compile tests rather
than run tests (so, for example, using autoconf to determine byte order
at build time doesn't work), and also that you can't generally run unit
tests. There are certain things abuild won't try to do when you are
doing a cross compiled build, and there are also differences in how it
invokes autoconf, which needs to be told when you're doing a
cross-compiled build. There are a few important cases where this
becomes overly restrictive. One case is where you have an emulator or
test fixture that enables you to treat cross-compiled code as native for
purposes of running tests, and the other is the 32-bit vs. 64-bit case.
It is pretty normal for both 32-bit systems and 64-bit systems to be
able to build both 32-bit and 64-bit code, but while 64-bit systems can
run 32-bit code, 32-bit systems can't run 64-bit code. Abuild uses
different platforms for 32-bit and 64-bit compiles, since 32-bit and
64-bit code are not link-compatible. It does not, however, have the
ability to build both 32-bit and 64-bit code natively in a single
invocation. You have to tell abuild which one is native for a given
build. If I were going to make an abuild 2.0, this would be one of the
things I would fix. In practice, however, if you just worry about
building 64-bit code on a 64-bit system and building 32-bit code on a
32-bit system, then you don't have to think about this issue at all.
One could imagine a general case where you have a mixture of native and
embedded code and where all your build systems have all the native and
cross compilers installed so you could do the build anywhere but where
only some systems have the required toolchains to kick off tests on the
embedded platforms or to run embedded code in an emulator for unit
testing. This is not likely to happen in a small, open source project,
but it will be common in an enterprise environment. To my knowledge,
there are no build systems that handle these cases as easily as abuild,
even though abuild is far from perfect. Maybe someone reading this will
know of something different, but in my experience, people who have these
kinds of complex requirements end up just coding a bunch of ad hoc build
scripts or makefiles that are hard to maintain. Abuild's platform
system enables this kind of stuff to be handled almost transparently,
and it would be great to see gradle able to do this eventually as well.
For example, in my last job, we had a project that was switching is
vxworks build from a powerpc platform based on one version of vxworks to
an Intel platform based on another version of vxworks, and they were
dreading updating over 200 makefiles to make the switch and changing all
their build scripts to run on intel instead of ppc. I spent about 30
hours converting their existing build over to abuild and proving that
the results were the same. Then it took me about an hour to write the
toolchain configuration files for the new board support packages, at
which point their build "just worked" on the new platforms while
continuing to work on the old platforms at the same time. The only
thing I had to do was add new vxworks-bsp* platform types for the new
boards and create one pass-through items for the board-specific libraries.
In my next post, I will describe what parameters abuild requires for
specifying a compiler toolchain. That post will be much simpler than this.
--Jay
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email