Hi Bill,
it took longer than expected, but here we are. I took the CMake build
protoype again, that my colleague Martin Hollmichel provided, and gave
it another whirl. The results didn't change much.
In short words:
Of course CMake is able to build OOo, it is also better than our current
build system. But the benefit of using CMake does not appear to be big
enough to make the switch. And there are still some disadvantages in
CMake because it is a recursive system and some open questions. I will
try to explain that all.
It's possible that my summary is caused by my own misunderstanding of
the documentation for CMake that I found in the net and from my lurking
on the CMake mailing list for some weeks. So perhaps you can add some
new points to that evaluation.
Let me start with the last point in your mail as it touches a central
point and perhaps we can get somewhere by shifting the discussion to the
essentials:
Another aspect of CMake that you do not address in your evaluation is
that fact that with CMake you will be "outsourcing" the maintenance of
the build system to the CMake developers.
Sure, that goes without a saying. But we also don't plan to take over
the maintenance of GNU Make ;-).
It's correct that we don't use the "raw" GNU Make, we will use some
macros that provide a higher abstraction layer to avoid some common
mistakes that we found in our current build system. These macros are
what makes this system attractive for us. They provide some important
high-level functionality that GNU Make and CMake don't not have out of
the box. So we had to implement the same macros in CMake's macro
language if we wanted to have the same features. It's unclear to me if
we can do all of them in CMake, but at the end that probably doesn't
matter so much, as even if we could do that, I don't see the benefit of
using CMake instead of GNU Make then, as we had to do the same amount of
work. But I see a potential disadvantage: we would get an additional
layer between our makefile code and the build system that actually does
the work.
The makefiles the developers had to create for the GNU Make based build
system would be comparable to the CMake makefiles I have seen in Martin
Hollmichel's build, so there's nothing that makes a difference in one or
the other direction.
I still think that probably CMake does not provide all features of GNU
Make that we use in our macros. But perhaps it doesn't make sense to
discuss single features without knowing the complete story. I assume
that one can do *nearly* everything we planned to do also with CMake or
one can at least find a replacement that works quite similar. But if one
tries to use all of these things in combination, the result might not be
the same.
So my idea to mention some single plus and minus facts in my summary
obviously was a mistake, as it could be misunderstood easily and
distracted the discussion from the general level to some minor points,
even in total they add to the final result. Without the necessary
context of how we are doing things and how we want to do them in future
the discussion digressed.
So let's focus on the big issues. That should be enough. So those things
that made me add the "nearly everything" in my sentence above.
The first big issue that I have is about dependencies.
Maybe it's me who didn't find the trick, but a forgotten add_dependency
statement gave me a build that didn't break before it started (as it
would do in our projected build system using the abstracting macros),
but somewhere later. And it didn't break if by accident the same
dependency in another makefile (totally unrelated to the "broken" one)
made the build work. This is exactly the unstable behavior we want to
get rid of. It will happen less often than in our current build system,
but it is still possible.
If a build system knows all dependencies, it can use a correct build
order. If it doesn't, it needs help from the developer that now takes
over some of the duties that should belong to the build system. In a
recursive build system that means telling the build system: "you need to
build Y before you can build X. I don't tell you how to build Y, instead
of that I tell you to call make for makefile M, this will deliver Y and
you can proceed with building X. Believe me." This makes building Y a
side effect from the POV of the build system, and from past experience
we dislike that. A build system can do better if provided with the full
information. That's where recursive and non-recursive build systems differ.
Becoming unsusceptible for side effects is important for us and this
susceptibility in our current system was the main reason to consider a
new build system at all. Without writing the same kind of macros that we
have written for GNU Make we wouldn't get that in CMake, as my test has
proven. So, as I wrote already, at the end there's no implementation or
maintencance benefit from using CMake.
The second issue I have is that CMake still is a recursive system.
Though it is better in resolving dependencies than "classical" recursive
systems, it still has the other inherent disadvantages of them. Just to
name the two most obvious ones: it does not scale with the number of
processes as good as a non-recursive system does (especially in case of
partial builds) and it needs to traverse all source directories just to
find out that nothing needs to be built. From our current build system
we know that this is a PITA at least on Windows, where it takes between
7 and 20 minutes for that on standard hardware (quad core CPU),
depending on how "hot" the disk cache is. From our GNU Make based
prototype we estimate that a non-recursive system for OOo, that does not
traverse through the file system but includes makefiles, is ~5 times
faster on Windows. That's a lot.
To my knowledge based on lurking on the mailing list and reading the
documentation on your web site I concluded that CMake can be used
recursively only. I would be glad if you could prove me wrong and show
us that we can use CMake in a non-recursive way too without losing
something else and that we won't run into scalability problems with our
huge project. An example for what that could be: we made some
scalability tests for GNU Make before we started using it and discovered
that it doesn't scale well with the number of rules above ~25000 rules
or so, but that could be fixed easily by using pattern rules as much as
possible.
The biggest problem of a switch to CMake besides the ones already
mentioned is that it does not offer a migration path for us.
Switching our build system to CMake would require a one shot
conversion, including parallel maintenance of makefiles for the whole
duration of the switch (that is estimated to last for several
months).
There are ways to do this. In particular, a recent addition to CMake
call external projects might be useful.
Our current build system is a huge perl program that builds "modules"
(sub projects) by calling dmake processes. It evaluates the dependencies
between the modules and does a lot more. It's comparable to what CMake
does, though CMake does it better.
In our planned non-recursive build system most of the duties of the perl
program just would go away and the replacement for it would be a quite
simple makefile that just includes all other makefiles. Each of them can
be converted from dmake to GNU Make step by step.
Here's how it would look like (example covers 8 modules):
GBUILDDIR := $(SOLARENV)/gbuild
include $(GBUILDDIR)/gbuild.mk
include $(foreach module,\
framework \
sfx2 \
svl \
svtools \
xmloff \
sw \
toolkit \
tools \
,$(SRCDIR)/$(module)/prj/target_module_$(module).mk)
all : $(foreach module,$(gb_Module_ALLMODULES),$(call
gb_Module_get_target,$(module)))
$(call gb_Helper_announce,Completed all modules.)
clean : $(foreach module,$(gb_Module_ALLMODULES),$(call
gb_Module_get_clean_target,$(module)))
$(call gb_Helper_announce,all modules cleaned.)
.DEFAULT_GOAL := all
(Don't look too close at the somewhat long names; it's still an
unpolished prototype.)
The included makefiles could be plugged into this makefile as well as
called from the perl program without changing a single line of code,
just a one-line wrapper is needed to call it in the Perl program. And
each of these makefiles can be executed standalone.
Migrating to CMake would require to first convert the Perl program to a
CMake process and then add each module as an external project,
converting them to CMake later on step by step. This way we had to
reimplement a lot of functionality that in our projected non-recursive
system just wouldn't be needed.
So all in all the migration for a CMake conversion would be more work to
do. Though I meanwhile think that it wouldn't be a one shot conversion
as I wrote in my first mail. I got this impression because I only though
about doing it the other way around (call CMake from the Perl program
until all modules are converted).
Until now the balance says to me: no difference wrt. maintenance and
user make files, more work to do for a conversion to CMake and some
disadvantages due to its recursive nature. Now to some open questions.
Another topic that is mentioned is the fact that builds have to be run
from the top. That is not entirely true. You can have subprojects that
are built on their own. Each add_subdirectory can point to a complete
project. If you cd into the sub directory in the build tree, you can
just run make, and it will only build the targets associated with that
sub-project. Also, you could run cmake on the sub-project by itself if
it is written to work as an independent sub-project.
I never wanted to say that with CMake you always have to run from the
top. But it seemed to me that by organizing the build in a way to allow
more flexibility I lose other things.
Sure, we can create a CMake makefile for each module we want to be able
to build invidually and make it an own CMake project. But of course we
still want to have the full build over all modules, and we need its
result in the same working directory as the build of the individual modules.
The project structure we are aiming at would look this way:
-ooo
--mod1
--mod2
...
--modn
-sun
--sun-mod1
--sun-mod2
...
--sun-modn
We have these two top level projects because we want to build OOo and
our commercial variant from the same source tree, as they share most of
the code.
We want to go to any sub module and build it independently from the
rest, if possible without any external dependencies to other modules. Of
course this would require that all other modules "below" had been built
before at some time. If you ask why building single modules inside the
whole tree is so important: remember the long time it takes on Windows
to traverse through modules where nothing has to be built.
We want to go into any of the two top modules (here named ooo and sun)
and do a complete build, with all dependencies evaluated correctly as we
would get it for a "normal" project without sub projects.
We want to share the working directory between the two top level
projects as we don't want to build the common parts twice, but each of
them must be buildable separately as outside of our Hamburg lab only one
top level project exists.
How do I have to setup a CMake build to support that build structure?
Regards,
Mathias
--
Mathias Bauer (mba) - Project Lead OpenOffice.org Writer
OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
Please don't reply to "nospamfor...@gmx.de".
I use it for the OOo lists and only rarely read other mails sent to it.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.org
For additional commands, e-mail: dev-h...@openoffice.org