Re: [tools-dev] Re: Building OpenOffice.org with GNU make

Mathias Bauer Mon, 07 Dec 2009 01:45:18 -0800

Thorsten Behrens wrote:

> Björn Michaelsen wrote:
>> >  - regarding parallelization, that's surely fixable with much less
>> >    effort in build.pl, no?
>> Currently we are starting one dmake-process per directory and that
>> dmake process does paralellization the directory. Implementing a
>> recursive jobserver that communicates between dmake and build.pl would
>> not only be ugly, it would also be a major effort.
>>
> That's not what I meant. The reason to also hand over a specific
> dmake option for multi-processor builds is that sometimes only one
> or two directories have their actual dependencies satisfied & are
> thus buildable. So it would be ~trivial to only hand "-P16" over to
> build.pl, and let it distribute that across the active dmake
> instances.


build.pl uses module dependencies, not target dependencies, so it has an
inherent susceptibility to bottlenecks. Basically all of our c++ sources
could be built in parallel (as they don't depend on each other), but
with build.pl we always have to wait until header files are "delivered"
or created. And because the dependency granularity level is the "module"
(not a real target like e.g. a library), we can't use as much
parallelization as possible. This becomes even more painful if you don't
build the whole office, but only some parts of it, e.g. in a split build
or if you rebuild several "modules".

A make system with full dependencies and a single dependency tree only
listing real targets, not virtual ones like "modules" does not have
these defects.

> 
>> >  - what kind of dependency tracking is missing in the current system?
>> Those that bite you on compatible builds.
>>
> Ah. That's what I thought. So nothing inherently missing in
> dmake/build.pl, but "just" bugs in the makefiles.

No, not in the make file. The bugs are in the notorious "build.lst"
files. As you surely know, our makefiles don't have useful rependencies
as this would require not only spcifying prerequisites but also rules.
If you link a library against another one, there's nothing in our build
system that tells how to create it in case it doesn't exist. Instead of
that, you have to maintain another pseudo-makefile, the build.lst, that
tells build.pl to "visit" other modules before, so that the case
"prerequisite does not exist but no rule is defined" does not happen.
This is very error prone, especially if the build.lst is changed later
on. And at least on Windows this is quite time consuming.

I was bitten by that quite often in my recent refactoring activities.
The reason is that mistakes can remain unnoticed for a long time: if you
miss a module dependency in a build.lst, the build might work despite
that because the missing module was built "by luck", because another
module, that either is built before or in parallel, has a dependency on
it. In a single process make you would see the mistake as soon as you do
the first build where the prerequite who's rule you forgot to specify is
not present.

To overcome this and to avoid to have information stored in different
places (we have at least n makefile.mk, one build.lst, one d.lst and one
makefile.pmk per "module") makefiles must be able to include others. Our
current system is not able to do that because all makefiles use the same
target names. This is the most remarkable difference of the makefiles in
Björn's prototype.

> Sorry, but
> http://hg.services.openoffice.org/hg/cws/gnu_make/raw-file/2518db232510/sw/prj/target_lib_msword.mk
> then leaves a lot to be desired. There's still loads of redundancy -
> and actually, when compared to a dmake-style makefile without the
> carried-over cargo-cult contents - quite similar.

It seems that you still did not see the major difference that I have
described above.

So then please tell us the "lot to be desired". That's exactly what this
discussion is about. We hope to get input and facts. I also fail to see
redundancy. Due to its nature of a prototype it doesn't factor out
commonly used defines as much as possible (like e.g. compiler flags),
but I think that you know that. So where else do you see redundancies?
Can you do less in a makefile for a library than specifying the name of
the library, the libraries it links against and the object files it is
thought the be built from? The list of include pathes is just caused by
our current structure (and in a final version surely would be done by a
macro that factors out all commonly used patterns).

>> see my other replies here and on the blog. As a sidenote: project files
>> for common IDEs only give you more trouble, if they are one way (and
>> they currently all are): They are a just minor simplification for
>> newcomers for a simple build without changing anything. But I leave it
>> to you to explain to release engineers, why it is their job to translate
>> the changes made by a new dev in his IDE-project back to the cmake
>> source. They will rightfully refuse that. Thus the newcomer will have
>> to fiddle with the makefiles again manually, winning nothing in the end
>> (other than additional confusion and probably some needless
>> communication on mailing lists).
>> 
> Yes. And? It's still significantly lowering the barrier of entry.
> Loads of other projects do it (with the same limitation). And the
> need to add a new file to the project is usually not the first, not
> the second, but the nth thing you do when starting to hack ...

If you think that calling "make" is a barrier, overcoming this is easy.
Many developers have configured their IDE to start the OOo build process
from it. But that's not what I would call an IDE integration.

This would require to make OOo a "project" of the IDE. As Michael Stahl
stated, we have done some first tests, and the results where so
disappointing that we didn't continue. It seems that neither MS Visual
Studio nor NetBeans is able to handle even the single writer project
(writer sources and headers from "solver") with an acceptable
performance. We could do the same test with Eclipse, but I doubt that
any IDE is able to handle larger parts of OOo, let alone the whole
source code. That really would create the "pathetic" performance and the
"ridiculous memory usage" that you are afraid of.

So an IDE can't be the solution for the build system. Of course it's
something that we should keep in mind (and we do) as an addition.
Building smaller, well separated "modules" of OOo in an IDE is surely
possible, and especially being able to build extensions that way is one
of my oldest requirements for our SDK. But this is an addition to a
reliable build system for the whole project, nothing more. I don't see
contradicting, but supplementing requirements and solutions here.

If you think that this is so important, we will be glad to work with you
to get that implemented.

> So stepping back a bit, I should probably mention (again) that I
> find the general idea really great - the make system is  truly
> arcane & leaves a lot to desire. But despite your initial request for
> input, plans & thoughts, I cannot but have the impression you're
> already quite determined to follow the outlined route - sadly the
> build system has seen quite a few attempts for change; and survived
> all of them ~unmodified, thus some extra thought & consultation is 
> surely advisable ...

No, really, there's nothing nailed until now. If you or anybody else
knew a better way and(!) offered help and cooperation, there's nothing
that would hold us back from doing it differently. The more people get
involved and the broader the consense, the better. Currently we have
only published our most important requirements and a prototype for how
they could be fulfilled.

> From my humble point of view, what has usually worked best in OOo
> land is some iterative approach to change; which in this case & my
> opinion would mean cleaning up makefiles one by one, either using a
> declarative DSL directly that could later be mapped to gnu make or
> whatever tool we see fit, or - using a meta-language like automake
> cmake, or something homegrown, *still* use dmake for the while, and
> then, after some critical mass has been attained, switch the make
> tool wholesale (and adapt the metalang-to-makefile generator).

That iterative approach is exactly what we want to do also. At least the
GNU make builds fit nicely into the build.pl process, as Björn already
has tested, we can convert each module step by step. Of course the big
benefit will require to do the whole thing at the end.

> Additionally, and since you mentioned the desire to have only one
> make instance - last time someone tried to have gnu make hold all of
> OOo's dependency tree in one process, that guy (Kai Backman) ended
> up with absolutely pathetic performance & ridiculous mem usage. 

IIRC Kai didn't use GNU make, but jam (that according to Kay Ramme is
not able to deliver what we need). And what was "ridiculous mem usage"
seven years ago, nowadays is probably just a little bit more than average.

The scalability tests Björn has done for GNU make made us think that
this won't be our problem. Most of our dependencies are header
dependencies. The additional dependencies on other prerequisites are
important (and - as I wrote - are the PITA in our current system) but
not so numerous. So what we have seen in our prototype (where the header
dependencies have been pretty complete already) didn't show "pathetic"
performance.

> That's why he went for bjam ...

Can you explain if bjam is able to fulfill the requirements Björn and I
have mentioned and what else it can do better than GNU make? Or can you
 at least explain why you perhaps prefer it?

Regards,
Mathias

-- 
Mathias Bauer (mba) - Project Lead OpenOffice.org Writer
OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
Please don't reply to "[email protected]".
I use it for the OOo lists and only rarely read other mails sent to it.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [tools-dev] Re: Building OpenOffice.org with GNU make

Reply via email to