On Tue, Jun 20, 2006 at 11:46:35AM -0400, Laszlo (Laca) Peter wrote:

<snip>

Thanks a lot for writing this up; I think it will serve as a valuable
guide for future contributors.  A few thoughts inline.

> FWIW, I do believe that this system works great for ON and other
> in-house developed projects where developers can update the Makefiles
> and the pkgmaps when they introduce a new header or library or source
> file.  However, for Open Source packages we have to do this for the
> community developers: lots of possible mistakes and in fairness, they
> know it better than us, so it's better to use their Makefiles.

In general, that's logical, but unfortunately they also make a lot of
assumptions that aren't valid for an integrated offering like
OpenSolaris.  They may use nonstandard (or differently-standardised)
directories, some of which may not be possible to change at build
time; they may deliver private headers that should not be delivered at
all; they may overwrite existing files in the shared proto area,
making package construction difficult.  Using a per-component proto
area is helpful in solving this last problem, but it has two major
drawbacks as well: it makes it effectively impossible to manage
intra-consolidation, inter-component flag days (because you can't
build dependents against their current dependencies), and it requires
additional work to deliver packages consisting of parts or all of
multiple components (such as SUNWsfwhea, unfortunate though that
package may be).  I'd be interested to understand how you would
address these shortcomings: the objective is to end up with a single
internally-consistent shared proto area with known contents that can
be packaged in arbitrary ways, without requiring root privileges
during the build.

> In my day job, I work on JDS and I maintain the JDS build system,
> so allow me to make a comparison.  (I'm obviously biased, but I'll
> try my best to be fair.)  Our builds are also based on community
> tarballs and source patches.  In JDS, we use the Makefile system
> supplied by the originating community, which usually means
> configure; make; make install.  The "make install" part
> is redirected (using the DESTDIR Makefile variable) to a per-package
> proto area[3].  This means that by default, any files that the community

We've often suggested that make install (when available) is the better
route for the Companion, and perhaps even for SFW, but only once its
entire effects are thoroughly understood.  For example, many makefiles
are broken and attempt (and succeed, if building as root) to install
files to /, ignoring DESTDIR or similar variables.  Other more subtle
brokenness is common as well.

> maintainers intended to install are installed, and to the correct
> (relative) locations.  Then we remove any files that we decided not to
> deliver, for example lib*.a, lib*.la.  This is the opposite of SFW's
> philosophy, where we decide which files we wish to deliver and deal with
> them one by one.  We are now ready to create packages.  Pkgmaps are
> created dynamically from glob lists.

And here's where we really come to the crux of the philosophical
differences between JDS and the rest of the Solaris organisation,
especially but not only ON.  The procedure you're describing is one
which asserts that "upstream is assumed to be correct" and if in
doubt, the effects of using the autotools-based build system are
assumed to be desirable.  That's perfect for compiling what the GPL so
eloquently terms "mere aggregations" of software, but it's dead wrong
for building a tightly integrated polished product.

As a concrete point, why should the default be to include all new
files built by a component's makefiles?  Shouldn't we at least know
what we're shipping to customers?  And if we know what we're shipping,
is it so much to ask that we explicitly specify it (in packaging
files, if not in install-sfw or equivalent) so that upon future
updates engineers are reminded that this is *their* product, and they
are expected to know what *they* are shipping to customers (to say
nothing of noticing the changes and perhaps making necessary changes
elsewhere in the system and/or alerting customers to them)?  Why
should change be so cavalier?  It's bad enough that huge portions of
Solaris - a Sun product, with the Sun brand affixed to it - go out the
door without anyone except (possibly) the authors having read the
code, but to dispense with even the rudiments of change control in the
name of expediency is unconscionable.

A perfect example of this is 6434055 python 2.4 SSL support is
missing.  Had JDS been using the SFW approach to delivering software,
the missing ssl.so would have been immediately noticed (your build
would fail), and the defective diffs would have been readily apparent.
If the ON approach were in use, it would probably have been impossible
from the beginning, since you would have been maintaining makefiles
that you wrote yourself, forcing you to understand both the makefiles
themselves and at least the rough outline of the product you're
building.  But this bug highlights an even worse problem: how do we
know there aren't other pieces of functionality missing?  The simple
answer is that we don't, and can't.  Even construction of a
comprehensive after-the-fact test suite is made impossible by the
assumption that the software itself is a moving target, changing in
ways we don't take the time to understand.  If a test that passed with
the previous delivery fails with the new, do we write this off to
mistaken assumptions in the test, or do we assume the new code is
broken?  Without understanding the intended changes to the code, it's
simply not possible to know.

This all boils down to pride of ownership.  ON is so good in large
part because the people who write it deliver it under their name and
their brand, and are held individually accountable for it by their
peers.  That just doesn't happen with GNOME, because we're simply
accepting a block of code without considering its suitability for our
purposes, without understanding its technical characteristics, and
without taking in it any pride of ownership.  This isn't your fault;
it's the result of unfortunate business decisions, and it has little
or nothing to do with OpenSolaris (which will have the exact same
problems any time third-party code is incorporated without due
consideration for its correctness for *Open*Solaris).  But it's still
true.

> The above steps are basically the same for all packages and all versions,
> so updating to a newer version is often as simple as changing the
> version number in the spec file[4].  Once the packages are built, it's
> the same in both systems: we need to test the new packages.

But how do you test them?  You've already said that you just assume
whatever files the makefiles install are the right ones, less any you
already know we don't want.  How can you possibly test new
functionality, since you don't know what it is?  And if you do know
what it is, why is it so onerous a burden to list the files that
provide it?

> I realise we have long-standing traditions at Sun for doing things
> a certain way and it's difficult to change.  I'm just offering my
> experiences and recommend that the SFW and CCD builds be made more
> automatic and more convenient.  In it's present form, SFW/CCD isn't
> encouraging or inviting contributions, because of the tedious work
> involved.[5]

Inviting contributions is not an end of itself but a means to an end.
The goal, ultimately, is to make the software delivered as OpenSolaris
(and, for Sun employees, as Solaris) the best it can possibly be for
the people who use it.  Inviting more people, with more knowledge and
skill and talent, to contribute to the software they use, and by doing
so to improve it, can be - and, I believe, is and will remain for
OpenSolaris - a huge win for quality.  But it is not itself the goal;
if a tradeoff exists between making work easier (so that more people
will want to do it) and ensuring that work is of acceptable quality,
quality must always take precedence.  I'm in no way suggesting that
the SFW strategy is ideal, or even that it's the right one; it's, as
you've noted, time-consuming, obscure, and in some cases even
error-prone.  But the JDS solution is worse - it represents a belief
that saving engineers time and effort is more important than
delivering a well-understood, high-quality product.  That philosophy I
can never accept, and will for as long as I can type use every means
within my power to contain and extinguish.

Instead, we need to look for ways to invite contributions to SFW (and
the Companion) within a framework that supports the delivery of
quality, well-understood software, and those which would improve that
framework.  If the most inviting acceptable framework we can
collectively devise is still unattractive to would-be contributing
engineers, so be it.  Quality is a constraint; participation is a
goal; participation in improvement of quality-oriented processes is
welcome.

-- 
Keith M Wesolowski              "Sir, we're surrounded!" 
Solaris Kernel Team             "Excellent; we can attack in any direction!" 

Reply via email to