Re: Smaller and Quicker Releases

Roland Weber Fri, 25 Jan 2008 23:44:52 -0800

Hello Thomas,

> About my idea for two week release cycles. Probably most of you think
> this is unrealistic or bad.


Unrealistic at Apache, yes.

> - Fully automated releases

Apache projects are also running integration builds like Gump [1]
and continuous integration engines like Continuum. [2] You can
download the packages generated by such builds, for example at [3]
for JMeter. And of course you can always grab the source of some
project from trunk and build it yourself. Such packages are called
nightlies or snapshots, but they are not and cannot be _releases_
at Apache.

[1] http://vmgump.apache.org/gump/public/
[2] http://vmbuild.apache.org/continuum/
[3] http://people.apache.org/builds/jakarta-jmeter/nightly/

An official Apache release MUST be reviewed and approved by the
responsible Project Management Committee (PMC). This is one of
the major differences between Apache and a simple OSS project
hoster. It is Apache policy, and projects that don't like this
policy must be hosted elsewhere. PMC review means that pre-release
packages are created and made accessible for the review. The PMC
members and other interested volunteers then access these packages
and check them for formal compliance with Apache requirements and
for technical correctness. Although unit tests and tools like RAT
help with some of these tasks, it is inherently a manual thing.
The review must be done by the _persons_ that are PMC members.
They can use tools, but the cannot delegate their responsibility
towards Apache to a piece of software. A vote is held on publishing
the release, and the votes need to be cast manually. Three days is
the typical voting period for release votes, so that is the minimum
delay for publishing a release.
The formal review includes checking that license headers in the
sources are correct (RAT does that), that the LICENSE and NOTICE
files are in the proper locations, and that their contents matches
the contents of the archive. See [4] for a recent example where
that was not the case. There are attempts to automatically generate
NOTICE files, but the results are unsatisfactory and there has
been a huge discussion [5,6] about it on legel-discuss@ recently.
The technical review can also lead to unpleasant surprises. For
example, the release of HttpCore beta1 this month was delayed by
a week. When pre-release packages became available [7], the reviewers
tried them on different OSes with different Java versions and
different numbers of CPUs, and the testcases that always passed
on the developer machines started failing in some combinations.
We don't have the hardware, software licenses, or administrator
time that would be needed to run a fully automated test suite
in a variety of environments. I believe the same is true for
every other Apache project.
Most committers and PMC members here at Apache are volunteers that
spend their spare time on contributing, and many are active in
several projects. You can't expect the reviewers to go through
this routine every two weeks for a single (or every?) project.

That's for the review part. Preparing a release also takes some
time, as Jukka pointed out. Even if you generate release notes
from JIRA, somebody has to sift through the issues and make sure
they are tagged correctly so the generated release notes match
the package that is being built. Then there should be a paragraph
or two explaining the significant changes since the last release.
The actual publishing includes copying files to dist locations
(can be done by script), updating release numbers and download
links, putting a News item on the web page, and sending out an
announcement mail. Shouldn't take more than an hour, but an hour
of spare time to spend every two weeks is summing up to a lot
of time that could be spend otherwise.

[4]
http://mail-archives.apache.org/mod_mbox/incubator-general/200801.mbox/[EMAIL 
PROTECTED]
[5]
http://mail-archives.apache.org/mod_mbox/www-legal-discuss/200712.mbox/[EMAIL 
PROTECTED]
[6]
http://mail-archives.apache.org/mod_mbox/www-legal-discuss/200801.mbox/[EMAIL 
PROTECTED]
[7]
http://mail-archives.apache.org/mod_mbox/hc-dev/200801.mbox/[EMAIL PROTECTED]


> - No branches
> - Only one file without dependencies

What you are suggesting are techniques from an approach that
is called "Agile Programming". I've seen Agile Programming
at work and function well. You have provided some examples,
too. But the cases where I know it has worked have two things
in common, and your eBay example matches these as well:

1) The projects were at the top of the software stack.
2) The projects were staffed with payed developers.

So eBay rolls out a new release of their software every two
weeks. I bet you they don't update their OS, database,
web server or other lower layers of their software stack
with the same frequency. One golden rule of system administration
is to "Never change a running system". It has a mirror rule
which is "Never run a changing system". Keeping the system
environment stable allows them to test their application
in that environment well. This testing offsets the risk
introduced by the frequent code changes.
Jackrabbit and other OSS projects are not at the top of the
software stack. Users that apply the Agile Programming technique
themselves would be able to integrate frequent releases.
But users that have a more traditional release cycle of
their own software, or that are trying to keep their
software stack stable and minimize the risk introduced
by changes will simply not be able to deal with that.
Say that a bug is uncovered in an older release. The bug
is reported, maybe even with a patch. And you want to tell
them: "Duh, that release is 6 months old. We've had 12 new
releases since, upgrade to the latest one first and if the
bug is still in there, provide a new fix for that."?
I don't believe that most users would find this appealing.

You have suggested to keep default behavior stable to allow
for such updates. You would also need to keep all APIs
stable. That in itself is a problem even for standard APIs.
For example, I had to deal with a change in the Servlet API
(I think from 2.2 to 2.3) where the method signature was
the same, but the path in the HttpServletRequest was split
differently for default servlets in the newer API. To make
matters worse, IBM initially overlooked that change in the
specification and implemented it later in a fixlevel. So
with some fix version of the application server, my code
would suddenly start to produce wrong results. In the end,
I had to implement a configuration switch for the servlet
because on the newer API it was impossible to correctly
detect the case through the standard API.
Now imagine that not only the standardized API, but also
the Jackrabbit specific APIs would have to be kept stable,
along with the default behavior. That in itself is a
contradiction of the Agile Programming model. AP is all
about allowing frequent changes and managing the risk
through good test coverage. You can do that easily at
the top of a software stack, but not in the foundation.

Yes, another golden rule for OSS projects says "release early,
release often". That is necessary to attract a developer
community. But stable branches are created and maintained
for the users with a conservative approach to running their
systems, and I don't think an OSS project with the scope of
Jackrabbit can afford to leave those users behind.

Now to the second point, payed developers. AP requires
excellent test coverage. OSS requires volunteers. You will
find it very hard to attract volunteers if you require
them to provide full test coverage for everything they do,
even if they are just trying something out and know they
will have to change that implementation (and all testcases
for it) later on. Even worse if you want to suspend
development of the project until the test coverage is
over 90%.* Payed developers do what they get paid for,
volunteers will just go looking for a more interesting
project to spend their time on.

*) If you can test 100% of the functionality without
   testing 100% of the code, you've got dead code ;-)

The technical points you raised can be countered too.
No dependencies means that you have to package every
dependency into your file(s). That is OK if you are at the
top of the software stack and want just an EAR file to
deploy on an application server. But it sucks if you
are offering components for others to build their stack.
The users may be on different fixlevels of dependencies
that are used in their own code too. In some cases, it is
also impossible for Apache projects to bundle dependencies
because of licensing issues.[8]

The no-branches approach with "experimental" switches
in the code does not necessarily improve code quality.
But the major problem is the missing code stability
for conservative users pointed out above.

[8] http://people.apache.org/~rubys/3party.html


At this point, I'd like to iterate something that
Jukka wrote in this thread: Thanks for bringing this
radically different idea into the discussion!
Suggestions from a totally different point of view
are a good thing. Even if they are not adopted, they
require reconsideration of the status quo. In this
particular case, I believe that your suggestions
are not applicable to the nature and environment of
the project. That does _not_ mean there shouldn't
be more test coverage ;-)

cheers,
  Roland

Re: Smaller and Quicker Releases

Reply via email to