The problems ESR is talking about were first noticed
and worked about 5 or so years ago in the context of
the GNU Arch project. Canonical managed to more or
less kill GNU Arch and so our work was interrupted but
we made some good progress before that.
Here, I would like to do three things:
1. Offer a brief caution as regards ESR.
2. Report on some of the technical insights we
developed in the GNU Arch project.
3. Discuss the current state of play in some
more recent work in areas like this that I
have been doing with some help from the FSF
and Dave Crossland.
That makes this a slightly long message so please
be patient.
* Regarding ESR
I think that it is fair and not too controversial
to point out that ESR has a long history of not being
exactly friendly towards, nevermind supporting, the
movement for software freedom. While I wouldn't go
so far as to say he's been consistently, overtly hostile
to it I think it reasonable to say that he has, at times,
been repeatedly hostile to it.
This does not mean that we should automatically reject
his interest in the issues he's mentioned. It does
suggest that we should be cautious in (a) engaging with
him and (b) allowing him to drive the "agenda" of the
autonomo.us community: is this really the best use of
time right now to worry about project hosting per se
just because ESR brought it up?
* Distributed, Decentralized Project Hosting and Aggregation
Things I learned, figured out, or realized while working
on GNU Arch. These are not Great Secret Truths of the
Universe Revealed Here for the First Time but perhaps it
will save folks some time to have them written down in one
place:
a) Mailing Lists? Consider netnews instead.
A mailing list requires a centralized list of
subscribers, a single authoritative archive,
and centralized moderation. Consider, instead,
using NNTP, not SMTP, and encouraging third parties
to carry your group.
b) Bug tracker? Consider writing a new one, over version control.
We almost got this done before GNU Arch (all bug) died.
Store bug tracker data in version controlled files and
use version control transactions to maintain database
integrity. Of course use a distributed and decentralized
version control system. Choose your data formats so that
peers who elect to do so can automerge their branches of
the bug database without need for manual conflict resolution.
Build indexing and UIs on top of those layers.
This is really part of a larger, more general purpose
design pattern that people don't use enough: use a DDVCS
(distributed, decentralized version control system) with
ancillary indexing and UI whenever you can, in preference
to, say, MySQL.
c) VC? Yes, of course use DDVCS.
Everyone knows that already. Encourage actual peering
rather than de facto centralization stuff like github.
d) Keep your eyes on the prize: distributed decentralized
aggregation.
Projects are projects and OS distributions are aggregates
of projects. A major malfunction of the software engineering
practices in the (so called) "open source" world is that
we generally fail to make the FORM of individual projects
suitable for the FUNCTION of low-cost aggregation into a
complete OS. That is why every major distribution has,
for every individual project, a "shadow maintainer" who works
on packaging that project. One role of the packager is to
force fit the project into a consistent build environment.
Another role of the packager is link the project to
QA procedures, often including "closely held" (unreleased)
systems. We can do much better than that. With better
(not harder, just smarter) standards and practices for such
things as build infrastructure and testing/QA. Instead
of few leading distributions we can count on our fingers
in decimal, using no particularly clever encoding techniques -
we should ideally have many thousand.
* The Bigger Picture
Five years ago I was pretty enthusiastic about what ESR is
talking about - that the way to *start* advancing software
freedom on the web was with project hosting. I went a little
beyond ESR to also think about distribution aggregation.
Of late, I think that the social value of the web lies mainly
in the inter-personal communication of familiars. You know,
that cat video your dad sends you or that latest political
rant your mom sends you or the baby pictures from your cousin?
That stuff is where the greatest social value has yet been
found.
All of those materials - those transcripts of exchanges - those
archives of correspondence .... we can treat that like software
projects and archives. The software architecture patterns
I described for software projects and distribution aggregates
apply here, as well. Those basic messages can be DDVCS'ed.
Systems can control their aggregation along lines of deliberate
peering. Then we've really got something -- software is just a
subset.
Where are "diff" and "patch" for Open Office files? Where
is an NNTP server my mom can run (she's not a programmer)?
Why (what bad practices led to) that suite being so poorly
designed that "diff" and "patch" for its data files is so
hard to imagine? It ain't the problem domain, I'm sure...
This is part of what irks me about ESR's rant is that he's
thinking small and selfishly, he's late to the game, and he
isn't relating it to the bigger picture.
-t
On Sat, 2009-10-10 at 01:21 -0500, Wes Felter wrote:
> http://esr.ibiblio.org/?p=1282
>
> "The worst problem with almost all current hosting sites is that
> they’re data jails. You can put data (the source code revision
> history, mailing list address lists, bug reports) into them, but
> getting a complete snapshot of that data back out often ranges from
> painful to impossible."
>
> http://esr.ibiblio.org/?p=1295
>
> "I conclude that the SourceForge/GForge/FusionForge architecture, as
> it is now, is an evolutionary dead end — overspecialized for
> webbiness. To tackle challenges like fixing the data-jail problem,
> scripting, and seamless project migration, one of these systems will
> need to be rebuilt from the inside out."
>
> Wes Felter
> _______________________________________________
> Discuss mailing list
> [email protected]
> http://lists.autonomo.us/mailman/listinfo/discuss
_______________________________________________
Discuss mailing list
[email protected]
http://lists.autonomo.us/mailman/listinfo/discuss