I've recently been having discussions about how to handle the
repository configuration for various bits of CouchDB post-merge. The
work that Benoit has been doing on the rcouch merge branch have also
touched on this topic as well.

The background for those unfamiliar is that the standard operating
procedure for Erlang is to have a single Erlang application per
repository and then rely on rebar to fetch each dependency.
Traditionally in CouchDB land we've always just included the source to
all applications in a single monolithic repository and periodically
reimport changes from upstream dependencies.

Recently rcouch changed from the monolithic repository to use external
repositories for some dependencies. Originally the BigCouch used an
even more federated scheme that had each Erlang application in an
external repository (and the core couch Erlang application was in the
root repository). When Bob Newson and I did the initial hacking on the
BigCouch merge we pulled those external dependencies into the root
repository reverting back to the large monolithic approach.

After trying to deal with the merge and contemplating how various
Erlang release things might work it's become fairly apparent that the
monolithic approach is a bit constrictive. For instance, part of
rebar's versioning abilities lets you tag repositories to generate
versions rather than manually updating versions in source files.
Another thing I've found on other projects is that having each
application in a separate repository requires developers to think a
bit more detailed about the public internal interfaces used through
out the system. We've done some work to this extent already with
separating source directories but forcing commits to multiple
repositories shoots up a big red flag that maybe there's a high level
of coupling between two bits of code.

Other benefits of having the multiple repository setup is that its
possible that this lends itself to being integrated with the proposed
plugin system. It'd be fairly trivial to have a script that went and
fetched plugins that aren't developed at Apache (as a ./configure time
switch type of thing). Having a system like this would also allow us
to have groups focused on particular bits of development not have to
concern themselves with the unrelated parts of the system.

Given all that, I'd like to propose that we move to having a
repository for each application/dependency that we use to build
CouchDB. Each repository would be hosted on ASF infra and mirrored to
GitHub as expected. This means that we could have the root repository
be a simple repo that contains packaging/release/build stuff that
would enable lots of the ideas offered on configurable types of
release generation. I've included an initial list of repositories at
the end of this email. Its basically just the apps that have been
split out in either rcouch or bigcouch plus a few other bits from
CouchDB master.

I would also point out that even though our main repo would need to
fetch other dependencies from the internet to build the final output,
we fully intend that our release tarballs would *not* have this
requirement. Ie, when we go to cut a release part of the process the
RM would run would be to pull all of those dependencies before
creating a tarball that would be wholly self contained. Given an
apache-couchdb-x.y.z.tar.gz release file, there won't be a requirement
to have access to the ASF git repos.

I'm not entirely sure how controversial this is for anyone. For the
most part the reactions I remember hearing were more concerned on
whether the infrastructure team would allow us to use this sort of
configuration. I looked yesterday and asked and apparently its
something we can request but as always we'll want to verify again if
we have consensus to move in this direction.

Anyone have comments or flames? Right now I'm just interested in
feeling out what sort of (lack of?) consensus there is on such a
change. If there's general consensus I'd think we'd do a vote in a
couple weeks and if that passes then start on down this road for the
two merge projects and then it would become part of master once those
land (as opposed to doing this to master and then attempting to merge
rcouch/bigcouch onto that somehow).


This is a quick pass at listing what extra repositories I'd have
created. Some of these applications only exist in the bigcouch and/or
rcouch branches so that's where the unfamiliar application names are
from. I'd also point out that the documentation and fauxton things are
just on a whim in that we could decouple that development from the
erlang development. I can see arguments for an against those. I'm much
less concerned on that aspect than the Erlang parts that are directly
affected by rebar/Erlang conventions.

    chttpd
    config
    couch
    couch_collate
    couch_dbupdates
    couch_httpd
    couch_index
    couch_mrview
    couch_plugins
    couch_replicator
    documentation
    ddoc_cache
    ets_lru
    fabric
    fauxton
    ibrowse
    jiffy
    mem3
    mochiweb
    oauth
    rebar
    rexi
    snappy
    twig

Reply via email to