Can you elaborate on your intermediate form? I don't understand how git submodules prohibit or restrict submodule evolution. The only difference I see with the submodule approach is that it requires an extra commit to update the submodule versions (and subsequently a pull followed by a submodule update in other clones), whereas the makefile approach only requires a 'make update' in the umbrella clones. Is there something else I'm missing?
On Tue, Aug 13, 2013 at 2:22 PM, Tony Garnock-Jones <to...@ccs.neu.edu>wrote: > Hi all, > > Matthias asked me to write a few words about an experience I had splitting > a large repository of code up into smaller repositories and then building a > mechanism to tie them together again. > > == A short story == > > Once upon a time, RabbitMQ (www.rabbitmq.com) was held in a single, > monolithic Mercurial repository, including the server, the Java client > library, the .NET client library, the Erlang client library, the protocol > codec compiler, the documentation, adapters for other related messaging > protocols, and so on. > > We decided for various reasons to split the monolithic repository into > separate repositories. The approach we ended up taking was to have a single > repository, the "umbrella", which included a Makefile and a handful of > scripts which checked out, updated, compiled etc. a number of other > repositories from various places. You can still see the umbrella today > here: > http://hg.rabbitmq.com/**rabbitmq-public-umbrella/file/**default<http://hg.rabbitmq.com/rabbitmq-public-umbrella/file/default> > > The workflow for someone working on RabbitMQ is now: > > 1. Check out the umbrella, and `cd` into it. > 2. Run `make checkout`. > 3. Run `make`. > 4. Edit, compile, debug, commit and push in the subdirectories resulting > from step 2. > 5. Occasionally run `make update` in the umbrella. > > (There's also some ugly makefile machinery to do cross-subrepository > dependency tracking to let `make` in a subrepo recompile just the right > things. Mostly.) > > Personally, I frequently use a script, `foreachrepo` (git variant > attached) that lets me operate on all repositories found under the umbrella > at once. For example, > > $ foreachrepo pwd > > would tell me where all the checkouts live, and > > $ foreachrepo git status > > would show me their status. > > When a configuration is found that works nicely and is to be released, a > tag is made across all the currently-checked-out repositories: > > $ foreachrepo git tag my_release_2.3.4 > $ foreachrepo git push --tags > > The split into completely separate repositories, linked informally by > action of a script, worked out well for RabbitMQ, and the RabbitMQ project > seems to be living happily ever after. > > == Comment == > > The problem addressed here is *configuration management*. RabbitMQ takes a > very loose approach to configuration management, where individual modules > evolve independently and are only connected to each other by happening to > be in sibling directories within the umbrella. Tags are used to take a > snapshot of a group of repositories at the same time. > > Another approach to configuration management uses an explicitly > *versioned* manifest, where an umbrella repository names other repositories > *and specific versions* of their contents to pull into scope. This is taken > by systems like rebar, and is essentially how git submodules work. > > You could frame the contrast between the two by saying that the RabbitMQ > approach is essentially *optimistic*, freezing configurations only when > needed, and with occasional frankenconfigurations (when you `git pull` one > subrepo but not one of its siblings) a risk during development, whereas the > `git submodule` approach is *pessimistic*, keeping configurations frozen > until explicitly moved forward into the next frozen configuration. > > An intermediate form could be imagined, where the Makefile checks out > specific versions or branches but otherwise leaves them free to evolve in a > way `git submodule` prohibits. > > Vincent has recently run into issues of configuration management: he > wishes to assemble a specific collection of packages at specific versions > to run a particular application (namely, some benchmarks). > > Others on this list do similar things, assembling specific versions of > libraries into complete applications. > > I think it's interesting that both releasing applications and releasing > the Racket system itself have this problem of describing a collection of > related packages. > > Cheers, > Tony > > _________________________ > Racket Developers list: > http://lists.racket-lang.org/dev > >
_________________________ Racket Developers list: http://lists.racket-lang.org/dev