It doesn't appear that this is objectionable to anyone. Does anyone have an objection to us having infra/me create these repos to use for the bigcouch/rcouch merge work? This won't affect master or releases until those merges finish.
On Tue, Jan 14, 2014 at 11:02 PM, Paul J Davis <[email protected]> wrote: > > >> On Jan 14, 2014, at 8:37 PM, Benoit Chesneau <[email protected]> wrote: >> >> On Wed, Jan 15, 2014 at 12:22 AM, Paul Davis >> <[email protected]>wrote: >> >>> I've recently been having discussions about how to handle the >>> repository configuration for various bits of CouchDB post-merge. The >>> work that Benoit has been doing on the rcouch merge branch have also >>> touched on this topic as well. >>> >>> The background for those unfamiliar is that the standard operating >>> procedure for Erlang is to have a single Erlang application per >>> repository and then rely on rebar to fetch each dependency. >>> Traditionally in CouchDB land we've always just included the source to >>> all applications in a single monolithic repository and periodically >>> reimport changes from upstream dependencies. >>> >>> Recently rcouch changed from the monolithic repository to use external >>> repositories for some dependencies. Originally the BigCouch used an >>> even more federated scheme that had each Erlang application in an >>> external repository (and the core couch Erlang application was in the >>> root repository). When Bob Newson and I did the initial hacking on the >>> BigCouch merge we pulled those external dependencies into the root >>> repository reverting back to the large monolithic approach. >>> >>> After trying to deal with the merge and contemplating how various >>> Erlang release things might work it's become fairly apparent that the >>> monolithic approach is a bit constrictive. For instance, part of >>> rebar's versioning abilities lets you tag repositories to generate >>> versions rather than manually updating versions in source files. >>> Another thing I've found on other projects is that having each >>> application in a separate repository requires developers to think a >>> bit more detailed about the public internal interfaces used through >>> out the system. We've done some work to this extent already with >>> separating source directories but forcing commits to multiple >>> repositories shoots up a big red flag that maybe there's a high level >>> of coupling between two bits of code. >>> >>> Other benefits of having the multiple repository setup is that its >>> possible that this lends itself to being integrated with the proposed >>> plugin system. It'd be fairly trivial to have a script that went and >>> fetched plugins that aren't developed at Apache (as a ./configure time >>> switch type of thing). Having a system like this would also allow us >>> to have groups focused on particular bits of development not have to >>> concern themselves with the unrelated parts of the system. >>> >>> Given all that, I'd like to propose that we move to having a >>> repository for each application/dependency that we use to build >>> CouchDB. Each repository would be hosted on ASF infra and mirrored to >>> GitHub as expected. This means that we could have the root repository >>> be a simple repo that contains packaging/release/build stuff that >>> would enable lots of the ideas offered on configurable types of >>> release generation. I've included an initial list of repositories at >>> the end of this email. Its basically just the apps that have been >>> split out in either rcouch or bigcouch plus a few other bits from >>> CouchDB master. >>> >>> I would also point out that even though our main repo would need to >>> fetch other dependencies from the internet to build the final output, >>> we fully intend that our release tarballs would *not* have this >>> requirement. Ie, when we go to cut a release part of the process the >>> RM would run would be to pull all of those dependencies before >>> creating a tarball that would be wholly self contained. Given an >>> apache-couchdb-x.y.z.tar.gz release file, there won't be a requirement >>> to have access to the ASF git repos. >>> >>> I'm not entirely sure how controversial this is for anyone. For the >>> most part the reactions I remember hearing were more concerned on >>> whether the infrastructure team would allow us to use this sort of >>> configuration. I looked yesterday and asked and apparently its >>> something we can request but as always we'll want to verify again if >>> we have consensus to move in this direction. >>> >>> Anyone have comments or flames? Right now I'm just interested in >>> feeling out what sort of (lack of?) consensus there is on such a >>> change. If there's general consensus I'd think we'd do a vote in a >>> couple weeks and if that passes then start on down this road for the >>> two merge projects and then it would become part of master once those >>> land (as opposed to doing this to master and then attempting to merge >>> rcouch/bigcouch onto that somehow). >>> >>> >>> This is a quick pass at listing what extra repositories I'd have >>> created. Some of these applications only exist in the bigcouch and/or >>> rcouch branches so that's where the unfamiliar application names are >>> from. I'd also point out that the documentation and fauxton things are >>> just on a whim in that we could decouple that development from the >>> erlang development. I can see arguments for an against those. I'm much >>> less concerned on that aspect than the Erlang parts that are directly >>> affected by rebar/Erlang conventions. >>> >>> chttpd >>> config >>> couch >>> couch_collate >>> couch_dbupdates >>> couch_httpd >>> couch_index >>> couch_mrview >>> couch_plugins >>> couch_replicator >>> documentation >>> ddoc_cache >>> ets_lru >>> fabric >>> fauxton >>> ibrowse >>> jiffy >>> mem3 >>> mochiweb >>> oauth >>> rebar >>> rexi >>> snappy >>> twig >> >> >> I also contemplated this and and I am generally +1 on this. And definitely >> +1 to mirror them on the apache git if possible. I have a couple of >> comments though. >> >> Initially I also had everything separated in its own source repository. 1 >> year ago I merged back as one core repo the couchdb erlang applications and >> put all the dependencies in the refuge repository or in the refuge CDN for >> the spidermonkey and ICU sources. >> >> I merged back as one core repo the couchdb erlang applications because they >> were a little too much dependant. Especially couch_httpd, couch_index and >> couch_mrview. These applications are not yet enough by themselves. >> >> Imo if we split everything in their own apps, then we should make sure >> that couch_httpd can be used without couch_index and couch_mrview (which >> means that "all_docs" is available in couch_httpd). Also we should be able >> to just launch couch without any of the above. And probably without the >> need of an ini. The couch_query_server module thing is an interesting case. >> bigcouch is also introducing `ddoc_cache` which I am not sure why it is >> provided as a standalone app. Does it means it can be replaced by another >> application eventually? Why not having it simply in the couch application? >> Does it needs to be updated separately? >> >> Also all our base applications should also be named spaced correctly so >> they will be strictly identified as erlang modules: "config" is too >> generic, "ddoc_cache" too. Others are probably OK. >> >> There are probably other things that we could provide as apps: >> >> - couch_daemon, >> - couch_js >> - couch_external >> - couch_stats >> - couch_compaction_daemon >> - couch_httpd_proxy >> >> Anyway again i'm +1 for this move, I really think it's a good idea. >> >> - benoit > > I agree on most of this. Roughly I see three general points. > > First, deciding on whether some things are external deps is definitely up for > discussion. Whether couch_mrview is a different app/repo is not necessarily > clear cut. Personally I think I over engineered couch_index which blurs the > lines a bit. If I could wave a wand I'd have just couch_mrview and it'd be > separate. More importantly I think the separate repos makes these things more > apparent. The fact were discussing this sort of architecture thing is > suggestive that it's forcing us to think a bit harder. > > Second is the aspect of composability. For instance the mrview thing to me is > obviously a different repo precisely so a user could import couch (_core?) > directly without requiring the spider monkey dependency. The monolithic repo > doesn't allow this without some very non-standard tooling. > > Thirdly, app naming is always a contention. The config name was actually a > hot code upgrade concern. We couldn't reuse couch_config directly at the > time. And Adam was also hopeful we could the it into a useful non-specific > config app. > > Fourthly, and related to secondly, we'll also want to look at splitting other > apps out as necessary. The ones you listed I think aren't controversial it's > just that no one has done it yet. My list was purely what existed so far > without attempting to carve things up more. I definitely agree we should > carve more in just wanted to cover consensus that carving is the right > direction. > > Fifthly, I'm done typing on my phone. I'll fill in more thoughts tomorrow. >
