On 16 Jan 2014, at 20:42 , Paul Davis <[email protected]> wrote:
> It doesn't appear that this is objectionable to anyone. Does anyone > have an objection to us having infra/me create these repos to use for > the bigcouch/rcouch merge work? This won't affect master or releases > until those merges finish. no objections. Jan -- > > On Tue, Jan 14, 2014 at 11:02 PM, Paul J Davis > <[email protected]> wrote: >> >> >>> On Jan 14, 2014, at 8:37 PM, Benoit Chesneau <[email protected]> wrote: >>> >>> On Wed, Jan 15, 2014 at 12:22 AM, Paul Davis >>> <[email protected]>wrote: >>> >>>> I've recently been having discussions about how to handle the >>>> repository configuration for various bits of CouchDB post-merge. The >>>> work that Benoit has been doing on the rcouch merge branch have also >>>> touched on this topic as well. >>>> >>>> The background for those unfamiliar is that the standard operating >>>> procedure for Erlang is to have a single Erlang application per >>>> repository and then rely on rebar to fetch each dependency. >>>> Traditionally in CouchDB land we've always just included the source to >>>> all applications in a single monolithic repository and periodically >>>> reimport changes from upstream dependencies. >>>> >>>> Recently rcouch changed from the monolithic repository to use external >>>> repositories for some dependencies. Originally the BigCouch used an >>>> even more federated scheme that had each Erlang application in an >>>> external repository (and the core couch Erlang application was in the >>>> root repository). When Bob Newson and I did the initial hacking on the >>>> BigCouch merge we pulled those external dependencies into the root >>>> repository reverting back to the large monolithic approach. >>>> >>>> After trying to deal with the merge and contemplating how various >>>> Erlang release things might work it's become fairly apparent that the >>>> monolithic approach is a bit constrictive. For instance, part of >>>> rebar's versioning abilities lets you tag repositories to generate >>>> versions rather than manually updating versions in source files. >>>> Another thing I've found on other projects is that having each >>>> application in a separate repository requires developers to think a >>>> bit more detailed about the public internal interfaces used through >>>> out the system. We've done some work to this extent already with >>>> separating source directories but forcing commits to multiple >>>> repositories shoots up a big red flag that maybe there's a high level >>>> of coupling between two bits of code. >>>> >>>> Other benefits of having the multiple repository setup is that its >>>> possible that this lends itself to being integrated with the proposed >>>> plugin system. It'd be fairly trivial to have a script that went and >>>> fetched plugins that aren't developed at Apache (as a ./configure time >>>> switch type of thing). Having a system like this would also allow us >>>> to have groups focused on particular bits of development not have to >>>> concern themselves with the unrelated parts of the system. >>>> >>>> Given all that, I'd like to propose that we move to having a >>>> repository for each application/dependency that we use to build >>>> CouchDB. Each repository would be hosted on ASF infra and mirrored to >>>> GitHub as expected. This means that we could have the root repository >>>> be a simple repo that contains packaging/release/build stuff that >>>> would enable lots of the ideas offered on configurable types of >>>> release generation. I've included an initial list of repositories at >>>> the end of this email. Its basically just the apps that have been >>>> split out in either rcouch or bigcouch plus a few other bits from >>>> CouchDB master. >>>> >>>> I would also point out that even though our main repo would need to >>>> fetch other dependencies from the internet to build the final output, >>>> we fully intend that our release tarballs would *not* have this >>>> requirement. Ie, when we go to cut a release part of the process the >>>> RM would run would be to pull all of those dependencies before >>>> creating a tarball that would be wholly self contained. Given an >>>> apache-couchdb-x.y.z.tar.gz release file, there won't be a requirement >>>> to have access to the ASF git repos. >>>> >>>> I'm not entirely sure how controversial this is for anyone. For the >>>> most part the reactions I remember hearing were more concerned on >>>> whether the infrastructure team would allow us to use this sort of >>>> configuration. I looked yesterday and asked and apparently its >>>> something we can request but as always we'll want to verify again if >>>> we have consensus to move in this direction. >>>> >>>> Anyone have comments or flames? Right now I'm just interested in >>>> feeling out what sort of (lack of?) consensus there is on such a >>>> change. If there's general consensus I'd think we'd do a vote in a >>>> couple weeks and if that passes then start on down this road for the >>>> two merge projects and then it would become part of master once those >>>> land (as opposed to doing this to master and then attempting to merge >>>> rcouch/bigcouch onto that somehow). >>>> >>>> >>>> This is a quick pass at listing what extra repositories I'd have >>>> created. Some of these applications only exist in the bigcouch and/or >>>> rcouch branches so that's where the unfamiliar application names are >>>> from. I'd also point out that the documentation and fauxton things are >>>> just on a whim in that we could decouple that development from the >>>> erlang development. I can see arguments for an against those. I'm much >>>> less concerned on that aspect than the Erlang parts that are directly >>>> affected by rebar/Erlang conventions. >>>> >>>> chttpd >>>> config >>>> couch >>>> couch_collate >>>> couch_dbupdates >>>> couch_httpd >>>> couch_index >>>> couch_mrview >>>> couch_plugins >>>> couch_replicator >>>> documentation >>>> ddoc_cache >>>> ets_lru >>>> fabric >>>> fauxton >>>> ibrowse >>>> jiffy >>>> mem3 >>>> mochiweb >>>> oauth >>>> rebar >>>> rexi >>>> snappy >>>> twig >>> >>> >>> I also contemplated this and and I am generally +1 on this. And definitely >>> +1 to mirror them on the apache git if possible. I have a couple of >>> comments though. >>> >>> Initially I also had everything separated in its own source repository. 1 >>> year ago I merged back as one core repo the couchdb erlang applications and >>> put all the dependencies in the refuge repository or in the refuge CDN for >>> the spidermonkey and ICU sources. >>> >>> I merged back as one core repo the couchdb erlang applications because they >>> were a little too much dependant. Especially couch_httpd, couch_index and >>> couch_mrview. These applications are not yet enough by themselves. >>> >>> Imo if we split everything in their own apps, then we should make sure >>> that couch_httpd can be used without couch_index and couch_mrview (which >>> means that "all_docs" is available in couch_httpd). Also we should be able >>> to just launch couch without any of the above. And probably without the >>> need of an ini. The couch_query_server module thing is an interesting case. >>> bigcouch is also introducing `ddoc_cache` which I am not sure why it is >>> provided as a standalone app. Does it means it can be replaced by another >>> application eventually? Why not having it simply in the couch application? >>> Does it needs to be updated separately? >>> >>> Also all our base applications should also be named spaced correctly so >>> they will be strictly identified as erlang modules: "config" is too >>> generic, "ddoc_cache" too. Others are probably OK. >>> >>> There are probably other things that we could provide as apps: >>> >>> - couch_daemon, >>> - couch_js >>> - couch_external >>> - couch_stats >>> - couch_compaction_daemon >>> - couch_httpd_proxy >>> >>> Anyway again i'm +1 for this move, I really think it's a good idea. >>> >>> - benoit >> >> I agree on most of this. Roughly I see three general points. >> >> First, deciding on whether some things are external deps is definitely up >> for discussion. Whether couch_mrview is a different app/repo is not >> necessarily clear cut. Personally I think I over engineered couch_index >> which blurs the lines a bit. If I could wave a wand I'd have just >> couch_mrview and it'd be separate. More importantly I think the separate >> repos makes these things more apparent. The fact were discussing this sort >> of architecture thing is suggestive that it's forcing us to think a bit >> harder. >> >> Second is the aspect of composability. For instance the mrview thing to me >> is obviously a different repo precisely so a user could import couch >> (_core?) directly without requiring the spider monkey dependency. The >> monolithic repo doesn't allow this without some very non-standard tooling. >> >> Thirdly, app naming is always a contention. The config name was actually a >> hot code upgrade concern. We couldn't reuse couch_config directly at the >> time. And Adam was also hopeful we could the it into a useful non-specific >> config app. >> >> Fourthly, and related to secondly, we'll also want to look at splitting >> other apps out as necessary. The ones you listed I think aren't >> controversial it's just that no one has done it yet. My list was purely what >> existed so far without attempting to carve things up more. I definitely >> agree we should carve more in just wanted to cover consensus that carving is >> the right direction. >> >> Fifthly, I'm done typing on my phone. I'll fill in more thoughts tomorrow. >>
signature.asc
Description: Message signed with OpenPGP using GPGMail
