It doesn't appear that this is objectionable to anyone. Does anyone
have an objection to us having infra/me create these repos to use for
the bigcouch/rcouch merge work? This won't affect master or releases
until those merges finish.

On Tue, Jan 14, 2014 at 11:02 PM, Paul J Davis
<[email protected]> wrote:
>
>
>> On Jan 14, 2014, at 8:37 PM, Benoit Chesneau <[email protected]> wrote:
>>
>> On Wed, Jan 15, 2014 at 12:22 AM, Paul Davis 
>> <[email protected]>wrote:
>>
>>> I've recently been having discussions about how to handle the
>>> repository configuration for various bits of CouchDB post-merge. The
>>> work that Benoit has been doing on the rcouch merge branch have also
>>> touched on this topic as well.
>>>
>>> The background for those unfamiliar is that the standard operating
>>> procedure for Erlang is to have a single Erlang application per
>>> repository and then rely on rebar to fetch each dependency.
>>> Traditionally in CouchDB land we've always just included the source to
>>> all applications in a single monolithic repository and periodically
>>> reimport changes from upstream dependencies.
>>>
>>> Recently rcouch changed from the monolithic repository to use external
>>> repositories for some dependencies. Originally the BigCouch used an
>>> even more federated scheme that had each Erlang application in an
>>> external repository (and the core couch Erlang application was in the
>>> root repository). When Bob Newson and I did the initial hacking on the
>>> BigCouch merge we pulled those external dependencies into the root
>>> repository reverting back to the large monolithic approach.
>>>
>>> After trying to deal with the merge and contemplating how various
>>> Erlang release things might work it's become fairly apparent that the
>>> monolithic approach is a bit constrictive. For instance, part of
>>> rebar's versioning abilities lets you tag repositories to generate
>>> versions rather than manually updating versions in source files.
>>> Another thing I've found on other projects is that having each
>>> application in a separate repository requires developers to think a
>>> bit more detailed about the public internal interfaces used through
>>> out the system. We've done some work to this extent already with
>>> separating source directories but forcing commits to multiple
>>> repositories shoots up a big red flag that maybe there's a high level
>>> of coupling between two bits of code.
>>>
>>> Other benefits of having the multiple repository setup is that its
>>> possible that this lends itself to being integrated with the proposed
>>> plugin system. It'd be fairly trivial to have a script that went and
>>> fetched plugins that aren't developed at Apache (as a ./configure time
>>> switch type of thing). Having a system like this would also allow us
>>> to have groups focused on particular bits of development not have to
>>> concern themselves with the unrelated parts of the system.
>>>
>>> Given all that, I'd like to propose that we move to having a
>>> repository for each application/dependency that we use to build
>>> CouchDB. Each repository would be hosted on ASF infra and mirrored to
>>> GitHub as expected. This means that we could have the root repository
>>> be a simple repo that contains packaging/release/build stuff that
>>> would enable lots of the ideas offered on configurable types of
>>> release generation. I've included an initial list of repositories at
>>> the end of this email. Its basically just the apps that have been
>>> split out in either rcouch or bigcouch plus a few other bits from
>>> CouchDB master.
>>>
>>> I would also point out that even though our main repo would need to
>>> fetch other dependencies from the internet to build the final output,
>>> we fully intend that our release tarballs would *not* have this
>>> requirement. Ie, when we go to cut a release part of the process the
>>> RM would run would be to pull all of those dependencies before
>>> creating a tarball that would be wholly self contained. Given an
>>> apache-couchdb-x.y.z.tar.gz release file, there won't be a requirement
>>> to have access to the ASF git repos.
>>>
>>> I'm not entirely sure how controversial this is for anyone. For the
>>> most part the reactions I remember hearing were more concerned on
>>> whether the infrastructure team would allow us to use this sort of
>>> configuration. I looked yesterday and asked and apparently its
>>> something we can request but as always we'll want to verify again if
>>> we have consensus to move in this direction.
>>>
>>> Anyone have comments or flames? Right now I'm just interested in
>>> feeling out what sort of (lack of?) consensus there is on such a
>>> change. If there's general consensus I'd think we'd do a vote in a
>>> couple weeks and if that passes then start on down this road for the
>>> two merge projects and then it would become part of master once those
>>> land (as opposed to doing this to master and then attempting to merge
>>> rcouch/bigcouch onto that somehow).
>>>
>>>
>>> This is a quick pass at listing what extra repositories I'd have
>>> created. Some of these applications only exist in the bigcouch and/or
>>> rcouch branches so that's where the unfamiliar application names are
>>> from. I'd also point out that the documentation and fauxton things are
>>> just on a whim in that we could decouple that development from the
>>> erlang development. I can see arguments for an against those. I'm much
>>> less concerned on that aspect than the Erlang parts that are directly
>>> affected by rebar/Erlang conventions.
>>>
>>>    chttpd
>>>    config
>>>    couch
>>>    couch_collate
>>>    couch_dbupdates
>>>    couch_httpd
>>>    couch_index
>>>    couch_mrview
>>>    couch_plugins
>>>    couch_replicator
>>>    documentation
>>>    ddoc_cache
>>>    ets_lru
>>>    fabric
>>>    fauxton
>>>    ibrowse
>>>    jiffy
>>>    mem3
>>>    mochiweb
>>>    oauth
>>>    rebar
>>>    rexi
>>>    snappy
>>>    twig
>>
>>
>> I also contemplated this and and I am generally +1 on this. And definitely
>> +1 to mirror them on the apache git if possible.  I have a couple of
>> comments though.
>>
>> Initially I also had everything separated in its own source repository. 1
>> year ago I merged back as one core repo the couchdb erlang applications and
>> put all the dependencies in the refuge repository or in the refuge CDN for
>> the spidermonkey and ICU sources.
>>
>> I merged back as one core repo the couchdb erlang applications because they
>> were a little too much dependant. Especially couch_httpd, couch_index and
>> couch_mrview. These applications are not yet enough by themselves.
>>
>> Imo if we split everything in  their own apps, then we should make sure
>> that couch_httpd can be used without couch_index and couch_mrview (which
>> means that "all_docs" is available in couch_httpd). Also we should be able
>> to just launch couch without any of the above. And probably without the
>> need of an ini. The couch_query_server module thing is an interesting case.
>> bigcouch is also introducing `ddoc_cache` which I am not sure why it is
>> provided as a standalone app. Does it means it can be replaced by another
>> application eventually? Why not having it simply in the  couch application?
>> Does it needs to be updated separately?
>>
>> Also  all our base applications should also be named spaced correctly so
>> they will be strictly identified as erlang modules:  "config" is too
>> generic, "ddoc_cache" too. Others are probably OK.
>>
>> There are probably other things that we could provide as apps:
>>
>> - couch_daemon,
>> - couch_js
>> - couch_external
>> - couch_stats
>> - couch_compaction_daemon
>> - couch_httpd_proxy
>>
>> Anyway again i'm +1 for this move, I really think it's a good idea.
>>
>> - benoit
>
> I agree on most of this. Roughly I see three general points.
>
> First, deciding on whether some things are external deps is definitely up for 
> discussion. Whether couch_mrview is a different app/repo is not necessarily 
> clear cut. Personally I think I over engineered couch_index which blurs the 
> lines a bit. If I could wave a wand I'd have just couch_mrview and it'd be 
> separate. More importantly I think the separate repos makes these things more 
> apparent. The fact were discussing this sort of architecture thing is 
> suggestive that it's forcing us to think a bit harder.
>
> Second is the aspect of composability. For instance the mrview thing to me is 
> obviously a different repo precisely so a user could import couch (_core?) 
> directly without requiring the spider monkey dependency. The monolithic repo 
> doesn't allow this without some very non-standard tooling.
>
> Thirdly, app naming is always a contention. The config name was actually a 
> hot code upgrade concern. We couldn't reuse couch_config directly at the 
> time. And Adam was also hopeful we could the it into a useful non-specific 
> config app.
>
> Fourthly, and related to secondly, we'll also want to look at splitting other 
> apps out as necessary. The ones you listed I think aren't controversial it's 
> just that no one has done it yet. My list was purely what existed so far 
> without attempting to carve things up more. I definitely agree we should 
> carve more in just wanted to cover consensus that carving is the right 
> direction.
>
> Fifthly, I'm done typing on my phone. I'll fill in more thoughts tomorrow.
>

Reply via email to