Hello everybody!

Wow, 56 repos! Hopefully we get an award somewhere for that. I've
listed the repositories below in some crude groups to try and give an
idea of what we're working with. I have to agree that this is getting
a bit on the ridiculous side. Of all of the repos that the ASF
actually develops I'm only seeing four or so Erlang apps (b64url,
config, couch-collate, and khash) that are likely truly re-usable
outside of CouchDB without mucking about.

I have previously (years ago) played around with trying to go back to
a single repository. It generally works fine, the only issue that I
found was that rebar's {vsn, git} tag for *.app.src files doesn't work
in a single repo and it gets a bit complicated managing that by hand.
However, I think it would be possible to add something like rebar's
deps files and a small custom tool that either a) breaks the build if
versions haven't changed or b) even better, automatically sets
application versions based on a source file (and tweaks them like git
describe when there's been a commit since the version). This lets us
continue to "alias" commits with a human readable version and doesn't
require a single version across all applications. (Alternatively, we
could wipe the version info on every project and set a single version
that is the same for all applications that matches the CouchDB
version, but this might get weird for upstream dependencies).

That said, I'd agree with Bob that the new dependency format seems to
be solving a problem we shouldn't have. I'd rather just pull
everything into a single repo and use tooling to help maintain any
sharp edges like the versioning issue I mentioned above.

Personally, what I'd like to see is to have all Erlang repos merged
into the main couchdb.git repo and then have all upstream dependencies
managed by git-subtree. I could go either way on having the node and
spidermonkey view engines included or not. For the non Erlang parts of
our release (fauxton and documentation) I'd keep them as separate
repos so that their tooling doesn't need to be changed and/or adapted
to work out of a repo subdirectory. The administration things also
seem to make good sense to keep separate as they're not part of the
product/release tarball/whatever.

If anyone has a strong objection to a monolithic Erlang repo I'd like
to hear it. Otherwise I may work up a lengthier and more thorough
proposal for dev@ to consider consolidating all of these repositories
for sanity and profit.

Paul

Main repo:

couchdb.git


Erlang repos developed by ASF:

couchdb-b64url.git
couchdb-cassim.git
couchdb-chttpd.git
couchdb-config.git
couchdb-couch-collate.git
couchdb-couch-dbupdates.git
couchdb-couch-epi.git
couchdb-couch-event.git
couchdb-couch-httpd.git
couchdb-couch-index.git
couchdb-couch-log-lager.git
couchdb-couch-log.git
couchdb-couch-mrview.git
couchdb-couch-plugins.git
couchdb-couch-replicator.git
couchdb-couch-stats.git
couchdb-couch.git
couchdb-ddoc-cache.git
couchdb-erlang-tests.git
couchdb-ets-lru.git
couchdb-fabric.git
couchdb-global-changes.git
couchdb-ioq.git
couchdb-khash.git
couchdb-mango.git
couchdb-mem3.git
couchdb-peruser.git
couchdb-rexi.git
couchdb-setup.git
couchdb-snappy.git
couchdb-twig.git


Non-Erlang things we develop as part of a release:

couchdb-fauxton.git
couchdb-documentation.git


Mirrored repos of upstream Erlang deps:

couchdb-bear.git
couchdb-folsom.git
couchdb-goldrush.git
couchdb-ibrowse.git
couchdb-jiffy.git
couchdb-lager.git
couchdb-meck.git
couchdb-mochiweb.git
couchdb-oauth.git
couchdb-rebar.git


Query Servers:

couchdb-query-server-node.git
couchdb-query-server-spidermonkey.git


Unsure but has Erlang in it:

couchdb-examples.git


Project Administrative Things Kinda:

couchdb-admin.git
couchdb-ci.git
couchdb-docker.git
couchdb-www.git


Client Library:

couchdb-nano.git


JS CLI tool:

couchdb-nmo.git


Empty:

couchdb-javascript-tests.git

Legacy:

couchdb-futon.git
couchdb-jquery-couch.git


On Wed, Apr 13, 2016 at 3:41 AM, Garren Smith <gar...@apache.org> wrote:
> I like the idea of going back to a single repo for core db features. I
> would like Fauxton to still be in its own repo.
> As someone who wrote some very basic erlang code for CouchDB recently. I
> found the multiple repos quite tricky to manage and I couldn't see how it
> made anything easier.
>
> On Wed, Apr 13, 2016 at 8:35 AM, Alexander Shorin <kxe...@gmail.com> wrote:
>
>> Hi Robert,
>>
>> Point about flattening to a single repository is valid: in the end, we
>> have our apps repos in broken state all the time as they are not
>> declare their decencies. So noone can pick fabric@master and run it -
>> he'll spend quite a lot of time to figure the deps of the right
>> versions. But the idea to solve the problem by reducing set of
>> repositories we have to test is good.
>>
>>
>> Hi Iliya,
>>
>> I have alternative solution for you:
>>
>> - Turn-off Travis CI everywhere where we cannot be sure about testing
>> without depended PRs (all except third-party modules, fauxton, docs,
>> and few more independent projects like couch-epi);
>> - Require everyone to submit additional PR to apache/couchdb repo with
>> commit hashes update;
>> - On this apache/couchdb PR we'll run CI testing;
>> - If you rebase/update any of your subcomponent PRs you must update
>> commit hash on apache/couchdb one;
>>
>> Pros:
>> - We won't forget to update rebar.config when new changes lands;
>> - We will always run complete integration testing with all the right
>> deps states;
>> - We won't have to invent any complicated integration solutions to
>> deal with sub-repos testing;
>> - No new new steps/files/work introduced, so there is no need to care
>> about learning curve;
>>
>> Cons:
>> - Need to be a bit tricky on Travis builder to realize on which remote
>> (fork) new rebar.config hashes are to correctly checkout them, though
>> that is not a rocket science since we have access to git information
>> there.
>>
>> The Jenkins CI role here is to ensure that we have master build right
>> and releases build right, on the various OSes.
>>
>> Sounds simpler and better for me, how it does for you?
>>
>> --
>> ,,,^..^,,,
>>
>>
>> On Wed, Apr 13, 2016 at 12:37 AM, Robert Samuel Newson
>> <rnew...@apache.org> wrote:
>> > I'd like us to instead consider flattening to a single repository. I've
>> found no value and only pain from the multiple repositories approach (43 in
>> total!).
>> >
>> > The contention is that multiple repositories enforces application
>> boundaries (low coupling / high cohesion) but I've not felt that in
>> reality. We don't, and couldn't meaningfully, release any of our components
>> separately, and, as Ilya makes clear, many enhancements require changes to
>> multiple repositories, and we break this into multiple commits, losing the
>> ability to look at an enhancement in toto.
>> >
>> > If what Ilya is proposing is the solution, I think it's the solution to
>> a problem we should not have.
>> >
>> > B.
>> >
>> >> On 12 Apr 2016, at 16:22, Ilya Khlopotov <iil...@ca.ibm.com> wrote:
>> >>
>> >>
>> >>
>> >> Dear community,
>> >>
>> >>
>> >> There is a problem with contributors workflow which renders our CI
>> system
>> >> useless. As you might know couchdb project consists of multiple
>> >> repositories. Most of the time changes cross the repositories
>> boundaries.
>> >> When this happens the push to any of the repositories causes CI
>> failures.
>> >> CI fails since it uses the old version of dependencies from main
>> repository
>> >> of the project. Here is what we can do about it.
>> >>
>> >> # Proposal
>> >>
>> >> Let's use multiple files for dependency management.
>> >>
>> >> - deps.json - serves the same purpose as dependencies list from current
>> >> rebar.config.script
>> >> - proposed.deps.json - here we specify list of PRs we want to commit
>> >> atomically
>> >> - override.deps.json - local file outside of version control which we
>> >> consult in order to include development tools specific to contributor
>> (code
>> >> reloader, debugger, tracer, profiler, binpp, ...)
>> >>
>> >> Bellow is the example of a content of these files:
>> >>
>> >> ## deps.json
>> >> {
>> >>    "src/b64url": [
>> >>        "https://github.com/apache/couchdb-b64url";,
>> >>        "6895652d80f95cdf04efb14625abed868998f174"
>> >>    ],
>> >>    "src/cassim": [
>> >>        "https://github.com/apache/couchdb-cassim";,
>> >>        "9bbfe82125284fa7cb3317079e8bc1dc876a07bf"
>> >>    ],
>> >>    "src/chttpd": [
>> >>        "https://github.com/apache/couchdb-chttpd";,
>> >>        "54e8f6147486d9afc5245e0143d15a4dd1185654"
>> >>    ],
>> >>    "src/meck": [
>> >>        "https://github.com/apache/couchdb-meck";,
>> >>        "tree/0.8.2"
>> >>    ],
>> >> ....
>> >> }
>> >>
>> >> ## proposed.deps.json
>> >> {
>> >>    "src/couch": "https://github.com/apache/couchdb-couch/pull/124";,
>> >>    "src/chttpd": "https://github.com/apache/couchdb-chttpd/pull/108";
>> >>    "src/couch_tests": [
>> >>        "https://github.com/apache/couchdb-erlang-tests";,
>> >>        "tree/branch"
>> >>    ],
>> >> }
>> >>
>> >> # Interface
>> >>
>> >> I propose to write a simple CLI tool to work with this structure.
>> Bellow is
>> >> a list of commands which we need to support (for minimal version)
>> >>
>> >> ## Adding new dependency
>> >>
>> >> git propose add https://github.com/apache/couchdb-foo
>> >> a2d5ad2eedc960248b806f61df0a1009462bdb46
>> >> git propose add https://github.com/apache/couchdb-bar tree/branch_name
>> >>
>> >> ## Adding new PR to the change set
>> >>
>> >> git propose add https://github.com/apache/couchdb-config/pull/4
>> >>
>> >> ## Checking out right dependencies
>> >>
>> >> git propose checkout
>> >>
>> >> ## Checking out release
>> >>
>> >> git propose checkout --release # this would ignore proposed.deps.json
>> if it
>> >> exists
>> >>
>> >> ## Merge the change
>> >>
>> >> This command would do the following:
>> >> - Parse proposed.deps.json
>> >> - Retrieve merge commit sha for every PR (exit if dependency is not
>> merged
>> >> yet)
>> >> - Update dependencies in deps.json with correct merge commit sha
>> >> - remove proposed.deps.json
>> >>
>> >> # Workflow
>> >>
>> >> export GIT_EXEC_PATH=`pwd`/bin # or use tools like `direnv`
>> >> git checkout -b feature-ZZZ
>> >> cd src/X && hack dependency X
>> >> cd ../..
>> >> cd src/Y && hack dependency Y
>> >> issue PRs for X and Y
>> >> cd ../..
>> >> git propose add https://github.com/apache/couchdb-X/pull/4
>> >> git propose add https://github.com/apache/couchdb-Y/pull/49
>> >> git add proposed.deps.json
>> >> git commit -m "Commit feature {something} which does {a thing} and can
>> be
>> >> tested as {procedure}"
>> >> git push origin  feature-ZZZ
>> >> ^ this would trigger our CI
>> >> CI would do
>> >> git propose checkout && ./configure && make check
>> >>
>> >> # Pros and Cons
>> >>
>> >> ## Pros
>> >>
>> >> - Changes are merged atomically
>> >> - CI runs against expected versions of deps
>> >> - Enables git bisect
>> >> - Reduce tasks that needs to be done by ASF committer (no need to update
>> >> rebar.config.script manually)
>> >> - Simplifies testing of PRs by reviewers
>> >> - Simplifies rebar.config since rebar is not used for managing deps
>> >>
>> >> ## Cons
>> >>
>> >> - some github.com specifics (concept of PRs and access to github API
>> to get
>> >> info about PR)
>> >> - we need to have github.com as one of the remotes
>> >> - we trigger CI only on push to main repository
>> >>
>> >> # Implementation
>> >>
>> >> We write a git-propose script in python and place it in ./bin. We add
>> ./bin
>> >> into either GIT_EXEC_PATH or PATH. You always can call the script
>> directly
>> >> (as ./bin/git-propose) if you don't like amending
>> >> your environment.
>> >>
>> >> # Later improvements
>> >>
>> >> - We can issue PRs from the tool itself
>> >> - We can merge from the tool itself
>> >> - We can implement support for multiple remotes (asf, github, private)
>> >> - We can implement support for multiple git transports (for the first
>> >> version all repositories in *.deps.json files would use https://)
>> >>
>> >> Sincerely,
>> >> ILYA KHLOPOTOV
>> >
>>

Reply via email to