[RFD] should all merge bases be equal?

Junio C Hamano Mon, 17 Oct 2016 15:29:09 -0700

People can see how fast the usual merges I see everyday are by
looking at output from


    $ git log --first-parent --format='%ct %s' master..pu

and noticing the seconds since epoch.  Most of the days, these are
recreated directly on top of 'master' from scratch, and they get
timestamps that are very close to each other (or the same), meaning
that I am getting multiple merges per second.

Being accustomed how fast my merges go, there is one merge that
hiccups for me every once in a few days: merging back from 'master'
to 'next'.  This happens after having multiple topics (that by
definition have to be the ones that were already merged to 'next'
some time ago) to 'master', and 'master' may also have its own
commit (e.g. update to "RelNotes") and merges of side branches that
were not in 'next' (e.g. merge from submaintainers like i18n, etc.)

The reason why this merge is slow is because it typically have many
merge bases.  For example, today's tip of 'next' is:

    commit 6021889cc14df07d4366829367d2c4a11d1eaa56
    Merge: 4868def05e 659889482a
    Author: Junio C Hamano <[email protected]>
    Date:   Mon Oct 17 14:02:05 2016 -0700

        Sync with master

        * master:
          Tenth batch for 2.11
          l10n: de.po: translate 260 new messages
          l10n: de.po: fix translation of autostash
          l10n: ru.po: update Russian translation

which is a merge that has 12 merge bases:

    $ git merge-base --all 4868def05e 659889482a | git name-rev --stdin
    3cdd5d19178a54d2e51b5098d43b57571241d0ab (ko/master)
    641c900b2c3a8c3d385eb353b3801a5f49ddbb47 (js/reset-usage)
    30cfe72d37ed8c174cae43923769730a94549dae (rs/pretty-format-color-doc-fix)
    654311bf6ee0fbf530c088271c2c76d46f31f82d (da/mergetool-diff-order)
    72710165c932edb2b8552aef7aef2f357dde4746 (sb/submodule-config-doc-drop-path)
    842a516cb02a53cf0291ff67ed6f8517966345c0 (js/regexec-buf)
    62fe0eb4804c297486a1d421a4f893865fcbc911 (jk/quarantine-received-objects)
    a94bb683970a111b467a36590ca36e52754ad504 (rs/cocci)
    e8c42cb9ce6a566aad797cc6c5bc1279d608d819 (jk/ref-symlink-loop)
    22d3b8de1b625813faec6f3d6ffe66124837b78b (jk/clone-copy-alternates-fix)
    7431596ab1f05a13adb93b44108f27cfd6578a31 (nd/commit-p-doc)
    5275c3081c2b2c6166a2fc6b253a3acb20f8ae89 (dt/http-empty-auth)

The tip of each topic that was merged recently to 'master' since
'master' was last merged to 'next' becomes a valid merge-base by
design of the workflow.  We merge a topic to 'master' whose tip
has been already in 'next' for a while, so the tip of 'next' before
merging 'master' back is a descendant of the tips of these topics,
and the tip of 'master' before I make this merge has just become a
descendant of the tips of these topics.  That makes them common
ancestors between 'master' and 'next'.

But for the purpose of figuring out the 3-way merge result, I
suspect that they are pretty much useless common ancestor to use as
the merge base.  The first one in the list, the old 'master' that
was merged the last time to 'next', would be a lot more useful one.

And of course, if I do this:

    $ git checkout next
    $ git merge master ;# this is slow due to the 12-base above
    $ git checkout HEAD^ ;# detach at the previous state
    $ git merge-recursive ko/master -- HEAD master

the merge is instantaneous.  I'd get only what truly happened
uniquely on 'master', e.g. RelNotes update and i18n merge.

I am wondering if it is worth trying to optimize it by taking
advantage of the specific workflow (i.e. give me an option to use
when merging 'master' back to 'next') and allows me to exclude the
tips of these topic branches.  Would I be making the result open to
the criss-cross merge gotchas the "recursive merge" machinery was
designed to prevent in the first place by doing so?  Offhand, I do
not think that would be the case.

Assuming that it is a good idea, there is another question of how to
tell the more meaningful merge bases (ko/master in this case) out of
the less useful ones (all the others).  I think it would be
sufficient to reject commits that are not on the first-parent chain
of neither branch being merged.  Among the 12 merge bases, ko/master
is the only one that appears on the first-parent chain from 'master'
being merged (it does not appear on the first-parent chain from
'next').  All others were topic tips that by definition were merged
as second parent to integration branches ('next' and later 'master').
The place to do this would be a new option to 'merge-base'; instead
of "--all", perhaps "--major" option gives only the major merge bases
(with the definition of 'major' being the above heuristics), and then
"git merge-recursive" would learn "-Xmajor-base-only" strategy option,
or something along that line.

Thoughts?

[RFD] should all merge bases be equal?

Reply via email to