tl;dr:

I want to handle conflicts automatically on lines like

> KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv 
> ~s390 ~sparc ~x86"

where conflicts frequently happen by adding/removing ~ before the
architecture names or adding/removing whole architectures. I don't
know if I should use a custom git merge driver or a custom git merge
strategy.


So the program in the patch below works, but it's not ideal, because
it rejects any hunks that don't touch the KEYWORDS=... assignment.

As I understand it, a custom git merge driver is intended to be used
to merge whole file formats, like JSON. As a result, you configure it
via gitattributes on a per-extension basis.

I really just want to make the default recursive git merge handle
KEYWORDS=... conflicts automatically, and I don't expect to be able to
make a git merge driver that can handle arbitrary conflicts in
*.ebuild files. If the merge driver returns non-zero if it was unable
to resolve the conflicts, but when it does so git evidently doesn't
fallback and insert the typical <<< HEAD ... === ... >>> markers.
Maybe I could make my merge driver insert those like git normally
does? Seems like git's  logic is probably a bit better about handling
some conflicts than my tool would be.

So... is a git merge strategy the thing I want? I don't know. There
doesn't seem to really be any documentation on writing git merge
strategies. I've only found [1] and [2].

Cc'ing g...@vger.kernel.org, since I expect that's where the experts
are. Hopefully they have suggestions.


[1] 
https://stackoverflow.com/questions/23140240/git-how-do-i-add-a-custom-merge-strategy
[2] 
https://stackoverflow.com/questions/54528824/any-documentation-for-writing-a-custom-git-merge-strategy


On Sun, Dec 20, 2020 at 10:44 PM Matt Turner <matts...@gentoo.org> wrote:
>
> Since the KEYWORDS=... assignment is a single line, git struggles to
> handle conflicts. When rebasing a series of commits that modify the
> KEYWORDS=... it's usually easier to throw them away and reapply on the
> new tree than it is to manually handle conflicts during the rebase.
>
> git allows a 'merge driver' program to handle conflicts; this program
> handles conflicts in the KEYWORDS=... assignment. E.g., given an ebuild
> with these keywords:
>
> KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv 
> ~s390 ~sparc ~x86"
>
> One developer drops the ~alpha keyword and pushes to gentoo.git, and
> another developer stabilizes hppa. Without this merge driver, git
> requires the second developer to manually resolve the conflict.  With
> the custom merge driver, it automatically resolves the conflict.
>
> gentoo.git/.git/config:
>
>         [core]
>                 ...
>                 attributesfile = ~/.gitattributes
>         [merge "keywords"]
>                 name = KEYWORDS merge driver
>                 driver = merge-driver-ekeyword %O %A %B
>
>  ~/.gitattributes:
>
>         *.ebuild merge=keywords
>
> Signed-off-by: Matt Turner <matts...@gentoo.org>
> ---
> One annoying wart in the program is due to the fact that ekeyword
> won't work on any file not named *.ebuild. I make a symlink (and set up
> an atexit handler to remove it) to work around this. I'm not sure we
> could make ekeyword handle arbitrary filenames given its complex multi-
> argument parameter support. git merge files are named .merge_file_XXXXX
> according to git-unpack-file(1), so we could allow those. Thoughts?
>
>  bin/merge-driver-ekeyword | 125 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 125 insertions(+)
>  create mode 100755 bin/merge-driver-ekeyword
>
> diff --git a/bin/merge-driver-ekeyword b/bin/merge-driver-ekeyword
> new file mode 100755
> index 0000000..6e645a9
> --- /dev/null
> +++ b/bin/merge-driver-ekeyword
> @@ -0,0 +1,125 @@
> +#!/usr/bin/python
> +#
> +# Copyright 2020 Gentoo Authors
> +# Distributed under the terms of the GNU General Public License v2 or later
> +
> +"""
> +Custom git merge driver for handling conflicts in KEYWORDS assignments
> +
> +See https://git-scm.com/docs/gitattributes#_defining_a_custom_merge_driver
> +"""
> +
> +import atexit
> +import difflib
> +import os
> +import shutil
> +import sys
> +
> +from typing import List, Optional, Tuple
> +
> +from gentoolkit.ekeyword import ekeyword
> +
> +
> +def keyword_array(keyword_line: str) -> List[str]:
> +    # Find indices of string inside the double-quotes
> +    i1: int = keyword_line.find('"') + 1
> +    i2: int = keyword_line.rfind('"')
> +
> +    # Split into array of KEYWORDS
> +    return keyword_line[i1:i2].split(' ')
> +
> +
> +def keyword_line_changes(old: str, new: str) -> List[Tuple[Optional[str],
> +                                                           Optional[str]]]:
> +    a: List[str] = keyword_array(old)
> +    b: List[str] = keyword_array(new)
> +
> +    s = difflib.SequenceMatcher(a=a, b=b)
> +
> +    changes = []
> +    for tag, i1, i2, j1, j2 in s.opcodes():
> +        if tag == 'replace':
> +            changes.append((a[i1:i2], b[j1:j2]),)
> +        elif tag == 'delete':
> +            changes.append((a[i1:i2], None),)
> +        elif tag == 'insert':
> +            changes.append((None, b[j1:j2]),)
> +        else:
> +            assert tag == 'equal'
> +    return changes
> +
> +
> +def keyword_changes(ebuild1: str, ebuild2: str) -> List[Tuple[Optional[str],
> +                                                              
> Optional[str]]]:
> +    with open(ebuild1) as e1, open(ebuild2) as e2:
> +        lines1 = e1.readlines()
> +        lines2 = e2.readlines()
> +
> +        diff = difflib.unified_diff(lines1, lines2, n=0)
> +        assert next(diff) == '--- \n'
> +        assert next(diff) == '+++ \n'
> +
> +        hunk: int = 0
> +        old: str = ''
> +        new: str = ''
> +
> +        for line in diff:
> +            if line.startswith('@@ '):
> +                if hunk > 0: break
> +                hunk += 1
> +            elif line.startswith('-'):
> +                if old or new: break
> +                old = line
> +            elif line.startswith('+'):
> +                if not old or new: break
> +                new = line
> +        else:
> +            if 'KEYWORDS=' in old and 'KEYWORDS=' in new:
> +                return keyword_line_changes(old, new)
> +        return None
> +
> +
> +def apply_keyword_changes(ebuild: str,
> +                          changes: List[Tuple[Optional[str],
> +                                              Optional[str]]]) -> int:
> +    # ekeyword will only modify files named *.ebuild, so make a symlink
> +    ebuild_symlink = ebuild + '.ebuild'
> +    os.symlink(ebuild, ebuild_symlink)
> +    atexit.register(lambda: os.remove(ebuild_symlink))
> +
> +    for removals, additions in changes:
> +        args = []
> +        for rem in removals:
> +            # Drop leading '~' and '-' characters and prepend '^'
> +            i = 1 if rem[0] in ('~', '-') else 0
> +            args.append('^' + rem[i:])
> +        if additions:
> +            args.extend(additions)
> +        args.append(ebuild_symlink)
> +
> +        result = ekeyword.main(args)
> +        if result != 0:
> +            return result
> +    return 0
> +
> +
> +def main(argv):
> +    if len(argv) != 4:
> +        sys.exit(-1)
> +
> +    O = argv[1] # %O - filename of original
> +    A = argv[2] # %A - filename of our current version
> +    B = argv[3] # %B - filename of the other branch's version
> +
> +    # Get changes from %O to %B
> +    changes = keyword_changes(O, B)
> +    if not changes:
> +        sys.exit(-1)
> +
> +    # Apply O -> B changes to A
> +    result: int = apply_keyword_changes(A, changes)
> +    sys.exit(result)
> +
> +
> +if __name__ == "__main__":
> +    main(sys.argv)
> --
> 2.26.2
>

Reply via email to