Re: [gentoo-portage-dev] [PATCH gentoolkit] bin: Add merge-driver-ekeyword

2020-12-31 Thread Zac Medico
On 12/31/20 11:47 AM, Matt Turner wrote:
> Since the KEYWORDS=... assignment is a single line, git struggles to
> handle conflicts. When rebasing a series of commits that modify the
> KEYWORDS=... it's usually easier to throw them away and reapply on the
> new tree than it is to manually handle conflicts during the rebase.
> 
> git allows a 'merge driver' program to handle conflicts; this program
> handles conflicts in the KEYWORDS=... assignment. E.g., given an ebuild
> with these keywords:
> 
> KEYWORDS="~alpha amd64 arm arm64 ~hppa ppc ppc64 x86"
> 
> One developer drops the ~alpha keyword and pushes to gentoo.git, and
> another developer stabilizes hppa. Without this merge driver, git
> requires the second developer to manually resolve the conflict which is
> tedious and prone to mistakes when rebasing a long series of patches.
> With the custom merge driver, it automatically resolves the conflict.
> 
> To use the merge driver, configure your gentoo.git as such:
> 
> gentoo.git/.git/config:
> 
>   [merge "keywords"]
>   name = KEYWORDS merge driver
>   driver = merge-driver-ekeyword %O %A %B %P
> 
> gentoo.git/.git/info/attributes:
> 
>   *.ebuild merge=keywords
> 
> Signed-off-by: Matt Turner 
> ---
> v3: Address Zac's feedback: use tempfile.TemporaryDirectory

Looks great!
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] [PATCH gentoolkit] bin: Add merge-driver-ekeyword

2020-12-31 Thread Matt Turner
On Mon, Dec 28, 2020 at 8:09 PM Zac Medico  wrote:
>
> On 12/28/20 3:15 PM, Matt Turner wrote:
> > +def apply_keyword_changes(ebuild: str, pathname: str,
> > +  changes: List[Tuple[Optional[str],
> > +  Optional[str]]]) -> int:
> > +result: int = 0
> > +
> > +# ekeyword will only modify files named *.ebuild, so make a symlink
> > +ebuild_symlink: str = os.path.basename(pathname)
> > +os.symlink(ebuild, ebuild_symlink)
>
> Are we sure that the current working directory is an entirely safe place
> to create this symlink? A simple fix would be to use
> tempfile.TemporaryDirectory to create a temporary directory to hold the
> symlink. Or, we could change ekeyword to assume that an argument is an
> ebuild if os.path.isfile(arg) succeeds.

Thanks, this is a good question. I've sent a v3 patch using
tempfile.TemporaryDirectory. I think that's better than passing
".merge_file_SEd3R8" to ekeyword since the filename is printed and
it's nice to see what file is being modified during the rebase.



[gentoo-portage-dev] [PATCH gentoolkit] bin: Add merge-driver-ekeyword

2020-12-31 Thread Matt Turner
Since the KEYWORDS=... assignment is a single line, git struggles to
handle conflicts. When rebasing a series of commits that modify the
KEYWORDS=... it's usually easier to throw them away and reapply on the
new tree than it is to manually handle conflicts during the rebase.

git allows a 'merge driver' program to handle conflicts; this program
handles conflicts in the KEYWORDS=... assignment. E.g., given an ebuild
with these keywords:

KEYWORDS="~alpha amd64 arm arm64 ~hppa ppc ppc64 x86"

One developer drops the ~alpha keyword and pushes to gentoo.git, and
another developer stabilizes hppa. Without this merge driver, git
requires the second developer to manually resolve the conflict which is
tedious and prone to mistakes when rebasing a long series of patches.
With the custom merge driver, it automatically resolves the conflict.

To use the merge driver, configure your gentoo.git as such:

gentoo.git/.git/config:

[merge "keywords"]
name = KEYWORDS merge driver
driver = merge-driver-ekeyword %O %A %B %P

gentoo.git/.git/info/attributes:

*.ebuild merge=keywords

Signed-off-by: Matt Turner 
---
v3: Address Zac's feedback: use tempfile.TemporaryDirectory

Since ekeyword prints the name of the file modified, I think using a
symlink with the name of the original file is better than having it
print .merge_file_SEd3R8.

 bin/merge-driver-ekeyword | 132 ++
 1 file changed, 132 insertions(+)
 create mode 100755 bin/merge-driver-ekeyword

diff --git a/bin/merge-driver-ekeyword b/bin/merge-driver-ekeyword
new file mode 100755
index 000..2df83fc
--- /dev/null
+++ b/bin/merge-driver-ekeyword
@@ -0,0 +1,132 @@
+#!/usr/bin/python
+#
+# Copyright 2020 Gentoo Authors
+# Distributed under the terms of the GNU General Public License v2 or later
+
+"""
+Custom git merge driver for handling conflicts in KEYWORDS assignments
+
+See https://git-scm.com/docs/gitattributes#_defining_a_custom_merge_driver
+"""
+
+import difflib
+import os
+import sys
+import tempfile
+
+from typing import List, Optional, Tuple
+
+from gentoolkit.ekeyword import ekeyword
+
+
+def keyword_array(keyword_line: str) -> List[str]:
+# Find indices of string inside the double-quotes
+i1: int = keyword_line.find('"') + 1
+i2: int = keyword_line.rfind('"')
+
+# Split into array of KEYWORDS
+return keyword_line[i1:i2].split(' ')
+
+
+def keyword_line_changes(old: str, new: str) -> List[Tuple[Optional[str],
+   Optional[str]]]:
+a: List[str] = keyword_array(old)
+b: List[str] = keyword_array(new)
+
+s = difflib.SequenceMatcher(a=a, b=b)
+
+changes = []
+for tag, i1, i2, j1, j2 in s.get_opcodes():
+if tag == 'replace':
+changes.append((a[i1:i2], b[j1:j2]),)
+elif tag == 'delete':
+changes.append((a[i1:i2], None),)
+elif tag == 'insert':
+changes.append((None, b[j1:j2]),)
+else:
+assert tag == 'equal'
+return changes
+
+
+def keyword_changes(ebuild1: str, ebuild2: str) -> List[Tuple[Optional[str],
+  Optional[str]]]:
+with open(ebuild1) as e1, open(ebuild2) as e2:
+lines1 = e1.readlines()
+lines2 = e2.readlines()
+
+diff = difflib.unified_diff(lines1, lines2, n=0)
+assert next(diff) == '--- \n'
+assert next(diff) == '+++ \n'
+
+hunk: int = 0
+old: str = ''
+new: str = ''
+
+for line in diff:
+if line.startswith('@@ '):
+if hunk > 0:
+break
+hunk += 1
+elif line.startswith('-'):
+if old or new:
+break
+old = line
+elif line.startswith('+'):
+if not old or new:
+break
+new = line
+else:
+if 'KEYWORDS=' in old and 'KEYWORDS=' in new:
+return keyword_line_changes(old, new)
+return None
+
+
+def apply_keyword_changes(ebuild: str, pathname: str,
+  changes: List[Tuple[Optional[str],
+  Optional[str]]]) -> int:
+result: int = 0
+
+with tempfile.TemporaryDirectory() as tmpdir:
+# ekeyword will only modify files named *.ebuild, so make a symlink
+ebuild_symlink: str = os.path.join(tmpdir, os.path.basename(pathname))
+os.symlink(os.path.join(os.getcwd(), ebuild), ebuild_symlink)
+
+for removals, additions in changes:
+args = []
+for rem in removals:
+# Drop leading '~' and '-' characters and prepend '^'
+i = 1 if rem[0] in ('~', '-') else 0
+args.append('^' + rem[i:])
+if additions:
+args.extend(additions)
+

Re: [gentoo-portage-dev] [PATCH gentoolkit] bin: Add merge-driver-ekeyword

2020-12-28 Thread Zac Medico
On 12/28/20 5:09 PM, Zac Medico wrote:
> On 12/28/20 3:15 PM, Matt Turner wrote:
>> +def apply_keyword_changes(ebuild: str, pathname: str,
>> +  changes: List[Tuple[Optional[str],
>> +  Optional[str]]]) -> int:
>> +result: int = 0
>> +
>> +# ekeyword will only modify files named *.ebuild, so make a symlink
>> +ebuild_symlink: str = os.path.basename(pathname)
>> +os.symlink(ebuild, ebuild_symlink)
> 
> Are we sure that the current working directory is an entirely safe place
> to create this symlink? A simple fix would be to use
> tempfile.TemporaryDirectory to create a temporary directory to hold the
> symlink. Or, we could change ekeyword to assume that an argument is an
> ebuild if os.path.isfile(arg) succeeds.
> 
>> +for removals, additions in changes:
>> +args = []
>> +for rem in removals:
>> +# Drop leading '~' and '-' characters and prepend '^'
>> +i = 1 if rem[0] in ('~', '-') else 0
>> +args.append('^' + rem[i:])
>> +if additions:
>> +args.extend(additions)
>> +args.append(ebuild_symlink)
>> +
>> +result = ekeyword.main(args)

Another option is to bypass the ekeyword.main function, like this:

try:
   ekeyword.process_ebuild(pathname, list(map(ekeyword.arg_to_op, args))
except Exception:
   result = 1
   traceback.print_exc()
else:
   result = 0


>> +if result != 0:
>> +break
>> +
>> +os.remove(ebuild_symlink)
>> +return result
> 
> 


-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] [PATCH gentoolkit] bin: Add merge-driver-ekeyword

2020-12-28 Thread Zac Medico
On 12/28/20 3:15 PM, Matt Turner wrote:
> +def apply_keyword_changes(ebuild: str, pathname: str,
> +  changes: List[Tuple[Optional[str],
> +  Optional[str]]]) -> int:
> +result: int = 0
> +
> +# ekeyword will only modify files named *.ebuild, so make a symlink
> +ebuild_symlink: str = os.path.basename(pathname)
> +os.symlink(ebuild, ebuild_symlink)

Are we sure that the current working directory is an entirely safe place
to create this symlink? A simple fix would be to use
tempfile.TemporaryDirectory to create a temporary directory to hold the
symlink. Or, we could change ekeyword to assume that an argument is an
ebuild if os.path.isfile(arg) succeeds.

> +for removals, additions in changes:
> +args = []
> +for rem in removals:
> +# Drop leading '~' and '-' characters and prepend '^'
> +i = 1 if rem[0] in ('~', '-') else 0
> +args.append('^' + rem[i:])
> +if additions:
> +args.extend(additions)
> +args.append(ebuild_symlink)
> +
> +result = ekeyword.main(args)
> +if result != 0:
> +break
> +
> +os.remove(ebuild_symlink)
> +return result


-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


[gentoo-portage-dev] [PATCH gentoolkit] bin: Add merge-driver-ekeyword

2020-12-28 Thread Matt Turner
Since the KEYWORDS=... assignment is a single line, git struggles to
handle conflicts. When rebasing a series of commits that modify the
KEYWORDS=... it's usually easier to throw them away and reapply on the
new tree than it is to manually handle conflicts during the rebase.

git allows a 'merge driver' program to handle conflicts; this program
handles conflicts in the KEYWORDS=... assignment. E.g., given an ebuild
with these keywords:

KEYWORDS="~alpha amd64 arm arm64 ~hppa ppc ppc64 x86"

One developer drops the ~alpha keyword and pushes to gentoo.git, and
another developer stabilizes hppa. Without this merge driver, git
requires the second developer to manually resolve the conflict which is
tedious and prone to mistakes when rebasing a long series of patches.
With the custom merge driver, it automatically resolves the conflict.

To use the merge driver, configure your gentoo.git as such:

gentoo.git/.git/config:

[merge "keywords"]
name = KEYWORDS merge driver
driver = merge-driver-ekeyword %O %A %B %P

gentoo.git/.git/info/attributes:

*.ebuild merge=keywords

Signed-off-by: Matt Turner 
---
 bin/merge-driver-ekeyword | 131 ++
 1 file changed, 131 insertions(+)
 create mode 100755 bin/merge-driver-ekeyword

diff --git a/bin/merge-driver-ekeyword b/bin/merge-driver-ekeyword
new file mode 100755
index 000..2142dc8
--- /dev/null
+++ b/bin/merge-driver-ekeyword
@@ -0,0 +1,131 @@
+#!/usr/bin/python
+#
+# Copyright 2020 Gentoo Authors
+# Distributed under the terms of the GNU General Public License v2 or later
+
+"""
+Custom git merge driver for handling conflicts in KEYWORDS assignments
+
+See https://git-scm.com/docs/gitattributes#_defining_a_custom_merge_driver
+"""
+
+import difflib
+import os
+import sys
+
+from typing import List, Optional, Tuple
+
+from gentoolkit.ekeyword import ekeyword
+
+
+def keyword_array(keyword_line: str) -> List[str]:
+# Find indices of string inside the double-quotes
+i1: int = keyword_line.find('"') + 1
+i2: int = keyword_line.rfind('"')
+
+# Split into array of KEYWORDS
+return keyword_line[i1:i2].split(' ')
+
+
+def keyword_line_changes(old: str, new: str) -> List[Tuple[Optional[str],
+   Optional[str]]]:
+a: List[str] = keyword_array(old)
+b: List[str] = keyword_array(new)
+
+s = difflib.SequenceMatcher(a=a, b=b)
+
+changes = []
+for tag, i1, i2, j1, j2 in s.get_opcodes():
+if tag == 'replace':
+changes.append((a[i1:i2], b[j1:j2]),)
+elif tag == 'delete':
+changes.append((a[i1:i2], None),)
+elif tag == 'insert':
+changes.append((None, b[j1:j2]),)
+else:
+assert tag == 'equal'
+return changes
+
+
+def keyword_changes(ebuild1: str, ebuild2: str) -> List[Tuple[Optional[str],
+  Optional[str]]]:
+with open(ebuild1) as e1, open(ebuild2) as e2:
+lines1 = e1.readlines()
+lines2 = e2.readlines()
+
+diff = difflib.unified_diff(lines1, lines2, n=0)
+assert next(diff) == '--- \n'
+assert next(diff) == '+++ \n'
+
+hunk: int = 0
+old: str = ''
+new: str = ''
+
+for line in diff:
+if line.startswith('@@ '):
+if hunk > 0:
+break
+hunk += 1
+elif line.startswith('-'):
+if old or new:
+break
+old = line
+elif line.startswith('+'):
+if not old or new:
+break
+new = line
+else:
+if 'KEYWORDS=' in old and 'KEYWORDS=' in new:
+return keyword_line_changes(old, new)
+return None
+
+
+def apply_keyword_changes(ebuild: str, pathname: str,
+  changes: List[Tuple[Optional[str],
+  Optional[str]]]) -> int:
+result: int = 0
+
+# ekeyword will only modify files named *.ebuild, so make a symlink
+ebuild_symlink: str = os.path.basename(pathname)
+os.symlink(ebuild, ebuild_symlink)
+
+for removals, additions in changes:
+args = []
+for rem in removals:
+# Drop leading '~' and '-' characters and prepend '^'
+i = 1 if rem[0] in ('~', '-') else 0
+args.append('^' + rem[i:])
+if additions:
+args.extend(additions)
+args.append(ebuild_symlink)
+
+result = ekeyword.main(args)
+if result != 0:
+break
+
+os.remove(ebuild_symlink)
+return result
+
+
+def main(argv):
+if len(argv) != 5:
+sys.exit(-1)
+
+O = argv[1] # %O - filename of original
+A = argv[2] # %A - filename of our current version
+B = argv[3] # %B - filename of the other branch's version
+P =