Re: [PATCH] gnulib-tool.py: Inline 'sed' invocations used on library files.

2024-03-27 Thread Bruno Haible
Collin Funk wrote:
> With open() using binary mode with encoding='utf-8' causes a failure:
> 
>  with open('test.txt', 'wb', encoding='utf-8') as file:
>  file.write('abc')
> 
>  Traceback (most recent call last):
>   File "", line 1, in 
>  ValueError: binary mode doesn't take an encoding argument

Oops, you're right. My mistake. And likewise for 'rb'.

> # Write files with '\n' as newline character.
> with open('file.txt', 'w', encoding='utf-8', newline='\n') as file:

Yes, you're right. I've committed a fix now. Sorry.

> From the documentation from open, it seems the best way to deal with
> this is for reading files [2]:
> 
>  # Accepts '\n', '\r', '\r\n' as newline.
>  with open('file.txt', 'r', encoding='utf-8') as file:
>   data = file.read()

They can recommend it. But what we want here is to recognize Unix
newlines, not macOS 9 newlines or DOS/Windows newlines. It should
behave like gnulib-tool.sh, and thus newline='\n' is appropriate here.

Bruno






Re: [PATCH] gnulib-tool.py: Inline 'sed' invocations used on library files.

2024-03-27 Thread Collin Funk
On 3/27/24 6:55 PM, Collin Funk wrote:
> [1] https://docs.python.org/3/library/functions.html#open

This link was supposed to be:

https://docs.python.org/3/library/codecs.html#codecs.unregister

Can't type today I guess...

Collin



Re: [PATCH] gnulib-tool.py: Inline 'sed' invocations used on library files.

2024-03-27 Thread Collin Funk
On 3/27/24 6:24 PM, Bruno Haible wrote:
> Thanks! Applied, with one tweak: Let's continue to use 'rb' and 'wb' as
> file open() modes, not 'r' and 'w'. If gnulib-tool ever gets used on
> Windows, we don't want the trouble caused by Windows CRLF newlines.
> We want all generated files to use Unix LF newlines. (Some of the
> constants.nlconvert nonsense will have to go away as well.)

Oops, I've been using the the standard open() function since I'm not
too familiar with the 'codecs' module. I believe they work a bit
differently.

With open() using binary mode with encoding='utf-8' causes a failure:

 with open('test.txt', 'wb', encoding='utf-8') as file:
 file.write('abc')

 Traceback (most recent call last):
  File "", line 1, in 
 ValueError: binary mode doesn't take an encoding argument

The default encoding if not passed is None. I use it since the default
encoding is None. I assume in that case it is left up to the operating
system. I know previous versions of Windows liked UTF-16, but maybe it
is different now.

The codecs module doesn't seem to have that restriction, but Python
says that the regular open() and 'io' module should be used for text
files [1].

>From the documentation from open, it seems the best way to deal with
this is for reading files [2]:

 # Accepts '\n', '\r', '\r\n' as newline.
 with open('file.txt', 'r', encoding='utf-8') as file:
  data = file.read()

And then for writing files:

# Write files with '\n' as newline character.
with open('file.txt', 'w', encoding='utf-8', newline='\n') as file:
 file.write(data)

These changes are pretty simple though. I can get a Windows virtual
machine running at some point to test these changes.

[1] https://docs.python.org/3/library/functions.html#open
[2] https://docs.python.org/3/library/functions.html#open

Collin



Re: [PATCH] gnulib-tool.py: Inline 'sed' invocations used on library files.

2024-03-27 Thread Bruno Haible
I updated the gnulib-tool.py.TODO file again:
The other 'sed' invocation can stay, since it occurs only once per
gnulib-tool.py invocation).

Bruno






Re: [PATCH] gnulib-tool.py: Inline 'sed' invocations used on library files.

2024-03-27 Thread Bruno Haible
Hi Collin,

> Here is a fixed version, sorry.

Thanks! Applied, with one tweak: Let's continue to use 'rb' and 'wb' as
file open() modes, not 'r' and 'w'. If gnulib-tool ever gets used on
Windows, we don't want the trouble caused by Windows CRLF newlines.
We want all generated files to use Unix LF newlines. (Some of the
constants.nlconvert nonsense will have to go away as well.)

Bruno






Re: [PATCH] gnulib-tool.py: Inline 'sed' invocations used on library files.

2024-03-27 Thread Collin Funk
On 3/27/24 5:51 PM, Collin Funk wrote:
> Hi Bruno,
> 
> Here is the 'sed' inlining I mentioned earlier.

Oops, the I accidently sent the patch with this:

 # Determine script to apply to library files that go into $testsbase/.
-sed_transform_testsrelated_lib_file = sed_transform_lib_file
+sed_transform_testsrelated_lib_file = None

Here is a fixed version, sorry.

CollinFrom 83fe701bf90750f0743e0e6677489431b18d60b2 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Wed, 27 Mar 2024 17:39:58 -0700
Subject: [PATCH] gnulib-tool.py: Inline 'sed' invocations used on library
 files.

* pygnulib/GLFileSystem.py (GLFileAssistant.__init__): Update type hints
and docstrings to reflect changes necessary for using re.sub() instead
of 'sed'.
(GLFileAssistant.add_or_update): Use re.sub() instead of invoking 'sed'.
* pygnulib/GLImport.py (GLImport.prepare): Update transformation
variables to reflect changes to GLFileAssistant.
---
 ChangeLog| 10 ++
 gnulib-tool.py.TODO  |  1 -
 pygnulib/GLFileSystem.py | 43 +---
 pygnulib/GLImport.py |  9 -
 4 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bc31b14bd2..aa7df63120 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2024-03-27  Collin Funk  
+
+	gnulib-tool.py: Inline 'sed' invocations used on library files.
+	* pygnulib/GLFileSystem.py (GLFileAssistant.__init__): Update type hints
+	and docstrings to reflect changes necessary for using re.sub() instead
+	of 'sed'.
+	(GLFileAssistant.add_or_update): Use re.sub() instead of invoking 'sed'.
+	* pygnulib/GLImport.py (GLImport.prepare): Update transformation
+	variables to reflect changes to GLFileAssistant.
+
 2024-03-27  Bruno Haible  
 
 	obstack: Work around ICE with Oracle cc 12.6 (regr. 2023-12-01).
diff --git a/gnulib-tool.py.TODO b/gnulib-tool.py.TODO
index d7da337f20..48c9e39fa8 100644
--- a/gnulib-tool.py.TODO
+++ b/gnulib-tool.py.TODO
@@ -9,7 +9,6 @@ Optimize:
   - os.chdir around subprocess creation -> cwd=... argument instead.
   - Inline all 'sed' invocations:
 main.py:1387:args = ['sed', '-e', sed_table, tempname]
-GLFileSystem.py:382:args = ['sed', '-e', transformer]
 
 Various other refactorings, as deemed useful.
 
diff --git a/pygnulib/GLFileSystem.py b/pygnulib/GLFileSystem.py
index 0397af4df3..a155c7f0a0 100644
--- a/pygnulib/GLFileSystem.py
+++ b/pygnulib/GLFileSystem.py
@@ -19,6 +19,7 @@ from __future__ import annotations
 # Define global imports
 #===
 import os
+import re
 import codecs
 import filecmp
 import subprocess as sp
@@ -163,8 +164,13 @@ class GLFileSystem(object):
 class GLFileAssistant(object):
 '''GLFileAssistant is used to help with file processing.'''
 
-def __init__(self, config: GLConfig, transformers: dict = dict()):
-'''Create GLFileAssistant instance.'''
+def __init__(self, config: GLConfig, transformers: dict[str, tuple[re.Pattern, str] | None] = {}) -> None:
+'''Create GLFileAssistant instance.
+
+config stores information shared between classes.
+transformers is a dictionary which uses a file category as the key. The
+  value accessed is a tuple containing arguments for re.sub() or None if
+  no transformations are needed.'''
 if type(config) is not GLConfig:
 raise TypeError('config must be a GLConfig, not %s'
 % type(config).__name__)
@@ -173,11 +179,11 @@ class GLFileAssistant(object):
 % type(transformers).__name__)
 for key in ['lib', 'aux', 'main', 'tests']:
 if key not in transformers:
-transformers[key] = 's,x,x,'
+transformers[key] = None
 else:  # if key in transformers
 value = transformers[key]
-if type(value) is not str:
-raise TypeError('transformers[%s] must be a string, not %s'
+if type(value) is not tuple and value != None:
+raise TypeError('transformers[%s] must be a tuple or None, not %s'
 % (key, type(value).__name__))
 self.original = None
 self.rewritten = None
@@ -341,10 +347,10 @@ class GLFileAssistant(object):
 xoriginal = substart('tests=lib/', 'lib/', original)
 lookedup, tmpflag = self.filesystem.lookup(xoriginal)
 tmpfile = self.tmpfilename(rewritten)
-sed_transform_lib_file = self.transformers.get('lib', '')
-sed_transform_build_aux_file = self.transformers.get('aux', '')
-sed_transform_main_lib_file = self.transformers.get('main', '')
-sed_transform_testsrelated_lib_file = self.transformers.get('tests', '')
+sed_transform_lib_file = self.transf

[PATCH] gnulib-tool.py: Inline 'sed' invocations used on library files.

2024-03-27 Thread Collin Funk
Hi Bruno,

Here is the 'sed' inlining I mentioned earlier.

While adding the type hints I noticed that some docstrings are missing
or are out of date. Some my fault, some not. :)

I'll work on updating them as I make changes. It is easier to do that
than go out of my way to fix them all at once.

To test this change you can use the coreutils import test case. It
uses the config-h module for lib/vfprintf.c and friends.

CollinFrom 639d368ed6378e71a5a5478e04c2422239287ad2 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Wed, 27 Mar 2024 17:39:58 -0700
Subject: [PATCH] gnulib-tool.py: Inline 'sed' invocations used on library
 files.

* pygnulib/GLFileSystem.py (GLFileAssistant.__init__): Update type hints
and docstrings to reflect changes necessary for using re.sub() instead
of 'sed'.
(GLFileAssistant.add_or_update): Use re.sub() instead of invoking 'sed'.
* pygnulib/GLImport.py (GLImport.prepare): Update transformation
variables to reflect changes to GLFileAssistant.
---
 ChangeLog| 10 ++
 gnulib-tool.py.TODO  |  1 -
 pygnulib/GLFileSystem.py | 43 +---
 pygnulib/GLImport.py | 11 +-
 4 files changed, 38 insertions(+), 27 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bc31b14bd2..aa7df63120 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2024-03-27  Collin Funk  
+
+	gnulib-tool.py: Inline 'sed' invocations used on library files.
+	* pygnulib/GLFileSystem.py (GLFileAssistant.__init__): Update type hints
+	and docstrings to reflect changes necessary for using re.sub() instead
+	of 'sed'.
+	(GLFileAssistant.add_or_update): Use re.sub() instead of invoking 'sed'.
+	* pygnulib/GLImport.py (GLImport.prepare): Update transformation
+	variables to reflect changes to GLFileAssistant.
+
 2024-03-27  Bruno Haible  
 
 	obstack: Work around ICE with Oracle cc 12.6 (regr. 2023-12-01).
diff --git a/gnulib-tool.py.TODO b/gnulib-tool.py.TODO
index d7da337f20..48c9e39fa8 100644
--- a/gnulib-tool.py.TODO
+++ b/gnulib-tool.py.TODO
@@ -9,7 +9,6 @@ Optimize:
   - os.chdir around subprocess creation -> cwd=... argument instead.
   - Inline all 'sed' invocations:
 main.py:1387:args = ['sed', '-e', sed_table, tempname]
-GLFileSystem.py:382:args = ['sed', '-e', transformer]
 
 Various other refactorings, as deemed useful.
 
diff --git a/pygnulib/GLFileSystem.py b/pygnulib/GLFileSystem.py
index 0397af4df3..a155c7f0a0 100644
--- a/pygnulib/GLFileSystem.py
+++ b/pygnulib/GLFileSystem.py
@@ -19,6 +19,7 @@ from __future__ import annotations
 # Define global imports
 #===
 import os
+import re
 import codecs
 import filecmp
 import subprocess as sp
@@ -163,8 +164,13 @@ class GLFileSystem(object):
 class GLFileAssistant(object):
 '''GLFileAssistant is used to help with file processing.'''
 
-def __init__(self, config: GLConfig, transformers: dict = dict()):
-'''Create GLFileAssistant instance.'''
+def __init__(self, config: GLConfig, transformers: dict[str, tuple[re.Pattern, str] | None] = {}) -> None:
+'''Create GLFileAssistant instance.
+
+config stores information shared between classes.
+transformers is a dictionary which uses a file category as the key. The
+  value accessed is a tuple containing arguments for re.sub() or None if
+  no transformations are needed.'''
 if type(config) is not GLConfig:
 raise TypeError('config must be a GLConfig, not %s'
 % type(config).__name__)
@@ -173,11 +179,11 @@ class GLFileAssistant(object):
 % type(transformers).__name__)
 for key in ['lib', 'aux', 'main', 'tests']:
 if key not in transformers:
-transformers[key] = 's,x,x,'
+transformers[key] = None
 else:  # if key in transformers
 value = transformers[key]
-if type(value) is not str:
-raise TypeError('transformers[%s] must be a string, not %s'
+if type(value) is not tuple and value != None:
+raise TypeError('transformers[%s] must be a tuple or None, not %s'
 % (key, type(value).__name__))
 self.original = None
 self.rewritten = None
@@ -341,10 +347,10 @@ class GLFileAssistant(object):
 xoriginal = substart('tests=lib/', 'lib/', original)
 lookedup, tmpflag = self.filesystem.lookup(xoriginal)
 tmpfile = self.tmpfilename(rewritten)
-sed_transform_lib_file = self.transformers.get('lib', '')
-sed_transform_build_aux_file = self.transformers.get('aux', '')
-sed_transform_main_lib_file = self.transformers.get('main', '')
-sed_transform_testsrelated_lib_file = self.transformers.get('tests', '')
+sed_transform_lib_file = self.transforme