Re: gnulib-tool.py: Handle absolute path checks consistently.

2024-06-09 Thread Bruno Haible
Collin Funk wrote:
> Lets just use os.path.isabs() and let it deal with
> platform related stuff.

OK.

I assume that if someone uses gnulib-tool with Cygwin, and has a
native Windows Python installation, they will report problems. Then
we can decide what to do. (Probably there are few people who do this.
Nowadays, WSL is the faster way to run shell scripts on Windows.)

Bruno






Re: gnulib-tool.py: Refactor duplicated regular expressions.

2024-06-03 Thread Bruno Haible
Collin Funk wrote:
> Good point. I've pushed this patch adding a set() around that one call.

Thanks.

> This comment is interesting:
> 
>   # We need NO_EXPENSIVE_WARN_CFLAGS, because with gcc 13, the compilation of
>   # libxml/parser.c requires 1.9 GB of RAM with -fanalyzer, as opposed to
>   # 0.09 GB of RAM with -fno-analyzer.
> 
> I noticed that -fanalyzer is slow on files with many lines (e.g.
> vasnprintf.c, many Emacs sources) but I never looked into memory usage.

I never deliberately looked into memory usage. But since many of my VMs
have 512 MB or 1 GB of RAM, a compiler that needs 1.9 GB of RAM became a
blocking issue :)

Bruno






Re: gnulib-tool.py: Refactor duplicated regular expressions.

2024-06-03 Thread Collin Funk
Bruno Haible  writes:

>> The lists there seem small enough that I doubt it matters too much.
>
> Bad excuse :) You haven't seen this module yet:
> https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gnulib-local/modules/libxml;h=dc6404c59362e5d17885a211a61a4eed99da6642;hb=HEAD#l130

Hahaha, I concede my point then. :)

This comment is interesting:

  # We need NO_EXPENSIVE_WARN_CFLAGS, because with gcc 13, the compilation of
  # libxml/parser.c requires 1.9 GB of RAM with -fanalyzer, as opposed to
  # 0.09 GB of RAM with -fno-analyzer.

I noticed that -fanalyzer is slow on files with many lines (e.g.
vasnprintf.c, many Emacs sources) but I never looked into memory usage.

>> Perhaps a patch like this is a better fix then just wrapping the call in
>> set() there:
>
> Although it would work to make _extract_lib_SOURCES return a set (since
> for both callers, the order of the result is irrelevant), it is more
> maintainable to let _extract_lib_SOURCES return a list and let each caller
> decide whether they need a set(...) conversion or not. This way, if a
> third caller is added in the future, you don't need to undo the optimization
> in _extract_lib_SOURCES.

Good point. I've pushed this patch adding a set() around that one call.

Collin

>From c1cdc5ef8ef8b22d46bf55d8cb992f0b75532b46 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Mon, 3 Jun 2024 04:57:14 -0700
Subject: [PATCH] gnulib-tool.py: Use a set to optimize.

* pygnulib/GLModuleSystem.py
(GLModule.getAutomakeSnippet_Unconditional): Call set() on the result of
_extract_lib_SOURCES() to ensure computing the difference between
another set is O(n).
---
 ChangeLog  | 8 
 pygnulib/GLModuleSystem.py | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 40238433eb..3e1e553dc4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2024-06-03  Collin Funk  
+
+	gnulib-tool.py: Use a set to optimize.
+	* pygnulib/GLModuleSystem.py
+	(GLModule.getAutomakeSnippet_Unconditional): Call set() on the result of
+	_extract_lib_SOURCES() to ensure computing the difference between
+	another set is O(n).
+
 2024-06-03  Bruno Haible  
 
 	pthread-* tests, regex tests: Prepare for use of 'alarm'.
diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py
index 5d67c5b6df..02dacfcf9f 100644
--- a/pygnulib/GLModuleSystem.py
+++ b/pygnulib/GLModuleSystem.py
@@ -587,7 +587,7 @@ def getAutomakeSnippet_Unconditional(self) -> str:
 snippet = self.getAutomakeSnippet_Conditional()
 snippet = combine_lines(snippet)
 # Get all the file names from 'lib_SOURCES += ...'.
-mentioned_files = _extract_lib_SOURCES(snippet)
+mentioned_files = set(_extract_lib_SOURCES(snippet))
 all_files = self.getFiles()
 lib_files = filter_filelist('\n', all_files,
 'lib/', '', 'lib/', '')
-- 
2.45.1



Re: gnulib-tool.py: Refactor duplicated regular expressions.

2024-06-03 Thread Bruno Haible
Hi Collin,

> Oops yes. Good point, I must have overlooked that. It appears that it
> will be O(n²). In this loop [1] and the else branch of
> "PyAnySet_Check(other)" [2].

Thanks for checking.

> The lists there seem small enough that I doubt it matters too much.

Bad excuse :) You haven't seen this module yet:
https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gnulib-local/modules/libxml;h=dc6404c59362e5d17885a211a61a4eed99da6642;hb=HEAD#l130

> Perhaps a patch like this is a better fix then just wrapping the call in
> set() there:

Although it would work to make _extract_lib_SOURCES return a set (since
for both callers, the order of the result is irrelevant), it is more
maintainable to let _extract_lib_SOURCES return a list and let each caller
decide whether they need a set(...) conversion or not. This way, if a
third caller is added in the future, you don't need to undo the optimization
in _extract_lib_SOURCES.

Bruno






Re: gnulib-tool.py: Refactor duplicated regular expressions.

2024-06-03 Thread Collin Funk
Hi Bruno,

Bruno Haible  writes:

> The variable mentioned_files in pygnulib/GLModuleSystem.py line 599
> was a set and is now a list. Is the expression
>   lib_files.difference(mentioned_files)
> therefore being evaluated more slowly (as O(n²) where it was O(n) before)?
> That is, does Python iterate through mentioned_files for each element of
> lib_files, or does Python convert the argument mentioned_files internally
> to a set first?

Oops yes. Good point, I must have overlooked that. It appears that it
will be O(n²). In this loop [1] and the else branch of
"PyAnySet_Check(other)" [2].

The lists there seem small enough that I doubt it matters too much.
Perhaps a patch like this is a better fix then just wrapping the call in
set() there:

diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py
index 5d67c5b6df..b91406f564 100644
--- a/pygnulib/GLModuleSystem.py
+++ b/pygnulib/GLModuleSystem.py
@@ -45,14 +45,14 @@
 _LIB_SOURCES_PATTERN = re.compile(r'^lib_SOURCES[\t ]*\+=[\t ]*(.+?)[\t 
]*(?:#|$)', re.MULTILINE)
 
 
-def _extract_lib_SOURCES(snippet: str) -> list[str]:
+def _extract_lib_SOURCES(snippet: str) -> set[str]:
 '''Return the file names specified in the lib_SOURCES variable of
 a Makefile.am snippet.'''
 lines = [ line.group(1)
   for line in re.finditer(_LIB_SOURCES_PATTERN, snippet) ]
-return [ file_name
+return { file_name
  for line in lines
- for file_name in line.split() ]
+ for file_name in line.split() }


I want to move some other file list stuff to use sets
(GLModuleSystem.filelist particularly) since membership checks are
performed on them and there is many unnecessary sorted() calls that can
be cleaned up.

Sets would work better for GLModule objects too but sorting matters for
them because 'dummy' is always placed last. Sort of strange behavior
that is necessary to match gnulib-tool.sh output in Makefile.am. :)

Collin

[1] 
https://github.com/python/cpython/blob/059be67b51717519609b29c53bf742ca4d91b68f/Objects/setobject.c#L1576
[2] 
https://github.com/python/cpython/blob/059be67b51717519609b29c53bf742ca4d91b68f/Objects/setobject.c#L1430



Re: gnulib-tool.py: Refactor duplicated regular expressions.

2024-06-03 Thread Bruno Haible
Hi Collin,

> Indirectly this also fixes two issues. The first is the same issue as
> yesterdays patch. Variables with a trailing comment, e.g.
> 
> lib_SOURCES += file.c file.h # comment
> 
> are not matched. As far as I can tell none of the module descriptions
> have this comment style but gnulib-tool.sh handles it (line 3453-3466).
> So gnulib-tool.py should too.

Thanks! Yes, gnulib-tool should handle this.

The variable mentioned_files in pygnulib/GLModuleSystem.py line 599
was a set and is now a list. Is the expression
  lib_files.difference(mentioned_files)
therefore being evaluated more slowly (as O(n²) where it was O(n) before)?
That is, does Python iterate through mentioned_files for each element of
lib_files, or does Python convert the argument mentioned_files internally
to a set first?

Bruno






Re: gnulib-tool.py: Fix crash when no ACLOCAL_AMFLAGS is found.

2024-06-02 Thread Bruno Haible
Collin Funk wrote:
> Thanks, I've applied the attached patch fixing that regular expression.

Thanks. Now it appears to work fine.

Bruno






Re: gnulib-tool.py: Fix crash when no ACLOCAL_AMFLAGS is found.

2024-06-02 Thread Collin Funk
Bruno Haible  writes:

>> > Thanks for this recipe. I think this is worth a unit test, since it
>> > is a special case that is otherwise not tested.
>> 
>> Agreed. I can add one later today.
>
> I added it already.

Thanks, I've applied the attached patch fixing that regular expression.

Also, Paul while investigating this I noticed some strange behavior. The
Autoconf documentation mentions the limit to a single line for
ACLOCAL_AMFLAGS. Perhaps it is worth changing this regexp [1] or
mentioning this in the documentation [2].

Here is a test case:

   $ gnulib-tool --create-testdir --dir testdir1 README-release
   $ cd testdir1
   $ sed -i -e 's/ACLOCAL_AMFLAGS = -I glm4/ACLOCAL_AMFLAGS = -I glm4 # 
--help/g' Makefile.am
   $ cat Makefile.am
 [...]
 ACLOCAL_AMFLAGS = -I glm4 # --help
   $ autoreconf
 Usage: aclocal [OPTION]...

 Generate 'aclocal.m4' by scanning 'configure.ac' or 'configure.in'
 [...]

If you remove aclocal.m4 and try to autoreconf the build will fail
because m4 macros cannot be found of course.

I don't know enough about Autotools to tell if anything silly can be
done using this trick. Not that you would run auto* in a project that
you don't trust anyways...

Collin

[1] 
https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.72/autoconf.html#index-AC_005fCONFIG_005fMACRO_005fDIR-1
[2] https://git.savannah.gnu.org/cgit/autoconf.git/tree/bin/autoreconf.in#n532

>From 1372dc1631053fa72bbf5755cbc89f63cbffec33 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Sun, 2 Jun 2024 05:29:33 -0700
Subject: [PATCH] gnulib-tool.py: Fix regular expression (regr. today).

* pygnulib/main.py (main) [import]: Match all characters until '#' or
end of line, whichever comes first.
---
 ChangeLog| 4 
 pygnulib/main.py | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 3a982c4988..f076c168b0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,9 @@
 2024-06-02  Collin Funk  
 
+	gnulib-tool.py: Fix regular expression (regr. today).
+	* pygnulib/main.py (main) [import]: Match all characters until '#' or
+	end of line, whichever comes first.
+
 	gnulib-tool.py: Fix crash when no ACLOCAL_AMFLAGS is found.
 	* pygnulib/main.py (main) [import]: Use a regular expression to match
 	the ACLOCAL_AMFLAGS Makefile.am variable. Properly handle the case where
diff --git a/pygnulib/main.py b/pygnulib/main.py
index 6167135858..c1c2cd2c1e 100644
--- a/pygnulib/main.py
+++ b/pygnulib/main.py
@@ -987,7 +987,7 @@ def main(temp_directory: str) -> None:
 if os.path.isfile(filepath):
 with open(filepath, mode='r', newline='\n', encoding='utf-8') as file:
 data = file.read()
-pattern = re.compile(r'^ACLOCAL_AMFLAGS[\t ]*=[\t ]*([^#]+?)$', re.MULTILINE)
+pattern = re.compile(r'^ACLOCAL_AMFLAGS[\t ]*=[\t ]*(.+?)[\t ]*(?:#|$)', re.MULTILINE)
 match = re.search(pattern, data)
 if match:
 aclocal_amflags = match.group(1).split()
-- 
2.45.1



Re: gnulib-tool.py: Fix crash when no ACLOCAL_AMFLAGS is found.

2024-06-02 Thread Bruno Haible
Hi Collin,

> > Thanks for this recipe. I think this is worth a unit test, since it
> > is a special case that is otherwise not tested.
> 
> Agreed. I can add one later today.

I added it already.

Bruno






Re: gnulib-tool.py: Fix crash when no ACLOCAL_AMFLAGS is found.

2024-06-02 Thread Collin Funk
Hi Bruno,

Bruno Haible  writes:

> Thanks for this recipe. I think this is worth a unit test, since it
> is a special case that is otherwise not tested.

Agreed. I can add one later today.

> The regular expression doesn't seem right:
[...]
> The point of using a non-greedy operator is to not include newlines
> in the match, right? So,
>   re.compile(r'^ACLOCAL_AMFLAGS[\t ]*=[\t ]*([^#]*)', re.MULTILINE)
> wouldn't work?

Oops, good catch. I was trying to be clever and failed. :)

Here is something like what I was trying to do:

===
import re
string = """
ACLOCAL_AMFLAGS =
ACLOCAL_AMFLAGS = -I m4 -I glm4 # ignore this please
"""
pattern = re.compile(r'^ACLOCAL_AMFLAGS[\t ]*=[\t ]*(.+?)[\t ]*(?:#|$)', 
re.MULTILINE)
print('"' + re.search(pattern, string).group(1) + '"')
===

Not sure if Makefile.am's are written like that in practice but here is
the output:

$ ./main.py 
"-I m4 -I glm4"

Collin



Re: gnulib-tool.py: Fix crash when no ACLOCAL_AMFLAGS is found.

2024-06-02 Thread Bruno Haible
Hi Collin,

> Not sure how this didn't get found sooner, to be honest. Here is an
> example of the bug:
> 
> $ git clone https://git.savannah.gnu.org/git/rcs.git
> $ cd rcs
> $ git checkout next
> $ sh -x autogen.sh

Thanks for this recipe. I think this is worth a unit test, since it
is a special case that is otherwise not tested.

> I've applied this patch fixing it and confirmed it passes with
> GNULIB_TOOL_IMPL=sh+py.

The regular expression doesn't seem right:

>>> import re
>>> re.search(re.compile(r'^ACLOCAL_AMFLAGS[\t ]*=[\t ]*([^#]+?)$', 
>>> re.MULTILINE), '')
>>> re.search(re.compile(r'^ACLOCAL_AMFLAGS[\t ]*=[\t ]*([^#]+?)$', 
>>> re.MULTILINE), 'ACLOCAL_AMFLAGS = -I m4')

>>> re.search(re.compile(r'^ACLOCAL_AMFLAGS[\t ]*=[\t ]*([^#]+?)$', 
>>> re.MULTILINE), 'ACLOCAL_AMFLAGS = -I m4 # yes')

The point of using a non-greedy operator is to not include newlines
in the match, right? So,
  re.compile(r'^ACLOCAL_AMFLAGS[\t ]*=[\t ]*([^#]*)', re.MULTILINE)
wouldn't work?

Bruno






Re: gnulib-tool.py: Fix behavior of --test when a subprocess fails.

2024-05-08 Thread Bruno Haible
Hi Collin,

> I've pushed the attached patch. In gnulib-tool.sh the following is
> done:
> 
> ../configure || func_exit 1
> $MAKE || func_exit 1
> [...]
> 
> So here we can just use sp.run([command, arg], check=True) and exit if
> an exception occurs.

Thanks. I confirm that --test now behaves as expected.

But --megatest has the same problem. It was apparently overlooked in
commit 42a01df09d7f13f15a54e0c4b09d121c5bf7ac66, in 2006. The default
should be to reflect errors in the exit code. If a user wants to ignore
the build errors, they can 'cd' into the directory and run "make -k" or
"make -k check". Done through the patch below.

> As a side note, gradually I'll start changing sp.call(),
> sp.check_call(), etc. to sp.run(). Under the hood they all call the
> same internal functions, but the latter is newer and "recommended"
> [1].
> 
> I always forget how that module works and just using the recommended
> sp.run() makes scanning the documentation easier.
> 
> [1] 
> https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module

Sounds good.


2024-05-08  Bruno Haible  

gnulib-tool: In --megatestdir mode, stop when there is an error.
* gnulib-tool.sh (megatest): Fail when one of the 'configure' or 'make'
steps fails.
* pygnulib/main.py (main): Likewise.

diff --git a/gnulib-tool.sh b/gnulib-tool.sh
index 521d16e47a..b5aadcaeaa 100755
--- a/gnulib-tool.sh
+++ b/gnulib-tool.sh
@@ -7501,10 +7501,10 @@ s/\([.*$]\)/[\1]/g'
 cd "$destdir"
   mkdir build
   cd build
-../configure
-$MAKE
-$MAKE check
-$MAKE distclean
+../configure || func_exit 1
+$MAKE || func_exit 1
+$MAKE check || func_exit 1
+$MAKE distclean || func_exit 1
 remaining=`find . -type f -print`
 if test -n "$remaining"; then
   echo "Remaining files:" $remaining 1>&2
diff --git a/pygnulib/main.py b/pygnulib/main.py
index 128f25f9c1..8462916653 100644
--- a/pygnulib/main.py
+++ b/pygnulib/main.py
@@ -1130,10 +1130,13 @@ def main(temp_directory: str) -> None:
 os.chdir(destdir)
 os.mkdir('build')
 os.chdir('build')
-sp.call(['../configure'])
-sp.call([UTILS['make']])
-sp.call([UTILS['make'], 'check'])
-sp.call([UTILS['make'], 'distclean'])
+try:  # Try to execute commands
+sp.run(['../configure'], check=True)
+sp.run([UTILS['make']], check=True)
+sp.run([UTILS['make'], 'check'], check=True)
+sp.run([UTILS['make'], 'distclean'], check=True)
+except Exception:
+sys.exit(1)
 args = ['find', '.', '-type', 'f', '-print']
 remaining = sp.check_output(args).decode(ENCS['shell'])
 lines = [ line.strip()






Re: gnulib-tool.py: Fix an undefined function name.

2024-05-07 Thread Bruno Haible
Hi Collin,

> > Hmm. Is that refactoring, that you are imagining, intrusive? ...
> 
> Depends on the specific warning being addressed. :)
> 
> There are more trivial cases like GLMakeFileTable which stores
> information as a dictionary. We have a table of makefile variables:
> 
>  table: list[dict[str, str | bool]]
>  # Each 'dict' has the following 'key': type.
>  { 'dir': str, 'var': str, 'dotfirst': bool }
> 
> Using a value from a key at this will lead to a type error unless you
> "narrow" the union [1]:
> 
>if type(makefiletable['dir']) is str:
>   makefiletable.split('/') # Would not work on 'bool'.
> 
> Me and you can tell the use of makefiletable['dir'] will return a str,
> but the type checker can't. Using a class like this:
> 
> class GLMakeVar:
> dir: str
> var: str
> dotfirst: bool
> 
> Would be type-checkable and would probably be easier for new
> contributors to understand, in my opinion.

I agree. OK for this refactoring.

> Other cases are less trivial and bring up situations which may cause
> errors. For example there are many warnings about "list[GLModule]"
> actually being "list[GLModule | None]".
> 
> This is caused by functions like GLModule.getDependenciesWithoutConditions():
> 
>  def getDependenciesWithoutConditions(self) -> list[GLModule | None]:
>  # Read the "Depends-on" and find each module name.
>  # Run self.modulesystem.find() to get the GLModule objects.
>  # Return the list of dependencies.
> 
> Remember that modulesystem.find() returns None with a warning (or
> error depending on arguments) if the module cannot be found. Therefore,
> this function should probably filter out None in the return value.

Right. It makes no sense for the returned list to contain None values.

Bruno






Re: gnulib-tool.py speeds up continuous integrations

2024-05-06 Thread Simon Josefsson via Gnulib discussion list
I forgot to mention: the pattern to provide re-usable GitLab CI/CD
definitions that I'm inspired by is Debian's pipeline project:

https://salsa.debian.org/salsa-ci-team/pipeline/

It is easy to setup a new project to use their reusable pipeline -- just
add the CI/CD configuration file setting pointing to their job file --
and gives a broad configurable and tweakable pipeline.  Of course, this
is only for building Debian packages, so it is a narrow focus.

I'm thinking we could do the same but for any project using gnulib.
Within some reasonable limit and assumptions, but the majority of
projects mentioned already are similar enough for this to be possible
relatively easily.

I'm thinking it should be sufficient to add gnu-ci.yml@gnulib/pipeline
(or similar) as a CI/CD configuration file setting to achieve this.

/Simon


signature.asc
Description: PGP signature


Re: gnulib-tool.py speeds up continuous integrations

2024-05-06 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> gnulib-tool is used is many CI jobs. Just adding 'python3' to the
> prerequisites of such a job makes it run faster. Here are the execution
> times for a single run, before and after adding 'python3', for those
> CIs that I maintain or co-maintain. In minutes and seconds.
>
>   Before   After
>
> https://gitlab.com/gnulib/gnulib-ci/-/pipelines   30:  11:
> https://gitlab.com/gnu-gettext/ci-distcheck/-/pipelines   36:  32:
> https://gitlab.com/gnu-poke/ci-distcheck/-/pipelines  18:4018:24
> https://gitlab.com/gnu-libunistring/ci-distcheck/-/pipelines  11:2509:16
> https://gitlab.com/gnu-diffutils/ci-distcheck/-/pipelines 07:2106:27
> https://gitlab.com/gnu-grep/ci-distcheck/-/pipelines  06:5106:08
> https://gitlab.com/gnu-m4/ci-distcheck/-/pipelines06:4605:44
> https://gitlab.com/gnu-sed/ci-distcheck/-/pipelines   05:2804:39
> https://gitlab.com/gnu-gzip/ci-distcheck/-/pipelines  04:1603:58
> https://gitlab.com/gnu-libffcall/ci-distcheck/-/pipelines 01:5001:42
> https://gitlab.com/gnu-libsigsegv/ci-distcheck/-/pipelines00:4500:42

These are useful pipelines with basic build testing!  I help on a bunch
of others below, to get broader OS/architecture-compatibility testing.

https://gitlab.com/gsasl/inetutils/-/pipelines
https://gitlab.com/gsasl/gsasl/-/pipelines
https://gitlab.com/gsasl/shishi/-/pipelines
https://gitlab.com/gsasl/gss/-/pipelines
https://gitlab.com/libidn/libidn2/-/pipelines
https://gitlab.com/libidn/libidn/-/pipelines
https://gitlab.com/gnutls/libtasn1/-/pipelines

I think the pattern of having the .gitlab-ci.yml outside of the core
Savannah project is a good one for those GNU projects who are not
embracing the GitLab platform.  Then GitLab-related stuff stays on the
GitLab platform and doesn't invade the core project.

Would it make sense to collaborate on re-usable GitLab CI/CD pipeline
definitions in a single GitLab project?  Then we can apply that group
for free CI/CD minutes and get testing on macOS/Windows too.  I have a
shared GitLab runner for native arm64 and ppc64el building, and have
wanted to setup NetBSD/OpenBSD/FreeBSD/etc GitLab runners too.  Adding
runners to a group is easy, adding it to multiple groups require some
manual work and added resources on the runner.

How about using https://gitlab.com/gnulib/ as a playground for these
ideas?  Then we can add sub-projects there for pipeline definitions, and
Savannah mirrors of other projects too.  If you can add 'jas' as
maintainer of the 'gnulib' group on GitLab I could add one project to
start work on writing re-usable pipeline definitions, and one example
project maybe for GNU InetUtils that would use these new re-usable
pipeline components to provide a CI/CD pipeline definition file.  I
could add some arm64/ppc64el builds of gnulib too.

/Simon


signature.asc
Description: PGP signature


Re: gnulib-tool.py: Fix an undefined function name.

2024-05-04 Thread Collin Funk
On 5/4/24 8:17 AM, Bruno Haible wrote:
>> For CI, I am thinking it is best to ignore the typing warnings for the
>> time being. Since gnulib-tool.py needs a bit more refactoring for that
>> to be helpful. It was written before typing was a thing. :)
> 
> Hmm. Is that refactoring, that you are imagining, intrusive? If it's
> intrusive, then maybe we should leave it alone. You already introduced
> types in the function signatures and in the fields of classes; what else
> is missing?

Depends on the specific warning being addressed. :)

There are more trivial cases like GLMakeFileTable which stores
information as a dictionary. We have a table of makefile variables:

 table: list[dict[str, str | bool]]
 # Each 'dict' has the following 'key': type.
 { 'dir': str, 'var': str, 'dotfirst': bool }

Using a value from a key at this will lead to a type error unless you
"narrow" the union [1]:

   if type(makefiletable['dir']) is str:
  makefiletable.split('/') # Would not work on 'bool'.

Me and you can tell the use of makefiletable['dir'] will return a str,
but the type checker can't. Using a class like this:

class GLMakeVar:
dir: str
var: str
dotfirst: bool

Would be type-checkable and would probably be easier for new
contributors to understand, in my opinion.

Other cases are less trivial and bring up situations which may cause
errors. For example there are many warnings about "list[GLModule]"
actually being "list[GLModule | None]".

This is caused by functions like GLModule.getDependenciesWithoutConditions():

 def getDependenciesWithoutConditions(self) -> list[GLModule | None]:
 # Read the "Depends-on" and find each module name.
 # Run self.modulesystem.find() to get the GLModule objects.
 # Return the list of dependencies.

Remember that modulesystem.find() returns None with a warning (or
error depending on arguments) if the module cannot be found. Therefore,
this function should probably filter out None in the return value.

>> I'm not too familiar with the GitLab CI stuff. Would a Makefile in
>> gnulib/pygnulib with some checks work?
> 
> Yes, a Makefile in pygnulib/ would be sufficient. The top-level Makefile then
> may invoke it.

Sounds good. I'll have a look at getting started with something in a bit.

[1] 
https://mypy.readthedocs.io/en/stable/type_narrowing.html#type-narrowing-expressions

Collin



Re: gnulib-tool.py: Fix an undefined function name.

2024-05-04 Thread Bruno Haible
Collin Funk wrote:
> For CI, I am thinking it is best to ignore the typing warnings for the
> time being. Since gnulib-tool.py needs a bit more refactoring for that
> to be helpful. It was written before typing was a thing. :)

Hmm. Is that refactoring, that you are imagining, intrusive? If it's
intrusive, then maybe we should leave it alone. You already introduced
types in the function signatures and in the fields of classes; what else
is missing?

> I'm removing the 'lines_to_multiline' import from GLImport.py. We
> could have a few checks for guaranteed runtime failures like this:
> 
>   $ mypy GLImport.py | grep 'name-defined'
>   GLImport.py:722: error: Name "lines_to_multiline" is not defined  
> [name-defined]
>   GLImport.py:736: error: Name "lines_to_multiline" is not defined  
> [name-defined]
>   GLImport.py:1143: error: Name "lines_to_multiline" is not defined  
> [name-defined]
>   GLImport.py:1326: error: Name "lines_to_multiline" is not defined  
> [name-defined]
>   GLImport.py:1327: error: Name "lines_to_multiline" is not defined  
> [name-defined]
>   GLImport.py:1328: error: Name "lines_to_multiline" is not defined  
> [name-defined]

Sounds like a reasonable way to automate this check...

> I'm not too familiar with the GitLab CI stuff. Would a Makefile in
> gnulib/pygnulib with some checks work?

Yes, a Makefile in pygnulib/ would be sufficient. The top-level Makefile then
may invoke it.

Bruno






Re: gnulib-tool.py: Fix an undefined function name.

2024-05-04 Thread Collin Funk
On 5/4/24 3:34 AM, Bruno Haible wrote:
> Yes. We cannot change the past git log, but this is one of the situations
> where changing a ChangeLog entry is adequate. Other such situations are
> obvious typos and forgotten changes.

I've pushed the attached patch adding the date to the ChangeLog entry.

>>> In this case, maybe some Python style checker would have produced a
>>> warning, right? Would it be possible to add a Makefile rule that checks
>>> for the absense of such warnings, and then use this Makefile rule in the
>>> continuous integration that runs at least once a week?
>>
>> Hmm, I like that idea. I would have to find what tool works best for
>> that.
> 
> Yes, please.

Right now I am leaning towards mypy. It seems to be the best at
detecting errors/potential errors.

For CI, I am thinking it is best to ignore the typing warnings for the
time being. Since gnulib-tool.py needs a bit more refactoring for that
to be helpful. It was written before typing was a thing. :)

mypy has the following options:

 mypy --disable-error-code NAME
 mypy --enable-error-code NAME

but there is a long list of error names and I'm not sure how
compatible they are between versions.

I'm removing the 'lines_to_multiline' import from GLImport.py. We
could have a few checks for guaranteed runtime failures like this:

  $ mypy GLImport.py | grep 'name-defined'
  GLImport.py:722: error: Name "lines_to_multiline" is not defined  
[name-defined]
  GLImport.py:736: error: Name "lines_to_multiline" is not defined  
[name-defined]
  GLImport.py:1143: error: Name "lines_to_multiline" is not defined  
[name-defined]
  GLImport.py:1326: error: Name "lines_to_multiline" is not defined  
[name-defined]
  GLImport.py:1327: error: Name "lines_to_multiline" is not defined  
[name-defined]
  GLImport.py:1328: error: Name "lines_to_multiline" is not defined  
[name-defined]

I'm not too familiar with the GitLab CI stuff. Would a Makefile in
gnulib/pygnulib with some checks work? Or did you have a different idea?

I think mypy should be packed in Debian and Fedora as 'python3-mypy'.
Another method that works for Python software is pipx [1]:

$ pipx install mypy
$ readlink $HOME/.local/bin/mypy
/home/collin/.local/share/pipx/venvs/mypy/bin/mypy

Sometimes it helps for distributions with old packages. It uses a
virtual environment so you don't have to worry about breaking things
too.

[1] https://pipx.pypa.io/stable/

CollinFrom 284c0ad51e1caba277616c76c6cd14774d92e19c Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Sat, 4 May 2024 05:17:45 -0700
Subject: [PATCH] Fix ChangeLog entry for previous commit.

---
 ChangeLog | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 96b29b7fc4..6b56b35d8e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,6 +1,6 @@
 2024-05-03  Collin Funk  
 
-	gnulib-tool.py: Fix an undefined function name.
+	gnulib-tool.py: Fix an undefined function name (regression 2024-05-02).
 	* pygnulib/main.py (main_with_exception_handling): Use the tempfile
 	module prefix when calling mkdtemp(). Use the 'glpy' prefix for the
 	temporary directory that exists for the entirety of the program.
-- 
2.44.0



Re: gnulib-tool.py: Fix an undefined function name.

2024-05-04 Thread Bruno Haible
Hi Collin,

> > Can you please add, to the first line of the ChangeLog entry, a note
> > that tells us when the regression was introduced? Such as
> >(regression 2024-xx-xy)
> > or
> >(regr. yesterday)
> ...
> I've never modified a ChangeLog entry before. Is the proper way to
> just modify it and commit with a one line summary?

Yes. We cannot change the past git log, but this is one of the situations
where changing a ChangeLog entry is adequate. Other such situations are
obvious typos and forgotten changes.

> > In this case, maybe some Python style checker would have produced a
> > warning, right? Would it be possible to add a Makefile rule that checks
> > for the absense of such warnings, and then use this Makefile rule in the
> > continuous integration that runs at least once a week?
> 
> Hmm, I like that idea. I would have to find what tool works best for
> that.

Yes, please.

Bruno






Re: gnulib-tool.py: Fix an undefined function name.

2024-05-03 Thread Collin Funk
On 5/3/24 6:03 PM, Bruno Haible wrote:
>> In main.py a call to mkdtemp() was the use of the 'tempfile' module
>> prefix.
> 
> Oh, that would have led to a runtime error, right?

Yes, it would only be seen if it entered the section of code where
modules have an incompatible license. So the program would fail with a
backtrace instead of a helpful error message.

> Can you please add, to the first line of the ChangeLog entry, a note
> that tells us when the regression was introduced? Such as
>(regression 2024-xx-xy)
> or
>(regr. yesterday)
> 
> The motivation for such an annotation is to get us thinking about
>   - how the regression could have been avoided,
>   - how the regression could have been noticed earlier,
>   - whether we need extra measures when we see, say, 5 regressions in a
> single week.

Oops, yes I should have done that in the commit message sorry.

I've never modified a ChangeLog entry before. Is the proper way to
just modify it and commit with a one line summary? In other words, do
I need another ChangeLog entry in that commit?

I don't want to mess it up any further...

> In this case, maybe some Python style checker would have produced a
> warning, right? Would it be possible to add a Makefile rule that checks
> for the absense of such warnings, and then use this Makefile rule in the
> continuous integration that runs at least once a week?

Hmm, I like that idea. I would have to find what tool works best for
that.

I think PyCharm didn't load the whole file all the way. It would show
a warning once you scrolled down to the bottom of the file. That also
isn't very well suited for the Makefile use.

I think mypy gives a warning about it. But that is a type checker that
produces a lot of spam at the moment. With that change reverted:

$ mypy main.py | grep 'name-defined'
main.py:1407: error: Name "mktemp" is not defined  [name-defined]

I want to gradually improve the ability of it's type checking. But
that involves making sure the code *improves* with the changes. In
some cases you can silence the warnings with worse code. :)

I'll do some research into seeing if there are tools more suited
towards that.

Collin



Re: gnulib-tool.py: Fix an undefined function name.

2024-05-03 Thread Bruno Haible
Hi Collin,

> In main.py a call to mkdtemp() was the use of the 'tempfile' module
> prefix.

Oh, that would have led to a runtime error, right?

Can you please add, to the first line of the ChangeLog entry, a note
that tells us when the regression was introduced? Such as
   (regression 2024-xx-xy)
or
   (regr. yesterday)

The motivation for such an annotation is to get us thinking about
  - how the regression could have been avoided,
  - how the regression could have been noticed earlier,
  - whether we need extra measures when we see, say, 5 regressions in a
single week.

In this case, maybe some Python style checker would have produced a
warning, right? Would it be possible to add a Makefile rule that checks
for the absense of such warnings, and then use this Makefile rule in the
continuous integration that runs at least once a week?

> I put back the 'glpy' prefix.

Good.

Bruno






Re: gnulib-tool.py: Don't leave temporary directories on exit.

2024-05-02 Thread Bruno Haible
Hi Collin,

> ... It seems like a pain to track
> down every code path that can lead to 'sys.exit()' or 'exit()' with a
> temporary directory still existing.
> ...
> Therefore, we can use tempfile.TemporaryDirectory as a context manager
> and let Python cleanup for us [1]:
> 
>  def main_with_exception_handling() -> None:
>  try:  # Try to execute
> -main()
> +with tempfile.TemporaryDirectory() as temporary_directory:
> +main(temporary_directory)
>  except GLError as error:
>  errmode = 0  # gnulib-style errors
>  errno = error.errno

Yes, that's a nice improvement. I agree, the 'with' statement with
automatic resource cleanup is exactly the right tool for this situation.

Bruno






Re: gnulib-tool.py: Quote file names passed to 'patch'.

2024-05-02 Thread Bruno Haible
Collin Funk wrote:
> I noticed that the file names when running 'patch' on test-driver
> weren't quoted. I guess that would cause problems in practice if you
> used spaces in directories

Indeed. Thanks for fixing that!

> Since we assume POSIX shells we can just use shlex.quote() to deal
> with any theoretical shell injections too [1].

Yes, I agree. We don't need to write the equivalent of module 'sh-quote'
in Python, when it already exists.

Bruno






Re: gnulib-tool.py: Use the GLModule's name variable directly.

2024-05-01 Thread Bruno Haible
Collin Funk wrote:
> Since 'module' is pretty much always a GLModule, I have renamed these
> to 'module_name'.

+1






Re: gnulib-tool.py: Use the GLModule's name variable directly.

2024-05-01 Thread Collin Funk
On 5/1/24 1:51 AM, Collin Funk wrote:
> I pushed this patch removing getName() from GLModule, preferring
> it's name variable directly.

This patch fixes a mistake I made in two functions.

They took a 'module' argument which was a str. I must have not looked
at the type hint...

Since 'module' is pretty much always a GLModule, I have renamed these
to 'module_name'.

CollinFrom ecdab13fca9f61cdfa832005c9324414621b4069 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Wed, 1 May 2024 02:21:43 -0700
Subject: [PATCH 2/2] gnulib-tool.py: Fix mistake in previous commit.

* pygnulib/GLModuleSystem.py (GLModuleSystem.exists)
(GLModuleSystem.find): Rename 'module' argument to 'module_name' so it
is clear they are not a GLModule object. Treat them as such.
---
 ChangeLog  |  7 +++
 pygnulib/GLModuleSystem.py | 34 +-
 2 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 917761a7d2..fee17c01ce 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-05-01  Collin Funk  
+
+	gnulib-tool.py: Fix mistake in previous commit.
+	* pygnulib/GLModuleSystem.py (GLModuleSystem.exists)
+	(GLModuleSystem.find): Rename 'module' argument to 'module_name' so it
+	is clear they are not a GLModule object. Treat them as such.
+
 2024-05-01  Collin Funk  
 
 	gnulib-tool.py: Use the GLModule's name variable directly.
diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py
index 2bfd5b252f..2cb5d1d468 100644
--- a/pygnulib/GLModuleSystem.py
+++ b/pygnulib/GLModuleSystem.py
@@ -65,43 +65,43 @@ def __repr__(self) -> str:
 result = '' % hex(id(self))
 return result
 
-def exists(self, module: str) -> bool:
+def exists(self, module_name: str) -> bool:
 '''Check whether the given module exists.
 GLConfig: localpath.'''
-if type(module) is not str:
-raise TypeError('module must be a string, not %s'
-% type(module).__name__)
+if type(module_name) is not str:
+raise TypeError('module_name must be a string, not %s'
+% type(module_name).__name__)
 localpath = self.config['localpath']
 result = False
 badnames = ['ChangeLog', 'COPYING', 'README', 'TEMPLATE',
 'TEMPLATE-EXTENDED', 'TEMPLATE-TESTS']
-if module not in badnames:
-result = os.path.isfile(joinpath(DIRS['modules'], module))
+if module_name not in badnames:
+result = os.path.isfile(joinpath(DIRS['modules'], module_name))
 if not result:
 for localdir in localpath:
 if (os.path.isdir(joinpath(localdir, 'modules'))
-and os.path.isfile(joinpath(localdir, 'modules', module))):
+and os.path.isfile(joinpath(localdir, 'modules', module_name))):
 result = True
 break
 return result
 
-def find(self, module: str) -> GLModule | None:
+def find(self, module_name: str) -> GLModule | None:
 '''Return the GLModule object given the module name,
 or None if the module description file with that name does not exist.
-- module, The name of the module.'''
-if type(module) is not str:
-raise TypeError('module must be a string, not %s'
-% type(module).__name__)
-if self.exists(module):
-path, istemp = self.filesystem.lookup(joinpath('modules', module))
-result = GLModule(self.config, module, path, istemp)
+- module_name, The name of the module.'''
+if type(module_name) is not str:
+raise TypeError('module_name must be a string, not %s'
+% type(module_name).__name__)
+if self.exists(module_name):
+path, istemp = self.filesystem.lookup(joinpath('modules', module_name))
+result = GLModule(self.config, module_name, path, istemp)
 return result
 else:  # if not self.exists(module)
 if self.config['errors']:
-raise GLError(3, module)
+raise GLError(3, module_name)
 else:  # if not self.config['errors']
 sys.stderr.write('gnulib-tool: warning: ')
-sys.stderr.write("module %s doesn't exist\n" % module.name)
+sys.stderr.write("module %s doesn't exist\n" % module_name)
 
 def file_is_module(self, filename: str) -> bool:
 '''Given the name of a file in the modules/ directory, return true
-- 
2.44.0



Re: gnulib-tool.py: Emit libtests in testdirs generated Makefile.am.

2024-04-30 Thread Collin Funk
Hi Bruno,

On 4/30/24 12:44 AM, Bruno Haible wrote:
> Yes, it's a good sign, and most probably a consequence of the testing
> with various packages [1][2] that we did.

And the test suite that you wrote. Many thanks for working on that for
me.

Also, thanks for testing all the projects who use cvs. I never learned
how to use it haha.

Collin



Re: gnulib-tool.py: Emit libtests in testdirs generated Makefile.am.

2024-04-30 Thread Bruno Haible
Collin Funk wrote:
> Hopefully no other bug reports for gnulib-tool.py since the info-gnu
> announcement other than the one I just patched is a good sign? :)

Yes, it's a good sign, and most probably a consequence of the testing
with various packages [1][2] that we did.

Bruno

[1] https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00101.html
[2] https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00171.html






Re: gnulib-tool.py: Add type hints to classes.

2024-04-29 Thread Bruno Haible
Hi Collin,

> This patch adds type hints to the Python classes.

Looks good. Thanks!

> Same as previously done in GLFileTable that I wrote. The only new
> thing introduced is the syntax for class variables, so this line in a
> class definition:
> 
>  section_label_pattern = re.compile(...)
> 
> becomes this:
> 
>  section_label_pattern: ClassVar[re.Pattern] = re.compile(...)

It's pretty self-explaining, therefore OK to use this new ClassVar[...].

Bruno






Re: gnulib-tool.py --create-megatestdir bug

2024-04-27 Thread Collin Funk
Hi Bruno,

On 4/27/24 8:48 AM, Bruno Haible wrote:
> Could you please look at this one? I created a megatestdir for some
> portability testing, with gnulib-tool.py (of course), and it failed
> on macOS:

Thanks for the report! I can reproduce it.

I have only had a brief look and it looks like gnulib-tool.py is
adding libtests where gnulib-tool.sh isn't.

I'll have a look at fixing that right now.

Collin



Re: gnulib-tool.py: Remove some unused instance variables.

2024-04-26 Thread Collin Funk
Hi Bruno,

On 4/26/24 2:20 AM, Bruno Haible wrote:
> However, please hold pushing patches for the next 3 days. I will be
> posting the announcement of the faster gnulib-tool in the next few
> hours; therefore I expect that several GNU package maintainers will
> give it a try for the first time in years, this week-end. This would
> not be the right time to introduce possible regressions.
> 
> You are free to prepare commits, ask for reviews, etc. — just don't push.

Exciting. Thanks for letting me know! That sounds like a good idea.

I was just having a look at simplifying GLMakefileTable. I'll just
work on stuff in a local branch and maybe post them for review. Then
in 3 days *hopefully* there isn't any unexpected bugs in
gnulib-tool.py that need fixing.

> You are also free to make changes to the HACKING file (describing your IDE,
> for example) or to the maint-tools repository; these have no effect on
> what GNU package maintainers see when the run gnulib-tool.

True, I remember saying I would. Then I never got around to it...

Collin



Re: gnulib-tool.py: Remove some unused instance variables.

2024-04-26 Thread Bruno Haible
Hi Collin,

> Most of the instance variables in GLMegaTestDir are unused. This makes
> sense since GLMegaTestDir pretty much just calls GLTestDir functions
> on a list of modules and then does some extra work afterwards.
> 
> I committed this patch removing them.

Looks good. Thanks.

However, please hold pushing patches for the next 3 days. I will be
posting the announcement of the faster gnulib-tool in the next few
hours; therefore I expect that several GNU package maintainers will
give it a try for the first time in years, this week-end. This would
not be the right time to introduce possible regressions.

You are free to prepare commits, ask for reviews, etc. — just don't push.

You are also free to make changes to the HACKING file (describing your IDE,
for example) or to the maint-tools repository; these have no effect on
what GNU package maintainers see when the run gnulib-tool.

Thanks!

Bruno






Re: gnulib-tool.py: Add a new GLFileTable class.

2024-04-25 Thread Bruno Haible
Collin Funk wrote:
> for instance variables I
> rather put them outside __init__. It also serves as a decent place to
> add comments that don't belong in doc strings.

Yes. For instance variables that are used in several methods it would seem
odd to declare them in __init__. To me, __init__ is the place where they
are initialized, not declared.

Bruno






Re: gnulib-tool.py: Add a new GLFileTable class.

2024-04-25 Thread Collin Funk
On 4/25/24 1:11 AM, Bruno Haible wrote:
> Interesting syntax. This makes the class easier to understand. You're welcome
> to do the same thing with the other classes (except for GLError.message, which
> is private to a single method).

I agree. I originally wanted to add it to GLModuleTable when I was
looking at it previously. I imagine that section of code is probably
the most difficult for someone new to understand compared to the rest
of gnulib-tool.py.

Before I made the changes to GLModule.__hash__ the types were
something like this:

self.dependers: dict[str, list[str]] # 'module-name' : ['dep1', 'dep2']
self.conditionals: dict[str, str | bool] # key is 'module1---module2'

Which doesn't really tell us much without looking more into the class.
After my changes to GLModule.__hash__ it is a bit better:

self.dependers: defaultdict[GLModule, set[GLModule]]
self.conditionals: dict[tuple[GLModule, GLModule], str | bool]

This syntax can also be used on assignments in __init__ or for normal
variables like this:

# type hinted 'self.dependers = []'.
self.dependers: defaultdict[GLModule, set[GLModule]] = []

Which helps mypy and other type checkers who cannot infer the type
otherwise. I felt it was a bit cluttered so for instance variables I
rather put them outside __init__. It also serves as a decent place to
add comments that don't belong in doc strings.

I'll probably add them once I get around to making variables private
as we discussed previously.

> We are not going to lower the version dependency:
>   - No one asked for it in the beta testing process since last week.
>   - Python 3.6 (and even 3.7) is already end-of-life [1].

Sounds good. In any case it is always good to double check quickly
before modernizing like this. I run all the tests with Python 3.7
before submitting patches for the same reason.

Collin



Re: gnulib-tool.py: Add a new GLFileTable class.

2024-04-25 Thread Bruno Haible
Collin Funk wrote:
> I've applied the following patch adding a GLFileTable class as
> discussed here:
> 
> https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00357.html

Thanks! Looks good.

> +class GLFileTable:
> +'''The GLFileTable class stores file information for the duration of the
> +operation being executed.'''
> +
> +all_files: list[str]
> +old_files: list[tuple[str, str]]
> +new_files: list[tuple[str, str]]
> +added_files: list[str]
> +removed_files: list[str]

Interesting syntax. This makes the class easier to understand. You're welcome
to do the same thing with the other classes (except for GLError.message, which
is private to a single method).

> This syntax was introduced in Python 3.6 which is okay for our
> dependency on Python 3.7. For older versions they can be removed or be
> placed in comments [1]. I've gone with it since I imagine there will
> be bigger fish to fry by lowering the version dependency.

We are not going to lower the version dependency:
  - No one asked for it in the beta testing process since last week.
  - Python 3.6 (and even 3.7) is already end-of-life [1].

Bruno

[1] https://en.wikipedia.org/wiki/History_of_Python#Table_of_versions






Re: gnulib-tool.py: Use absolute imports consistently.

2024-04-23 Thread Collin Funk
Hi Bruno,

On 4/23/24 8:37 AM, Bruno Haible wrote:
> I also checked the effect on the warnings in Eclipse: No new warnings,
> and 1 existing "unused import" warning is fixed.
>
> The patch is OK to push.

Done. Thanks!

Collin



Re: gnulib-tool.py: Use absolute imports consistently.

2024-04-23 Thread Bruno Haible
Hi Collin,

> > OK, patch welcome. I assume that when Dmitry used this style of assignments,
> > the Python 2.7 / 3.0 imports were not so well developed as they are today.
> 
> Here is a patch that removes all the module-specific variables and
> imports the functions directly.

The patch looks good.

> I ran all the tests quickly with Python 3.7 and 3.12 just to double
> check that I didn't break anything in the process. PyCharm has good
> warnings for missing or unused imports so I used that too.

I also checked the effect on the warnings in Eclipse: No new warnings,
and 1 existing "unused import" warning is fixed.

The patch is OK to push.

Bruno






Re: gnulib-tool.py: Use absolute imports consistently.

2024-04-22 Thread Collin Funk
Hi Bruno,

On 4/22/24 7:33 AM, Bruno Haible wrote:
> OK, patch welcome. I assume that when Dmitry used this style of assignments,
> the Python 2.7 / 3.0 imports were not so well developed as they are today.

Here is a patch that removes all the module-specific variables and
imports the functions directly. I haven't pushed it just incase you
want to make any style changes or something.

I ran all the tests quickly with Python 3.7 and 3.12 just to double
check that I didn't break anything in the process. PyCharm has good
warnings for missing or unused imports so I used that too.

CollinFrom 96b746aca8cd77c121ffd4b4343e1b37aec1f745 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Mon, 22 Apr 2024 17:57:52 -0700
Subject: [PATCH] gnulib-tool.py: Make better use of imports.

* pygnulib/*.py: Import functions from constants.py directly instead of
assigning them to module-specific variable. Keep the module prefix for
standard library functions.
---
 ChangeLog   |   7 ++
 pygnulib/GLConfig.py|  23 ++---
 pygnulib/GLEmiter.py|  32 +++
 pygnulib/GLError.py |   6 --
 pygnulib/GLFileSystem.py|  55 +---
 pygnulib/GLImport.py| 125 +-
 pygnulib/GLInfo.py  |  14 +--
 pygnulib/GLMakefileTable.py |  13 +--
 pygnulib/GLModuleSystem.py  |  36 
 pygnulib/GLTestDir.py   | 144 +++---
 pygnulib/main.py| 170 ++--
 11 files changed, 290 insertions(+), 335 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 462823888d..8fda2cdbba 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-04-22  Collin Funk  
+
+	gnulib-tool.py: Make better use of imports.
+	* pygnulib/*.py: Import functions from constants.py directly instead of
+	assigning them to module-specific variable. Keep the module prefix for
+	standard library functions.
+
 2024-04-22  Bruno Haible  
 
 	gnulib-tool: Fix trouble caused by Python's bytecode cache.
diff --git a/pygnulib/GLConfig.py b/pygnulib/GLConfig.py
index 05947e8ed3..16fa490fc6 100644
--- a/pygnulib/GLConfig.py
+++ b/pygnulib/GLConfig.py
@@ -22,23 +22,16 @@
 import copy
 import tempfile
 from typing import Any
-from . import constants
+from .constants import (
+MODES,
+TESTS,
+joinpath,
+remove_trailing_slashes,
+)
 from .GLError import GLError
 from pygnulib.enums import CopyAction
 
 
-#===
-# Define global constants
-#===
-MODES = constants.MODES
-TESTS = constants.TESTS
-joinpath = constants.joinpath
-relpath = constants.relativize
-remove_trailing_slashes = constants.remove_trailing_slashes
-isfile = os.path.isfile
-normpath = os.path.normpath
-
-
 #===
 # Define GLConfig class
 #===
@@ -1053,9 +1046,9 @@ def setAutoconfFile(self, configure_ac: str) -> None:
 def resetAutoconfFile(self) -> None:
 '''Reset path of autoconf file relative to destdir.'''
 configure_ac = ''
-if isfile(joinpath(self.table['destdir'], 'configure.ac')):
+if os.path.isfile(joinpath(self.table['destdir'], 'configure.ac')):
 configure_ac = joinpath(self.table['destdir'], 'configure.ac')
-elif isfile(joinpath(self.table['destdir'], 'configure.in')):
+elif os.path.isfile(joinpath(self.table['destdir'], 'configure.in')):
 configure_ac = joinpath(self.table['destdir'], 'configure.in')
 self.table['configure_ac'] = configure_ac
 
diff --git a/pygnulib/GLEmiter.py b/pygnulib/GLEmiter.py
index 91077a0325..ed6eae4997 100644
--- a/pygnulib/GLEmiter.py
+++ b/pygnulib/GLEmiter.py
@@ -22,7 +22,14 @@
 import re
 import subprocess as sp
 from collections.abc import Callable
-from . import constants
+from .constants import (
+UTILS,
+joinpath,
+lines_to_multiline,
+combine_lines_matching,
+substart,
+relinverse,
+)
 from .GLInfo import GLInfo
 from .GLConfig import GLConfig
 from .GLModuleSystem import GLModule
@@ -30,17 +37,6 @@
 from .GLMakefileTable import GLMakefileTable
 
 
-#===
-# Define global constants
-#===
-UTILS = constants.UTILS
-TESTS = constants.TESTS
-joinpath = constants.joinpath
-relinverse = constants.relinverse
-lines_to_multiline = constants.lines_to_multiline
-isfile = os.path.isfile
-normpath = os.path.normpath
-
 # Regular expressions used to convert Automake conditionals to GNU Make syntax.
 # Each tuple is the arguments given to re.sub in the correct order.
 _CONVERT_TO_GNU_MAKE = [
@@ -474,7 +470,7 @@ def po_POTFILES_in(self, files: list[str]) -> str:
 emit += 

Re: gnulib-tool.py: Use absolute imports consistently.

2024-04-22 Thread Bruno Haible
Hi Collin,

> I want to make the imports and use of functions from other modules
> consistent. This patch makes all files use absolute imports like
> main.py. So this:
> 
>from pygnulib.GLEmiter import GLEmiter
> 
> instead of:
> 
>from .GLEmiter import GLEmiter

Consistency is not a good enough reason to do that. If we commit a
patch like this, in a month we may want to commit the opposite patch,
with the rationale that the package name 'pygnulib' should not be
mentioned so often.

Really, there's not much to win here: The imports work and produce
no warnings. So, best leave them alone and spend your time on areas
that do need work.

> I want to make another change but would like opinions before writing
> the patch. Right now we have:
> 
>import os.path
>from pygnulib import constants
>[...]
>DIRS = constants.DIRS
>substart = constants.substart
>isfile = os.path.isfile
> 
> This works fine, but stops editors from tracking function usage
> properly. For example, GLEmiter might use 'substart' 30 times but an
> editor will say there is only 1 usage since we define
> 'constants.substart' to a variable.

OK, patch welcome. I assume that when Dmitry used this style of assignments,
the Python 2.7 / 3.0 imports were not so well developed as they are today.

> While making that change it would also be a good time to fix another
> inconsistency. Right now some files will use a mix of functions
> prefixed and not prefixed by a module. For example, both
> 'os.path.isfile' and 'isfile'. I think the best solution is to import
> our own code directly and use the module prefix for standard library
> (and third-party code if ever needed). So like this:
> 
> # Import like this.
> import os.path
> from constants import substart
> # Use them like this.
> os.path.join(...)
> substart(...)
> 
> This feels like what I am used to seeing.

Sounds OK to me. Thanks for looking around what are the common Python
idioms.

> Slight exception for the subprocess module, since everyone does
> 'import subprocess as sp'. It would feel strange not too. :)

OK.

Bruno






Re: gnulib-tool.py: Update type hints and docstring.

2024-04-21 Thread Bruno Haible
Collin Funk wrote:
> I am still learning about the
> type hinting and type checking stuff. Overall I think it is a good
> decision though. In the past I remember finding large Python code hard
> to follow without them.

Absolutely, yes.

Bruno






Re: gnulib-tool.py: Update type hints and docstring.

2024-04-21 Thread Collin Funk
Hi Bruno,

On 4/21/24 2:43 PM, Bruno Haible wrote:
> Hmm. It's valid code that will work at runtime. I would understand a
> "warning", if the tool cannot prove that the code is safe. But "error"?
> That's too severe. It's again time for a bug report or two, I would say.

Haha, yes maybe "error" is a bit more scary than it should be. It
isn't a bug though, just limitations on type checking a dynamic
language.

I haven't checked but I believe you could use 'dict' which would be
equivalent to 'dict[Any, Any]' or even 'dict[str, Any]'. The latter
*should* turn off type checking for the value at dict['key'] [1].

There are ways to remove types from the union, called "narrowing", but
they wouldn't make much sense when we know a specific dictionary key
will have a certain type [2].

Sorry for the not-so great explanations. I am still learning about the
type hinting and type checking stuff. Overall I think it is a good
decision though. In the past I remember finding large Python code hard
to follow without them.

> Nah. I think the better cure of the problem is to abandon the 'dict' type
> here. That is, create a proper Python class GLFileTable with five attributes.
> In this situation, the five attributes are not constrained to have the same
> type.

I agree. I think that would make things more clear anyways. Maybe I
will find an opportunity there to remove the duplicate rewrite_files
functions.

[1] 
https://mypy.readthedocs.io/en/stable/dynamic_typing.html#operations-on-any-values
[2] 
https://mypy.readthedocs.io/en/stable/type_narrowing.html#type-narrowing-expressions

Collin



Re: gnulib-tool.py: Update type hints and docstring.

2024-04-21 Thread Bruno Haible
Hi Collin,

> > Also, what about the type of filetable? It is described as
> > 
> >   dict[str, list[str]]
> > 
> > but I think it is more
> > 
> >   dict[str, list[str | tuple[str, str]]]
> > 
> > right?
> ...
> Using union types like that is annoying with type checkers though. For
> example, using that type hint and this line of code in
> GLImport.execute():
> 
>if [ file
>  for file in filetable['all']
>  if file.startswith('doc/') ]:
> 
> I see this warning under 'startswith' from Pyright:
> 
>   /home/collin/.local/src/gnulib/pygnulib/GLImport.py:1020:22 - error: Cannot 
> access attribute "startswith" for class "tuple[str, str]"
> Attribute "startswith" is unknown (reportAttributeAccessIssue)
> 
> or with mypy:
> 
>   GLImport.py:1020: error: Item "tuple[str, ...]" of "str | tuple[str, str]" 
> has no attribute "startswith"  [union-attr]

Hmm. It's valid code that will work at runtime. I would understand a
"warning", if the tool cannot prove that the code is safe. But "error"?
That's too severe. It's again time for a bug report or two, I would say.

> It might just be more beneficial to use 'Any' for complex unions like
> that. You could ignore warnings like:
> 
>if [ file
>  for file in filetable['all']
>  if file.startswith('doc/') ]: # type: ignore
> 
> but that feels annoying for variables used frequently like that filetable.

Nah. I think the better cure of the problem is to abandon the 'dict' type
here. That is, create a proper Python class GLFileTable with five attributes.
In this situation, the five attributes are not constrained to have the same
type.

Bruno






Re: gnulib-tool.py: Update type hints and docstring.

2024-04-21 Thread Collin Funk
On 4/21/24 4:07 AM, Bruno Haible wrote:
> Also, what about the type of filetable? It is described as
> 
>   dict[str, list[str]]
> 
> but I think it is more
> 
>   dict[str, list[str | tuple[str, str]]]
> 
> right?

Good catch. I think the one you have written is correct because of the
filetable['old'] and filetable['new'] which both have
[(rewritten, original), ...].

Using union types like that is annoying with type checkers though. For
example, using that type hint and this line of code in
GLImport.execute():

   if [ file
 for file in filetable['all']
 if file.startswith('doc/') ]:

I see this warning under 'startswith' from Pyright:

  /home/collin/.local/src/gnulib/pygnulib/GLImport.py:1020:22 - error: Cannot 
access attribute "startswith" for class "tuple[str, str]"
Attribute "startswith" is unknown (reportAttributeAccessIssue)

or with mypy:

  GLImport.py:1020: error: Item "tuple[str, ...]" of "str | tuple[str, str]" 
has no attribute "startswith"  [union-attr]

It might just be more beneficial to use 'Any' for complex unions like
that. You could ignore warnings like:

   if [ file
 for file in filetable['all']
 if file.startswith('doc/') ]: # type: ignore

but that feels annoying for variables used frequently like that filetable.

Collin



Re: gnulib-tool.py: Don't fail when given a bad module name.

2024-04-21 Thread Collin Funk
Hi Bruno,

On 4/21/24 4:37 AM, Bruno Haible wrote:
> The former makes an implicit statement that the function is searching
> for the file. The latter does not; it does not say or hint to as what
> function internally does. Thus it is better to use the latter wording.

Agreed, thanks for the change and advice. My documentation skills are
a bit lacking...

> It says "Stop." twice but does not actually stop :) Fixed as follows.

Nice! Running this passes for me now:

$ env GNULIB_TOOL_IMPL=sh+py gnulib-tool --create-testdir --dir aaa abc

Collin



Re: gnulib-tool.py: Make temporary directories recognizable.

2024-04-21 Thread Bruno Haible
Collin Funk wrote:
> This patch prefixes temporary directories created by gnulib-tool.py
> with "glpy". gnulib-tool.sh uses the 'gl' prefix in func_tmpdir, and I
> suppose it is beneficial to differentiate the two.

Thanks, applied.

Bruno






Re: gnulib-tool.py: Don't fail when given a bad module name.

2024-04-21 Thread Bruno Haible
Hi Collin,

> When using --create-testdir with a bad module name:
> 
> $ gnulib-tool.py --create-testdir --dir aaa bad-module-name
> gnulib-tool: warning: module bad-module-name doesn't exist
> Traceback (most recent call last):
>   File "/home/collin/.local/src/gnulib/.gnulib-tool.py", line 30, in 
> main.main_with_exception_handling()
>   File "/home/collin/.local/src/gnulib/pygnulib/main.py", line 1382, in 
> main_with_exception_handling
> main()
>   File "/home/collin/.local/src/gnulib/pygnulib/main.py", line 1075, in main
> testdir.execute()
>   File "/home/collin/.local/src/gnulib/pygnulib/GLTestDir.py", line 209, in 
> execute
> requested_licence = requested_module.getLicense()
> ^^^
> AttributeError: 'NoneType' object has no attribute 'getLicense'
> 
> This patch removes None from the list of GLModule objects before using
> them.

Good point. Thanks, patch applied with one tweak:

> I also documented the meaning of a None return in
> GLModuleSystem.find() since this problem occurred in GLImport
> previously.

When documenting a function or program, it is sometimes possible to state
the same condition with and without referring to the internals of the
function / program. It is better to document it without reference to the
internals.

In this case, one can say
if the module description file could not be found
or
if the module description file does not exist

The former makes an implicit statement that the function is searching
for the file. The latter does not; it does not say or hint to as what
function internally does. Thus it is better to use the latter wording.

I also tried the behaviour on gnulib-tool.sh, and got this confusing
output:
$ ./gnulib-tool.sh --create-testdir --dir=../testdir2 --single-configure foobar 
signbit
./gnulib-tool.sh: *** file /GNULIB/gnulib-git/./modules/foobar not found
./gnulib-tool.sh: *** Stop.
gnulib-tool: warning: module foobar lacks a License
./gnulib-tool.sh: *** file /GNULIB/gnulib-git/./modules/foobar not found
./gnulib-tool.sh: *** Stop.
gnulib-tool: warning: module foobar doesn't exist
gnulib-tool: warning: module foobar doesn't exist
Module list with included dependencies (indented):
absolute-header
c99
extern-inline
...

It says "Stop." twice but does not actually stop :) Fixed as follows.


2024-04-21  Bruno Haible  

gnulib-tool.sh: In --create-testdir, just warn about a bad module name.
* gnulib-tool.sh (func_create_testdir): Validate the modules list.

diff --git a/gnulib-tool.sh b/gnulib-tool.sh
index 5f0c6f530e..b486e99b1e 100755
--- a/gnulib-tool.sh
+++ b/gnulib-tool.sh
@@ -6504,6 +6504,9 @@ func_create_testdir ()
 # Except lib-ignore, which leads to link errors when Sun C++ is used. 
FIXME.
 modules=`func_all_modules`
 modules=`for m in $modules; do case $m in config-h | 
non-recursive-gnulib-prefix-hack | timevar | mountlist | lib-ignore) ;; *) echo 
$m;; esac; done`
+  else
+# Validate the list of specified modules.
+modules=`for module in $modules; do func_verify_module; if test -n 
"$module"; then echo "$module"; fi; done`
   fi
   specified_modules="$modules"
 






Re: gnulib-tool.py: Update type hints and docstring.

2024-04-21 Thread Bruno Haible
Collin Funk wrote:
> Two type hints are incorrectly marked 'dict[str, str]' instead of
> 'dict[str, tuple[re.Pattern, str] | None]'. I assume I forgot to
> change these when inlining sed expressions to use re.sub().

Thanks, applied.

> Now that I think about it, I should have used a list of re.sub()
> arguments. But I think these variables are only used by the 'config-h'
> module to replace "#if HAVE_CONFIG_H" with "#if 1". Until we need
> something more complex than that, how it is currently written should
> work fine.

Yes, please. Add complexity only when it is needed.

Also, what about the type of filetable? It is described as

  dict[str, list[str]]

but I think it is more

  dict[str, list[str | tuple[str, str]]]

right?

Bruno






Re: gnulib-tool.py speedup

2024-04-20 Thread Collin Funk
On 4/20/24 4:50 PM, Bruno Haible wrote:
> What I measure (with "GNULIB_TOOL_IMPL=sh time ./test-create-testdir-1.sh") 
> is:
>   dash  22 sec
>   bash  20 sec

dash  31 sec
bash  28 sec

Similar here. Looks like my desktop is just older than I remembered
originally.

> I think that 'dash' is generally somewhat faster than 'bash'. However,
> gnulib-tool uses special bash syntax for appending to a list and for the
> module caching; this probably makes it faster with 'bash' than with 'dash',

Ah, I see. I remember seeing the associative arrays for module
caching. I haven't spent much time using shells outside of bash so I
don't know how common support for those extensions are.

> What matters most, in the comparison shell vs. Python, IMO, is the string
> processing [1].

Yes, I remember reading that email. Thank you for the detailed
explanation. I guess that would also explain the Cygwin speed too?
Since as far as I know Windows doesn't have fork() and deals with
creating processes differently. I assume the Cygwin implementation is
limited by that.

Collin



Re: gnulib-tool.py speedup

2024-04-20 Thread Bruno Haible
On Sonntag, 21. April 2024 01:01:01 CEST Collin Funk wrote:
> Hi Bruno,
> 
> On 4/20/24 3:50 PM, Bruno Haible wrote:
> > On Linux:   On Cygwin 2.9.0:
> > 
> > in create-tests: time ./test-all.sh in create-tests: time 
> > ./test-all.sh
> >   sh: 1225 secsh: 27406 sec
> >   py: 155 sec py: 2400 sec
> >   => about 8 times faster   => more than 11 times faster
> 
> What shell did you use for this test?

On Linux: dash.   On Cygwin: bash

> Would other shells even make a difference?

You just have to replace the first line of gnulib-tool.sh:
  #!/bin/sh -> #!/bin/bash

What I measure (with "GNULIB_TOOL_IMPL=sh time ./test-create-testdir-1.sh") is:
  dash  22 sec
  bash  20 sec

I think that 'dash' is generally somewhat faster than 'bash'. However,
gnulib-tool uses special bash syntax for appending to a list and for the
module caching; this probably makes it faster with 'bash' than with 'dash',

What matters most, in the comparison shell vs. Python, IMO, is the string
processing [1].

Bruno

[1] https://lists.gnu.org/archive/html/bug-gnulib/2024-03/msg00160.html






Re: gnulib-tool.py speedup

2024-04-20 Thread Collin Funk
Hi Bruno,

On 4/20/24 3:50 PM, Bruno Haible wrote:
> On Linux:   On Cygwin 2.9.0:
> 
> in create-tests: time ./test-all.sh in create-tests: time 
> ./test-all.sh
>   sh: 1225 secsh: 27406 sec
>   py: 155 sec py: 2400 sec
>   => about 8 times faster   => more than 11 times faster

What shell did you use for this test? Running ./test-all.sh in the
create-tests took me ~1800 (30 minutes) last time I checked. That was
on GNU/Linux with Bash. Would other shells even make a difference?

Also those Cygwin times don't look very fun. The Python one is much
more bearable at least LOL.

Collin



Re: gnulib-tool.py speedup

2024-04-20 Thread Bruno Haible
Simon Josefsson wrote:
> jas@kaka:~/src/oath-toolkit$ time make -f cfg.mk 
> ...
> real  0m48,169s
> user  0m49,900s
> sys   0m9,658s
> jas@kaka:~/src/oath-toolkit$ export GNULIB_TOOL_IMPL=py
> jas@kaka:~/src/oath-toolkit$ time make -f cfg.mk 
> ...
> real  0m0,704s
> user  0m0,527s
> sys   0m0,179s
> jas@kaka:~/src/oath-toolkit$

I see similar speedups, using the test suite:

On Linux:   On Cygwin 2.9.0:

in create-tests: time ./test-all.sh in create-tests: time ./test-all.sh
  sh: 1225 secsh: 27406 sec
  py: 155 sec py: 2400 sec
  => about 8 times faster   => more than 11 times faster

in import-tests: time ./test-wget2-1.sh in import-tests: time 
./test-wget2-1.sh
  sh: 31.9 secsh: 831 sec
  py: 0.35 secpy: 43 sec
  => more than 91 times faster  => more than 19 times faster

Bruno






Re: gnulib-tool.py: Update authors list

2024-04-20 Thread Bruno Haible
Hi Collin,

> What do you think of the attached patch?

Looks good, and passes the test suite. => Applied.

> from pygnulib import __author__, __copyright__
> 
> to get them from __init__.py.

That seems to work, yes.

Thanks!

Bruno






Re: gnulib-tool.py: Update authors list

2024-04-20 Thread Collin Funk
On 4/19/24 10:07 AM, Bruno Haible wrote:
> Yeah, it seems redundant. You will surely find out whether doing so
> triggers some warnings from pylint, PyCharm, or other tools...

What do you think of the attached patch?

I tried to find out more about these variables but they are only
mentioned briefly in PEP 8 [1]. It looks like they are old variables
predating all of the fancy Python packaging stuff [2].

I've just removed the copies from constants.py and unused definitions
in the rest of the *.py files. In GLInfo we use __author__ and
__copyright__ for '--version' and copyright headers on generated
Makefiles, m4 files, etc. For that we can use:

from pygnulib import __author__, __copyright__

to get them from __init__.py.

I would just leave the rest of the definitions in __init__.py too. If
in the distant future there is ever a need to package gnulib-tool.py
so you can do something like:

$ pip install pygnulib
$ python3 -m pygnulib --create-testdir

that information is pretty much copy-pasted into the configuration
files. I don't see much use for them beyond that though.

[1] https://peps.python.org/pep-0008/#module-level-dunder-names
[2] https://stackoverflow.com/a/9531279/23476799

CollinFrom 5468442826688f02b778b8fddbc4850255838c6e Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Fri, 19 Apr 2024 23:42:24 -0700
Subject: [PATCH] gnulib-tool.py: Remove duplicate per-module definitions.

* pygnulib/constants.py: Remove duplicate __authors__, __license__, and
__copyright__ definitions.
* pygnulib/GLInfo.py: Use the value of __authors__ and __copyright__
from __init__.py for output.
* pygnulib/*.py: Remove unused references to the constant.py
definitions.
---
 ChangeLog   | 10 ++
 pygnulib/GLConfig.py|  8 
 pygnulib/GLEmiter.py|  8 
 pygnulib/GLError.py |  8 
 pygnulib/GLFileSystem.py|  8 
 pygnulib/GLImport.py|  8 
 pygnulib/GLInfo.py  | 11 ++-
 pygnulib/GLMakefileTable.py |  8 
 pygnulib/GLModuleSystem.py  |  8 
 pygnulib/GLTestDir.py   |  8 
 pygnulib/constants.py   | 10 --
 11 files changed, 12 insertions(+), 83 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1af6861244..b565a68faf 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2024-04-19  Collin Funk  
+
+	gnulib-tool.py: Remove duplicate per-module definitions.
+	* pygnulib/constants.py: Remove duplicate __authors__, __license__, and
+	__copyright__ definitions.
+	* pygnulib/GLInfo.py: Use the value of __authors__ and __copyright__
+	from __init__.py for output.
+	* pygnulib/*.py: Remove unused references to the constant.py
+	definitions.
+
 2024-04-19  Bruno Haible  
 
 	wcsstr: Update doc.
diff --git a/pygnulib/GLConfig.py b/pygnulib/GLConfig.py
index 9a4f4bfe68..19611b823c 100644
--- a/pygnulib/GLConfig.py
+++ b/pygnulib/GLConfig.py
@@ -27,14 +27,6 @@
 from pygnulib.enums import CopyAction
 
 
-#===
-# Define module information
-#===
-__author__ = constants.__author__
-__license__ = constants.__license__
-__copyright__ = constants.__copyright__
-
-
 #===
 # Define global constants
 #===
diff --git a/pygnulib/GLEmiter.py b/pygnulib/GLEmiter.py
index ff4e3027d6..91077a0325 100644
--- a/pygnulib/GLEmiter.py
+++ b/pygnulib/GLEmiter.py
@@ -30,14 +30,6 @@
 from .GLMakefileTable import GLMakefileTable
 
 
-#===
-# Define module information
-#===
-__author__ = constants.__author__
-__license__ = constants.__license__
-__copyright__ = constants.__copyright__
-
-
 #===
 # Define global constants
 #===
diff --git a/pygnulib/GLError.py b/pygnulib/GLError.py
index a990c382b1..47d1e97fdc 100644
--- a/pygnulib/GLError.py
+++ b/pygnulib/GLError.py
@@ -22,14 +22,6 @@
 from . import constants
 
 
-#===
-# Define module information
-#===
-__author__ = constants.__author__
-__license__ = constants.__license__
-__copyright__ = constants.__copyright__
-
-
 #===
 # Define global constants
 #===
diff --git a/pygnulib/GLFileSystem.py b/pygnulib/GLFileSystem.py
index 4bd8b1f386..1cad7f5bad 

Re: gnulib-tool.py: Simplify data structures for dependencies.

2024-04-19 Thread Collin Funk
On 4/19/24 2:37 PM, Bruno Haible wrote:
> You're right. Maybe Dmitry wanted to have the full data structure
> available somewhere; but since the 'conditionals' and 'unconditionals'
> tables contain the same info in a different way, it's redundant.
> 
> I would leave it in, for possible future uses, since it does not consume
> much memory. Just add a comment that it's currently unused, and is kept
> for possible future use.

Sure, that makes sense. It seems useful which is why I thought I was
missing something. I'll add a comment next time I make changes around
there.

>4. inst.getName(): This is similar to inst.get_name() but follows the
>   convention of using CamelCase for method names, which is common in
>   some coding styles. However, in Python, the convention is typically
>   to use lowercase letters and underscores for method names (get_name()).

Ah, the prior knowledge summarization engine is an advocate of PEP 8.
I've been meaning to get around to the "module.get_name() vs.
module.getName()" naming convention issue because we have a mix of
both at the moment. It just hasn't bothered me enough yet so I've
focused on other things. :)

Collin



Re: gnulib-tool.py: Simplify data structures for dependencies.

2024-04-19 Thread Bruno Haible
Hi Collin,

> Patch 0001 does the same but uses a set instead of checking for
> membership in a list.

Thanks; applied.

> Also, am I missing something or are the sets/lists at
> self.dependers[str(module)] unused?

You're right. Maybe Dmitry wanted to have the full data structure
available somewhere; but since the 'conditionals' and 'unconditionals'
tables contain the same info in a different way, it's redundant.

I would leave it in, for possible future uses, since it does not consume
much memory. Just add a comment that it's currently unused, and is kept
for possible future use.

> Patch 0002 changes the str(module) inserted into lists in that section
> of code to directly use the GLModule object. For accessing the
> conditionals dictionaries we can use a tuple:
> 
> (parent, module)
> 
> instead of a string like gnulib-tool.sh:
> 
> '%s---%s' % (str(parent), str(module))

Good idea, to use the Python tuple type, rather than an ad-hoc hash table
key.

> Also the str(module) sort of makes things harder to read. It is mostly
> my fault that we have three ways to get a GLModule objects name:
> 
> str(module)
> module.name
> module.getName()
> 
> which are all used in various places...

Indeed. I would not make use of
  str(module)
since it may be useful to have str(module) look like this:
#

In pure OO, one would not make the 'name' public, since it's meant to
be a read-only property. It eliminates possible side effects on the
'name' property from the beginning.

But what is common in Python? Let's ask the engine:
--
Q: In Python, when I have a class whose instances are uniquely
   identified by a name, and the name of an instance cannot be
   changed after the instance has been constructed, which is the
   preferred way of accessing the name? Is it `str(inst)`, `inst.name`,
   `inst.get_name()`, or `inst.getName()` ?
A: The preferred way of accessing the name of an instance in Python
   depends on the design of your class and your specific requirements.
   However, considering your criteria where the name cannot be changed
   after construction, you might want to use inst.name if the name is
   a property of the instance and is directly accessible.

   Here's a brief overview of each option:

   1. str(inst): This will call the __str__ method of your class. If
  you've implemented __str__ to return the name of the instance,
  this could work. However, using str(inst) to access the name might
  not be as clear in terms of code readability.
   2. inst.name: If the name is stored directly as an attribute of the
  instance and is meant to be accessed directly, this is a clear
  and concise way to access it.
   3. inst.get_name(): This method would be appropriate if you need to
  perform some processing or validation before returning the name.
  It adds a layer of abstraction, which can be useful in some cases,
  but it might be unnecessary overhead if you're simply returning
  the name.
   4. inst.getName(): This is similar to inst.get_name() but follows the
  convention of using CamelCase for method names, which is common in
  some coding styles. However, in Python, the convention is typically
  to use lowercase letters and underscores for method names (get_name()).

   Ultimately, the choice depends on factors such as readability,
   consistency with the rest of your codebase, and whether you need any
   additional logic when accessing the name.
--

I will let you evaluate this. I have nothing to add :-)

Bruno






Re: gnulib-tool.py: Update authors list

2024-04-19 Thread Bruno Haible
Hi Collin,

> If I am remembering how __init__.py works correctly,
> we should be able to remove the __copyright__, __author__, and
> __license__ definitions from constants.py and the rest of the files.

Yeah, it seems redundant. You will surely find out whether doing so
triggers some warnings from pylint, PyCharm, or other tools...

Bruno






Re: gnulib-tool.py: Fix pylint 'attribute-defined-outside-init' warnings.

2024-04-19 Thread Bruno Haible
Collin Funk wrote:
> I submitted a feature request here since I think a configuration
> option would probably be the best fix:
> 
> https://github.com/pylint-dev/pylint/issues/9555

Thank you!

> When I say obvious in GLModule we have (comments are added):
> 
>  # Cache for the getter functions.
>  self.cache = dict()
>  # Raw content of description file.
>  self.content = ''
>  self.filesystem = GLFileSystem(self.config)
>  self.modulesystem = GLModuleSystem(self.config)
> 
> It is very clear that the cache and content are meant for internal
> use. The filesystem and modulesystem are less clear to me. I think
> they *could* be used from the outside, but should they be? Those will
> take more consideration.

Yes. There are actually two questions to ask for each property:
  - Is it currently used from outside the class?
  - Would it make sense to use it from outside the class, given the
existing methods?

Bruno






Re: gnulib-tool.py: Simplify running some commands in a given directory

2024-04-19 Thread Collin Funk
On 4/19/24 9:24 AM, Bruno Haible wrote:
> This patch optimizes some subprocess invocations.
> 
> I did not change the os.chdir calls around larger blocks of code.

Looks good. I had a look at doing this previously but in GLTestDir for
example, we use 'constants.execute' to deal with the:

 Executing autoheader...
 Executing touch config.h.in...

messages. I sort of focused on other changes instead of worrying about
breaking those...

Collin



Re: gnulib-tool.py: Update authors list

2024-04-19 Thread Collin Funk
Hi Bruno,

On 4/19/24 8:49 AM, Bruno Haible wrote:
> This patch changes the './gnulib-tool.py --version' output to end like this:
> 
>   Written by Bruno Haible, Paul Eggert, Simon Josefsson, Dmitry Selyutin, and 
> Collin Funk.

Thanks. I think I can consider myself famous now. :)

> Also, in pygnulib/__init__.py, the authors list and copyright years need to
> include the authors of the gnulib-tool shell script, since under copyright 
> law,
> a derived work inherits the history of the original work.

Yes, makes sense. If I am remembering how __init__.py works correctly,
we should be able to remove the __copyright__, __author__, and
__license__ definitions from constants.py and the rest of the files. I
believe the ones in __init__.py should be kept across all files in
that directory since they are a part of the 'pygnulib' module [1].

Currently we use a mix of 'constants.function_name' and using the
function directly 'function_name'. Maybe I'll do that and take a look
at removing the duplicate definitions in constants.py.

[1] https://docs.python.org/3/tutorial/modules.html#packages

Collin



Re: gnulib-tool.py: Add a comment about coding style.

2024-04-19 Thread Bruno Haible
Thanks, Collin. Both patches applied.

Bruno






Re: gnulib-tool.py: Fix pylint 'attribute-defined-outside-init' warnings.

2024-04-18 Thread Collin Funk
Hi Bruno,

On 4/18/24 6:05 AM, Bruno Haible wrote:
> The same holds for GCC, glibc, qemu, and many other project: They have
> many open issues because the developers prioritize them. They nevertheless
> depend on their users for reporting issues. Paul Eggert and I regularly
> report issues in GCC, for instance. Please help making pylint better,
> since we use it. (Call it a "wish", rather than a "bug", if you prefer.)

True, that is a better way to think about it. If no one submitted
bugs/requests than software would never improve. :)

I submitted a feature request here since I think a configuration
option would probably be the best fix:

https://github.com/pylint-dev/pylint/issues/9555

> Yes, please. I think it will be a benefit to follow this convention,
> even though it departs from our previous coding style.

Nice. I'll add a small note in main.py.

> Please submit patches in this direction one-class-at-a-time, not all
> together, since that would be unreviewable.

Yes, that makes sense. I think it will be best for me to go through
the obvious ones first. I'll submit seperate patches for those.

When I say obvious in GLModule we have (comments are added):

 # Cache for the getter functions.
 self.cache = dict()
 # Raw content of description file.
 self.content = ''
 self.filesystem = GLFileSystem(self.config)
 self.modulesystem = GLModuleSystem(self.config)

It is very clear that the cache and content are meant for internal
use. The filesystem and modulesystem are less clear to me. I think
they *could* be used from the outside, but should they be? Those will
take more consideration.

Collin



Re: gnulib-tool.py: Fix pylint 'attribute-defined-outside-init' warnings.

2024-04-18 Thread Bruno Haible
Hi Collin,

> A single underscore doesn't silence it. I can submit a feature request
> but it isn't really a bug since it is still an attribute outside of
> __init__. IIRC pylint is pretty opinionated and they have many open
> issues so who knows if it will even be addressed.

The same holds for GCC, glibc, qemu, and many other project: They have
many open issues because the developers prioritize them. They nevertheless
depend on their users for reporting issues. Paul Eggert and I regularly
report issues in GCC, for instance. Please help making pylint better,
since we use it. (Call it a "wish", rather than a "bug", if you prefer.)

> How about we just silence this warning?

Thanks, applied.

> I think the '_' prefix for private instance
> variables/functions is a standard convention, but maybe we should add
> a note in main.py?

Yes, please. I think it will be a benefit to follow this convention,
even though it departs from our previous coding style.

Please submit patches in this direction one-class-at-a-time, not all
together, since that would be unreviewable.

Bruno

> [1] 
> https://pylint.readthedocs.io/en/stable/user_guide/messages/message_control.html#block-disables







Re: gnulib-tool.py: Fix pylint 'attribute-defined-outside-init' warnings.

2024-04-17 Thread Collin Funk
Hi Bruno,

On 4/17/24 3:18 PM, Bruno Haible wrote:
> In this example, the property 'attribute' is used outside of the class,
> that is, it is public. (In the OO world, the concept of "class" and of
> "private"/"public" are tightly related, because one of the main purposes
> of a class is to hide implementation details.)

True, but Python is different than the rest of the OO world because it
does not directly support "private"/"public". :(

There are ways to sort of emulate it but a single underscore is
simpler and gets the point across, as the "prior knowledge
summarization engine" suggests.

> Therefore, can you please see if after renaming message to _message the
> warning in GLError persists? If yes, please report a bug to the tool
> that produced this warning. The rationale why the warning is bogus is:
>   - The '_message' attribute is clearly private.
>   - Only one method uses it.
>   - There is no other convention for state that is used by one method only.

A single underscore doesn't silence it. I can submit a feature request
but it isn't really a bug since it is still an attribute outside of
__init__. IIRC pylint is pretty opinionated and they have many open
issues so who knows if it will even be addressed.

I'm not opposed to defining instance variables outside of __init__,
but I think that it should be more clear they are private. I'm happy
with the solution of a single underscore for private variables. The
functions I added in GLEmiter '_eliminate_NMD', etc. follow a similar
idea.

As far as silencing the warnings go, you can use a comment like so [1]:

   self.message = '[Errno %d] %s' % (errno, message)  # pylint: 
disable=attribute-defined-outside-init

but I can see that quickly becoming annoying if we make more use of
private variables. Since it would have to be placed in every block or
line one is defined.

How about we just silence this warning? I've attached a patch that
should work. I think the '_' prefix for private instance
variables/functions is a standard convention, but maybe we should add
a note in main.py?

Collin

[1] 
https://pylint.readthedocs.io/en/stable/user_guide/messages/message_control.html#block-disablesFrom b41ca5909ffe95d49f00a654d0fbb6dea2092340 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Wed, 17 Apr 2024 16:20:32 -0700
Subject: [PATCH] gnulib-tool.py: Ignore 'attribute-defined-outside-init'
 warnings.

* pygnulib/.pylintrc: Add W0201 to the disabled warnings.
See discussion here:

---
 ChangeLog  | 7 +++
 pygnulib/.pylintrc | 2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 646653ffe5..d40063de73 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-04-17  Collin Funk  
+
+	gnulib-tool.py: Ignore 'attribute-defined-outside-init' warnings.
+	* pygnulib/.pylintrc: Add W0201 to the disabled warnings.
+	See discussion here:
+	
+
 2024-04-17  Bruno Haible  
 
 	gnulib-tool.py: Use same warning style as gnulib-tool.sh.
diff --git a/pygnulib/.pylintrc b/pygnulib/.pylintrc
index 102accb37d..59fc986f13 100644
--- a/pygnulib/.pylintrc
+++ b/pygnulib/.pylintrc
@@ -1,7 +1,7 @@
 # .pylintrc
 
 [MESSAGES CONTROL]
-disable=C0103,C0114,C0121,C0123,C0209,C0301,C0302,R0902,R0912,R0913,R0914,R0915,R1705,R1702,R1720,R1735
+disable=C0103,C0114,C0121,C0123,C0209,C0301,C0302,R0902,R0912,R0913,R0914,R0915,R1705,R1702,R1720,R1735,W0201
 
 # Local Variables:
 # mode: conf
-- 
2.44.0



Re: gnulib-tool.py: Fix pylint 'attribute-defined-outside-init' warnings.

2024-04-17 Thread Bruno Haible
Hi Collin,

Both patches applied.

> The reason I dislike defining instance variables outside of __init__
> is that you can run into situations like this, which I find hard to
> follow:
> 
> var = GLImport(...)   # self.assistant does not exist.
> var.assistant.do_something()  # This is invalid.
> var.execute(...)  # self.assistant is now defined.
> var.assistant.do_something()  # Now this is valid.

In this example, the property 'attribute' is used outside of the class,
that is, it is public. (In the OO world, the concept of "class" and of
"private"/"public" are tightly related, because one of the main purposes
of a class is to hide implementation details.)

Whereas GLError.message is meant to be a private property of the
__repr__ method. Since it holds state, it cannot be converted to a
local variable (like the GLImport.assistant).

Let's see what the prior knowledge summarization engine can tell us
about the distinction:
Q: In Python, what conventions exist, to distinguish a private property
   (attribute) of a class from a public property (attribute) of a class?
A: - In Python, public attributes and methods are named without leading
 underscores, indicating they're accessible from outside the class.
   - Private attributes and methods conventionally have a leading
 underscore, signaling they're for internal use and discouraging
 direct access from outside the class.
   - Additionally, Python supports name mangling with double underscores (__),
 making attributes harder to access from outside the class by modifying
 their names in the bytecode.
   - These conventions aid readability and communicate intent, but Python
 doesn't strictly enforce encapsulation.

Therefore, can you please see if after renaming message to _message the
warning in GLError persists? If yes, please report a bug to the tool
that produced this warning. The rationale why the warning is bogus is:
  - The '_message' attribute is clearly private.
  - Only one method uses it.
  - There is no other convention for state that is used by one method only.

Bruno






Re: gnulib-tool.py: Fix pylint 'attribute-defined-outside-init' warnings.

2024-04-17 Thread Collin Funk
Hi Bruno,

On 4/17/24 11:02 AM, Bruno Haible wrote:
>> $ pylint *.py | grep W0201
>> GLError.py:119:12: W0201: Attribute 'message' defined outside __init__ 
>> (attribute-defined-outside-init)
> 
> This warning is bogus. In a dynamic language like Python, it is
> perfectly fine to have a property used only by one method. Only if
> it is used by more than one method, would I find it suitable to
> declare/initialize it in the constructor.

I respectfully disagree about the warning message being bogus, but I
think I generally agree beyond that. Though, when I see an instance
variable used only in one function it makes me think that it should be
local to the function it is used in.

The reason I dislike defining instance variables outside of __init__
is that you can run into situations like this, which I find hard to
follow:

var = GLImport(...)   # self.assistant does not exist.
var.assistant.do_something()  # This is invalid.
var.execute(...)  # self.assistant is now defined.
var.assistant.do_something()  # Now this is valid.

I think it is best to avoid this, but there are probably situations
where it may be intentional/necessary.

> Your patch is not good, because it increases the code that is
> necessary to read, in order to understand the property: before,
> it was one method, now it would be the class.
> 
> Maybe you can silence the warning by prepending a '_' to the
> property name? That would be acceptable.

The issue with the 'self.messages' in GLError is that it appears that
__repr__ expects 'self.messages' to be initialized to None in
__init__. I think the idea was to have it so that under
'if __name__ == "__main__"':

try:
main()
except GLError as exc:
sys.stderr.write(repr(exc))

instead of rewriting the messages there. I didn't write it though so maybe
I am misreading. :)

Let's leave that until I can properly fix that class.

>> GLImport.py:1043:8: W0201: Attribute 'assistant' defined outside __init__ 
>> (attribute-defined-outside-init)
> 
> Similarly. In the current state, this property could be turned into a
> local variable. If you need this attribute in other methods (see the
> other thread), please find a good place to initialize it. But the
> proposed patch here is not good, as it initializes the same property
> twice, and who knows if both initializations are really equivalent?

Good point. I was under the impression that GLConfig does not get
modified after being passed from main(). Though I should check and
document it before making changes with that assumption.

Patch 0002 makes this GLFileAssistant local to GLImport.execute().
Since it is only used there I agree that it makes more sense.

>> We already unnecessarily create a
>> GLFileSystem that we don't need to [2] [3]. :)
> 
> You're welcome to remove the unneeded property.

In patch 0001 attached.

CollinFrom e1c70606ae64faee9ced3ebc4c46e9c86a689a58 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Wed, 17 Apr 2024 11:34:07 -0700
Subject: [PATCH 1/2] gnulib-tool.py: Remove an unused instance attribute.

* pygnulib/GLImport.py (GLImport.__init__): Remove the unused
GLFileSystem object.
---
 ChangeLog| 6 ++
 pygnulib/GLImport.py | 1 -
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index adf89411d2..dbe6edd52d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-04-17  Collin Funk  
+
+	gnulib-tool.py: Remove an unused instance attribute.
+	* pygnulib/GLImport.py (GLImport.__init__): Remove the unused
+	GLFileSystem object.
+
 2024-04-17  Collin Funk  
 
 	gnulib-tool.py: Fix a pylint 'attribute-defined-outside-init' warning.
diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py
index c6a4693c90..0933943d4c 100644
--- a/pygnulib/GLImport.py
+++ b/pygnulib/GLImport.py
@@ -302,7 +302,6 @@ def __init__(self, config: GLConfig, mode: int) -> None:
 
 # Define GLImport attributes.
 self.emitter = GLEmiter(self.config)
-self.filesystem = GLFileSystem(self.config)
 self.modulesystem = GLModuleSystem(self.config)
 self.moduletable = GLModuleTable(self.config,
  self.config.checkInclTestCategory(TESTS['all-tests']),
-- 
2.44.0

From d3306868fdd69724442c05b1b1526bc9c279bd65 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Wed, 17 Apr 2024 11:47:16 -0700
Subject: [PATCH 2/2] gnulib-tool.py: Make an instance variable local to a
 function.

* pygnulib/GLImport.py (GLImport.execute): Define the GLFileAssistant as
local to this function because it is unused elsewhere.
---
 ChangeLog|  6 ++
 pygnulib/GLImport.py | 48 ++--
 2 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index dbe6edd52d..348c887be0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-04-17  Collin Funk  
+
+	gnulib-tool.py: Make an instance variable local to a function.
+	* 

Re: gnulib-tool.py: Fix pylint 'attribute-defined-outside-init' warnings.

2024-04-17 Thread Bruno Haible
Hi Collin,

> $ pylint *.py | grep W0201
> GLError.py:119:12: W0201: Attribute 'message' defined outside __init__ 
> (attribute-defined-outside-init)

This warning is bogus. In a dynamic language like Python, it is
perfectly fine to have a property used only by one method. Only if
it is used by more than one method, would I find it suitable to
declare/initialize it in the constructor.

Your patch is not good, because it increases the code that is
necessary to read, in order to understand the property: before,
it was one method, now it would be the class.

Maybe you can silence the warning by prepending a '_' to the
property name? That would be acceptable.

> GLImport.py:1043:8: W0201: Attribute 'assistant' defined outside __init__ 
> (attribute-defined-outside-init)

Similarly. In the current state, this property could be turned into a
local variable. If you need this attribute in other methods (see the
other thread), please find a good place to initialize it. But the
proposed patch here is not good, as it initializes the same property
twice, and who knows if both initializations are really equivalent?

> GLModuleSystem.py:916:8: W0201: Attribute 'modules' defined outside __init__ 
> (attribute-defined-outside-init)

This part is good; applied.

> GLError seems like it was never completed, since we have the error
> messages repeated under "if __name__ == '__main__':".

I understand the reasoning to have library exception handling in a
different place than program exception handling. But if you can
reasonably improve this part, I'd like to see it.

> We already unnecessarily create a
> GLFileSystem that we don't need to [2] [3]. :)

You're welcome to remove the unneeded property.

Bruno






Re: gnulib-tool.py: Make GLModule's __eq__ and __hash__ method agree.

2024-04-16 Thread Bruno Haible
Hi Collin,

> From the documentation of object.__hash__(self) [1]:
> 
>  The only required property is that objects which compare equal
>  have the same hash value; it is advised to mix together the hash
>  values of the components of the object that also play a part in
>  comparison of objects by packing them into a tuple and hashing
>  the tuple.

Yup, the same is true in all programming languages (C, Java, etc.).

> Right now we use the following to compute the hash in GLModule:
> 
> result = hash(self.name) ^ hash(self.patched)
> 
> The way that the documentation recommends writing this:
> 
> result = hash((self.name, self.patched))

Both of these hash code are good. It is good if the hash function
is not symmetric in the properties: we do *not* want that
(a,b) and (b,a) produce the same has code, because it would slow
down hash table lookups. But since self.patched is boolean, the
first formula is not symmetric.

> But I believe the 'patched' boolean should not be involved in
> computing the hash.

I agree: The 'patched' property is inferred from the files on disk
when a GLModule object is created. Once a GLModule with a given name
has been created, no second one with the same name will be created.

Applied. Thanks!

Bruno






Re: gnulib-tool.py: Make data structures more clear.

2024-04-16 Thread Bruno Haible
Collin Funk wrote:
> Patch 0001 changes GLModuleTable's unconditionals member from a
> dictionary to a set.
> ...
> PEP 8 makes this recommendation for 'is not' and 'not ... is' for
> similar reasons [1].

Thanks. Both patches applied.

Bruno






Re: gnulib-tool.py: Remove a redundant function.

2024-04-16 Thread Collin Funk
Hi Bruno,

On 4/16/24 8:09 AM, Bruno Haible wrote:
> I'm talking about this piece of code:
> 
> filetable = []
> for src in filelist:
> dest = self.rewrite_files([src])[-1]
> filetable.append(tuple([dest, src]))
> 
> which can be written as
> 
> filetable = [ tuple([self.rewrite_filename(src), src])
>   for src in filelist ]

Ah, okay now I see what you mean. I'll have a look at doing something
like that. Thanks!

Collin



Re: gnulib-tool.py: Make data structures more clear.

2024-04-16 Thread Collin Funk
On 4/16/24 9:06 AM, Collin Funk wrote:
> Patch 0001 changes GLModuleTable's unconditionals member from a
> dictionary to a set.

Forgot attachments, oops.

CollinFrom be8c8ce23d090269f52124630b1b0b4d6034052e Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Tue, 16 Apr 2024 08:21:27 -0700
Subject: [PATCH 1/2] gnulib-tool.py: Make data structures more clear.

* pygnulib/GLModuleSystem.py (GLModuleTable.__init__): Use a set to
represent the unconditional modules instead of a dictionary. Remove
redundant comments.
(GLModuleTable.addUnconditional): Add the module to a set instead of
using it as a key to the dictionary.
---
 ChangeLog  |  9 +
 pygnulib/GLModuleSystem.py | 16 
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8418f71093..bb91326403 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2024-04-16  Collin Funk  
+
+	gnulib-tool.py: Make data structures more clear.
+	* pygnulib/GLModuleSystem.py (GLModuleTable.__init__): Use a set to
+	represent the unconditional modules instead of a dictionary. Remove
+	redundant comments.
+	(GLModuleTable.addUnconditional): Add the module to a set instead of
+	using it as a key to the dictionary.
+
 2024-04-15  Collin Funk  
 
 	gnulib-tool.py: Optimize directory creation.
diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py
index 82b3c2f6e5..11f8934026 100644
--- a/pygnulib/GLModuleSystem.py
+++ b/pygnulib/GLModuleSystem.py
@@ -727,13 +727,13 @@ def __init__(self, config: GLConfig, inc_all_direct_tests: bool, inc_all_indirec
   returns the condition when B should be enabled as a dependency of A,
   once the m4 code for A has been executed.
 '''
-self.dependers = dict()  # Dependencies
-self.conditionals = dict()  # Conditional modules
-self.unconditionals = dict()  # Unconditional modules
-self.base_modules = []  # Base modules
-self.main_modules = []  # Main modules
-self.tests_modules = []  # Tests modules
-self.final_modules = []  # Final modules
+self.dependers = dict()
+self.conditionals = dict()
+self.unconditionals = set()
+self.base_modules = []
+self.main_modules = []
+self.tests_modules = []
+self.final_modules = []
 if type(config) is not GLConfig:
 raise TypeError('config must be a GLConfig, not %s'
 % type(config).__name__)
@@ -781,7 +781,7 @@ def addUnconditional(self, module: GLModule) -> None:
 if type(module) is not GLModule:
 raise TypeError('module must be a GLModule, not %s'
 % type(module).__name__)
-self.unconditionals[str(module)] = True
+self.unconditionals.add(str(module))
 if str(module) in self.dependers:
 self.dependers.pop(str(module))
 
-- 
2.44.0

From 352d793b06942978eb0ce8c713348c55768d3a3b Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Tue, 16 Apr 2024 08:36:55 -0700
Subject: [PATCH 2/2] gnulib-tool.py: Prefer 'not in' over 'not ... in'.

* pygnulib/GLEmiter.py (GLEmiter.autoconfSnippet): Change conditional.
* pygnulib/GLModuleSystem.py (GLModuleTable.addConditional): Likewise.
---
 ChangeLog  | 6 ++
 pygnulib/GLEmiter.py   | 2 +-
 pygnulib/GLModuleSystem.py | 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bb91326403..186c8a6b39 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-04-16  Collin Funk  
+
+	gnulib-tool.py: Prefer 'not in' over 'not ... in'.
+	* pygnulib/GLEmiter.py (GLEmiter.autoconfSnippet): Change conditional.
+	* pygnulib/GLModuleSystem.py (GLModuleTable.addConditional): Likewise.
+
 2024-04-16  Collin Funk  
 
 	gnulib-tool.py: Make data structures more clear.
diff --git a/pygnulib/GLEmiter.py b/pygnulib/GLEmiter.py
index 558f130bbb..ff4e3027d6 100644
--- a/pygnulib/GLEmiter.py
+++ b/pygnulib/GLEmiter.py
@@ -218,7 +218,7 @@ def autoconfSnippet(self, module: GLModule, toplevel: bool, disable_libtool: boo
 if str(module) in ['gnumakefile', 'maintainer-makefile']:
 # These modules are meant to be used only in the top-level directory.
 flag = toplevel
-else:  # if not str(module) in ['gnumakefile', 'maintainer-makefile']
+else:  # if str(module) not in ['gnumakefile', 'maintainer-makefile']
 flag = True
 if flag:
 snippet = module.getAutoconfSnippet()
diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py
index 11f8934026..3148f921e1 100644
--- a/pygnulib/GLModuleSystem.py
+++ b/pygnulib/GLModuleSystem.py
@@ -767,7 +767,7 @@ def addConditional(self, parent: GLModule, module: GLModule, condition: str | bo
 if not (type(condition) is str or condition == True):
 raise TypeError('condition must be a string or True, not %s'
 % 

Re: gnulib-tool.py: Remove a redundant function.

2024-04-16 Thread Bruno Haible
Hi Collin,

> But I think the idea of the patch is still
> correct. Since it doesn't make sense to accept a list and then only
> use it with one element lists.

Sure. This code structure comes from the fact that in the shell
implementation, the rewriting of file names is done through a 'sed' invocation,
and that is equally suited to a single file name or a list of file names.

> > Also, the last hunk makes use of yet another Python built-in function 'zip',
> > where list comprehension [ ... for ... in ... ] is more readable.
> 
> Maybe I am missing something, but I don't think there is a good way to
> use a list comprehension here without 'zip'. Since 'zip' is used to
> combine these two lists like so:

I'm talking about this piece of code:

filetable = []
for src in filelist:
dest = self.rewrite_files([src])[-1]
filetable.append(tuple([dest, src]))

which can be written as

filetable = [ tuple([self.rewrite_filename(src), src])
  for src in filelist ]

Bruno






Re: gnulib-tool.py: Optimize directory creation.

2024-04-15 Thread Bruno Haible
Hi Collin,

> This patch handles the directory creation upfront instead of checking
> for every single file, as you mentioned earlier [1].

Thanks! This looks good. Applied.

Bruno






Re: gnulib-tool.py: Remove a redundant function.

2024-04-15 Thread Collin Funk
Hi Bruno,

On 4/15/24 7:58 AM, Bruno Haible wrote:
> Patch 0002 is not applicable because it relies on 0001, which was not good.

Yes, I need to rewrite it. But I think the idea of the patch is still
correct. Since it doesn't make sense to accept a list and then only
use it with one element lists.

> Also, the last hunk makes use of yet another Python built-in function 'zip',
> where list comprehension [ ... for ... in ... ] is more readable.

Maybe I am missing something, but I don't think there is a good way to
use a list comprehension here without 'zip'. Since 'zip' is used to
combine these two lists like so:

list1 = [ 1, 2, 3, 4 ]
list2 = [ 5, 6, 7, 8 ]
result = list(zip(list1, list2))
print(result)
[(1, 5), (2, 6), (3, 7), (4, 8)]

This still uses 'zip' but maybe you find it easier to read?

 result = [ (a, b) for
a, b in zip(list1, list2) ]

> In gnulib-tool.sh the directories are created ahead of the loop that
> copies the files. Why? Because when we have to create 500 files in the
> lib/ directory, it is faster to do 'if not isdir(dirname)' once than
> 500 times. This is also true in Python.

Ah, yes that makes sense. I'll go fix that now.

Collin



Re: gnulib-tool.py: Remove a redundant function.

2024-04-15 Thread Bruno Haible
Hi Collin,

> Patch 0002 does this. GLTestDir also has this rewrite_files() function
> so I did the same there. Maybe it is worth making that a helper
> function or using a base class in the future.
> 
> Also, the set() and list() calls around zip(...) are important since
> zip() returns an iterator [1]. I've used whichever was most similar to
> the previous code.

Patch 0002 is not applicable because it relies on 0001, which was not good.

Also, the last hunk makes use of yet another Python built-in function 'zip',
where list comprehension [ ... for ... in ... ] is more readable.

> Patch 0003 removes a directories list that was unused. These are
> created in the loop below it as files are written.

In gnulib-tool.sh the directories are created ahead of the loop that
copies the files. Why? Because when we have to create 500 files in the
lib/ directory, it is faster to do 'if not isdir(dirname)' once than
500 times. This is also true in Python.

Bruno






Re: gnulib-tool.py: Remove a redundant function.

2024-04-15 Thread Collin Funk
Hi Bruno,

On 4/15/24 4:47 AM, Bruno Haible wrote:
> No. I'm adding 3 unit tests that prove that the patch is wrong,
> one for each of docbase, sourcebase, testsbase. (For auxdir and m4base
> gnulib-tool.{sh,py} does not support changing the value while preserving
> the rest: For auxdir the old files are not removed, for m4base the list
> of modules gets reset to empty.)

Ah, I see. I should have diff'd the two functions like you. The
'self.config' and 'self.cache' was too easy for my eyes to overlook...
Thanks for adding the tests. My local branch fails but passes using
master, so I can see this change is incorrect.

I sent 2 other patches whenever you have time to check them. I can
rewrite patch 0002 as long as it is otherwise correct. It should use
rewrite_{new,old}_files and not the improper renamed function from
patch 0001.

Patch 0003 should work fine. Just an unused variable that is handled
elsewhere.

Collin



Re: gnulib-tool.py: Remove a redundant function.

2024-04-15 Thread Bruno Haible
Collin Funk wrote:
> The GLImport class has two functions that are the same,
> GLImport.rewrite_old_files() and GLImport.rewrite_new_files().

No. When I copy these functions into separate text files and use 'diff'
on them:

$ diff -u 1 2
--- 1   2024-04-15 06:34:45.441369330 +0200
+++ 2   2024-04-15 06:35:01.941511954 +0200
@@ -1,7 +1,7 @@
 
-def rewrite_old_files(self, files: list[str]) -> list[str]:
-'''Replace auxdir, docbase, sourcebase, m4base and testsbase from 
default
-to their version from cached config.'''
+def rewrite_new_files(self, files: list[str]) -> list[str]:
+'''Replace auxdir, docbase, sourcebase, m4base and testsbase from
+default to their version from config.'''
 if type(files) is not list:
 raise TypeError('files argument must has list type, not %s'
 % type(files).__name__)
@@ -9,13 +9,11 @@
 if type(file) is not str:
 raise TypeError('each file must be a string instance')
 files = sorted(set(files))
-files = [ '%s%s' % (file, os.path.sep)
-  for file in files ]
-auxdir = self.cache['auxdir']
-docbase = self.cache['docbase']
-sourcebase = self.cache['sourcebase']
-m4base = self.cache['m4base']
-testsbase = self.cache['testsbase']
+auxdir = self.config['auxdir']
+docbase = self.config['docbase']
+sourcebase = self.config['sourcebase']
+m4base = self.config['m4base']
+testsbase = self.config['testsbase']
 result = []
 for file in files:
 if file.startswith('build-aux/'):

> Therefore, we can remove GLImport.rewrite_old_files() and rename
> GLImport.rewrite_new_files() to GLImport.rewrite_files().

No. I'm adding 3 unit tests that prove that the patch is wrong,
one for each of docbase, sourcebase, testsbase. (For auxdir and m4base
gnulib-tool.{sh,py} does not support changing the value while preserving
the rest: For auxdir the old files are not removed, for m4base the list
of modules gets reset to empty.)

Bruno






Re: gnulib-tool.py: Remove a redundant function.

2024-04-14 Thread Collin Funk
On 4/14/24 6:23 PM, Collin Funk wrote:
> Also, I noticed we have:
> 
>  for src in old_files:
>  dest = self.rewrite_files([src])[-1]
>  old_table.append(tuple([dest, src]))
> 
> This is looping over a list, creating a new list with one item,
> calling GLImport.rewrite_files(), which then calls sorted(set(...))
> twice, and then appending the result to a list.
> 
> We should be able to create a new list from that function and zip()
> the two together. I'll submit another patch for that since it requires
> some sorting changes.

Patch 0002 does this. GLTestDir also has this rewrite_files() function
so I did the same there. Maybe it is worth making that a helper
function or using a base class in the future.

Also, the set() and list() calls around zip(...) are important since
zip() returns an iterator [1]. I've used whichever was most similar to
the previous code.

Patch 0003 removes a directories list that was unused. These are
created in the loop below it as files are written.

[1] https://docs.python.org/3/library/functions.html#zip

CollinFrom 3ca340c6a56d25e3e87786d12c04fcab2f8d0973 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Sun, 14 Apr 2024 18:55:36 -0700
Subject: [PATCH 2/3] gnulib-tool.py: Refactor file name transformations.

* pygnulib/GLImport.py (GLImport.rewrite_files): Don't sort and don't
remove duplicates.
(GLImport.prepare): Pass the the file list to rewrite_files and zip
it together the result.
* pygnulib/GLTestDir.py (GLTestDir.rewrite_files): Don't sort and don't
remove duplicates.
(GLTestDir.execute): Pass the the file list to rewrite_files and zip
it together the result.
---
 ChangeLog | 12 
 pygnulib/GLImport.py  | 15 +++
 pygnulib/GLTestDir.py |  8 ++--
 3 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 82c4a5f838..cf52f56b19 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,15 @@
+2024-04-14  Collin Funk  
+
+	gnulib-tool.py: Refactor file name transformations.
+	* pygnulib/GLImport.py (GLImport.rewrite_files): Don't sort and don't
+	remove duplicates.
+	(GLImport.prepare): Pass the the file list to rewrite_files and zip
+	it together the result.
+	* pygnulib/GLTestDir.py (GLTestDir.rewrite_files): Don't sort and don't
+	remove duplicates.
+	(GLTestDir.execute): Pass the the file list to rewrite_files and zip
+	it together the result.
+
 2024-04-14  Collin Funk  
 
 	gnulib-tool.py: Remove a redundant function.
diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py
index 430691efbd..ceacfbb232 100644
--- a/pygnulib/GLImport.py
+++ b/pygnulib/GLImport.py
@@ -323,7 +323,6 @@ def rewrite_files(self, files: list[str]) -> list[str]:
 for file in files:
 if type(file) is not str:
 raise TypeError('each file must be a string instance')
-files = sorted(set(files))
 auxdir = self.config['auxdir']
 docbase = self.config['docbase']
 sourcebase = self.config['sourcebase']
@@ -348,7 +347,7 @@ def rewrite_files(self, files: list[str]) -> list[str]:
 else:  # file is not a special file
 path = file
 result.append(os.path.normpath(path))
-return sorted(set(result))
+return result
 
 def actioncmd(self) -> str:
 '''Return command-line invocation comment.'''
@@ -918,16 +917,8 @@ def prepare(self) -> tuple[dict[str, list[str]], dict[str, str]]:
 transformers['aux'] = sed_transform_build_aux_file
 transformers['main'] = sed_transform_main_lib_file
 transformers['tests'] = sed_transform_testsrelated_lib_file
-old_table = []
-new_table = []
-for src in old_files:
-dest = self.rewrite_files([src])[-1]
-old_table.append(tuple([dest, src]))
-for src in new_files:
-dest = self.rewrite_files([src])[-1]
-new_table.append(tuple([dest, src]))
-old_table = sorted(set(old_table))
-new_table = sorted(set(new_table))
+old_table = sorted(set(zip(self.rewrite_files(old_files), old_files)))
+new_table = sorted(set(zip(self.rewrite_files(new_files), new_files)))
 
 # Prepare the filetable.
 filetable = dict()
diff --git a/pygnulib/GLTestDir.py b/pygnulib/GLTestDir.py
index a7709a1259..dee8d629dc 100644
--- a/pygnulib/GLTestDir.py
+++ b/pygnulib/GLTestDir.py
@@ -138,7 +138,6 @@ def rewrite_files(self, files: list[str]) -> list[str]:
 for file in files:
 if type(file) is not str:
 raise TypeError('each file must be a string instance')
-files = sorted(set(files))
 auxdir = self.config['auxdir']
 docbase = self.config['docbase']
 sourcebase = self.config['sourcebase']
@@ -163,7 +162,7 @@ def rewrite_files(self, files: list[str]) -> list[str]:
 else:  # file is not a special file
 path = file
 

Re: gnulib-tool.py: Stop using codecs.open

2024-04-14 Thread Collin Funk
Hi Bruno,

On 4/14/24 3:16 PM, Bruno Haible wrote:
> I verified that on Cygwin, the test suite passes; this is because
>   - Cygwin programs produce LF as line terminator,
>   - Python's platform.system() returns "CYGWIN_NT-10.0".
> 
> Regarding native Windows, I don't think there's a realistic use-case,
> as users would have to have a working autoconf and automake first,
> which is unlikely for that platform. So, I don't even spend time
> testing gnulib-tool on native Windows.

Thanks for testing that. As far as native Windows goes, that makes
sense. I know Perl and Python work there, but I have never tried
Autoconf and Automake. I think in the past I used MYS2 to test builds
[1].

> And regarding z/OS — its newline conventions are that much of a mess [1]
> that it's better not to even think about it.

Interesting read. I've never used z/OS, or much of the other
proprietary operating systems for that matter (e.g. HP-UX, AIX).

[1] https://www.msys2.org

Collin



Re: gnulib-tool.py: Fix incorrect type hint.

2024-04-14 Thread Bruno Haible
Collin Funk wrote:
> The 'filelist' passed to
> 'filter_filelist' should be marked 'list[str]' and not 'str'.

Thanks, applied.






Re: gnulib-tool.py: Stop using codecs.open

2024-04-14 Thread Bruno Haible
Hi Collin,

> You mentioned removing the constants.nlconvert() stuff in
> an earlier email [1]. How about these two patches?

I verified that on Cygwin, the test suite passes; this is because
  - Cygwin programs produce LF as line terminator,
  - Python's platform.system() returns "CYGWIN_NT-10.0".

Regarding native Windows, I don't think there's a realistic use-case,
as users would have to have a working autoconf and automake first,
which is unlikely for that platform. So, I don't even spend time
testing gnulib-tool on native Windows.

And regarding z/OS — its newline conventions are that much of a mess [1]
that it's better not to even think about it.

> Patch 0001 removes nlconvert and all of its calls. That should make
> sure gnulib-tool.py and gnulib-tool.sh deal with newlines the same
> way.
> 
> Patch 0002 removes the 'NL' constant.

Thanks! Both patches applied.

> I assume this was to match gnulib-tool.sh's use of "$nl"?

Yes. In a shell script, use of literal newlines produces hard-to-read code.

Bruno

[1] https://lists.gnu.org/archive/html/bug-gnu-libiconv/2023-04/msg00010.html






Re: gnulib-tool.py: Remove some unused variables.

2024-04-14 Thread Bruno Haible
Hi Collin,

> There are a few cases of statements like this:
> 
>   var = None
>   try:
>   # do something
>   var = True
>except:
>   var = False
> or:
> 
>  var = None
>  if other_condition:
>  var = True
>  else:
>  var = False
> 
> Since the 'var' will be set to a bool unconditionally, the None is
> never used.

Yup. It seems that I did not know about [1] when I wrote this code.

Thanks for the patch. Applied with corrected wording in ChangeLog:
When you remove
   var = None
here, you are not removing a variable, but an assignment.

Bruno

[1] 
https://stackoverflow.com/questions/2829528/whats-the-scope-of-a-variable-initialized-in-an-if-statement






Re: gnulib-tool.py: Stop using codecs.open

2024-04-14 Thread Collin Funk
Hi Bruno,

On 4/13/24 4:08 AM, Bruno Haible wrote:
> It seems that codecs.open is frowned upon, nowadays [1],
> and that the Python 3 way of opening a file is a built-in function 'open' [2].

Thanks for this patch. When I started working on gnulib-tool.py I
didn't even know the codecs module existed, since I had never used
Python 2.

> Let's use this consistently. With newline='\n' in order to match what
> gnulib-tool.sh does.

Sounds good. You mentioned removing the constants.nlconvert() stuff in
an earlier email [1]. How about these two patches?

Patch 0001 removes nlconvert and all of its calls. That should make
sure gnulib-tool.py and gnulib-tool.sh deal with newlines the same
way.

Patch 0002 removes the 'NL' constant. I assume this was to match
gnulib-tool.sh's use of "$nl"? But there it is used to expand to a
newline character without breaking lines right? For example, this:

 func_append license_incompatibilities "$module $license$nl"

vs. this:

 func_append license_incompatibilities "$module $license
"

Since we use a mix of '\n' and constants.NL, I would just like to use
'\n' everwhere consistently.

> Specifying encoding='utf-8' is what makes the most sense today. If a package
> still has a configure.ac or Makefile.am in ISO-8859-1 encoding, this will
> probably fail. Let's see if people report a problem with this; it should be
> very rare.

Sounds good.

[1] https://lists.gnu.org/archive/html/bug-gnulib/2024-03/msg00370.html

CollinFrom fbdd4e7783b8a4b6e637bac2b392024b0e880622 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Sun, 14 Apr 2024 10:18:03 -0700
Subject: [PATCH 1/2] gnulib-tool.py: Don't perform newline conversions.

* pygnulib/constants.py (nlconvert): Remove function. Remove unused
platform import.
* pygnulib/GLImport.py (GLImport.gnulib_cache): Remove calls to
nlconvert().
* pygnulib/GLModuleSystem.py
(GLModule.getAutomakeSnippet_Unconditional): Likewise.
* pygnulib/GLTestDir.py (GLTestDir.execute, GLMegaTestDir.execute):
Likewise.
---
 ChangeLog  | 12 
 pygnulib/GLImport.py   |  2 +-
 pygnulib/GLModuleSystem.py |  1 -
 pygnulib/GLTestDir.py  |  6 --
 pygnulib/constants.py  | 10 --
 5 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 9b40f3d9d8..b5c16eae4e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,15 @@
+2024-04-14  Collin Funk  
+
+	gnulib-tool.py: Don't perform newline conversions.
+	* pygnulib/constants.py (nlconvert): Remove function. Remove unused
+	platform import.
+	* pygnulib/GLImport.py (GLImport.gnulib_cache): Remove calls to
+	nlconvert().
+	* pygnulib/GLModuleSystem.py
+	(GLModule.getAutomakeSnippet_Unconditional): Likewise.
+	* pygnulib/GLTestDir.py (GLTestDir.execute, GLMegaTestDir.execute):
+	Likewise.
+
 2024-04-14  Collin Funk  
 
 	gnulib-tool.py: Remove some unused variables.
diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py
index eadf45828d..eaff520fa4 100644
--- a/pygnulib/GLImport.py
+++ b/pygnulib/GLImport.py
@@ -598,7 +598,7 @@ def gnulib_cache(self) -> str:
 if vc_files != None:
 # Convert Python bools to shell (True -> true).
 emit += 'gl_VC_FILES([%s])\n' % str(vc_files).lower()
-return constants.nlconvert(emit)
+return emit
 
 def gnulib_comp(self, filetable: dict[str, list[str]], gentests: bool) -> str:
 '''Emit the contents of generated $m4base/gnulib-comp.m4 file.
diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py
index 2551e316e1..11cd6ef78a 100644
--- a/pygnulib/GLModuleSystem.py
+++ b/pygnulib/GLModuleSystem.py
@@ -645,7 +645,6 @@ def getAutomakeSnippet_Unconditional(self) -> str:
for filename in buildaux_files.split(constants.NL) ]
 result += 'EXTRA_DIST += %s' % ' '.join(buildaux_files)
 result += '\n\n'
-result = constants.nlconvert(result)
 self.cache['makefile-unconditional'] = result
 return self.cache['makefile-unconditional']
 
diff --git a/pygnulib/GLTestDir.py b/pygnulib/GLTestDir.py
index dc70e8d304..a7709a1259 100644
--- a/pygnulib/GLTestDir.py
+++ b/pygnulib/GLTestDir.py
@@ -404,7 +404,6 @@ def execute(self) -> None:
 if file.startswith('m4/'):
 file = constants.substart('m4/', '', file)
 emit += 'EXTRA_DIST += %s\n' % file
-emit = constants.nlconvert(emit)
 with open(destfile, mode='w', newline='\n', encoding='utf-8') as file:
 file.write(emit)
 
@@ -522,7 +521,6 @@ def execute(self) -> None:
 emit += 'AH_TOP([#include \"../config.h\"])\n\n'
 emit += 'AC_CONFIG_FILES([Makefile])\n'
 emit += 'AC_OUTPUT\n'
-emit = constants.nlconvert(emit)
 path = joinpath(self.testdir, testsbase, 'configure.ac')
 with open(path, mode='w', newline='\n', encoding='utf-8') as file:
 

Re: gnulib-tool.py: Don't use mutable default arguments.

2024-04-14 Thread Bruno Haible
Hi Collin,

> Here is a test program to demonstrate:
> 
> 
> #!/usr/bin/env python3
> 
> def function(arg1, arg2 = dict()):
> arg2[arg1] = 0
> print(arg2)
> 
> function('one')
> function('two')
> function('three')
> 
> 
> When executing the following is printed:
> 
>  {'one': 0}
>  {'one': 0, 'two': 0}
>  {'one': 0, 'two': 0, 'three': 0}

Oh, this is surprising. I would have expected that the default value
expression gets evaluated each time, not once only.

Thanks for the observation! Patch applied.

Bruno






Re: gnulib-tool.py: Fix extra arguments to function call.

2024-04-13 Thread Bruno Haible
Collin Funk wrote:
> I see a warning for this section of code because isfile() is called
> with two arguments. It looks like the correct way to write this is to
> joinpath() the two arguments

Yes, I agree. Thanks! Applied.

Bruno






Re: gnulib-tool.py: Fix extra arguments to function call.

2024-04-13 Thread Collin Funk
On 4/13/24 10:12 AM, Collin Funk wrote:
> diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py
> index d44ceedcec..098bbc59ac 100644
> --- a/pygnulib/GLImport.py
> +++ b/pygnulib/GLImport.py
> @@ -1230,7 +1230,7 @@ AC_DEFUN([%s_FILE_LIST], [\n''' % macro_prefix

Also, I found out that adding '*.py diff=python' to .gitattributes
will give a better diff hunk header [1]:

diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py
index d44ceedcec..098bbc59ac 100644
--- a/pygnulib/GLImport.py
+++ b/pygnulib/GLImport.py
@@ -1230,7 +1230,7 @@ def execute(self, filetable: dict[str, list[str]], 
transformers: dict[str, str])

[1] 
https://github.com/git/git/blob/8f7582d995682f785e80e344197cc715e6bc7d8e/userdiff.c#L270

Collin



Re: gnulib-tool.py: Refactor directory tree removals

2024-04-13 Thread Bruno Haible
Hi Collin,

> > +try:
> > +shutil.rmtree(dest)
> > +except FileNotFoundError:
> > +pass
> 
> You should be able to use 'shutil.rmtree(dest, ignore_errors=True)'
> here [1]. Unless you have a reason for not doing so that I missed.

It can be useful to get PermissionError explicitly signalled.
I know that 'rm -rf' will just print an error message to stderr in this case.
If we ever find that we need better error handling than what 'rm -rf'
provides, we are ready to use shutil.rmtree.

Bruno






Re: gnulib-tool.py: Refactor directory tree removals

2024-04-13 Thread Collin Funk
Hi Bruno,

On 4/13/24 3:17 AM, Bruno Haible wrote:
> +def rmtree(dest: str) -> None:
> +'''Removes the file or directory tree at dest, if it exists.'''
> +# These two implementations are nearly equivalent.
> +# Speed: 'rm -rf' can be a little faster.
> +# Exceptions: shutil.rmtree raises Python exceptions, e.g. 
> PermissionError.
> +if True:
> +sp.run(['rm', '-rf', dest], shell=False)
> +else:
> +try:
> +shutil.rmtree(dest)
> +except FileNotFoundError:
> +pass

You should be able to use 'shutil.rmtree(dest, ignore_errors=True)'
here [1]. Unless you have a reason for not doing so that I missed.

>>> shutil.rmtree('bad-directory-name')
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib64/python3.12/shutil.py", line 775, in rmtree
onexc(os.lstat, path, err)
  File "/usr/lib64/python3.12/shutil.py", line 773, in rmtree
orig_st = os.lstat(path, dir_fd=dir_fd)
  ^
FileNotFoundError: [Errno 2] No such file or directory: 'bad-directory-name'
>>> shutil.rmtree('bad-directory-name', ignore_errors=True)
>>> shutil.rmtree('bad-directory-name', ignore_errors=True)

[1] https://docs.python.org/3/library/shutil.html#shutil.rmtree

Collin



Re: gnulib-tool.py in gettext

2024-04-12 Thread Bruno Haible
Hi Collin,

> This patch fixes the issue for me.

Thanks! I confirm, it's fixed for me too. Applied.

Bruno






Re: gnulib-tool.py in gettext

2024-04-12 Thread Collin Funk
Hi Bruno,

On 4/10/24 4:26 AM, Bruno Haible wrote:
> The Python implementation has created a tests/ directory instead of a
> gettext-tools/tests/ directory.

This patch fixes the issue for me. It looks like the GLConfig had it's
destdir set *after* the files were already copied.

CollinFrom be1f5c620cbf207038594c4718731a7927f1abfe Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Fri, 12 Apr 2024 18:34:49 -0700
Subject: [PATCH] gnulib-tool.py: Fix --copy-file directory creation.

Reported by Bruno Haible in


* pygnulib/main.py (main): Make sure that destdir is set in the GLConfig
object before copying files.
---
 ChangeLog   | 8 
 gnulib-tool.py.TODO | 2 --
 pygnulib/main.py| 2 +-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8f5904281c..80679d0743 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2024-04-12  Collin Funk  
+
+	gnulib-tool.py: Fix --copy-file directory creation.
+	Reported by Bruno Haible in
+	
+	* pygnulib/main.py (main): Make sure that destdir is set in the GLConfig
+	object before copying files.
+
 2024-04-12  Bruno Haible  
 
 	gnulib-tool.py: Implement --add-import --with-*-tests correctly.
diff --git a/gnulib-tool.py.TODO b/gnulib-tool.py.TODO
index 88e92150f4..2db551d4e0 100644
--- a/gnulib-tool.py.TODO
+++ b/gnulib-tool.py.TODO
@@ -3,8 +3,6 @@
 
 Bugs:
   - test-cache-2-*.sh test suite failures
-  - --copy-file creates wrong directory
-https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00140.html
   - error message missing when gl_DOC_BASE missing
 https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00160.html
 
diff --git a/pygnulib/main.py b/pygnulib/main.py
index f654d5b109..4b64305049 100644
--- a/pygnulib/main.py
+++ b/pygnulib/main.py
@@ -1339,12 +1339,12 @@ def main() -> None:
 except FileExistsError:
 pass
 # Copy the file.
+config.setDestDir(destdir)
 assistant = GLFileAssistant(config)
 tmpfile = assistant.tmpfilename(destpath)
 copyfile(lookedup, tmpfile)
 ensure_writable(tmpfile)
 assistant.setOriginal(srcpath)
-assistant.config.setDestDir(destdir)
 assistant.setRewritten(destpath)
 if isfile(joinpath(destdir, destpath)):
 # The file already exists.
-- 
2.44.0



Re: gnulib-tool.py in jugtail

2024-04-12 Thread Collin Funk
Hi Bruno,

On 4/12/24 3:28 PM, Bruno Haible wrote:
> I am applying these two refactorings. Hope it makes this piece of code
> easier to understand.

Thanks!

It looks like you got all the cache tests passing. Nice work.

Collin



Re: gnulib-tool.py in jugtail

2024-04-12 Thread Bruno Haible
Collin Funk wrote:
> Then only set it to the cache or
> default in GLImport.__init__(). But that section of code is already
> sort of difficult to follow...

I am applying these two refactorings. Hope it makes this piece of code
easier to understand.

>From 27c5b31a42ccd5353ccff09d46dbaa0d105f0f0b Mon Sep 17 00:00:00 2001
From: Bruno Haible 
Date: Sat, 13 Apr 2024 00:10:44 +0200
Subject: [PATCH 1/2] gnulib-tool.py: Optimize.

* pygnulib/GLConfig.py (GLConfig.update, GLConfig.update_key): Avoid
useless cloning of dictionaries.
---
 ChangeLog|  6 ++
 pygnulib/GLConfig.py | 24 +++-
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b25819a4aa..212a78b305 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-04-12  Bruno Haible  
+
+	gnulib-tool.py: Optimize.
+	* pygnulib/GLConfig.py (GLConfig.update, GLConfig.update_key): Avoid
+	useless cloning of dictionaries.
+
 2024-04-12  Bruno Haible  
 
 	gnulib-tool.py: Implement --no-conditional-dependencies correctly.
diff --git a/pygnulib/GLConfig.py b/pygnulib/GLConfig.py
index e938b30a5e..515047d3f0 100644
--- a/pygnulib/GLConfig.py
+++ b/pygnulib/GLConfig.py
@@ -275,21 +275,27 @@ class GLConfig:
 if type(dictionary) is not GLConfig:
 raise TypeError('dictionary must be a GLConfig, not %s'
 % type(dictionary).__name__)
-dictionary = dict(dictionary.table)
+dictionary = dictionary.table
 result = dict()
 for key in dictionary:
 src = self.table[key]
 dest = dictionary[key]
-result[key] = src
-if src != dest:
+# Merge src and dest into a single value.
+if src == dest:
+value = src
+else:
 if self.isdefault(key, src):
-result[key] = dest
+value = dest
 else:  # if not self.isdefault(key, src)
-if not self.isdefault(key, dest):
+if self.isdefault(key, dest):
+value = src
+else:  # if not self.isdefault(key, dest)
 if key in ['modules', 'avoids', 'tests']:
-dest = sorted(set(src + dest))
-result[key] = dest
-self.table = dict(result)
+value = sorted(set(src + dest))
+else:
+value = dest
+result[key] = value
+self.table = result
 
 def update_key(self, dictionary: GLConfig, key: str) -> None:
 '''Update the given key using value from the given dictionary.'''
@@ -297,7 +303,7 @@ class GLConfig:
 if type(dictionary) is not GLConfig:
 raise TypeError('dictionary must be a GLConfig, not %s'
 % type(dictionary).__name__)
-dictionary = dict(dictionary.table)
+dictionary = dictionary.table
 self.table[key] = dictionary[key]
 else:  # if key not in self.table
 raise KeyError('GLConfig does not contain key: %s' % repr(key))
-- 
2.34.1

>From 48f615b71b7299e7416ca46db076b7e61820a2f2 Mon Sep 17 00:00:00 2001
From: Bruno Haible 
Date: Sat, 13 Apr 2024 00:24:52 +0200
Subject: [PATCH 2/2] gnulib-tool.py: Refactor.

* pygnulib/GLConfig.py (GLConfig.update, GLConfig.update_key): Improve
variable names and comments.
* pygnulib/GLImport.py (GLImport.__init__): Improve comments.
---
 ChangeLog|  7 +++
 pygnulib/GLConfig.py | 24 
 pygnulib/GLImport.py | 30 +++---
 3 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 212a78b305..4d5725f843 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-04-12  Bruno Haible  
+
+	gnulib-tool.py: Refactor.
+	* pygnulib/GLConfig.py (GLConfig.update, GLConfig.update_key): Improve
+	variable names and comments.
+	* pygnulib/GLImport.py (GLImport.__init__): Improve comments.
+
 2024-04-12  Bruno Haible  
 
 	gnulib-tool.py: Optimize.
diff --git a/pygnulib/GLConfig.py b/pygnulib/GLConfig.py
index 515047d3f0..ab1cb218be 100644
--- a/pygnulib/GLConfig.py
+++ b/pygnulib/GLConfig.py
@@ -270,12 +270,12 @@ class GLConfig:
 table = copy.deepcopy(self)
 return table
 
-def update(self, dictionary: GLConfig) -> None:
-'''Specify the dictionary whose keys will be used to update config.'''
-if type(dictionary) is not GLConfig:
-raise TypeError('dictionary must be a GLConfig, not %s'
-% type(dictionary).__name__)
-dictionary = dictionary.table
+def update(self, other_config: GLConfig) -> None:
+'''Merges all non-default values from other_config into this config.'''
+if type(other_config) is not GLConfig:
+raise 

Re: gnulib-tool.py in jugtail

2024-04-12 Thread Collin Funk
On 4/12/24 5:08 AM, Bruno Haible wrote:
> OK, if you want to rework this part, we need unit tests for it first.
> I have now added unit tests for the gnulib-cache.m4 handling; essentially,
> one test for each possible option.
> 
> Before dealing with the 'jugtail' package, it would be good to make
> these tests work. Currently, a number of them fails:

Sounds good. Thanks for adding them.

Collin



Re: gnulib-tool.py in jugtail

2024-04-12 Thread Bruno Haible
Hi Collin,

> It looks like the reading of gnulib-cache.m4 will need to be reworked.

OK, if you want to rework this part, we need unit tests for it first.
I have now added unit tests for the gnulib-cache.m4 handling; essentially,
one test for each possible option.

Before dealing with the 'jugtail' package, it would be good to make
these tests work. Currently, a number of them fails:

$ ./test-all.sh PASS: test-hello-c-gnulib-1.sh
PASS: test-hello-c-gnulib-automakesubdir-1.sh
PASS: test-hello-c-gnulib-automakesubdir-withtests-1.sh
PASS: test-hello-c-gnulib-conddeps-1.sh
PASS: test-hello-c-gnulib-nonrecursive-1.sh
PASS: test-cache-1-1.sh
PASS: test-cache-1-2.sh
PASS: test-cache-1-3.sh
PASS: test-cache-1-4.sh
PASS: test-cache-1-5.sh
PASS: test-cache-1-6.sh
Files ./test-cache-1-7.result/m4/gnulib-cache.m4 and 
tmp3074241-result/m4/gnulib-cache.m4 differ
FAIL: gnulib-tool's result has unexpected differences.
FAIL: test-cache-1-7.sh
Files ./test-cache-1-8.result/lib/Makefile.am and 
tmp3074263-result/lib/Makefile.am differ
Files ./test-cache-1-8.result/m4/gnulib-cache.m4 and 
tmp3074263-result/m4/gnulib-cache.m4 differ
FAIL: gnulib-tool's result has unexpected differences.
FAIL: test-cache-1-8.sh
PASS: test-cache-1-9.sh
PASS: test-cache-1-10.sh
PASS: test-cache-1-11.sh
PASS: test-cache-1-12.sh
PASS: test-cache-1-13.sh
PASS: test-cache-1-14.sh
PASS: test-cache-1-15.sh
PASS: test-cache-1-16.sh
PASS: test-cache-1-17.sh
PASS: test-cache-1-18.sh
PASS: test-cache-1-19.sh
PASS: test-cache-1-20.sh
PASS: test-cache-1-21.sh
PASS: test-cache-1-22.sh
PASS: test-cache-1-23.sh
PASS: test-cache-1-24.sh
PASS: test-cache-1-25.sh
PASS: test-cache-1-26.sh
PASS: test-cache-1-27.sh
PASS: test-cache-1-28.sh
PASS: test-cache-1-29.sh
PASS: test-cache-1-30.sh
./test-cache-1-31.err tmp3074843-err differ: byte 1, line 1
--- ./test-cache-1-31.err   2024-04-12 00:02:39.013521019 +0200
+++ tmp3074843-err  2024-04-12 14:03:20.355689866 +0200
@@ -1 +1 @@
-gnulib-tool: warning: --po-domain has no effect without a --po-base option
+.../gnulib-tool.py: warning: --po-domain has no effect without a --po-base 
option
FAIL: gnulib-tool's error output has unexpected differences.
FAIL: test-cache-1-31.sh
PASS: test-cache-1-32.sh
PASS: test-cache-1-33.sh
PASS: test-cache-1-34.sh
Only in tmp3074942-result/lib: Makefile.am
Only in ./test-cache-2-1.result/lib: Makefile.gnulib
Files ./test-cache-2-1.result/m4/gnulib-cache.m4 and 
tmp3074942-result/m4/gnulib-cache.m4 differ
Only in tmp3074942-result: support
Only in tmp3074942-result/tests: Makefile.am
Only in ./test-cache-2-1.result/tests: Makefile.gnulib
FAIL: gnulib-tool's result has unexpected differences.
FAIL: test-cache-2-1.sh
Only in tmp3074970-result/lib: Makefile.am
Only in ./test-cache-2-2.result/lib: Makefile.gnulib
Files ./test-cache-2-2.result/m4/gnulib-cache.m4 and 
tmp3074970-result/m4/gnulib-cache.m4 differ
Only in tmp3074970-result: support
Only in tmp3074970-result/tests: Makefile.am
Only in ./test-cache-2-2.result/tests: Makefile.gnulib
FAIL: gnulib-tool's result has unexpected differences.
FAIL: test-cache-2-2.sh
Only in tmp3074998-result/lib: Makefile.am
Only in ./test-cache-2-3.result/lib: Makefile.gnulib
Files ./test-cache-2-3.result/m4/gnulib-cache.m4 and 
tmp3074998-result/m4/gnulib-cache.m4 differ
Only in tmp3074998-result: support
Only in tmp3074998-result/tests: Makefile.am
Only in ./test-cache-2-3.result/tests: Makefile.gnulib
FAIL: gnulib-tool's result has unexpected differences.
FAIL: test-cache-2-3.sh
Only in tmp3075026-result/lib: Makefile.am
Only in ./test-cache-2-4.result/lib: Makefile.gnulib
Files ./test-cache-2-4.result/m4/gnulib-cache.m4 and 
tmp3075026-result/m4/gnulib-cache.m4 differ
Only in tmp3075026-result: support
Only in tmp3075026-result/tests: Makefile.am
Only in ./test-cache-2-4.result/tests: Makefile.gnulib
FAIL: gnulib-tool's result has unexpected differences.
FAIL: test-cache-2-4.sh
Only in tmp3075054-result/lib: Makefile.am
Only in ./test-cache-2-5.result/lib: Makefile.gnulib
Only in ./test-cache-2-5.result/m4: ansi-c++.m4
Only in ./test-cache-2-5.result/m4: assert_h.m4
Only in ./test-cache-2-5.result/m4: c-bool.m4
Only in ./test-cache-2-5.result/m4: codeset.m4
Files ./test-cache-2-5.result/m4/gnulib-cache.m4 and 
tmp3075054-result/m4/gnulib-cache.m4 differ
Files ./test-cache-2-5.result/m4/gnulib-comp.m4 and 
tmp3075054-result/m4/gnulib-comp.m4 differ
Only in ./test-cache-2-5.result/m4: inttypes.m4
Only in ./test-cache-2-5.result/m4: limits-h.m4
Only in ./test-cache-2-5.result/m4: locale-fr.m4
Only in ./test-cache-2-5.result/m4: multiarch.m4
Only in ./test-cache-2-5.result/m4: std-gnu11.m4
Only in ./test-cache-2-5.result/m4: stdalign.m4
Only in ./test-cache-2-5.result/m4: stdint.m4
Only in ./test-cache-2-5.result/m4: stdlib_h.m4
Only in ./test-cache-2-5.result/m4: wchar_h.m4
Only in ./test-cache-2-5.result/m4: wint_t.m4
Only in tmp3075054-result: support
Only in ./test-cache-2-5.result: tests
FAIL: gnulib-tool's result has unexpected 

Re: gnulib-tool.py: Simplify regular expressions.

2024-04-11 Thread Bruno Haible
Hi Collin,

> > Oh, this means that r'[x\$]' contains dollar and backslash, whereas the
> > programmer might have thought that it contains only the dollar? Indeed,
> > it's worth to listen to these warnings!
> 
> I don't think it changes the meaning:
> 
> import re
> re.match(r'[x$]*', 'x\\$').group()
> 'x'
> re.match(r'[x\$]*', 'x\\$').group()
> 'x'
> re.match(r'[x\\$]*', 'x\\$').group()
> 'x\\$'

Oh, so it's not as bad as I thought. I had taken your statement

  "And in the set the special characters have their special meaning dropped,
   so there is no need to backslash them."

at face value, and got confused. Actually the correct statement is

  "And in the set the special characters, other than backslash, have their
   special meaning dropped,
   so there is no need to backslash them."

Bruno






Re: gnulib-tool.py: Simplify regular expressions.

2024-04-11 Thread Collin Funk
On 4/11/24 12:25 PM, Bruno Haible wrote:
> Oh, this means that r'[x\$]' contains dollar and backslash, whereas the
> programmer might have thought that it contains only the dollar? Indeed,
> it's worth to listen to these warnings!

I don't think it changes the meaning:

import re
re.match(r'[x$]*', 'x\\$').group()
'x'
re.match(r'[x\$]*', 'x\\$').group()
'x'
re.match(r'[x\\$]*', 'x\\$').group()
'x\\$'

Since in Python you have to backslash a backslash in string literals.
Therefore to backslash a backslash you would have to do ''. Since
we are using raw strings a backslashed backslash is r'\\'. The Python
Regular expression page has a far better explanation of that [1]. :)

# Regular string.
re.match('', '\\').group()
'\\'
print(re.match('', '\\').group())
\
# Raw string.
re.match(r'\\', '\\').group()
'\\'
print(re.match(r'\\', '\\').group())
\

[1] https://docs.python.org/3/library/re.html#module-re

Collin



Re: gnulib-tool.py: Optimize module set lookups

2024-04-11 Thread Collin Funk
Hi Bruno,

On 4/11/24 12:51 PM, Bruno Haible wrote:
> The func_transitive_closure function in the shell implementation can take
> a lot of time. So I wondered whether in the Python implementation, there
> is room for speedup at this place as well. And indeed, there is.

Nice work. I noticed this too, which was my reasoning for wanting to
change GLModuleSystem to use sets to store the modules instead of
lists.

Doing so would break the sorting required for the test cases though.
All modules are sorted except for the dummy module which is placed at
the end.

I ended up just changing the avoided modules since it was easy and
every module + it's dependencies are checked:

while inmodules:
inmodules_this_round = inmodules
inmodules = []   # Accumulator, queue for next round
for module in inmodules_this_round:
if module not in self.avoids:  # Important line here.
outmodules.append(module)

Collin



Re: gnulib-tool.py: Simplify regular expressions.

2024-04-11 Thread Bruno Haible
Hi Collin,

> I decided to try PyCharm again since I remember liking it when I used
> it ~1 year ago. It seems that it has pretty good warnings for regular
> expressions.

Both patches applied; thanks.

> Patch 0002 removes some redundant backslashing. I am pretty sure most
> of these were introduced by me. We don't need to backslash ']' when it
> is the first character in the '[...]' set. And in the set the special
> characters have their special meaning dropped, so there is no need to
> backslash them.

Oh, this means that r'[x\$]' contains dollar and backslash, whereas the
programmer might have thought that it contains only the dollar? Indeed,
it's worth to listen to these warnings!

Bruno






Re: gnulib-tool.py current status

2024-04-11 Thread Bruno Haible
Collin Funk wrote:
> * Packages successfully tested with gnulib-tool.py
> 
>   bison
>   coreutils
>   cppi
>   cpio
>   diffutils
>   findutils
>   freedink
>   Update AC_PREREQ to 2.64 required.
>   gnutls
>   grep
>   groff
>   gzip
>   inetutils
>   libiconv
>   mailutils
>   patch
>   pspp
>   sed
>   tar
>   wget
>   wget2

And I have tested these with GNULIB_TOOL_IMPL=sh+py. (A few of these tests
may have been ineffective, for packages that insist to run gnulib-tool
from a particular gnulib revision. But I don't think this matters much.)

  acct
  barcode
  bison
  coreutils
  cpio
  CSSC
  cvs
  datamash
  dico
  diffutils
  findutils
  freedink
  gcal
  gettext REPORTED: creates an extra 'tests' directory
  gengetopt
  gmediaserver
  gnuit
  groff
  grub
  gsasl
  gsequencer
  gss
  grep
  guile
  gzip
  hello
  idutils
  inetutils
  jugtail REPORTED: missing --docbase not detected
  jwhois
  libffcall
  libiconv
  libidn
  libtasn1
  libtool
  libunistring
  licenseutils
  m4
  mailutils
  mini-httpd
  myserver
  nano
  octave
  parted
  patch
  pdf
  poke
  pspp
  radius
  recutils
  reindeer
  rcs
  sed
  shishi
  tar
  texinfo
  time
  vc-dwim
  wdiff
  wget
  wget2
  zile
  gawk

Bruno






Re: gnulib-tool.py in wget

2024-04-11 Thread Collin Funk
On 4/11/24 3:51 AM, Bruno Haible wrote:
> => Applied. Thanks.

Nice! Thanks for writing the ChangeLog entry.

Collin



Re: gnulib-tool.py in jugtail

2024-04-11 Thread Collin Funk
On 4/10/24 7:23 PM, Bruno Haible wrote:
> $ cat ../glpyFfCgil-sh-err
> .../gnulib-tool.sh: *** missing --doc-base option. --doc-base has been 
> introduced on 2006-07-11; if your last invocation of 'gnulib-tool 
> --import' is before that date, you need to run 'gnulib-tool --import' 
> once, with a --doc-base option.
> .../gnulib-tool.sh: *** Stop.
> $ cat ../glpyFfCgil-py-err
> gnulib-tool: warning: module malloc doesn't exist

I just started looking at this. It looks like the reading of
gnulib-cache.m4 will need to be reworked.

In gnulib-tool.sh line 5203:

# The docbase defaults to the cached one.
if test -z "$docbase"; then
  docbase="$cached_docbase"
  if test -z "$docbase"; then
 # Long error message here.
  fi
fi

The corresponding section would be GLImport.__init__(), but we set the
default docbase in main.py line 943, before creating the GLImport:

if not docbase:
docbase = 'doc'

So we can't check for the empty string. I guess we docbase an empty
string if --docbase isn't used. Then only set it to the cache or
default in GLImport.__init__(). But that section of code is already
sort of difficult to follow...

Collin



Re: gnulib-tool.py in wget

2024-04-11 Thread Bruno Haible
Hi Collin,

> Can you test the attached patch for me? It is just an addition of a
> sorted() call, which should hopefully fix it.

The patch works fine. It fixes gnulib-tool.py in the packages
  wget
  wdiff
  freedink

=> Applied. Thanks.

Bruno






  1   2   3   >