Re: gnulib-tool.py: Remove a redundant function.

2024-04-16 Thread Collin Funk
Hi Bruno,

On 4/16/24 8:09 AM, Bruno Haible wrote:
> I'm talking about this piece of code:
> 
> filetable = []
> for src in filelist:
> dest = self.rewrite_files([src])[-1]
> filetable.append(tuple([dest, src]))
> 
> which can be written as
> 
> filetable = [ tuple([self.rewrite_filename(src), src])
>   for src in filelist ]

Ah, okay now I see what you mean. I'll have a look at doing something
like that. Thanks!

Collin



Re: gnulib-tool.py: Remove a redundant function.

2024-04-16 Thread Bruno Haible
Hi Collin,

> But I think the idea of the patch is still
> correct. Since it doesn't make sense to accept a list and then only
> use it with one element lists.

Sure. This code structure comes from the fact that in the shell
implementation, the rewriting of file names is done through a 'sed' invocation,
and that is equally suited to a single file name or a list of file names.

> > Also, the last hunk makes use of yet another Python built-in function 'zip',
> > where list comprehension [ ... for ... in ... ] is more readable.
> 
> Maybe I am missing something, but I don't think there is a good way to
> use a list comprehension here without 'zip'. Since 'zip' is used to
> combine these two lists like so:

I'm talking about this piece of code:

filetable = []
for src in filelist:
dest = self.rewrite_files([src])[-1]
filetable.append(tuple([dest, src]))

which can be written as

filetable = [ tuple([self.rewrite_filename(src), src])
  for src in filelist ]

Bruno






Re: gnulib-tool.py: Remove a redundant function.

2024-04-15 Thread Collin Funk
Hi Bruno,

On 4/15/24 7:58 AM, Bruno Haible wrote:
> Patch 0002 is not applicable because it relies on 0001, which was not good.

Yes, I need to rewrite it. But I think the idea of the patch is still
correct. Since it doesn't make sense to accept a list and then only
use it with one element lists.

> Also, the last hunk makes use of yet another Python built-in function 'zip',
> where list comprehension [ ... for ... in ... ] is more readable.

Maybe I am missing something, but I don't think there is a good way to
use a list comprehension here without 'zip'. Since 'zip' is used to
combine these two lists like so:

list1 = [ 1, 2, 3, 4 ]
list2 = [ 5, 6, 7, 8 ]
result = list(zip(list1, list2))
print(result)
[(1, 5), (2, 6), (3, 7), (4, 8)]

This still uses 'zip' but maybe you find it easier to read?

 result = [ (a, b) for
a, b in zip(list1, list2) ]

> In gnulib-tool.sh the directories are created ahead of the loop that
> copies the files. Why? Because when we have to create 500 files in the
> lib/ directory, it is faster to do 'if not isdir(dirname)' once than
> 500 times. This is also true in Python.

Ah, yes that makes sense. I'll go fix that now.

Collin



Re: gnulib-tool.py: Remove a redundant function.

2024-04-15 Thread Bruno Haible
Hi Collin,

> Patch 0002 does this. GLTestDir also has this rewrite_files() function
> so I did the same there. Maybe it is worth making that a helper
> function or using a base class in the future.
> 
> Also, the set() and list() calls around zip(...) are important since
> zip() returns an iterator [1]. I've used whichever was most similar to
> the previous code.

Patch 0002 is not applicable because it relies on 0001, which was not good.

Also, the last hunk makes use of yet another Python built-in function 'zip',
where list comprehension [ ... for ... in ... ] is more readable.

> Patch 0003 removes a directories list that was unused. These are
> created in the loop below it as files are written.

In gnulib-tool.sh the directories are created ahead of the loop that
copies the files. Why? Because when we have to create 500 files in the
lib/ directory, it is faster to do 'if not isdir(dirname)' once than
500 times. This is also true in Python.

Bruno






Re: gnulib-tool.py: Remove a redundant function.

2024-04-15 Thread Collin Funk
Hi Bruno,

On 4/15/24 4:47 AM, Bruno Haible wrote:
> No. I'm adding 3 unit tests that prove that the patch is wrong,
> one for each of docbase, sourcebase, testsbase. (For auxdir and m4base
> gnulib-tool.{sh,py} does not support changing the value while preserving
> the rest: For auxdir the old files are not removed, for m4base the list
> of modules gets reset to empty.)

Ah, I see. I should have diff'd the two functions like you. The
'self.config' and 'self.cache' was too easy for my eyes to overlook...
Thanks for adding the tests. My local branch fails but passes using
master, so I can see this change is incorrect.

I sent 2 other patches whenever you have time to check them. I can
rewrite patch 0002 as long as it is otherwise correct. It should use
rewrite_{new,old}_files and not the improper renamed function from
patch 0001.

Patch 0003 should work fine. Just an unused variable that is handled
elsewhere.

Collin



Re: gnulib-tool.py: Remove a redundant function.

2024-04-15 Thread Bruno Haible
Collin Funk wrote:
> The GLImport class has two functions that are the same,
> GLImport.rewrite_old_files() and GLImport.rewrite_new_files().

No. When I copy these functions into separate text files and use 'diff'
on them:

$ diff -u 1 2
--- 1   2024-04-15 06:34:45.441369330 +0200
+++ 2   2024-04-15 06:35:01.941511954 +0200
@@ -1,7 +1,7 @@
 
-def rewrite_old_files(self, files: list[str]) -> list[str]:
-'''Replace auxdir, docbase, sourcebase, m4base and testsbase from 
default
-to their version from cached config.'''
+def rewrite_new_files(self, files: list[str]) -> list[str]:
+'''Replace auxdir, docbase, sourcebase, m4base and testsbase from
+default to their version from config.'''
 if type(files) is not list:
 raise TypeError('files argument must has list type, not %s'
 % type(files).__name__)
@@ -9,13 +9,11 @@
 if type(file) is not str:
 raise TypeError('each file must be a string instance')
 files = sorted(set(files))
-files = [ '%s%s' % (file, os.path.sep)
-  for file in files ]
-auxdir = self.cache['auxdir']
-docbase = self.cache['docbase']
-sourcebase = self.cache['sourcebase']
-m4base = self.cache['m4base']
-testsbase = self.cache['testsbase']
+auxdir = self.config['auxdir']
+docbase = self.config['docbase']
+sourcebase = self.config['sourcebase']
+m4base = self.config['m4base']
+testsbase = self.config['testsbase']
 result = []
 for file in files:
 if file.startswith('build-aux/'):

> Therefore, we can remove GLImport.rewrite_old_files() and rename
> GLImport.rewrite_new_files() to GLImport.rewrite_files().

No. I'm adding 3 unit tests that prove that the patch is wrong,
one for each of docbase, sourcebase, testsbase. (For auxdir and m4base
gnulib-tool.{sh,py} does not support changing the value while preserving
the rest: For auxdir the old files are not removed, for m4base the list
of modules gets reset to empty.)

Bruno






Re: gnulib-tool.py: Remove a redundant function.

2024-04-14 Thread Collin Funk
On 4/14/24 6:23 PM, Collin Funk wrote:
> Also, I noticed we have:
> 
>  for src in old_files:
>  dest = self.rewrite_files([src])[-1]
>  old_table.append(tuple([dest, src]))
> 
> This is looping over a list, creating a new list with one item,
> calling GLImport.rewrite_files(), which then calls sorted(set(...))
> twice, and then appending the result to a list.
> 
> We should be able to create a new list from that function and zip()
> the two together. I'll submit another patch for that since it requires
> some sorting changes.

Patch 0002 does this. GLTestDir also has this rewrite_files() function
so I did the same there. Maybe it is worth making that a helper
function or using a base class in the future.

Also, the set() and list() calls around zip(...) are important since
zip() returns an iterator [1]. I've used whichever was most similar to
the previous code.

Patch 0003 removes a directories list that was unused. These are
created in the loop below it as files are written.

[1] https://docs.python.org/3/library/functions.html#zip

CollinFrom 3ca340c6a56d25e3e87786d12c04fcab2f8d0973 Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Sun, 14 Apr 2024 18:55:36 -0700
Subject: [PATCH 2/3] gnulib-tool.py: Refactor file name transformations.

* pygnulib/GLImport.py (GLImport.rewrite_files): Don't sort and don't
remove duplicates.
(GLImport.prepare): Pass the the file list to rewrite_files and zip
it together the result.
* pygnulib/GLTestDir.py (GLTestDir.rewrite_files): Don't sort and don't
remove duplicates.
(GLTestDir.execute): Pass the the file list to rewrite_files and zip
it together the result.
---
 ChangeLog | 12 
 pygnulib/GLImport.py  | 15 +++
 pygnulib/GLTestDir.py |  8 ++--
 3 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 82c4a5f838..cf52f56b19 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,15 @@
+2024-04-14  Collin Funk  
+
+	gnulib-tool.py: Refactor file name transformations.
+	* pygnulib/GLImport.py (GLImport.rewrite_files): Don't sort and don't
+	remove duplicates.
+	(GLImport.prepare): Pass the the file list to rewrite_files and zip
+	it together the result.
+	* pygnulib/GLTestDir.py (GLTestDir.rewrite_files): Don't sort and don't
+	remove duplicates.
+	(GLTestDir.execute): Pass the the file list to rewrite_files and zip
+	it together the result.
+
 2024-04-14  Collin Funk  
 
 	gnulib-tool.py: Remove a redundant function.
diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py
index 430691efbd..ceacfbb232 100644
--- a/pygnulib/GLImport.py
+++ b/pygnulib/GLImport.py
@@ -323,7 +323,6 @@ def rewrite_files(self, files: list[str]) -> list[str]:
 for file in files:
 if type(file) is not str:
 raise TypeError('each file must be a string instance')
-files = sorted(set(files))
 auxdir = self.config['auxdir']
 docbase = self.config['docbase']
 sourcebase = self.config['sourcebase']
@@ -348,7 +347,7 @@ def rewrite_files(self, files: list[str]) -> list[str]:
 else:  # file is not a special file
 path = file
 result.append(os.path.normpath(path))
-return sorted(set(result))
+return result
 
 def actioncmd(self) -> str:
 '''Return command-line invocation comment.'''
@@ -918,16 +917,8 @@ def prepare(self) -> tuple[dict[str, list[str]], dict[str, str]]:
 transformers['aux'] = sed_transform_build_aux_file
 transformers['main'] = sed_transform_main_lib_file
 transformers['tests'] = sed_transform_testsrelated_lib_file
-old_table = []
-new_table = []
-for src in old_files:
-dest = self.rewrite_files([src])[-1]
-old_table.append(tuple([dest, src]))
-for src in new_files:
-dest = self.rewrite_files([src])[-1]
-new_table.append(tuple([dest, src]))
-old_table = sorted(set(old_table))
-new_table = sorted(set(new_table))
+old_table = sorted(set(zip(self.rewrite_files(old_files), old_files)))
+new_table = sorted(set(zip(self.rewrite_files(new_files), new_files)))
 
 # Prepare the filetable.
 filetable = dict()
diff --git a/pygnulib/GLTestDir.py b/pygnulib/GLTestDir.py
index a7709a1259..dee8d629dc 100644
--- a/pygnulib/GLTestDir.py
+++ b/pygnulib/GLTestDir.py
@@ -138,7 +138,6 @@ def rewrite_files(self, files: list[str]) -> list[str]:
 for file in files:
 if type(file) is not str:
 raise TypeError('each file must be a string instance')
-files = sorted(set(files))
 auxdir = self.config['auxdir']
 docbase = self.config['docbase']
 sourcebase = self.config['sourcebase']
@@ -163,7 +162,7 @@ def rewrite_files(self, files: list[str]) -> list[str]:
 else:  # file is not a special file
 path = file