Re: WIP patch to reduce rewrite_filename duplication

2024-04-25 Thread Collin Funk
On 4/25/24 5:14 PM, Bruno Haible wrote:
> Yes, the other heuristic is that when many functions operate on the
> same object, like methods in class do, the pointer to that object
> is often passed first. And, as you noticed, the two heuristics collide...
> 
> Here, I would not think of 'rewrite_file_name' as a method pertaining
> to the class GLConfig. Rather, GLConfig is just one of the parameters.

Yeah, that makes sense. I'll keep that in mind.

> Perfect. OK to push.

Done.

> Correct. Feel free to add tests for --copy-file, if you like.
> I didn't spend time on it, since --copy-file is simple enough.

Sure, I can have a look at adding some. I see it is part of the Gnulib
Savannah group so I can commit changes there. I assume same rules go
there too, always send patches for review?

> Ah, that's because exitfail.h was modified in Gnulib a few days ago.
> Will adapt the override in gettext. Thanks for the notice.

Thanks! I see now that the #ifdef __cplusplus change messed up the
patch.

Collin



Re: WIP patch to reduce rewrite_filename duplication

2024-04-25 Thread Bruno Haible
Hi Collin,

> > Yep, that's the right way to do it. Maybe file_name should come first
> > (by the usual heuristic that the argument that shows most variation
> > comes first)?
> 
> Sure, that is fine with me. I think the other way is based on how I
> like to write C. Usually I like to place structures (or pointers to
> them) first. The best example I can think of off the top of my head
> is:
> 
> int fprintf (FILE *stream, const char *format, ...);
> 
> vs.
> 
>int fputs (const char *str, FILE *stream);
> 
> I always feel awkward using 'fputs' because of that.

Yes, the other heuristic is that when many functions operate on the
same object, like methods in class do, the pointer to that object
is often passed first. And, as you noticed, the two heuristics collide...

Here, I would not think of 'rewrite_file_name' as a method pertaining
to the class GLConfig. Rather, GLConfig is just one of the parameters.

> How does this patch look?

Perfect. OK to push.

> I ran all tests with Python 3.7 and they pass. I don't think
> gnulib-tool --copy-file is tested there though.

Correct. Feel free to add tests for --copy-file, if you like.
I didn't spend time on it, since --copy-file is simple enough.

> I tried to use gettext
> since I remember autogen.sh does it a few times but I see the
> following failure:
> 
> $ cd gettext
> $ env GNULIB_TOOL_IMPL=sh+py sh -x ./autogen.sh
> [...]
> Copying file gnulib-lib/exitfail.c
> 1 out of 1 hunk FAILED -- saving rejects to file /tmp/glDMpS4d/exitfail.h.rej
> /home/collin/.local/src/gnulib/gnulib-tool.sh: *** patch file 
> gnulib-local/lib/exitfail.h.diff didn't apply cleanly
> /home/collin/.local/src/gnulib/gnulib-tool.sh: *** Stop.
> + exit 1

Ah, that's because exitfail.h was modified in Gnulib a few days ago.
Will adapt the override in gettext. Thanks for the notice.

Bruno






Re: WIP patch to reduce rewrite_filename duplication

2024-04-25 Thread Collin Funk
Hi Bruno,

On 4/25/24 1:18 PM, Bruno Haible wrote:
> Yep, that's the right way to do it. Maybe file_name should come first
> (by the usual heuristic that the argument that shows most variation
> comes first)?

Sure, that is fine with me. I think the other way is based on how I
like to write C. Usually I like to place structures (or pointers to
them) first. The best example I can think of off the top of my head
is:

int fprintf (FILE *stream, const char *format, ...);

vs.

   int fputs (const char *str, FILE *stream);

I always feel awkward using 'fputs' because of that. But that is a
different language so I can adapt. :)

How does this patch look? I'll wait for you to review it before
pushing any changes.

I ran all tests with Python 3.7 and they pass. I don't think
gnulib-tool --copy-file is tested there though. I tried to use gettext
since I remember autogen.sh does it a few times but I see the
following failure:

$ cd gettext
$ env GNULIB_TOOL_IMPL=sh+py sh -x ./autogen.sh
[...]
Copying file gnulib-lib/exitfail.c
1 out of 1 hunk FAILED -- saving rejects to file /tmp/glDMpS4d/exitfail.h.rej
/home/collin/.local/src/gnulib/gnulib-tool.sh: *** patch file 
gnulib-local/lib/exitfail.h.diff didn't apply cleanly
/home/collin/.local/src/gnulib/gnulib-tool.sh: *** Stop.
+ exit 1

I assume that is unrelated to these changes and will be an easy fix
for you. When I remove the diff everything passes with
GNULIB_TOOL_IMPL=sh+py.

CollinFrom 95f83aabfcd2cd0624ae85357f7f40f34bb1a01b Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Thu, 25 Apr 2024 15:30:29 -0700
Subject: [PATCH] gnulib-tool.py: Reduce code duplication in file name
 transformations.

* pygnulib/functions.py: New file for shared functions between modules.
Add a function based on functions removed from GLImport and GLTestDir.
Accepts a single file name instead of a list.
* pygnulib/GLImport.py (GLImport.prepare): Use the new function.
(GLImport.rewrite_new_files, GLImport.rewrite_old_files): Remove
functions.
* pygnulib/GLTestDir.py (GLTestDir.execute): Use the new function.
(GLTestDir.rewrite_files): Remove functions.
* pygnulib/main.py (main): Remove unused function import. Use the new
function.
---
 ChangeLog | 14 +++
 pygnulib/GLImport.py  | 97 +--
 pygnulib/GLTestDir.py | 55 +---
 pygnulib/functions.py | 52 +++
 pygnulib/main.py  | 17 +---
 5 files changed, 88 insertions(+), 147 deletions(-)
 create mode 100644 pygnulib/functions.py

diff --git a/ChangeLog b/ChangeLog
index abefd3bae2..276258b885 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,17 @@
+2024-04-25  Collin Funk  
+
+	gnulib-tool.py: Reduce code duplication in file name transformations.
+	* pygnulib/functions.py: New file for shared functions between modules.
+	Add a function based on functions removed from GLImport and GLTestDir.
+	Accepts a single file name instead of a list.
+	* pygnulib/GLImport.py (GLImport.prepare): Use the new function.
+	(GLImport.rewrite_new_files, GLImport.rewrite_old_files): Remove
+	functions.
+	* pygnulib/GLTestDir.py (GLTestDir.execute): Use the new function.
+	(GLTestDir.rewrite_files): Remove functions.
+	* pygnulib/main.py (main): Remove unused function import. Use the new
+	function.
+
 2024-04-25  Bruno Haible  
 
 	doc: Remove documentation of IRIX as supported platform.
diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py
index cf1ebd10ec..47c0e83555 100644
--- a/pygnulib/GLImport.py
+++ b/pygnulib/GLImport.py
@@ -38,6 +38,7 @@
 relativize,
 rmtree,
 )
+from .functions import rewrite_file_name
 from .GLError import GLError
 from .GLConfig import GLConfig
 from .GLModuleSystem import GLModuleTable
@@ -303,80 +304,6 @@ def __repr__(self) -> str:
 result = '' % hex(id(self))
 return result
 
-def rewrite_old_files(self, files: list[str]) -> list[str]:
-'''Replace auxdir, docbase, sourcebase, m4base and testsbase from default
-to their version from cached config.'''
-if type(files) is not list:
-raise TypeError('files argument must has list type, not %s'
-% type(files).__name__)
-for file in files:
-if type(file) is not str:
-raise TypeError('each file must be a string instance')
-files = sorted(set(files))
-files = [ '%s%s' % (file, os.path.sep)
-  for file in files ]
-auxdir = self.cache['auxdir']
-docbase = self.cache['docbase']
-sourcebase = self.cache['sourcebase']
-m4base = self.cache['m4base']
-testsbase = self.cache['testsbase']
-result = []
-for file in files:
-if file.startswith('build-aux/'):
-path = substart('build-aux/', '%s/' % auxdir, file)
-elif file.startswith('doc/'):
-path = substart('doc/', '%s/' % docbase, file)
-elif file.startswith('lib/'):

Re: WIP patch to reduce rewrite_filename duplication

2024-04-25 Thread Bruno Haible
Hi Collin,

> > IMO, the creation of a rewrite_file_name method (singular! you mentioned
> > a couple of days ago that the ability to rewrite a single file name is
> > one of the motivations for this refactoring) in a central place goes in
> > the correct direction. However, moving knowledge about 'cache' and 'config'
> > into GLFileTable does not.
> 
> Sure, I agree. Though I'm not sure the best place to put it. All of
> our non-private functions are in constants.py, but this function seems
> out of place there. Since all of the functions there are string
> helpers and such. In other words, no GL* classes are imported there.

I agree, it's out-of-place in constants.py.

> Maybe it would be best to create a functions.py file?

Sounds reasonable, yes.

> For the actual function it would be something like this:
> 
> def rewrite_file_name(config: GLConfig,
>   file_name: str,
>   also_tests_lib: bool = False) -> str:
> ...
> if also_tests_lib and file_name.startswith('tests=lib/'):
># OK

Yep, that's the right way to do it. Maybe file_name should come first
(by the usual heuristic that the argument that shows most variation
comes first)?

Bruno






Re: RFC: Remove documentation of IRIX as supported platform

2024-04-25 Thread Bruno Haible
Paul Eggert wrote:
> This sounds good to me.

Done through the patch below.


2024-04-25  Bruno Haible  

doc: Remove documentation of IRIX as supported platform.
* doc/posix-headers/netdb.texi: Don't mention IRIX specific workarounds.
* doc/posix-headers/pthread.texi: Likewise.
* doc/posix-headers/sys_socket.texi: Likewise.
* doc/posix-headers/wctype.texi: Likewise.
* doc/posix-functions/btowc.texi: Likewise.
* doc/posix-functions/cbrtf.texi: Likewise.
* doc/posix-functions/cbrtl.texi: Likewise.
* doc/posix-functions/copysignf.texi: Likewise.
* doc/posix-functions/exp2.texi: Likewise.
* doc/posix-functions/exp2f.texi: Likewise.
* doc/posix-functions/exp2l.texi: Likewise.
* doc/posix-functions/expm1f.texi: Likewise.
* doc/posix-functions/expm1l.texi: Likewise.
* doc/posix-functions/fabsl.texi: Likewise.
* doc/posix-functions/isnan.texi: Likewise.
* doc/posix-functions/iswblank.texi: Likewise.
* doc/posix-functions/link.texi: Likewise.
* doc/posix-functions/log10l.texi: Likewise.
* doc/posix-functions/log1pf.texi: Likewise.
* doc/posix-functions/log2.texi: Likewise.
* doc/posix-functions/log2f.texi: Likewise.
* doc/posix-functions/log2l.texi: Likewise.
* doc/posix-functions/lseek.texi: Likewise.
* doc/posix-functions/nl_langinfo.texi: Likewise.
* doc/posix-functions/pthread_sigmask.texi: Likewise.
* doc/posix-functions/remainderf.texi: Likewise.
* doc/posix-functions/remainderl.texi: Likewise.
* doc/posix-functions/rintf.texi: Likewise.
* doc/posix-functions/sigaltstack.texi: Likewise.
* doc/posix-functions/strtod.texi: Likewise.
* doc/posix-functions/strtold.texi: Likewise.
* doc/posix-functions/vscanf.texi: Likewise.
* doc/posix-functions/wctob.texi: Likewise.
* doc/**/*.texi: Update.

https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=7941742a51e468b0f4ba07c395d0a46528352a81






Re: WIP patch to reduce rewrite_filename duplication

2024-04-25 Thread Collin Funk
Hi Bruno,

Thank you for the detailed explanations as always.

On 4/25/24 1:48 AM, Bruno Haible wrote:
> Object-oriented programming is not easy, and it comes with constant
> hesitations and deliberations.

:)

My least favorite part of my degree was object-oriented design and
database normalization, for similar reasons.

> Another, bigger warning sign is that you can no longer easily explain
> what the GLFileTable is.
> Before, it was a data structure that holds file lists during an operation.
> After, it is a helper class that holds file lists and also meddles with
> file name rewriting as required by the caller. (That's the best description
> I can come up with!)

I see what you mean. I guess the strange setters would be a symptom of
that warning sign that GLFileTable is doing to much.

The way you described the GLFileTable reminds me of the reasoning for
adding the 'dataclasses' module added in Python 3.7 [1]. Since we are
both very big fans of decorators of course. :)

> IMO, the creation of a rewrite_file_name method (singular! you mentioned
> a couple of days ago that the ability to rewrite a single file name is
> one of the motivations for this refactoring) in a central place goes in
> the correct direction. However, moving knowledge about 'cache' and 'config'
> into GLFileTable does not.

Sure, I agree. Though I'm not sure the best place to put it. All of
our non-private functions are in constants.py, but this function seems
out of place there. Since all of the functions there are string
helpers and such. In other words, no GL* classes are imported there.

Maybe it would be best to create a functions.py file?

> Neither of the two proprosals smells well. Why not add a boolean
> 'also_tests_lib' parameter to the rewrite_file_name method, that tells
> whether to replace 'tests=lib/' or not? This way, the refactoring does
> not introduce a functional difference.

Sorry, I think I worded that poorly. Since 'tests=lib/' is internally
used I was thinking that maybe we should warn/error if it is used. I
guess that isn't a problem worth losing sleep over.

For the actual function it would be something like this:

def rewrite_file_name(config: GLConfig,
  file_name: str,
  also_tests_lib: bool = False) -> str:
...
if also_tests_lib and file_name.startswith('tests=lib/'):
   # OK

Since 'tests=lib/' is used internally I felt it makes more sense to
default it to False. That way the call in main.py can ignore it and
the calls in GLImport.py and GLTestDir.py can use it internally.

> In a case like this, it's better if you post the patch for review, rather
> than push it too quickly. Trust me: OO takes a long time to learn.

Yes, I agree.

[1] https://docs.python.org/3/library/dataclasses.html#module-dataclasses

Collin



Re: GNULIB_REVISION

2024-04-25 Thread Paul Eggert

On 4/25/24 10:43, Simon Josefsson wrote:


https://gitlab.com/gsasl/inetutils/-/jobs/6706396721


That's an malloc failure on the server side. The savannah admins should 
be told about the issue soon after it happens. Routine clones of GNU 
projects shouldn't fail like that. (It's never happened to me, but then 
I don't clone from Savannah all that often - maybe Savannah is giving 
less server memory to clients that seem to hog?)




Btw, using --depth 1 is incompatible with gnulib's git-version-gen


We could fix that by not requiring git-version-gen for the Gnulib 
submodule. git-version-gen is pretty useless for that anyway, as for 
Gnulib it currently outputs a string like "0.1.7513-648ae" which is not 
much more useful than the commit ID 
"648aeb575db7af9ddd51cf17d1b56ee91e9ef3c5".



Isn't the real problem that we don't put (for example) gzip's own
commit ID into the coreutils tarball? If we did that, Gnulib's commit
ID would come for free, since it can be derived from gzip's commit ID.


I suppose you meant s/coreutils/gzip/


Yes, sorry.


SHA1 git
commits aren't long-term stable either, since SHA1 is broken


As you say, they're good enough for now. And anyway I would think SHA1 
is good enough longer term unless an adversary infects your Git 
repository (and in that case you probably have bigger problems...).




[PATCH] HACKING: Fix a typo.

2024-04-25 Thread Collin Funk
* HACKING: Add missing 't' to platforms.
---
 ChangeLog | 5 +
 HACKING   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 8d1e0e981b..e635e316b8 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2024-04-25  Collin Funk  
+
+   HACKING: Fix a typo.
+   * HACKING: Add missing 't' to platforms.
+
 2024-04-25  Bruno Haible  
 
relocatable-lib-lgpl: Allow unconditional use of set_relocation_prefix.
diff --git a/HACKING b/HACKING
index 8ffe5d5f5e..964b880d20 100644
--- a/HACKING
+++ b/HACKING
@@ -82,7 +82,7 @@ Maintaining high quality
 
 It is a good idea to occasionally create a testdir of all of Gnulib:
   $ rm -rf ../testdir-all; ./gnulib-tool --create-testdir --dir=../testdir-all 
--with-c++-tests --without-privileged-tests --single-configure `./all-modules`
-and test this directory on various plaforms:
+and test this directory on various platforms:
   - Linux/glibc systems,
   - Linux/musl systems,
   - macOS,
-- 
2.44.0




Re: GNULIB_REVISION

2024-04-25 Thread Collin Funk
On 4/25/24 10:00 AM, Paul Eggert wrote:
> Is there some way to cajole 'git clone' into using less memory, with a
> '--depth 1' or similar options? Cloning shallowly would clone Gnulib a
> lot faster, if you're cloning from a remote repository.

Maybe '--single-branch' would help too. If space is really desperate
the snapshots on gitweb are pretty small.


https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=648aeb575db7af9ddd51cf17d1b56ee91e9ef3c5

7.6 MB for me vs. my full clone

$ du -s --human-readable .git
114M.git

Collin



Re: GNULIB_REVISION

2024-04-25 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> You raise several good points. A couple of quick reaction:
>
> On 2024-04-25 09:26, Simon Josefsson via Gnulib discussion list wrote:
>
>> - the gnulib git submodule is huge.  Not rarely I get out of memory
>>errors during 'git clone' in CI/CD jobs.
>
> First I've heard of this problem (other than with Git LFS, which
> Gnulib doesn't use). What part of the clone operation fails? Is this 
> server-side RAM, or client-side RAM, or something else? How much
> memory does 'git clone' require now for Gnulib?
>
> Is there some way to cajole 'git clone' into using less memory, with a
> '--depth 1' or similar options? Cloning shallowly would clone Gnulib a 
> lot faster, if you're cloning from a remote repository.

It only happens once in a while, for example:

https://gitlab.com/gsasl/inetutils/-/jobs/6706396721

This is gitlab's own git submodule checkout system working, and it is
using --depth 150 which shouldn't be too heavy, so it not even getting
to ./bootstrap's git clone.

Btw, using --depth 1 is incompatible with gnulib's git-version-gen: it
will not find a suitable version number for the build.  Even gitlab's
default depth of 50 (or if it was 100, can't remember) is often not
enough if you have >50 commits since the last release.  This cause
problems when building from 'git archive' tarballs.

>> - we don't offer any way for people receiving tarballs to learn which
>>gnulib git commit was used
>
> Isn't the real problem that we don't put (for example) gzip's own
> commit ID into the coreutils tarball? If we did that, Gnulib's commit
> ID would come for free, since it can be derived from gzip's commit ID.

I suppose you meant s/coreutils/gzip/, otherwise I don't follow?

Yes that is a good idea!  The git commit of the project should be part
of the announce-gen output.  The git tag name could be mentioned too.
Tags are not long-term stable since they can be moved later on, so I
think the full git commit id should be mentioned too.  Current SHA1 git
commits aren't long-term stable either, since SHA1 is broken, but at
least this approach is the best we can do right now and when we move to
SHA256 git things will be better.

/Simon


signature.asc
Description: PGP signature


Re: GNULIB_REVISION

2024-04-25 Thread Paul Eggert

You raise several good points. A couple of quick reaction:

On 2024-04-25 09:26, Simon Josefsson via Gnulib discussion list wrote:


- the gnulib git submodule is huge.  Not rarely I get out of memory
   errors during 'git clone' in CI/CD jobs.


First I've heard of this problem (other than with Git LFS, which Gnulib 
doesn't use). What part of the clone operation fails? Is this 
server-side RAM, or client-side RAM, or something else? How much memory 
does 'git clone' require now for Gnulib?


Is there some way to cajole 'git clone' into using less memory, with a 
'--depth 1' or similar options? Cloning shallowly would clone Gnulib a 
lot faster, if you're cloning from a remote repository.




- we don't offer any way for people receiving tarballs to learn which
   gnulib git commit was used


Isn't the real problem that we don't put (for example) gzip's own commit 
ID into the coreutils tarball? If we did that, Gnulib's commit ID would 
come for free, since it can be derived from gzip's commit ID.






Re: RFC: Remove documentation of IRIX as supported platform

2024-04-25 Thread Paul Eggert

This sounds good to me.

As far as IRIX goes, AC_SYS_LARGEFILE still works on 64-bit IRIX. Even 
32-bit IRIX should still work if you configure with CC='cc -n32', and 
this should be adequate for hobbyists who want to keep IRIX 5.3 going 
(some, by reverse engineering it! 
).




Re: GNULIB_REVISION

2024-04-25 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi Simon,
>
>> you can ... via
>> GNULIB_REVISION pick out exactly the gnulib git revision that libpaper
>> needs. ...
>> [1] 
>> https://blog.josefsson.org/2024/04/13/reproducible-and-minimal-source-only-tarballs/
>> [2] https://salsa.debian.org/auth-team/libntlm/-/tree/master/debian
>
> I see GNULIB_REVISION as an obsolete alternative to git submodules, and
> would therefore discourage rather than propagate its use.

I think it will be challenging for gnulib to insists on always being
used as a git submodule, and I would prefer if we continue support
multiple ways of working.  Personally I have been migrating towards
gnulib git submodules because most other projects use gnulib like that,
but I've never really felt comfortable with them.  Some of the concerns
I have:

- git submodules leads to -- in my subjective opinion -- complexity
  which leads to a worse user experience for developers.  I have learned
  to work with git submodules over the years, but it was a hurdle that I
  don't want to force on everyone.

- the gnulib git submodule is huge.  Not rarely I get out of memory
  errors during 'git clone' in CI/CD jobs.  I can restart the jobs
  manually, but this indicate that there is a resource drain here.  For
  a tiny project like libntlm the imbalance if the small project code
  and large gnulib is troubling.

- often CI/CD platforms have different ways of working with git
  submodules which adds complexity which leads to bugs.  Allowing
  maintainers to decide if they want to work with git submodules or not
  seems like a good thing.

- we don't offer any way for people receiving tarballs to learn which
  gnulib git commit was used (you noticed this too below) but with a
  GNULIB_REVISION approach this is part of the tarball, just like any
  other versioned dependency on autoconf, automake etc

- I think gnulib could be regarded as any other external dependency,
  just like autoconf, automake, libtool etc that also generate files in
  my build tree during bootstrapping.  I don't put autoconf as a git
  submodule, why should I put gnulib as one?

Granted, these concerns are a bit vague and subjective.

> Currently libntlm has this in its bootstrap.conf:
>
>   GNULIB_REVISION=dfb71172a46ef41f8cf8ab7ca529c1dd3097a41d
>
> and GNU make has this:
>
>   GNULIB_REVISION=stable-202307

Interesting.  This suggests the GNULIB_REVISION approach isn't the
entire solution either.

I think it is useful to record the gnulib git commit used to prepare a
tarball, and have that git commit id be part of the shipped tarball, and
stored inside the git repository.  The first use above achieve this, but
the second one doesn't (branches/tags are moving targets).

If I download the gzip tarball I can't find anywhere what gnulib commit
was used for bootstrapping.  It is quite cumbersome to verify that the
tarball didn't contain any modified gnulib code.  This is even harder
when projects INTENTIONALLY modify gnulib code compared to what's in
gnulib git, which coreutils and several others projects does through
gnulib *.diff/*.patch files.

Ultimately, I think there is an important use-case to build projects
directly from source code without having tarballs with pre-generated
files that are not reproduced by the user.

> The differences between both approaches are:
>
>   - GNULIB_REVISION works only with the 'bootstrap' program. The submodules
> approach works also without 'bootstrap'.

What use case are you thinking of?  The gnulib git commit information
consumers that I can think of are gnulib-aware.

>   - For GNULIB_REVISION, the user is on their own regarding tooling, aside
> from 'bootstrap'. In the submodules approach, the 'git' suite provides
> the tooling, and many developers are familiar with it.

Yes, but developers also like flexibility, and in some situations I
think the git approach is not the best way of working.

>   - .tar.gz files created by the gitweb "snapshot" link, by the cgit "refs >
> Download" section, or the GitHub "Download ZIP" button contain an empty
> directory in place of the submodule, and no information about the 
> revision.
> Whereas they contain the file with the GNULIB_REVISION assignment.

Indeed, this was the main challenge for me.  That is critical
information for anyone who wants to avoid touching tarballs with
pre-generated content.

>> I should write a post to debian-devel describing this pattern on
>> how to use gnulib in Debian packages
>
> It feels wrong to me if, in order to get meta-information about required
> dependencies of a package, Debian tools grep a particular file for a specific
> string. This approach is simply too limited.

Meta-information about dependencies are normally always hand-curated in
Debian (the Build-Depends: header).  The simplest solution is for the
Debian package maintainer to figure out which gnulib git commit version
was used for a release and pin that manually in the debian/rules
makefile.  If the 

Re: GNULIB_REVISION

2024-04-25 Thread Bruno Haible
Hi Simon,

> you can ... via
> GNULIB_REVISION pick out exactly the gnulib git revision that libpaper
> needs. ...
> [1] 
> https://blog.josefsson.org/2024/04/13/reproducible-and-minimal-source-only-tarballs/
> [2] https://salsa.debian.org/auth-team/libntlm/-/tree/master/debian

I see GNULIB_REVISION as an obsolete alternative to git submodules, and
would therefore discourage rather than propagate its use.

Currently libntlm has this in its bootstrap.conf:

  GNULIB_REVISION=dfb71172a46ef41f8cf8ab7ca529c1dd3097a41d

and GNU make has this:

  GNULIB_REVISION=stable-202307

Both can be done with git submodules. Git submodules do support branches [1],
and the command 'git submodule update --remote gnulib' updates the
submodule while staying on the indicated branch.

When using submodules, the target revision is stored in a versionable way,
that both gitweb and cgit can show appropriately [2][3]. If a branch is
used, it is stored in the .gitmodules file.

The differences between both approaches are:

  - GNULIB_REVISION works only with the 'bootstrap' program. The submodules
approach works also without 'bootstrap'.

  - For GNULIB_REVISION, the user is on their own regarding tooling, aside
from 'bootstrap'. In the submodules approach, the 'git' suite provides
the tooling, and many developers are familiar with it.

  - .tar.gz files created by the gitweb "snapshot" link, by the cgit "refs >
Download" section, or the GitHub "Download ZIP" button contain an empty
directory in place of the submodule, and no information about the revision.
Whereas they contain the file with the GNULIB_REVISION assignment.

> I should write a post to debian-devel describing this pattern on
> how to use gnulib in Debian packages

It feels wrong to me if, in order to get meta-information about required
dependencies of a package, Debian tools grep a particular file for a specific
string. This approach is simply too limited.

The correct way, IMO, would be that 'git' provides this meta-information,
either embedded in the .tar.gz generated by the web tooling, or in a
separate .tar.gz. AFAICT, 'git' currently does not have this ability.
Therefore we need to approach the 'git' team, in order to find a solution
that scales across the whole set of software package — not specific to
gnulib and not specific to 'bootstrap'.

Bruno

[1] 
https://stackoverflow.com/questions/1777854/how-can-i-specify-a-branch-tag-when-adding-a-git-submodule
[2] https://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=tree
[3] https://git.savannah.gnu.org/cgit/coreutils.git/tree/






Re: 'relocatable' project built without --enable-relocatable

2024-04-25 Thread Reuben Thomas
On Thu, 25 Apr 2024 at 14:07, Bruno Haible  wrote:

> OK, I'm committing as shown below.
>

Great, thanks!

> Is there a reason not to make a similar macro for compute_curr_prefix?
>
> Yes:
>   - For compute_curr_prefix, the need has not been demonstrated.
>   - Even if ENABLE_RELOCATABLE is 1, there are cases when
> compute_curr_prefix
> is not defined.
>

Good reasons.

-- 
https://rrt.sc3d.org


Re: 'relocatable' project built without --enable-relocatable

2024-04-25 Thread Bruno Haible
Reuben Thomas wrote:
> > Can you please try the attached patch?
> 
> Works beautifully, thanks.

OK, I'm committing as shown below.

> Is there a reason not to make a similar macro for compute_curr_prefix?

Yes:
  - For compute_curr_prefix, the need has not been demonstrated.
  - Even if ENABLE_RELOCATABLE is 1, there are cases when compute_curr_prefix
is not defined.


2024-04-25  Bruno Haible  

relocatable-lib-lgpl: Allow unconditional use of set_relocation_prefix.
Reported by Reuben Thomas  in
.
* lib/relocatable.h (set_relocation_prefix): Define in a dummy way if
ENABLE_RELOCATABLE is not defined.

diff --git a/lib/relocatable.h b/lib/relocatable.h
index 162f9d82a4..0c10ebe2a1 100644
--- a/lib/relocatable.h
+++ b/lib/relocatable.h
@@ -109,6 +109,8 @@ extern char * compute_curr_prefix (const char 
*orig_installprefix,
 #else
 
 /* By default, we use the hardwired pathnames.  */
+#define set_relocation_prefix(orig_prefix, curr_prefix) \
+  ((void) (orig_prefix), (void) (curr_prefix))
 #define relocate(pathname) (pathname)
 #define relocate2(pathname,allocatedp) (*(allocatedp) = NULL, (pathname))
 






Re: gnulib-tool.py: Add a new GLFileTable class.

2024-04-25 Thread Bruno Haible
Collin Funk wrote:
> for instance variables I
> rather put them outside __init__. It also serves as a decent place to
> add comments that don't belong in doc strings.

Yes. For instance variables that are used in several methods it would seem
odd to declare them in __init__. To me, __init__ is the place where they
are initialized, not declared.

Bruno






Re: RFC: Remove documentation of IRIX as supported platform

2024-04-25 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Since
>   - IRIX 6.5 is end-of-life for more than 10 years [1],
>   - I don't have access to an IRIX machine any more,
>   - the AC_SYS_LARGEFILE macro no longer supports IRIX,
> I would suggest to remove mentions of IRIX support from the
>   documentation.

+1

"On September 6, 2006, an SGI press release announced the end of the
MIPS and IRIX product lines.[7] Production ended on December 29, 2006,
with final deliveries in March 2007, except by special
arrangement. Support for these products ended in December 2013 and they
will receive no further updates."
-- https://en.wikipedia.org/wiki/IRIX

GCC 4.8.x from 2011 was the last release with official support for IRIX.

/Simon

> Like we did for OSF/1 in 2019:
>
> doc: Remove documentation of OSF/1 as supported platform.
> * doc/gnulib-intro.texi (Target Platforms): Mention that OSF/1 is
> unsupported.
> ...
> * doc/**/*.texi: Update.
>
> Any objections?
>
>Bruno
>
> [1] 
> https://git.savannah.gnu.org/gitweb/?p=gnulib/maint-tools.git;a=blob;f=end-of-life.txt
>
>
>
>
>


signature.asc
Description: PGP signature


Re: gnulib-tool.py: Add a new GLFileTable class.

2024-04-25 Thread Collin Funk
On 4/25/24 1:11 AM, Bruno Haible wrote:
> Interesting syntax. This makes the class easier to understand. You're welcome
> to do the same thing with the other classes (except for GLError.message, which
> is private to a single method).

I agree. I originally wanted to add it to GLModuleTable when I was
looking at it previously. I imagine that section of code is probably
the most difficult for someone new to understand compared to the rest
of gnulib-tool.py.

Before I made the changes to GLModule.__hash__ the types were
something like this:

self.dependers: dict[str, list[str]] # 'module-name' : ['dep1', 'dep2']
self.conditionals: dict[str, str | bool] # key is 'module1---module2'

Which doesn't really tell us much without looking more into the class.
After my changes to GLModule.__hash__ it is a bit better:

self.dependers: defaultdict[GLModule, set[GLModule]]
self.conditionals: dict[tuple[GLModule, GLModule], str | bool]

This syntax can also be used on assignments in __init__ or for normal
variables like this:

# type hinted 'self.dependers = []'.
self.dependers: defaultdict[GLModule, set[GLModule]] = []

Which helps mypy and other type checkers who cannot infer the type
otherwise. I felt it was a bit cluttered so for instance variables I
rather put them outside __init__. It also serves as a decent place to
add comments that don't belong in doc strings.

I'll probably add them once I get around to making variables private
as we discussed previously.

> We are not going to lower the version dependency:
>   - No one asked for it in the beta testing process since last week.
>   - Python 3.6 (and even 3.7) is already end-of-life [1].

Sounds good. In any case it is always good to double check quickly
before modernizing like this. I run all the tests with Python 3.7
before submitting patches for the same reason.

Collin



Re: WIP patch to reduce rewrite_filename duplication

2024-04-25 Thread Bruno Haible
Hi Collin,

Object-oriented programming is not easy, and it comes with constant
hesitations and deliberations.

> Less repeated code to maintain.

Yes, this is one of the goals to keep in mind.

The other most relevant questions are:

  - Can the purpose of a class be explained in simple terms?
  - Is a class properly decoupled from the code that uses the class?

  - Are we using the common design patterns (such as getters/setters,
class-private caching, etc.)?

Here you already have noticed a warning sign:
> The setters also take two arguments which feels a bit unorthodox to me.

Another, bigger warning sign is that you can no longer easily explain
what the GLFileTable is.
Before, it was a data structure that holds file lists during an operation.
After, it is a helper class that holds file lists and also meddles with
file name rewriting as required by the caller. (That's the best description
I can come up with!)

The two warning signs together indicate that, in this case, you are
trying to move too much logic from the GLImport and GLTestDir classes into
GLFileTable.

Don't get me wrong: Moving application code to helper classes can be OK
in other cases. But when done so, it should decouple the class and the
caller. Which is not being done here: The fact that the caller has a
'cache' and a 'config' variable had no impact on the GLFileTable class,
and after the patch it would.

> I haven't pushed this patch yet, but it goes in the correct direction,
> in my opinion.

IMO, the creation of a rewrite_file_name method (singular! you mentioned
a couple of days ago that the ability to rewrite a single file name is
one of the motivations for this refactoring) in a central place goes in
the correct direction. However, moving knowledge about 'cache' and 'config'
into GLFileTable does not.

> > 'tests=lib/' should not occur in the scope of mode == 'copy-file'.
> > But this string is an indicator inside gnulib-tool; users of gnulib-tool
> > should not be allowed to pass it.
> 
> Does it make more sense to check for a filename starting with
> 'tests=lib/' and error out or is it safe to assume the input is clean?

Neither of the two proprosals smells well. Why not add a boolean
'also_tests_lib' parameter to the rewrite_file_name method, that tells
whether to replace 'tests=lib/' or not? This way, the refactoring does
not introduce a functional difference.

> Ideally I would like to use a module-level function or @staticmethod
> there.

Module-level functions are OK. @staticmethod raises questions; in particular,
whether the method belongs there, where you are attempting to put it.

> Any opinions on these ideas? If not, I will probably sleep on it and
> push an improved patch.

In a case like this, it's better if you post the patch for review, rather
than push it too quickly. Trust me: OO takes a long time to learn.

   Bruno






Re: gnulib-tool.py: Add a new GLFileTable class.

2024-04-25 Thread Bruno Haible
Collin Funk wrote:
> I've applied the following patch adding a GLFileTable class as
> discussed here:
> 
> https://lists.gnu.org/archive/html/bug-gnulib/2024-04/msg00357.html

Thanks! Looks good.

> +class GLFileTable:
> +'''The GLFileTable class stores file information for the duration of the
> +operation being executed.'''
> +
> +all_files: list[str]
> +old_files: list[tuple[str, str]]
> +new_files: list[tuple[str, str]]
> +added_files: list[str]
> +removed_files: list[str]

Interesting syntax. This makes the class easier to understand. You're welcome
to do the same thing with the other classes (except for GLError.message, which
is private to a single method).

> This syntax was introduced in Python 3.6 which is okay for our
> dependency on Python 3.7. For older versions they can be removed or be
> placed in comments [1]. I've gone with it since I imagine there will
> be bigger fish to fry by lowering the version dependency.

We are not going to lower the version dependency:
  - No one asked for it in the beta testing process since last week.
  - Python 3.6 (and even 3.7) is already end-of-life [1].

Bruno

[1] https://en.wikipedia.org/wiki/History_of_Python#Table_of_versions






RFC: Remove documentation of IRIX as supported platform

2024-04-25 Thread Bruno Haible
Since
  - IRIX 6.5 is end-of-life for more than 10 years [1],
  - I don't have access to an IRIX machine any more,
  - the AC_SYS_LARGEFILE macro no longer supports IRIX,
I would suggest to remove mentions of IRIX support from the documentation.
Like we did for OSF/1 in 2019:

doc: Remove documentation of OSF/1 as supported platform.
* doc/gnulib-intro.texi (Target Platforms): Mention that OSF/1 is
unsupported.
...
* doc/**/*.texi: Update.

Any objections?

   Bruno

[1] 
https://git.savannah.gnu.org/gitweb/?p=gnulib/maint-tools.git;a=blob;f=end-of-life.txt






doc: Update target platforms list

2024-04-25 Thread Bruno Haible
This patch updates the list of target platforms. What has changed since last
year:
  - I cannot claim to be testing RHEL any more, since I don't have access
to a copy of it and since the previous clones (CentOS, AlmaLinux etc.)
can no longer sync with RHEL, due to [1].
  - FreeBSD's current major version is 14.
  - One of the AIX machines in the GCC compilefarm has been upgraded; I no
longer have access to AIX 7.2.
  - I no longer have access to an IRIX 6.5 machine.

[1] https://www.phoronix.com/news/Red-Hat-CentOS-Stream-Sources


2024-04-25  Bruno Haible  

doc: Update target platforms list.
* doc/gnulib-intro.texi (Supported Platforms): Mention FreeBSD 14
instead of 13. Mention AIX 7.3 instead of 7.2. Mention Cygwin 3.4.
Don't mention IRIX.
(Formerly Supported Platforms): Add IRIX.

diff --git a/doc/gnulib-intro.texi b/doc/gnulib-intro.texi
index c46ef8991b..7000fe199b 100644
--- a/doc/gnulib-intro.texi
+++ b/doc/gnulib-intro.texi
@@ -113,15 +113,15 @@
 @node Supported Platforms
 @subsection Supported Platforms
 
-As of 2023, the list of supported platforms is the following:
+As of 2024, the list of supported platforms is the following:
 
 @itemize
 @item
 glibc systems.  With glibc 2.19 or newer, they are frequently tested.
 @c [Not very relevant in the long term.]
-@c The distributions Ubuntu, Fedora, RHEL, Arch Linux are frequently tested.
-@c CentOS is occasionally tested.
-@c Debian, gNewSense, Trisquel, OpenSUSE are rarely tested.
+@c The distributions Ubuntu, Debian, Fedora, CentOS are frequently tested.
+@c Arch Linux is occasionally tested.
+@c gNewSense, Trisquel, OpenSUSE are rarely tested.
 About the kernels:
 @itemize
 @item
@@ -129,27 +129,27 @@
 @item
 glibc on kFreeBSD is rarely tested.
 @item
-@c There is Alpine Linux 3.14, and also musl-gcc on Ubuntu.
+@c There is Alpine Linux 3.19, and also musl-gcc on Ubuntu.
 musl libc on Linux is occasionally tested.
 @end itemize
 @item
 macOS@.  In versions 12.5, it's occasionally tested.  In version
 10.5, it's rarely tested.
 @item
-FreeBSD 13.0 or newer is occasionally tested.
+FreeBSD 14.0 or newer is occasionally tested.
 @item
 OpenBSD 7.0 or newer is occasionally tested.
 @item
 NetBSD 10.0 or newer is occasionally tested.
 @item
-AIX 7.1 and 7.2 are occasionally tested.
+AIX 7.1 and 7.3 are occasionally tested.
 @item
 Solaris 10 and 11.4 are occasionally tested.  Solaris 9 is rarely
 tested and low priority.
 @item
 Android is occasionally tested, through the Termux app on Android 11.
 @item
-Cygwin 2.9 is occasionally tested.  Cygwin 1.7.x is rarely tested.
+Cygwin 2.9 and 3.4 are occasionally tested.  Cygwin 1.7.x no longer tested.
 @item
 Native Windows:
 @itemize
@@ -170,9 +170,6 @@
 @item
 GNU Hurd 0.9 is rarely tested.
 @item
-@c IRIX 6.5 cc has no option for C99 support. You would need to use gcc 
instead.
-IRIX 6.5 is very rarely tested.
-@item
 Minix 3.3.0 is no longer tested.
 @item
 Haiku is no longer tested.
@@ -197,7 +194,8 @@
 @item
 HP-UX 11.31.
 @item
-IRIX 6.4 and older.
+@c IRIX 6.5 cc has no option for C99 support. You would need to use gcc 
instead.
+IRIX 6.5 and older.
 @item
 OSF/1 5.1.
 @item






Re: [PATCH] largefile: port to C++

2024-04-25 Thread Bruno Haible
Paul Eggert wrote:
> -[AC_CACHE_CHECK([for $CC option to enable large file support],
> +[AC_CACHE_CHECK([for $CPPFLAGS option for large files],

Sorry, but I find the older wording ("option to enable large file support")
more understandable than ("option for large files"). The new wording
potentially suggests that the package may _create_ huge files, dissuading
users from using this option.

> (_AC_SYS_LARGEFILE_OPTIONS): Omit -n32.

So, this means that this macro does nothing any more for IRIX. Here's a patch
to update the documentation.


2024-04-25  Bruno Haible  

largefile: Update documentation.
* doc/largefile.texi: Remove mention of IRIX.

diff --git a/doc/largefile.texi b/doc/largefile.texi
index 574753f10a..cbffee2181 100644
--- a/doc/largefile.texi
+++ b/doc/largefile.texi
@@ -6,7 +6,7 @@
 To this effect, it attempts to ensure that types like @code{off_t} and
 @code{ino_t} are 64-bit,
 at least on the following platforms:
-glibc, Mac OS X, FreeBSD, NetBSD, OpenBSD, AIX, HP-UX, IRIX, Solaris,
+glibc, Mac OS X, FreeBSD, NetBSD, OpenBSD, AIX, HP-UX, Solaris,
 Cygwin, mingw, MSVC.
 
 If the types cannot be made 64-bit, @command{configure} issues a






Re: [PATCH] c32srtombs,mbsrtoc32s,mbsrtowcs,wcsrtombs: pacify GCC 14

2024-04-25 Thread Bruno Haible
Paul Eggert wrote:
> Add an extern decl for a “private” extern symbol, to pacify GCC
> 14’s -Wmissing-variable-declarations option.

Thanks. These *_state variables are not worth having an extra .h file
with just a one-line declaration.

Bruno