[PATCH] announce-gen: Mention git commit in release announcement.

2024-05-12 Thread Simon Josefsson via Gnulib discussion list
All,

Our release announcements does not mention the git commit hash that was
used to prepare the release.  While SHA1 is broken, I still think
including the commit hash provide some additional information that may
be useful further down the line, and hopefully including doesn't incur
too much cognitive load on the reader (that isn't already present..).

I haven't pushed the attached patch since I'm not a native speaker.
Could someone suggest better wording, if needed?  Or better placement in
the announcement?

To read the result of the patch in context, take some earlier
announcement:

https://lists.gnu.org/archive/html/info-gnu/2024-03/msg6.html

and then consider that the patch would turn the following snippet (for a
hypothethical upcoming GNU inetutils release) text:

This release was bootstrapped with the following tools:
  Gnulib aacceb6eff
  Autoconf 2.71
  Automake 1.16.5
  Bison 3.8.2
  M4 1.4.18
  Makeinfo 6.8
  Help2man 1.49.1
  Make 4.3
  Gzip 1.10
  Tar 1.34

and turn that into this:

This release was built bootstrapped with the following tools
using inetutils git commit 524d4b6934db12b9f43be410d2f201fdb40cfc97:

  Gnulib aacceb6eff
  Autoconf 2.71
  Automake 1.16.5
  Bison 3.8.2
  M4 1.4.18
  Makeinfo 6.8
  Help2man 1.49.1
  Make 4.3
  Gzip 1.10
  Tar 1.34

Does this make sense?  Is the location in the announcement e-mail a good
one?  This hides it a bit further down which I think makes sense.  Few
readers care about git commit and bootstrapping versions, and the
information is related.  The new version adds an empty line which I
think is more consistent with the other paragraphs.

Thoughts?

/Simon
From 8be372e8ddfaa5b7202e2b58b22e55c00d9016c5 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Sun, 12 May 2024 17:31:51 +0200
Subject: [PATCH] announce-gen: Mention git commit in release announcement.

* build-aux/announce-gen (this_commit_hash): New variable.
(main): Print commit hash.
---
 ChangeLog  | 6 ++
 build-aux/announce-gen | 6 --
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b6aa21d7f7..20dbe3c2a3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-05-12  Simon Josefsson  
+
+	announce-gen: Mention git commit in release announcement.
+	* build-aux/announce-gen (this_commit_hash): New variable.
+	(main): Print commit hash.
+
 2024-05-12  Simon Josefsson  
 
 	maintainer-makefile: Silence announce-gen error with GNULIB_REVISION.
diff --git a/build-aux/announce-gen b/build-aux/announce-gen
index f9e20129dd..3d47ceb9a7 100755
--- a/build-aux/announce-gen
+++ b/build-aux/announce-gen
@@ -35,7 +35,7 @@
 eval 'exec perl -wSx "$0" "$@"'
  if 0;
 
-my $VERSION = '2023-12-29 18:26'; # UTC
+my $VERSION = '2024-05-12 15:30'; # UTC
 # The definition above must lie within the first 8 lines in order
 # for the Emacs time-stamp write hook (at end) to update it.
 # If you change this file with Emacs, please let the write hook
@@ -551,6 +551,8 @@ EOF
   chomp (my $n_ci = `git rev-list "v$v0..v$v1" | wc -l`);
   chomp (my $n_p = `git shortlog "v$v0..v$v1" | grep -c '^[^ ]'`);
 
+  my $this_commit_hash = `git log --pretty=%H -1 "v$v1"`;
+  chop $this_commit_hash;
   my $prev_release_date = `git log --pretty=%ct -1 "v$v0"`;
   my $this_release_date = `git log --pretty=%ct -1 "v$v1"`;
   my $n_seconds = $this_release_date - $prev_release_date;
@@ -672,7 +674,7 @@ EOF
 
   my @tool_versions = get_tool_versions (\@tool_list, $gnulib_version);
   @tool_versions
-and print "\nThis release was bootstrapped with the following tools:",
+and print "\nThis release was built bootstrapped with the following tools\nusing $package_name git commit $this_commit_hash:\n",
   join ('', map {"\n  $_"} @tool_versions), "\n";
 
   print_news_deltas ($_, $prev_version, $curr_version)
-- 
2.41.0



signature.asc
Description: PGP signature


[PATCH] maintainer-makefile: Silence announce-gen error with GNULIB_REVISION.

2024-05-12 Thread Simon Josefsson via Gnulib discussion list
On running 'make release' I got this error message:

  GEN  release-prep
fatal: No names found, cannot describe anything.
make[1]: Entering directory '/home/jas/src/inetutils'

The error message is harmless since the code already handled this
situation, but the error message should be silenced since it looks
pretty alarming and the alternative code path using git rev-parse work
correctly as intended.

/Simon
From 0c52a761fbe563f2aa6731fbb18b0572005bc548 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Sun, 12 May 2024 17:07:30 +0200
Subject: [PATCH] maintainer-makefile: Silence announce-gen error with
 GNULIB_REVISION.

* top/maint.mk (gnulib-version): Silence git describe on failure.
---
 ChangeLog| 5 +
 top/maint.mk | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 2e2311e7b2..b6aa21d7f7 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2024-05-12  Simon Josefsson  
+
+	maintainer-makefile: Silence announce-gen error with GNULIB_REVISION.
+	* top/maint.mk (gnulib-version): Silence git describe on failure.
+
 2024-05-12  Bruno Haible  
 
 	execinfo: Document known bugs.
diff --git a/top/maint.mk b/top/maint.mk
index 32228f4366..ecd8971900 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -1502,7 +1502,7 @@ vc-diff-check:
 rel-files = $(DIST_ARCHIVES)
 
 gnulib-version = $$(cd $(gnulib_dir)\
-&& { git describe || git rev-parse --short=10 HEAD; } )
+&& { git describe 2> /dev/null || git rev-parse --short=10 HEAD; } )
 bootstrap-tools ?= autoconf,automake,gnulib
 
 gpgv = $$(gpgv2 --version >/dev/null && echo gpgv2 || echo gpgv)
-- 
2.41.0



signature.asc
Description: PGP signature


Re: De-vendoring gnulib in Debian packages

2024-05-11 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Simon Josefsson wrote:
>> Finally, while this is somewhat gnulib specific, I think the practice
>> goes beyond gnulib
>
> Yes, gnulib-tool for modules written in C is similar to
>
>   * 'npm install' for JavaScript source code packages [1],
>   * 'cargo fetch' for Rust source code packages [2],
>
> except that gnulib-tool is simpler: it fetches from a single source location
> only.
>
> How does Debian handle these kinds of source-code dependencies?

I don't know the details but I believe those commands are turned into
local requests for source code, either vendored or previously packaged
in Debian.  No network access during builds.  Same for Go packages,
which I have some experience with, although for Go packages they lose
the strict versioning so if Go package X declare a depedency on package
Y version Z then on Debian it may build against version Z+1 or Z+2 which
may in theory break and was not upstream's intended or supported
configuration.  We have a circular dependency situation for some core Go
libraries in Debian right now due to this.

I think fundamentally the shift that causes challenges for distributions
may be dealing with packages dependencies that are version >= X to
package dependencies that are version = X.  If there is a desire to
support that, some new patterns of the work flow is needed.  Some
package maintainers reject this approach and refuse to co-operate with
those upstreams, but I'm not sure if this is a long-term winning
strategy: it often just lead to useful projects not being available
through distributions, and users suffers as a result.

/Simon


signature.asc
Description: PGP signature


Re: continuous integrations — own runners

2024-05-11 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Simon Josefsson wrote:
>> I've found it to only be cost
>> effective to setup my own runners for platforms that gitlab doesn't
>> support natively, such as arm64 or ppc64el.
>
> For GitHub runners, hosting your own runners comes with security risks [1].
>
> Do GitLab runners have the same security risks? (I.e. If, on GitLab, I fork
> one of your projects, add code, then trigger the CI, does my code then run
> on your machine?)

I think you have to push your code as a merge request back into the
original project to trigger the CI to run on hosted runners on gitlab.
Runners are project-specific, so if you fork project
gitlab.com/gnu/whatever into gitlab.com/user/whatever the runner
associated with the gnu-project till not run for gitlab.com/user pushes.
However it will run if you push the commit back into a merge request
that ends up on gitlab.com/gnu/whatever.  What is actually executed
depends on .gitlab-ci.yml content, but that is modifyable by the user
too so any user can in theory run code on any runner associated with any
public project that enables merge requests.

I think the gitlab recommended way to deal with this is to have
protected branches, and to setup protected workflows.  Then you can
provide additional access for protected workflows only.  Merge requests
will not trigger a workflow on a protected branch.  You can probably
setup a runner to only run commits on protected branches, which somewhat
mitigate this problem.

However I think the details are unimportant: when you set up a
gitlab/github runner you are fundamentally authorizing
gitlab.com/github.com to run code that you cannot audit on that machine.
Thus the only reasonable way to approach this is to assume that you will
run malicious code.  If the code comes from a user on gitlab/github or
someone who gained access to gitlab/github internal systems doesn't
really matter.  My conclusion is that the only responsible way to host a
runner is to do it on throw-away machines and be prepared to re-install
on a new machine, and to produce an audit trail for everything going
through runners.  For me, the advantages of using gitlab/github runners
(e.g., the entire CI/CD test harness) outweigh the risks here, but there
are several unresolved risks to consider.

/Simon


signature.asc
Description: PGP signature


De-vendoring gnulib in Debian packages

2024-05-11 Thread Simon Josefsson via Gnulib discussion list
e and sign it yourself.  Could be done something like this:

   git clone https://git.savannah.gnu.org/git/inetutils.git
   cd inetutils/
   git archive --prefix=inetutils-v2.5/ -o inetutils-2.5-src.tar.gz v2.5
   # additional filtering of tarball may go here
   gpg -b inetutils-2.5-src.tar.gz

   This is your new upstream tarball.  To build this particular one, use
   ./bootstrap --no-git --gnulib-srcdir=/usr/share/gnulib.

4) Use upstream's git-archive tarball and PGP sign it.

   Download it using the GitHub or GitLab download link on the git tag
   like the cool kids.  If you did this on a sunny day, the downloaded
   tarball should be identical to the git-archive tarball and you can
   sign it if you are comfortable with this.

5) Use upstream's git-archive tarball.

   For those who want to join the really cool kids club.

6) Use upstream's tarball without PGP signature.

   This is quite common today.  It happens when upstream doesn't publish
   PGP signatures or the Debian maintainer doesn't care about them.

Regardless of mechanism, you should end up with a tarball that we call
the "upstream tarball".  Which approach is chosen is subjective and up
to the Debian package maintainer.  people have different opinions.
While I can't hide my own preferences I think we have to acknowledge
that there is no single uniform answer here.

To reach our goals in the beginning of this post, this upstream tarball
has to be filtered to remove all pre-generated artifacts and vendored
code.  Use some mechanism, like the debian/copyright Files-Excluded
mechanism to remove them.  If you used a git-archive upstream tarball,
chances are higher that you won't have to do a lot of work especially
for pre-generated scripts.

This filtered tarball will be the *.orig.tar.gz used to build the Debian
package.

Ideally you would like for the *.orig.tar.gz tarball to be as close as
possible to upstream's git repository for the tag release, minus any
pre-generated scripts or vendored gnulib files that upstream put into
git.  For collaborative upstreams, you could try to convince them to not
put pre-generated scripts and vendored gnulib files into git.

Auditing the upstream tarball to the *.orig.tar.gz should be simple, use
sha256sum or diffoscope to compare content.  In some ideal world this
could be bit-by-bit identical.  I'm hoping this can be the new best
recommended approach going forward.  This is only possible when upstream
agree with these concerns, and make an effort to publish such minimized
source-only tarballs.  This may be a pipe dream, just like Debian's
current best recommended approach for upstream PGP signed tarballs are
sometimes ignored.

You will now be faced with the challenge of building this tarball.  Your
existing debian/rules makefile will not work any more since it assumed
the existance of the pre-generated scripts and vendored gnulib files.
So you have to add the required tools as Build-Depends: and update the
debian/rules to build everything from source code.

For libntlm the essential diff between version 1.7-1, that used upstream
tarball with pre-generated content and gnulib code, and latest version
1.8-3 that builds from a minimal source-only tarball is small:

--- a/debian/control
+++ b/debian/control
@@ -6,6 +6,8 @@ Uploaders:
  Simon Josefsson ,
 Build-Depends:
  debhelper-compat (= 13),
+ git,
+ gnulib (>= 20240412~dfb7117+stable202401.20240408~aa0aa87-3~),
 Standards-Version: 4.6.2
 Section: libs
 Homepage: https://www.nongnu.org/libntlm/
--- a/debian/rules
+++ b/debian/rules
@@ -1,6 +1,16 @@
 #! /usr/bin/make -f
 
+include /usr/share/gnulib/debian/gnulib-dpkg.mk
+
 export DEB_BUILD_MAINT_OPTIONS = hardening=+all
 
 %:
-   dh $@ --builddirectory=build -X.la
+   dh $@ --without autoreconf --builddirectory=build
+
+pull:
+   ./bootstrap --gnulib-srcdir=$(GNULIB_DEB_DEBIAN_GNULIB) --pull
+
+gen:
+   ./bootstrap --gnulib-srcdir=$(GNULIB_DEB_DEBIAN_GNULIB) --gen
+
+execute_before_dh_auto_configure: dh_gnulib_clone pull dh_gnulib_patch gen

As you can see the essential part is to add a Build-Depends on the
gnulib Debian package to get the necessary gnulib code for building.  We
also disable dh_autoreconf since its approach is no longer necessary
(and hides problems), everything is built from source coming from Debian
or upstream.

There is one design of gnulib that is important to understand: gnulib is
a source-only library and is not versioned and has no release tarballs.
Its release artifact is the git repository containing all the commits.
Packages like coreutils, gzip, tar etc pin to one particular commit of
gnulib.  There is little coordination among packages which gnulib git
commit to use, and historically they typically use the latest gnulib git
commit that was published when the release manager prepared a release.
Usually the pinning happens through a git submodule or through the
GNULIB_REVISION bootstrap.conf mechanisms, but there is no requirement
from gnuli

De-vendoring gnulib in Debian packages

2024-05-11 Thread Simon Josefsson via Gnulib discussion list
lone https://git.savannah.gnu.org/git/inetutils.git
   cd inetutils/
   git archive --prefix=inetutils-v2.5/ -o inetutils-2.5-src.tar.gz v2.5
   # additional filtering of tarball may go here
   gpg -b inetutils-2.5-src.tar.gz

   This is your new upstream tarball.  To build this particular one, use
   ./bootstrap --no-git --gnulib-srcdir=/usr/share/gnulib.

4) Use upstream's git-archive tarball and PGP sign it.

   Download it using the GitHub or GitLab download link on the git tag
   like the cool kids.  If you did this on a sunny day, the downloaded
   tarball should be identical to the git-archive tarball and you can
   sign it if you are comfortable with this.

5) Use upstream's git-archive tarball.

   For those who want to join the really cool kids club.

6) Use upstream's tarball without PGP signature.

   This is quite common today.  It happens when upstream doesn't publish
   PGP signatures or the Debian maintainer doesn't care about them.

Regardless of mechanism, you should end up with a tarball that we call
the "upstream tarball".  Which approach is chosen is subjective and up
to the Debian package maintainer.  people have different opinions.
While I can't hide my own preferences I think we have to acknowledge
that there is no single uniform answer here.

To reach our goals in the beginning of this post, this upstream tarball
has to be filtered to remove all pre-generated artifacts and vendored
code.  Use some mechanism, like the debian/copyright Files-Excluded
mechanism to remove them.  If you used a git-archive upstream tarball,
chances are higher that you won't have to do a lot of work especially
for pre-generated scripts.

This filtered tarball will be the *.orig.tar.gz used to build the Debian
package.

Ideally you would like for the *.orig.tar.gz tarball to be as close as
possible to upstream's git repository for the tag release, minus any
pre-generated scripts or vendored gnulib files that upstream put into
git.  For collaborative upstreams, you could try to convince them to not
put pre-generated scripts and vendored gnulib files into git.

Auditing the upstream tarball to the *.orig.tar.gz should be simple, use
sha256sum or diffoscope to compare content.  In some ideal world this
could be bit-by-bit identical.  I'm hoping this can be the new best
recommended approach going forward.  This is only possible when upstream
agree with these concerns, and make an effort to publish such minimized
source-only tarballs.  This may be a pipe dream, just like Debian's
current best recommended approach for upstream PGP signed tarballs are
sometimes ignored.

You will now be faced with the challenge of building this tarball.  Your
existing debian/rules makefile will not work any more since it assumed
the existance of the pre-generated scripts and vendored gnulib files.
So you have to add the required tools as Build-Depends: and update the
debian/rules to build everything from source code.

For libntlm the essential diff between version 1.7-1, that used upstream
tarball with pre-generated content and gnulib code, and latest version
1.8-3 that builds from a minimal source-only tarball is small:

--- a/debian/control
+++ b/debian/control
@@ -6,6 +6,8 @@ Uploaders:
  Simon Josefsson ,
 Build-Depends:
  debhelper-compat (= 13),
+ git,
+ gnulib (>= 20240412~dfb7117+stable202401.20240408~aa0aa87-3~),
 Standards-Version: 4.6.2
 Section: libs
 Homepage: https://www.nongnu.org/libntlm/
--- a/debian/rules
+++ b/debian/rules
@@ -1,6 +1,16 @@
 #! /usr/bin/make -f
 
+include /usr/share/gnulib/debian/gnulib-dpkg.mk
+
 export DEB_BUILD_MAINT_OPTIONS = hardening=+all
 
 %:
-   dh $@ --builddirectory=build -X.la
+   dh $@ --without autoreconf --builddirectory=build
+
+pull:
+   ./bootstrap --gnulib-srcdir=$(GNULIB_DEB_DEBIAN_GNULIB) --pull
+
+gen:
+   ./bootstrap --gnulib-srcdir=$(GNULIB_DEB_DEBIAN_GNULIB) --gen
+
+execute_before_dh_auto_configure: dh_gnulib_clone pull dh_gnulib_patch gen

As you can see the essential part is to add a Build-Depends on the
gnulib Debian package to get the necessary gnulib code for building.  We
also disable dh_autoreconf since its approach is no longer necessary
(and hides problems), everything is built from source coming from Debian
or upstream.

There is one design of gnulib that is important to understand: gnulib is
a source-only library and is not versioned and has no release tarballs.
Its release artifact is the git repository containing all the commits.
Packages like coreutils, gzip, tar etc pin to one particular commit of
gnulib.  There is little coordination among packages which gnulib git
commit to use, and historically they typically use the latest gnulib git
commit that was published when the release manager prepared a release.
Usually the pinning happens through a git submodule or through the
GNULIB_REVISION bootstrap.conf mechanisms, but there is no requirement
from gnulib on this.  This method will vary between packages that uses
gnulib, 

Re: unistr/u8-strstr tests: Avoid test failure with ASAN

2024-05-09 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> +  int alarm_value = 50;
>signal (SIGALRM, SIG_DFL);
> -  alarm (10);
> +  alarm (alarm_value);

Nice trick, but doesn't the compiler optimize away this?  Maybe a
'volatile' is needed.

/Simon


signature.asc
Description: PGP signature


Re: continuous integrations pipeline frameworks

2024-05-06 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Simon Josefsson wrote:
>> I forgot to mention: the pattern to provide re-usable GitLab CI/CD
>> definitions that I'm inspired by is Debian's pipeline project:
>> 
>> https://salsa.debian.org/salsa-ci-team/pipeline/
>> 
>> It is easy to setup a new project to use their reusable pipeline -- just
>> add the CI/CD configuration file setting pointing to their job file --
>> and gives a broad configurable and tweakable pipeline.
>
> Sorry if this sounds negative, but
>
>   - So far, I've loved to adapt my CIs as needed. For example, one package
> has a number of --with options, so my CI first builds without these
> --with options, then installs the extra Debian packages and builds a
> second time with these --with options. I don't think that any
> pipeline framework can give me this possibility without causing
> massive hurdles.
>
>   - With such frameworks, documentation is key.

Yes, any reusable system will need to support additional system packages
and ./configure flags and so on.

>> I'm thinking we could do the same but for any project using gnulib.
>> Within some reasonable limit and assumptions, but the majority of
>> projects mentioned already are similar enough for this to be possible
>> relatively easily.
>> 
>> I'm thinking it should be sufficient to add gnu-ci.yml@gnulib/pipeline
>> (or similar) as a CI/CD configuration file setting to achieve this.
>
> It's quite possible that with this approach, you can bring more GNU packages
> into the "we have CI" camp.
>
> I wouldn't like to switch to such a framework, though, because I'm already
> too much of an expert in GitLab CI.

Right -- the key to this working well is that no switch should be
necessary.  Written properly, you add one 'include' to your existing job
definition file and that enable opt-in functionality.

I'm also quite merried to the job definition files I have so it will
take time to surrender them, but I also realize that large chunks of the
files I have repeat a lot of the same code patterns.

It's an experiment, I'm not sure how well it will work out, having
started on this a couple of times before and failed...

/Simon


signature.asc
Description: PGP signature


Re: continuous integrations

2024-05-06 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

>> I think the pattern of having the .gitlab-ci.yml outside of the core
>> Savannah project is a good one for those GNU projects who are not
>> embracing the GitLab platform.  Then GitLab-related stuff stays on the
>> GitLab platform and doesn't invade the core project.
>
> Yes, that's one reason I put the CI outside the main repository. The
> other reasons are:
>   - CIs will come and go over time. Whereas the source code is meant to
> be stable for > 20 years.
>   - Maintaining CIs is a different business than developing. It can be
> handled by different persons, with different skills.
>   - I had problems creating a git repository's mirror from Savannah at
> GitLab. If we can't have a GNU package's mirror at GitLab, and of
> course don't want to move the main repository away from Savannah,
> that was the only option.
>   - There is also the possibility of having CIs on other clouds, such as
> GitHub, Travis, etc. This is simpler if there is no mirroring
> in-between.

Right.  I also had trouble with Savannah git mirrors in the past, but
for the past year or so it has worked well.  So I like this pattern.

One of the few disadvantages with this approach that I've discovered is
that you don't get tight coupling of ci/cd script and the rest of the
repository.  This means that if you for some reason want to redo the
pipeline on commit X in say 5 years, you may have to find whatever old
commit of the CI pipeline job definition was used at the time and then
set that up to be able to run the pipeline.  If the pipeline definition
can be written to work with both current master git and 5 years old git,
then it will work fine, but it means more work to keep it tested.  I've
found this pattern useful once in a while, but it is not a strong
reason.

>> Then we can apply that group for free CI/CD minutes
>
> What do you mean by that? I've found GitLab's limit of 400 minutes per
> month and top-level group limiting, and see that GitHub does not have such
> a limit.

I have applied for this program for a couple of programs and while it is
a manual process and takes some time, it will give you 50.000 compute
minutes per month:

https://about.gitlab.com/solutions/open-source/join/

By using a single project it would also be possible to purchase compute
minutes in bulk and have them apply to all sub-projects.  I've found
this to be fairly cheap compared to alternative cost of setting up and
maintaining runners on my own hardware.  I've found it to only be cost
effective to setup my own runners for platforms that gitlab doesn't
support natively, such as arm64 or ppc64el.

>> How about using https://gitlab.com/gnulib/ as a playground for these
>> ideas?  Then we can add sub-projects there for pipeline definitions, and
>> Savannah mirrors of other projects too.
>
> On GitLab, the 400 minutes limit is per top-level group. Therefore, it's
> better if, for each GNU package, we have a separate top-level group.

Now I understand why you went through that effort to create new projects!

>> If you can add 'jas' as
>> maintainer of the 'gnulib' group on GitLab
>
> Done.

Thank you.

>> I could add one project to
>> start work on writing re-usable pipeline definitions, and one example
>> project maybe for GNU InetUtils that would use these new re-usable
>> pipeline components to provide a CI/CD pipeline definition file.  I
>> could add some arm64/ppc64el builds of gnulib too.
>
> The usefulness of this step depends on how much it would reduce the
> frequency of the x86_64 runs (which currently are at 1/week). Most
> parts of Gnulib are not arch-specific; therefore I think the minutes
> are better invested in testing Alpine Linux, FreeBSD, OpenBSD, than
> arm64/ppc64el.

Yes having more OSes is a good first step, but then having more
architectures than amd64 becomes relevant.

/Simon


signature.asc
Description: PGP signature


Re: gnulib-tool.py speeds up continuous integrations

2024-05-06 Thread Simon Josefsson via Gnulib discussion list
I forgot to mention: the pattern to provide re-usable GitLab CI/CD
definitions that I'm inspired by is Debian's pipeline project:

https://salsa.debian.org/salsa-ci-team/pipeline/

It is easy to setup a new project to use their reusable pipeline -- just
add the CI/CD configuration file setting pointing to their job file --
and gives a broad configurable and tweakable pipeline.  Of course, this
is only for building Debian packages, so it is a narrow focus.

I'm thinking we could do the same but for any project using gnulib.
Within some reasonable limit and assumptions, but the majority of
projects mentioned already are similar enough for this to be possible
relatively easily.

I'm thinking it should be sufficient to add gnu-ci.yml@gnulib/pipeline
(or similar) as a CI/CD configuration file setting to achieve this.

/Simon


signature.asc
Description: PGP signature


Re: gnulib-tool.py speeds up continuous integrations

2024-05-06 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> gnulib-tool is used is many CI jobs. Just adding 'python3' to the
> prerequisites of such a job makes it run faster. Here are the execution
> times for a single run, before and after adding 'python3', for those
> CIs that I maintain or co-maintain. In minutes and seconds.
>
>   Before   After
>
> https://gitlab.com/gnulib/gnulib-ci/-/pipelines   30:  11:
> https://gitlab.com/gnu-gettext/ci-distcheck/-/pipelines   36:  32:
> https://gitlab.com/gnu-poke/ci-distcheck/-/pipelines  18:4018:24
> https://gitlab.com/gnu-libunistring/ci-distcheck/-/pipelines  11:2509:16
> https://gitlab.com/gnu-diffutils/ci-distcheck/-/pipelines 07:2106:27
> https://gitlab.com/gnu-grep/ci-distcheck/-/pipelines  06:5106:08
> https://gitlab.com/gnu-m4/ci-distcheck/-/pipelines06:4605:44
> https://gitlab.com/gnu-sed/ci-distcheck/-/pipelines   05:2804:39
> https://gitlab.com/gnu-gzip/ci-distcheck/-/pipelines  04:1603:58
> https://gitlab.com/gnu-libffcall/ci-distcheck/-/pipelines 01:5001:42
> https://gitlab.com/gnu-libsigsegv/ci-distcheck/-/pipelines00:4500:42

These are useful pipelines with basic build testing!  I help on a bunch
of others below, to get broader OS/architecture-compatibility testing.

https://gitlab.com/gsasl/inetutils/-/pipelines
https://gitlab.com/gsasl/gsasl/-/pipelines
https://gitlab.com/gsasl/shishi/-/pipelines
https://gitlab.com/gsasl/gss/-/pipelines
https://gitlab.com/libidn/libidn2/-/pipelines
https://gitlab.com/libidn/libidn/-/pipelines
https://gitlab.com/gnutls/libtasn1/-/pipelines

I think the pattern of having the .gitlab-ci.yml outside of the core
Savannah project is a good one for those GNU projects who are not
embracing the GitLab platform.  Then GitLab-related stuff stays on the
GitLab platform and doesn't invade the core project.

Would it make sense to collaborate on re-usable GitLab CI/CD pipeline
definitions in a single GitLab project?  Then we can apply that group
for free CI/CD minutes and get testing on macOS/Windows too.  I have a
shared GitLab runner for native arm64 and ppc64el building, and have
wanted to setup NetBSD/OpenBSD/FreeBSD/etc GitLab runners too.  Adding
runners to a group is easy, adding it to multiple groups require some
manual work and added resources on the runner.

How about using https://gitlab.com/gnulib/ as a playground for these
ideas?  Then we can add sub-projects there for pipeline definitions, and
Savannah mirrors of other projects too.  If you can add 'jas' as
maintainer of the 'gnulib' group on GitLab I could add one project to
start work on writing re-usable pipeline definitions, and one example
project maybe for GNU InetUtils that would use these new re-usable
pipeline components to provide a CI/CD pipeline definition file.  I
could add some arm64/ppc64el builds of gnulib too.

/Simon


signature.asc
Description: PGP signature


Re: syntax-check reject u_char u_short u_int u_long

2024-05-06 Thread Simon Josefsson via Gnulib discussion list
Thanks for +1 Bruno, I have pushed the commits below.  More history or
insight on how to think about use of these types would be great.  My
recollection was that these types were preferred for compatibility with
ancient C tools that didn't parse 'unsigned char' etc.

/Simon
From 2adbe3be9e278cfc66289bbd9c8c433db84d5ce4 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Mon, 6 May 2024 14:56:08 +0200
Subject: [PATCH 1/2] inet-ntop, inet-pton: Avoid obsolete u_char type.

* lib/inet_pton.c (inet_pton6): Use unsigned char instead of u_char.
* lib/inet_ntop.c: Doc fix.
---
 ChangeLog   | 6 ++
 lib/inet_ntop.c | 2 +-
 lib/inet_pton.c | 8 
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 02ecbd341d..6b969dddbe 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-05-06  Simon Josefsson  
+
+	inet-ntop, inet-pton: Avoid obsolete u_char type.
+	* lib/inet_pton.c (inet_pton6): Use unsigned char instead of u_char.
+	* lib/inet_ntop.c: Doc fix.
+
 2024-05-05  Bruno Haible  
 
 	gnulib-tool.py: Regenerate aclocal.m4 before using 'autoconf -t ...'.
diff --git a/lib/inet_ntop.c b/lib/inet_ntop.c
index 0a4ba20e0d..26089959da 100644
--- a/lib/inet_ntop.c
+++ b/lib/inet_ntop.c
@@ -117,7 +117,7 @@ inet_ntop (int af, const void *restrict src,
  *  'dst' (as a const)
  * notes:
  *  (1) uses no statics
- *  (2) takes a u_char* not an in_addr as input
+ *  (2) takes a 'unsigned char *' not an in_addr as input
  * author:
  *  Paul Vixie, 1996.
  */
diff --git a/lib/inet_pton.c b/lib/inet_pton.c
index 2d29608d47..3d35f37adf 100644
--- a/lib/inet_pton.c
+++ b/lib/inet_pton.c
@@ -217,8 +217,8 @@ inet_pton6 (const char *restrict src, unsigned char *restrict dst)
 }
   if (tp + NS_INT16SZ > endp)
 return (0);
-  *tp++ = (u_char) (val >> 8) & 0xff;
-  *tp++ = (u_char) val & 0xff;
+  *tp++ = (unsigned char) (val >> 8) & 0xff;
+  *tp++ = (unsigned char) val & 0xff;
   saw_xdigit = 0;
   val = 0;
   continue;
@@ -236,8 +236,8 @@ inet_pton6 (const char *restrict src, unsigned char *restrict dst)
 {
   if (tp + NS_INT16SZ > endp)
 return (0);
-  *tp++ = (u_char) (val >> 8) & 0xff;
-  *tp++ = (u_char) val & 0xff;
+  *tp++ = (unsigned char) (val >> 8) & 0xff;
+  *tp++ = (unsigned char) val & 0xff;
 }
   if (colonp != NULL)
 {
-- 
2.34.1

From aacceb6eff58eba91290d930ea9b8275699057cf Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Mon, 6 May 2024 15:01:10 +0200
Subject: [PATCH 2/2] maintainer-makefile: Prohibit BSD4.3/SysV u_char etc
 types.

* top/maint.mk (sc_unsigned_char, sc_unsigned_short)
(sc_unsigned_int, sc_unsigned_long): Add.
---
 ChangeLog|  6 ++
 top/maint.mk | 18 ++
 2 files changed, 24 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 6b969dddbe..54ac701a98 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2024-05-06  Simon Josefsson  
+
+	maintainer-makefile: Prohibit BSD4.3/SysV u_char etc types.
+	* top/maint.mk (sc_unsigned_char, sc_unsigned_short)
+	(sc_unsigned_int, sc_unsigned_long): Add.
+
 2024-05-06  Simon Josefsson  
 
 	inet-ntop, inet-pton: Avoid obsolete u_char type.
diff --git a/top/maint.mk b/top/maint.mk
index af865717c4..32228f4366 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -854,6 +854,24 @@ sc_obsolete_symbols:
 	halt='do not use HAVE''_FCNTL_H or O'_NDELAY			\
 	  $(_sc_search_regexp)
 
+# Prohibit BSD4.3/SysV u_char, u_short, u_int and u_long usage.
+sc_unsigned_char:
+	@prohibit=u''_char \
+	halt='don'\''t use u''_char; instead use unsigned char'	\
+	  $(_sc_search_regexp)
+sc_unsigned_short:
+	@prohibit=u''_short \
+	halt='don'\''t use u''_short; instead use unsigned short' \
+	  $(_sc_search_regexp)
+sc_unsigned_int:
+	@prohibit=u''_int \
+	halt='don'\''t use u''_int; instead use unsigned int' \
+	  $(_sc_search_regexp)
+sc_unsigned_long:
+	@prohibit=u''_long \
+	halt='don'\''t use u''_long; instead use unsigned long'	\
+	  $(_sc_search_regexp)
+
 # FIXME: warn about definitions of EXIT_FAILURE, EXIT_SUCCESS, STREQ
 
 # Each nonempty ChangeLog line must start with a year number, or a TAB.
-- 
2.34.1



signature.asc
Description: PGP signature


syntax-check reject u_char u_short u_int u_long

2024-05-06 Thread Simon Josefsson via Gnulib discussion list
How about adding inetutils u_* syntax-checks to gnulib's maint.mk?

sc_unsigned_char:
@prohibit=u''_char \
halt='don'\''t use u''_char; instead use unsigned char' \
  $(_sc_search_regexp)

sc_unsigned_long:
@prohibit=u''_long \
halt='don'\''t use u''_long; instead use unsigned long' \
  $(_sc_search_regexp)

sc_unsigned_short:
@prohibit=u''_short \
halt='don'\''t use u''_short; instead use unsigned short' \
  $(_sc_search_regexp)

sc_unsigned_int:
@prohibit=u''_int \
halt='don'\''t use u''_int; instead use unsigned int' \
  $(_sc_search_regexp)


The u_char/u_long/u_short/u_int idiom used to be common but today I
don't think any reasonable code should use it.  Does anyone have more
background or opinions on this?

Glibc definitions:

/usr/include/features.h:
   __USE_MISC   Define things from 4.3BSD or System V Unix.
/usr/include/x86_64-linux-gnu/sys/types.h:
#ifdef  __USE_MISC
# ifndef __u_char_defined
typedef __u_char u_char;
typedef __u_short u_short;
typedef __u_int u_int;
typedef __u_long u_long;
typedef __quad_t quad_t;
typedef __u_quad_t u_quad_t;
typedef __fsid_t fsid_t;
#  define __u_char_defined
# endif
typedef __loff_t loff_t;
#endif
/usr/include/x86_64-linux-gnu/bits/types.h:
/* Convenience types.  */
typedef unsigned char __u_char;
typedef unsigned short int __u_short;
typedef unsigned int __u_int;
typedef unsigned long int __u_long;

The only usage in gnulib is lib/inet_ntop.c and lib/inet_pton.c.  It
seems u_char was removed in most places of the code except a few
remaining type casts/comments:

lib/inet_ntop.c: *  (2) takes a u_char* not an in_addr as input
lib/inet_pton.c:  *tp++ = (u_char) (val >> 8) & 0xff;
lib/inet_pton.c:  *tp++ = (u_char) val & 0xff;
lib/inet_pton.c:  *tp++ = (u_char) (val >> 8) & 0xff;
lib/inet_pton.c:  *tp++ = (u_char) val & 0xff;

/Simon


signature.asc
Description: PGP signature


Re: Indentation mistake

2024-05-03 Thread Simon Josefsson via Gnulib discussion list
Collin Funk  writes:

> Hi Simon,
>
> On 5/2/24 11:25 AM, Simon Josefsson via Bug reports for the GNU Internet 
> utilities wrote:
>>> Sadly, I cannot do this, at least not easily.  After installing GNU
>>> indent, "make syntax-check" complains about many files:
>>>
>>> $ indent --version
>>> GNU indent 2.2.12
>> You need 2.2.13 :-)
>
> I see that you added the 'syntax-check' for indent in Gnulib. One
> minor problem though, it breaks if the user has an ~/.indent.pro. :)
>
> I don't use indent much, so I forgot my repository where I store
> dotfiles installs this:
>
> $ cat ~/.indent.pro 
> --gnu-style
> --no-tabs
>
> Here lets check if the code is indented:
>
> $ make sc_indent | wc -l
> maint.mk: code format error, try "make indent"
> make: *** [maint.mk:1760: sc_indent] Error 1
> 52751
>
> I was confused for a bit until I saw that file.
>
>$ rm ~/.indent.pro
>$ make sc_indent | wc -l
>1
>
> Indent has -npro that you can use to ignore the file which might be
> good.

Nice catch.  It doesn't make sense for maint.mk's indentation to be
influenced by ~/.indent.pro -- the style has to be a per-project
setting.  I pushed the patch below.

/Simon
From 6213c5bd72d15ca5e1ea9c34122899e02fed448c Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Fri, 3 May 2024 08:44:03 +0200
Subject: [PATCH] maint.mk: Don't fail on ~/.indent.pro, reported by Collin
 Funk.

* top/maint.mk (indent_args): Use --ignore-profile.
---
 ChangeLog| 5 +
 top/maint.mk | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index d967c8cfac..2781a70800 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2024-05-03  Simon Josefsson  
+
+	maint.mk: Don't fail on ~/.indent.pro, reported by Collin Funk.
+	* top/maint.mk (indent_args): Use --ignore-profile.
+
 2024-05-02  Collin Funk  
 
 	gnulib-tool.sh: Fix program name in error message.
diff --git a/top/maint.mk b/top/maint.mk
index c30e71ba6e..af865717c4 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -1746,7 +1746,7 @@ refresh-po:
 
 # Indentation
 
-indent_args ?= -ppi 1
+indent_args ?= --ignore-profile --preprocessor-indentation 1
 C_SOURCES ?= $$($(VC_LIST_EXCEPT) | grep '\.[ch]\(.in\)\?$$')
 INDENT_SOURCES ?= $(C_SOURCES)
 exclude_file_name_regexp--indent ?= $(exclude_file_name_regexp--sc_indent)
-- 
2.34.1



signature.asc
Description: PGP signature


Re: GNULIB_REVISION

2024-04-25 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> You raise several good points. A couple of quick reaction:
>
> On 2024-04-25 09:26, Simon Josefsson via Gnulib discussion list wrote:
>
>> - the gnulib git submodule is huge.  Not rarely I get out of memory
>>errors during 'git clone' in CI/CD jobs.
>
> First I've heard of this problem (other than with Git LFS, which
> Gnulib doesn't use). What part of the clone operation fails? Is this 
> server-side RAM, or client-side RAM, or something else? How much
> memory does 'git clone' require now for Gnulib?
>
> Is there some way to cajole 'git clone' into using less memory, with a
> '--depth 1' or similar options? Cloning shallowly would clone Gnulib a 
> lot faster, if you're cloning from a remote repository.

It only happens once in a while, for example:

https://gitlab.com/gsasl/inetutils/-/jobs/6706396721

This is gitlab's own git submodule checkout system working, and it is
using --depth 150 which shouldn't be too heavy, so it not even getting
to ./bootstrap's git clone.

Btw, using --depth 1 is incompatible with gnulib's git-version-gen: it
will not find a suitable version number for the build.  Even gitlab's
default depth of 50 (or if it was 100, can't remember) is often not
enough if you have >50 commits since the last release.  This cause
problems when building from 'git archive' tarballs.

>> - we don't offer any way for people receiving tarballs to learn which
>>gnulib git commit was used
>
> Isn't the real problem that we don't put (for example) gzip's own
> commit ID into the coreutils tarball? If we did that, Gnulib's commit
> ID would come for free, since it can be derived from gzip's commit ID.

I suppose you meant s/coreutils/gzip/, otherwise I don't follow?

Yes that is a good idea!  The git commit of the project should be part
of the announce-gen output.  The git tag name could be mentioned too.
Tags are not long-term stable since they can be moved later on, so I
think the full git commit id should be mentioned too.  Current SHA1 git
commits aren't long-term stable either, since SHA1 is broken, but at
least this approach is the best we can do right now and when we move to
SHA256 git things will be better.

/Simon


signature.asc
Description: PGP signature


Re: GNULIB_REVISION

2024-04-25 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi Simon,
>
>> you can ... via
>> GNULIB_REVISION pick out exactly the gnulib git revision that libpaper
>> needs. ...
>> [1] 
>> https://blog.josefsson.org/2024/04/13/reproducible-and-minimal-source-only-tarballs/
>> [2] https://salsa.debian.org/auth-team/libntlm/-/tree/master/debian
>
> I see GNULIB_REVISION as an obsolete alternative to git submodules, and
> would therefore discourage rather than propagate its use.

I think it will be challenging for gnulib to insists on always being
used as a git submodule, and I would prefer if we continue support
multiple ways of working.  Personally I have been migrating towards
gnulib git submodules because most other projects use gnulib like that,
but I've never really felt comfortable with them.  Some of the concerns
I have:

- git submodules leads to -- in my subjective opinion -- complexity
  which leads to a worse user experience for developers.  I have learned
  to work with git submodules over the years, but it was a hurdle that I
  don't want to force on everyone.

- the gnulib git submodule is huge.  Not rarely I get out of memory
  errors during 'git clone' in CI/CD jobs.  I can restart the jobs
  manually, but this indicate that there is a resource drain here.  For
  a tiny project like libntlm the imbalance if the small project code
  and large gnulib is troubling.

- often CI/CD platforms have different ways of working with git
  submodules which adds complexity which leads to bugs.  Allowing
  maintainers to decide if they want to work with git submodules or not
  seems like a good thing.

- we don't offer any way for people receiving tarballs to learn which
  gnulib git commit was used (you noticed this too below) but with a
  GNULIB_REVISION approach this is part of the tarball, just like any
  other versioned dependency on autoconf, automake etc

- I think gnulib could be regarded as any other external dependency,
  just like autoconf, automake, libtool etc that also generate files in
  my build tree during bootstrapping.  I don't put autoconf as a git
  submodule, why should I put gnulib as one?

Granted, these concerns are a bit vague and subjective.

> Currently libntlm has this in its bootstrap.conf:
>
>   GNULIB_REVISION=dfb71172a46ef41f8cf8ab7ca529c1dd3097a41d
>
> and GNU make has this:
>
>   GNULIB_REVISION=stable-202307

Interesting.  This suggests the GNULIB_REVISION approach isn't the
entire solution either.

I think it is useful to record the gnulib git commit used to prepare a
tarball, and have that git commit id be part of the shipped tarball, and
stored inside the git repository.  The first use above achieve this, but
the second one doesn't (branches/tags are moving targets).

If I download the gzip tarball I can't find anywhere what gnulib commit
was used for bootstrapping.  It is quite cumbersome to verify that the
tarball didn't contain any modified gnulib code.  This is even harder
when projects INTENTIONALLY modify gnulib code compared to what's in
gnulib git, which coreutils and several others projects does through
gnulib *.diff/*.patch files.

Ultimately, I think there is an important use-case to build projects
directly from source code without having tarballs with pre-generated
files that are not reproduced by the user.

> The differences between both approaches are:
>
>   - GNULIB_REVISION works only with the 'bootstrap' program. The submodules
> approach works also without 'bootstrap'.

What use case are you thinking of?  The gnulib git commit information
consumers that I can think of are gnulib-aware.

>   - For GNULIB_REVISION, the user is on their own regarding tooling, aside
> from 'bootstrap'. In the submodules approach, the 'git' suite provides
> the tooling, and many developers are familiar with it.

Yes, but developers also like flexibility, and in some situations I
think the git approach is not the best way of working.

>   - .tar.gz files created by the gitweb "snapshot" link, by the cgit "refs >
> Download" section, or the GitHub "Download ZIP" button contain an empty
> directory in place of the submodule, and no information about the 
> revision.
> Whereas they contain the file with the GNULIB_REVISION assignment.

Indeed, this was the main challenge for me.  That is critical
information for anyone who wants to avoid touching tarballs with
pre-generated content.

>> I should write a post to debian-devel describing this pattern on
>> how to use gnulib in Debian packages
>
> It feels wrong to me if, in order to get meta-information about required
> dependencies of a package, Debian tools grep a particular file for a specific
> string. This approach is simply too limited.

Meta-information about dependencies are normally always hand-curated in
Debian (the Build-Depends: header).  The simplest solution is for the
Debian package maintainer to figure out which gnulib git commit version
was used for a release and pin that manually in the debian/rules
makefile.  If the 

Re: RFC: Remove documentation of IRIX as supported platform

2024-04-25 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Since
>   - IRIX 6.5 is end-of-life for more than 10 years [1],
>   - I don't have access to an IRIX machine any more,
>   - the AC_SYS_LARGEFILE macro no longer supports IRIX,
> I would suggest to remove mentions of IRIX support from the
>   documentation.

+1

"On September 6, 2006, an SGI press release announced the end of the
MIPS and IRIX product lines.[7] Production ended on December 29, 2006,
with final deliveries in March 2007, except by special
arrangement. Support for these products ended in December 2013 and they
will receive no further updates."
-- https://en.wikipedia.org/wiki/IRIX

GCC 4.8.x from 2011 was the last release with official support for IRIX.

/Simon

> Like we did for OSF/1 in 2019:
>
> doc: Remove documentation of OSF/1 as supported platform.
> * doc/gnulib-intro.texi (Target Platforms): Mention that OSF/1 is
> unsupported.
> ...
> * doc/**/*.texi: Update.
>
> Any objections?
>
>Bruno
>
> [1] 
> https://git.savannah.gnu.org/gitweb/?p=gnulib/maint-tools.git;a=blob;f=end-of-life.txt
>
>
>
>
>


signature.asc
Description: PGP signature


Re: Gnulib in Debian

2024-04-24 Thread Simon Josefsson via Gnulib discussion list
Reuben Thomas  writes:

> TLDR: FTP Master rejected my libpaper package because it contains gnulib
> source files. I pointed out that other Debian packages for which I am
> upstream do exactly this and have been accepted, and that it is the
> standard way to use gnulib. A few senior Debian Developers said they did
> not consider this use of gnulib to be against Debian policy. But FTP
> Master's stance appears to be that they will not let any new packages into
> the archive that contain gnulib sources (or in general, vendored
> sources—they don't have anything against gnulib in particular!). I also
> argued that building against Debian's version of gnulib would risk
> introducing bugs (I have found that updating gnulib in my projects can make
> previously-working code fail).

The last aspect should be solved: the latest gnulib in Debian contains a
git bundle of gnulib, so you can Build-Depends on gnulib and via
GNULIB_REVISION pick out exactly the gnulib git revision that libpaper
needs.  This avoids including gnulib files in the tarball that is
uploaded to Debian, and there is no risk that you will get gnulib code
from a different git commit.  It requires an added 'Build-Depends: git'
in libpaper, though, which is unfortunate but I don't see how to avoid
it.  I should write a post to debian-devel describing this pattern on
how to use gnulib in Debian packages, but you can infer everything from
the links given in my blog post [1] and the latest upload of libntlm
into Debian.

/Simon

[1] 
https://blog.josefsson.org/2024/04/13/reproducible-and-minimal-source-only-tarballs/
[2] https://salsa.debian.org/auth-team/libntlm/-/tree/master/debian


signature.asc
Description: PGP signature


Re: memset_explicit: Fix compilation error on some OpenSolaris derivatives

2024-04-24 Thread Simon Josefsson via Gnulib discussion list
Collin Funk  writes:

> Hi Paul,
>
> On 4/23/24 11:22 PM, Paul Eggert wrote:
>> Why is telnetd.h including config.h? Only a top-level C file should
>> include config.h, and it should so so at the start.
>
> I don't disagree. Most of those lines are 20 years old, so I assume it
> wasn't a problem then. Though I do wonder how common those warnings
> would be in other projects.

I think this was fairly common before.  If there had been a 'make
syntax-check' rule for this, we would have caught it!  I have removed
use of HAVE_CONFIG_H and fixed telnetd.h in Inetutils now, thanks.

https://git.savannah.gnu.org/cgit/inetutils.git/commit/?id=32336c79b6aede7beef1d6929b631a53d141cee6
https://git.savannah.gnu.org/cgit/inetutils.git/commit/?id=c7f6910d6832d90be59033911a39de2d4b59de30

/Simon


signature.asc
Description: PGP signature


Re: full-source bootstrap and Python

2024-04-22 Thread Simon Josefsson via Gnulib discussion list
Janneke Nieuwenhuizen  writes:

>> Also, from the diagrams in [1][2][3] it looks like the full-source bootstrap
>> uses tarballs frozen in time (make-3.80, gcc-2.95.3, gcc 4.7.3, etc.). So,
>> even if newer versions of 'make' or 'gcc' will use a Python-based 
>> gnulib-tool,
>> there won't be a problem, because the bootstrap of these old tarballs will
>> be unaffected.
>
> indeed.  For the current situtation (that's less than great and are
> working on to resolve), making essential GNU packages less
> bootstrappable is of no consequence.  Cleaning-up the full-source
> bootstrap and making it more or less future-proof, might be challenged
> by such a new dependency.

Rather than finding out what dependencies are problematic through
tedious manual work, is there a recommendation we can articulate that
would help the bootstrappable effort?

For example, in Libtasn1 (which I guess is fairly low in the
bootstrapping graph) I made the CI/CD pipeline [1] build the tarball on
Debian 4 etch (2010, first amd64 release), and using 'pcc' and 'tcc' as
alternative C compilers.  I'm hoping this has some value, but I have no
good way to tell.  What actual testable environments would it make sense
to test a project in, to help the bootstrappable effort?  Right now
these targets build fine, but if at some point 'pcc' stops building, I
may be inclinced to simply drop this target rather than to fix the bugs
since I have no idea if supporting building with 'pcc' helps anyone.

I'm thinking suggestions like 'Build and test project on i386 Debian 3',
or 'Cross-build project from amd64 to mipsel on Debian 4'.  I can't seem
to find docker images for CentOS 3-6, maybe old CentOS is a good
long-term target too.  If there were concrete fact-based suggestions
like that, I would make an effort to CI/CD build libidn, libidn2,
inetutils, and some other projects to make sure they continue to work on
old platforms.

/Simon

[1] https://gitlab.com/gnutls/libtasn1/-/pipelines/


signature.asc
Description: PGP signature


Re: full-source bootstrap and Python

2024-04-22 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Janneke Nieuwenhuizen wrote:
>> Are are we creating a problem for
>> bootstrapping (or even a dependency cycle) when introducing this new
>> dependency into a certain package.
>
> I think you answered this question with "no", when writing in [1]:
>
>   "Even more recently (2018), the GNU C Library glibc-2.28 adds Python
>as a build requirement"
>
> So, how do you avoid Python when building glibc? Do you use musl libc as
> a first stage, and only build glibc once a python built with musl exists?
>
> Also, from the diagrams in [1][2][3] it looks like the full-source bootstrap
> uses tarballs frozen in time (make-3.80, gcc-2.95.3, gcc 4.7.3, etc.). So,
> even if newer versions of 'make' or 'gcc' will use a Python-based gnulib-tool,
> there won't be a problem, because the bootstrap of these old tarballs will
> be unaffected.

While I agree, I think there is one nuance that could be added here: it
is true that full-source bootstraps usually needs to use earlier
releases of software tarballs to build more recent projects because of
cyclic dependencies.  However this cause extra work for people involved.
It also means we will have to keep maintaining and patching all the
software that is involved in the full-source bootstrap to keep it
working in the future.  So there is a cost involved here.

The takeaway from that situation should NOT be "don't use python" or
"don't use modern tools", that would be absurd.

The takeaway should be that one should carefully evaluate the
implications of using Python and modern tools, and look at the costs.

If we want to minimize the work for full-source bootstrap people we
increase the cost of people maintaining modern software, and vice versa,
I don't see how we can get away from this conflict.  If everyone wrote
everything in machine code, there would be nothing for the bootstrap
people to do.

However what we can do is to reduce the total amount of work involved by
not introducing too many bootstrap dependency cycles.  Consider the
extreme situation where gnulib-tool version A would require coreutils
verison B, and coreutils version B+1 would require gnulib-tool version
A+1, and gnulib-tool version A+2 would require coreutils version B+1 and
so on for really short release version increments.  Then a full-source
bootstrap will need to package and keep maintain all those coreutils and
gnulib-tool versions -- or start to patch things to avoid the
dependencies.  (I'm ignoring the fact that normally gnulib-tool is not
involved at all when building projects.)

I think gnulib is already quite careful in dependency tracking, more so
than most projects I'm familiar with, but it doesn't mean this can be
improved.

Janneke: is there any recommendation from you as a bootstrapping person
on what dependency we should be careful with, and which dependencies
that are fine?  I suppose/hope that if gnulib-tool required python
version 2, you would not have a serious problem?  I'm certain that
python version 2 is possible to build using really old toolchains.  At
what version of python would it lead require added another bootstrapping
step to the graph?  Python 3.0, 3.7, 3.12?  Maybe it is not easy to
answer this without generating the graph.  But I also think you would
have a better feeling of what the answer would be than most of us.

/Simon


> Bruno
>
> [1] 
> https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/
> [2] 
> https://guix.gnu.org/manual/en/html_node/Reduced-Binary-Seed-Bootstrap.html
> [3] 
> https://guix.gnu.org/manual/devel/en/html_node/Full_002dSource-Bootstrap.html
>
>
>
>
>


signature.asc
Description: PGP signature


Re: beta-tester call draft

2024-04-20 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi,
>
> It's now time to call for beta-testers of the Python gnulib-tool.
> I plan to post the same text to info-gnu and to planet.gnu.org.

Confirmed success with oath-toolkit; identical generated files.

Old execution time was ~48 seconds, now it is at 0.7 seconds.

The slow gnulib-tool runtime was the primary reason for putting gnulib
generated files into git for oath-toolkit.  Waiting close to a minute
for a fresh rebuild became unbearable during development cycles, and
this is not on the slowest of machines (i7-1260P, 64GB RAM, Samsung SSD
990 PRO).  Now I can experiment with removing gnulib files from version
control again.

jas@kaka:~/src/oath-toolkit$ time make -f cfg.mk 
...
real0m48,169s
user0m49,900s
sys 0m9,658s
jas@kaka:~/src/oath-toolkit$ export GNULIB_TOOL_IMPL=py
jas@kaka:~/src/oath-toolkit$ time make -f cfg.mk 
...
real0m0,704s
user0m0,527s
sys 0m0,179s
jas@kaka:~/src/oath-toolkit$

In case you doubt this was due to a caching speedup, here is another
invocation right after the previous python run:

jas@kaka:~/src/oath-toolkit$ export GNULIB_TOOL_IMPL=sh
jas@kaka:~/src/oath-toolkit$ time make -f cfg.mk 
...
real0m49,414s
user0m50,742s
sys 0m10,332s
jas@kaka:~/src/oath-toolkit$

Thank you so much,
/Simon


signature.asc
Description: PGP signature


Re: [PATCH] gitlog-to-changelog: Make output reproducible.

2024-04-15 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> I don't agree with this patch. It misrepresents the dates on which people
> have checked in their commits.

Paul Eggert  writes:

> Emacs has had the tradition of using UTC for ChangeLog dates, so
> please support that as well, as an option. This Emacs tradition dates
> back to the RCS days, as RCS supports only UTC timestamps for
> commits. Because UTC commit dates in ChangeLogs are sorted
> numerically, this lessens confusion for newbie ChangeLog readers.
>
> Although using UTC can be offputtinmg to a somewhat more expert reader
> who prefers dates to use the committer's UTC offset, ChangeLog format
> is not obvious anyway when the line contains the *committer's* date
> but the *author's* name. Projects like Emacs can reasonably prefer the
> confusion of using UTC, to the confusion of dates that seem to be out
> of order.

My head keeps spinning trying to figure out what the proper behaviour
should be, so I have reverted my patch and added documentation for the
current behaviour.  Documentation on the Makefile.am snippet was lacking
and seems useful on its own, I suspect people copied it from some other
project that used gitlog-to-changelog and the snippet was never
documented anywhere (or did I miss that?).  Now documentation mention
how to disable locale-dependent behaviour, for those who desire that.  I
still think it is bad to have output of gitlog-to-changelog depend on
the locale, but I'm not sure what the real intended behaviour really
should be (is time zone handling even non-ambigious from the GNU
standards document?) or what different project maintainers would prefer.
As you suggest with Emacs, maybe there are different expectations in
different projects.

/Simon
From 02d2ae07d9f72d73de615498218a7995de98a201 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Mon, 15 Apr 2024 17:47:52 +0200
Subject: [PATCH] gitlog-to-changelog: Revert 2024-04-12 fix and add
 documentation.

* build-aux/gitlog-to-changelog: Use localtime.
* doc/gitlog-to-changelog.texi: Add.
* doc/gnulib.texi (Build Infrastructure Modules): Add.
---
 ChangeLog |  7 +++
 build-aux/gitlog-to-changelog |  4 +-
 doc/gitlog-to-changelog.texi  | 81 +++
 doc/gnulib.texi   |  3 ++
 4 files changed, 93 insertions(+), 2 deletions(-)
 create mode 100644 doc/gitlog-to-changelog.texi

diff --git a/ChangeLog b/ChangeLog
index 5b7b3a36fc..2ab6e41ae2 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-04-15  Simon Josefsson  
+
+	gitlog-to-changelog: Revert 2024-04-12 fix and add documentation.
+	* build-aux/gitlog-to-changelog: Use localtime.
+	* doc/gitlog-to-changelog.texi: Add.
+	* doc/gnulib.texi (Build Infrastructure Modules): Add.
+
 2024-04-14  Collin Funk  
 
 	gnulib-tool.py: Fix incorrect type hint.
diff --git a/build-aux/gitlog-to-changelog b/build-aux/gitlog-to-changelog
index e06106490c..16a9405a7c 100755
--- a/build-aux/gitlog-to-changelog
+++ b/build-aux/gitlog-to-changelog
@@ -35,7 +35,7 @@
 eval 'exec perl -wSx "$0" "$@"'
  if 0;
 
-my $VERSION = '2024-04-12 15:23'; # UTC
+my $VERSION = '2023-06-24 21:59'; # UTC
 # The definition above must lie within the first 8 lines in order
 # for the Emacs time-stamp write hook (at end) to update it.
 # If you change this file with Emacs, please let the write hook
@@ -360,7 +360,7 @@ sub git_dir_option($)
   ? '  (tiny change)' : '');
 
   my $date_line = sprintf "%s  %s$tiny\n",
-strftime ("%Y-%m-%d", gmtime ($1)), $2;
+strftime ("%Y-%m-%d", localtime ($1)), $2;
 
   my @coauthors = grep /^Co-authored-by:.*$/, @line;
   # Omit meta-data lines we've already interpreted.
diff --git a/doc/gitlog-to-changelog.texi b/doc/gitlog-to-changelog.texi
new file mode 100644
index 00..137b15fcda
--- /dev/null
+++ b/doc/gitlog-to-changelog.texi
@@ -0,0 +1,81 @@
+@node gitlog-to-changelog
+@section gitlog-to-changelog
+
+@c Copyright (C) 2024 Free Software Foundation, Inc.
+
+@c Permission is granted to copy, distribute and/or modify this document
+@c under the terms of the GNU Free Documentation License, Version 1.3 or
+@c any later version published by the Free Software Foundation; with no
+@c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
+@c copy of the license is at <https://www.gnu.org/licenses/fdl-1.3.en.html>.
+
+@cindex gitlog
+@cindex changelog
+
+Gnulib have a module @code{gitlog-to-changelog} to parse @code{git log}
+output and generate @code{ChangeLog} files, see
+@ifinfo
+@ref{Change Logs,,,standards}.
+@end ifinfo
+@ifnotinfo
+@url{https://www.gnu.org/prep/standards/html_node/Change-Logs.html}.
+@end ifnotinfo
+
+You would typically use it by extending the @code{dist-hook} in the
+top-level @code{Makefile.am} like this:
+
+@example
+dist-hook: gen-ChangeLog
+...
+.PHONY: gen-ChangeLog
+gen-ChangeLog:
+$(A

Re: git repositories vs. tarballs

2024-04-15 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi Simon,
>
> In the other thread [1][2][2a], but see also [3] and [4], you are asking

Hi Bruno -- thanks for attempting to bring som order to this complicated
matter!  I am in agreement with most of what you say, although some
comments below.

>> Has this changed, so we should recommend maintainers
>> to 'EXTRA_DIST = bootstrap bootstrap-funclib.sh bootstrap.conf' so this
>> is even possible?
>
> 1) I think changing the contents of a tarball ad-hoc, like this, will not
>lead to satisfying results, because too many packages will do things
>differently.

Right, and people tend to jump to the incorrect conclusion that running
autoreconf -fvi or running ./bootstrap from a tarball is a good idea.

Rather than trying to fix that solution, I think we should guide these
people towards using 'git-archive' style tarballs instead.  Then they
will need to do all the work that is actually required to bootstrap a
project, including getting all the dependencies in place.

Some will succeed in that.

Some will give up and realize they wanted the traditional curated
tarball after all, and go back to it, and this time hopefully not do the
'autoreconf -fi' harmful dance.

In both situations, I think we are better off than with the current
situation.  Now people take the 'make dist' tarballs and try to reverse
engineer all the required dependencies to regenerate all artifacts, and
do a half-baked job at that, with an end result that is even harder to
audit than what we started with.

>(Y) Some distros want to be able to verify the tarballs.[9] (I don't agree
>with this. If you can't trust the release manager who produced the
>tarballs (C), you cannot trust (A) either. If there is a mechanism
>for verifying (C) from (A), criminals will commit their malware
>entirely into (A).)

I have another perspective here.  I don't think people necessarily want
to blindly trust either the git repository source code (A) or tarball
with generated code and source code (C).  So people will want the
ability to audit and verify everything.  Once people start to work on
auditing, they realize that there is no way around auditing (A).  You
need to audit XZUtils source code to gain trust in XZUtils.  So people
work on doing that.  Then someone realize that people aren't actually
using git source code (A) to build the XZUtils binaries -- they are
using (A) plus generated content, that is the full tarball (C).  However
auditing (C) is just a waste of human time if there was a way to avoid
using (C) completely, and have people use (A) directly.  This isn't all
that complicated, I just did it for Libntlm and will try to do the same
for other packages.

I think you are right that if we succeed with this, criminals will put
their malware directly into git source code repositories.  However that
is addressed by the people working on reviewing the core code of
projects.  There is no longer any need for people to spend time auditing
tarballs with a lot of other stuff in them.  This time can be redirected
towards auditing the code.  Which over the years saves a lot of human
cycles.

Most code audits I've seen focus on what's in git, not what's in the
tarball nor in the binary packages that people use.  Which is how it
should be -- the build environment is better to audit on its own rathen
than as part of the upstream code audit.

> 6) How could (X) be implemented?
>
>The main differences between (A) and (C) are [10]:
>  - Tarballs contain source code from other packages.
>  - Tarballs contain generated files.
>  - Tarballs contain localizations.
>
>I could imagine an intermediate step between (A) and (C):
>
>  (B) is for users with many packages installed and for distros, to apply
>  modifications (even to the set of gnulib modules) and then build
>  binaries of the package for one or more architectures, without
>  needing to fetch anything (other than build prerequisites) from the
>  network.
>
>This is a different stage than (A), because most developers don't want
>to commit source code from other packages into (A) — due to size — nor
>to commit generated files into (A) — due to hassles with branches.
>
>Going from (A) to (B) means pulling additional sources from the network.
>It could be implemented
>  - by "git submodule update --init", or
>  - by 'npm' for JavaScript packages, or
>  - by 'cargo' for Rust packages [11]
>and, for the localizations:
>  - essentially by a 'wget' command that fetches the *.po files.
>
>The proposed name of a script that does this is 'autopull.sh'.
>But I am equally open to a declarative YAML file instead of a shell script.

Another point of view is to give up on forcing the autopull part on
users -- instead we can mention the required dependencies in README and
let the user/packager worry about having them available.  At least as an
option.

The reason 

Re: [PATCH] gitlog-to-changelog: Make output reproducible.

2024-04-12 Thread Simon Josefsson via Gnulib discussion list
fre 2024-04-12 klockan 18:47 +0200 skrev Bruno Haible:
> The ChangeLogs are not random data. They are text files meant to be
> read
> and interpreted by humans. Shoving a "let's use GMT for everyone"
> attitude
> here is not the right way to handle the diversity of time zones.
> 
> There was a problem already before, in gitlog-to-changelog. Namely,
> what
> if a committer sits in California and the release manager, who
> creates the
> tarball, sits in England or Germany? The same effect as above would
> occur.
> But your change now made it even worse: Even the release manager
> cannot
> override the time zone.
> 
> The real fix, IMO, is to use 'git log --format=fuller', and convert
> the CommitDate (*) by removing the time zone. In the example above:
> 
>   CommitDate: Sat Mar 16 22:29:02 2024 -0700
>   -> 2024-03-16
> 
> This way, each committer's days will be correctly represented.

I agree with this: ChangeLog dates should correspond to the date when
it was commited by the commiter, and since ChangeLog entries doesn't
carry time zone data by nature it is local time.  That approach is also
fully reproducible, which was my main concern, and doesn't depend on
the release manager's time zone.  Indeed neither the old behaviour and
the current behaviour follow what I believe you and me agree on.  At
least the current behaviour is reproducible regardless of release
manager time zone.  I'll try to come up with a fix.

/Simon



signature.asc
Description: This is a digitally signed message part


[PATCH] gitlog-to-changelog: Make output reproducible.

2024-04-12 Thread Simon Josefsson via Gnulib discussion list
Hi

I ran into a reproducability problem in gitlog-to-changelog, and noticed
Guix people had also ran into this and worked around it outside of
gitlog-to-changelog:

https://issues.guix.gnu.org/70169/#21

However I don't think it makes sense for ChangeLog dates to depend on
the timezone under any circumstance.

One nit may be that I'm not certain the 'git log' command is time zone
dependent, but at least I could reproduce timezone problems before
applying this patch but not after applying it, so I believe the 'git
log' command already was not time zone dependent.  Test like this:

jas@kaka:~/src/gnulib$ build-aux/gitlog-to-changelog > foo
jas@kaka:~/src/gnulib$ TZ=UTC0 build-aux/gitlog-to-changelog > bar
jas@kaka:~/src/gnulib$ diff -ur foo bar

I have committed this.

/Simon
From dfb71172a46ef41f8cf8ab7ca529c1dd3097a41d Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Fri, 12 Apr 2024 17:25:16 +0200
Subject: [PATCH] gitlog-to-changelog: Make output reproducible.

* build-aux/gitlog-to-changelog: Use gmtime instead of localtime.
---
 ChangeLog | 5 +
 build-aux/gitlog-to-changelog | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index cdbb53ff8a..b20f69b06b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2024-04-12  Simon Josefsson  
+
+	gitlog-to-changelog: Make output reproducible.
+	* build-aux/gitlog-to-changelog: Use gmtime instead of localtime.
+
 2024-04-12  Bruno Haible  
 
 	gnulib-tool.py: Fix parsing of gl_LGPL in gnulib-cache.m4.
diff --git a/build-aux/gitlog-to-changelog b/build-aux/gitlog-to-changelog
index 16a9405a7c..e06106490c 100755
--- a/build-aux/gitlog-to-changelog
+++ b/build-aux/gitlog-to-changelog
@@ -35,7 +35,7 @@
 eval 'exec perl -wSx "$0" "$@"'
  if 0;
 
-my $VERSION = '2023-06-24 21:59'; # UTC
+my $VERSION = '2024-04-12 15:23'; # UTC
 # The definition above must lie within the first 8 lines in order
 # for the Emacs time-stamp write hook (at end) to update it.
 # If you change this file with Emacs, please let the write hook
@@ -360,7 +360,7 @@ sub git_dir_option($)
   ? '  (tiny change)' : '');
 
   my $date_line = sprintf "%s  %s$tiny\n",
-strftime ("%Y-%m-%d", localtime ($1)), $2;
+strftime ("%Y-%m-%d", gmtime ($1)), $2;
 
   my @coauthors = grep /^Co-authored-by:.*$/, @line;
   # Omit meta-data lines we've already interpreted.
-- 
2.39.2



signature.asc
Description: PGP signature


Re: ./bootstrap --gnulib-srcdir and GNULIB_REVISION

2024-04-11 Thread Simon Josefsson via Gnulib discussion list
Simon Josefsson via Gnulib discussion list  writes:

> My reaction was initially exactly the same as yours, until I found this
> piece of --help documentation, which actually is the first (and
> presumably highest priority) rule:
>
>  * If the environment variable GNULIB_SRCDIR is set (either as an
>environment variable or via the --gnulib-srcdir option), then sources
>are fetched from that local directory.  If it is a git repository and
>the configuration variable GNULIB_REVISION is set in bootstrap.conf,
>then that revision is checked out.
>
> So I think this combination is intended to be supported, it is just not
> working when a .gitmodules file is present in $CWD -- something that is
> not mentioned as a requirement.

Sorry, I found this piece of --help contradict the last sentence I wrote:

  If you maintain a package and want to pin a particular revision of the
  Gnulib sources that has been tested with your package, then there are
  two possible approaches: either configure a 'gnulib' submodule with the
  appropriate revision, or set GNULIB_REVISION (and if necessary
  GNULIB_URL) in bootstrap.conf.

So it seems having a git submodule for gnulib and using GNULIB_REVISION
at the same time is not supported.

FWIW, I'm going to experiment in libntlm and use GNULIB_REVISION in
bootstrap.conf instead of a git submodule, to allow a downloaded tarball
of git HEAD to be a supported way of building the package from source.
This hasn't worked historically for reasons discussed in this thread,
but given the xz backdoor I think there is value in supporting this
approach.

/Simon


signature.asc
Description: PGP signature


Re: autoreconf --force seemingly does not forcibly update everything

2024-04-11 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> On 4/10/24 13:36, Simon Josefsson via Gnulib discussion list wrote:
>> Is bootstrap intended to be reliable from within a tarball?  I thought
>> the bootstrap script was not included in tarballs because it wasn't
>> designed to be ran that way, and the way it is designed may not give
>> expected results.
>
> It's pretty routinely distributed, I expect under the theory that
> we're being transparent about what sources we use to generate the
> tarball.
>
> Whether it works from a tarball depends on one's definition of
> "works". Certainly more expertise and tools are needed to bootstrap
> than merely to configure + make.

The definition for "works" seems fairly permissive: running ./bootstrap
from, e.g., the coreutils 9.5 tarball dies instantly due to this:

  if test -n "$checkout_only_file" && test ! -r "$checkout_only_file"; then
die "Running this script from a non-checked-out distribution is risky."
  fi

I see that some projects (including coreutils) add bootstrap to
EXTRA_DIST, but I can't find any recommendation in the gnulib manual to
do that so I had assumed it is not something we recommend generally.  I
haven't added it to inetutils, libidn2, gsasl, etc.

/Simon


signature.asc
Description: PGP signature


Re: ./bootstrap --gnulib-srcdir and GNULIB_REVISION

2024-04-11 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi Simon,
>
>> Bug #2: ./bootstrap writes to the path indicated by --gnulib-srcdir with
>> the 'git checkout' command, and leaves the --gnulib-srcdir path at that
>> commit after ./bootstrap is finished.  This happens to work in my
>> example since I pointed it to a writable work tree, but I think altering
>> that path is unexpected and not documented.  Imagine pointing this to a
>> system-wide gnulib .git store like --gnulib-srcdir=/usr/share/src/gnulib
>> or similar read-only place.  Or imagine multiple ./bootstrap running at
>> the same time for different projects, both pointing to the same gnulib
>> .git work tree.  I think the path indicated by --gnulib-srcdir should be
>> read-only.
>> 
>> Should the 'git checkout' code be replaced with something like
>> 
>>   git clone --reference "$GNULIB_SRCDIR" "$gnulib_path" \
>>   && git checkout -C "$gnulib_path" $GNULIB_REVISION
>>   GNULIB_SRCDIR="$gnulib_path"
>> 
>> Discussion before suggesting patches would be useful, to establish some
>> agreement on how we want this to behave.
>
> You're right, --gnulib-srcdir and the $GNULIB_SRCDIR variable denote
>   "the local directory where gnulib
>sources reside.  Use this if you already
>have gnulib sources on your machine, and
>you want to use these sources."
> (I introduced the distinction between GNULIB_SRCDIR and GNULIB_REFDIR
> in commit 2122284380cc0d1b3b6f11d92c04652616da79c7.)
>
> Thus the behaviour you observed is a bug. Even worse, 'bootstrap' does
> it even when the option --no-git is given!
>
> How to reproduce:
>   $ git clone git://git.savannah.gnu.org/make.git
>   $ cd make
>   $ ./bootstrap --no-git --gnulib-srcdir=$GNULIB_SRCDIR
>
> I think the use of --gnulib-srcdir when GNULIB_REVISION is specified
> in bootstrap.conf is a classical example of conflicting requests.
> Which one should take precedence?
>   - IMO --gnulib-srcdir is documented in such a way that it takes
> precedence.
>   - But one may also argue that it should produce an error, to make
> the user aware of the conflict. Something like
> "The option --gnulib-srcdir cannot be honored together because the 
> package specifies a GNULIB_REVISION."
> The user should be able to resolve the conflict either way,
> by choosing different command-line options.

My reaction was initially exactly the same as yours, until I found this
piece of --help documentation, which actually is the first (and
presumably highest priority) rule:

 * If the environment variable GNULIB_SRCDIR is set (either as an
   environment variable or via the --gnulib-srcdir option), then sources
   are fetched from that local directory.  If it is a git repository and
   the configuration variable GNULIB_REVISION is set in bootstrap.conf,
   then that revision is checked out.

So I think this combination is intended to be supported, it is just not
working when a .gitmodules file is present in $CWD -- something that is
not mentioned as a requirement.

I would certainly agree that trying to understand the interaction
between:

   1) --gnulib-srcdir
   2) --gnulib-refdir
   3) GNULIB_REVISION
   4) --no-git
   4) building from a tarball with .gitmodules file
   5) building from a tarball without .gitmodules file
   6) building from a git clone with a .git sub-directory
   7) building from a git clone with an indirect .git file
   8) building with GNULIB_REVISION provided as an environment variable
  outside of bootstrap.conf

and maybe other factors is really complicated, and I have had to read
both --help and source code to feel close to understanding things.
Alas, GNULIB_REVISION is not documented in doc/gnulib.texi.

My impression is that the ./bootstrap script has gained a lot of
complexity that has evolved organically.  For some projects this
complexity is unwanted -- e.g., guile-gnutls is used early in the
bootstrapping of Guix and we eventually resolved to putting all needed
gnulib files in .git and used a naive ./bootstrap script:

https://gitlab.com/gnutls/guile/-/blob/master/bootstrap?ref_type=heads

/Simon


signature.asc
Description: PGP signature


./bootstrap --gnulib-srcdir and GNULIB_REVISION

2024-04-10 Thread Simon Josefsson via Gnulib discussion list
Hi

I'm trying to get ./bootstrap from a minimal source-only archive
generated via 'git archive' that has GNULIB_REVISION set in
bootstrap.conf, expecting this to work:

./bootstrap --gnulib-srcdir=/home/jas/src/gnulib

Bug #1: it seems GNULIB_REVISION in bootstrap.conf has no effect, and
this code is the reason (quoting bootstrap-funclib.sh):

  # XXX Should this be done if $use_git is false?
  if test -d "$GNULIB_SRCDIR"/.git && test -n "$GNULIB_REVISION" \
 && ! git_modules_config submodule.gnulib.url >/dev/null; then
(cd "$GNULIB_SRCDIR" && git checkout "$GNULIB_REVISION") || cleanup_gnulib
  fi

The reason is that the tarball has .gitmodules looking like this:

[submodule "gnulib"]
path = gnulib
url = https://git.savannah.gnu.org/git/gnulib.git

Which trigger the '! git_modules_config submodules.gnulib.url'.

The result is that GNULIB_REVISION is not respected, and I get whatever
gnulib code happens to be checked out in --gnulib-srcdir.

What's the reason for that check?  The logic here isn't that clear.  How
about simply using:

  if test -d "$GNULIB_SRCDIR"/.git && test -n "$GNULIB_REVISION"; then
(cd "$GNULIB_SRCDIR" && git checkout "$GNULIB_REVISION") || cleanup_gnulib
  fi

At least it seems like a bug that GNULIB_REVISION is not respected, the
--help output suggests this should work, which doesn't say anything
about got submodules affecting behaviour:

 * If the environment variable GNULIB_SRCDIR is set (either as an
   environment variable or via the --gnulib-srcdir option), then sources
   are fetched from that local directory.  If it is a git repository and
   the configuration variable GNULIB_REVISION is set in bootstrap.conf,
   then that revision is checked out.

I can work around bug#1 with the following:

rm .gitmodules
./bootstrap --gnulib-srcdir=/home/jas/src/gnulib

That result in the correct gnulib commit being used, and all is fine.

Bug #2: ./bootstrap writes to the path indicated by --gnulib-srcdir with
the 'git checkout' command, and leaves the --gnulib-srcdir path at that
commit after ./bootstrap is finished.  This happens to work in my
example since I pointed it to a writable work tree, but I think altering
that path is unexpected and not documented.  Imagine pointing this to a
system-wide gnulib .git store like --gnulib-srcdir=/usr/share/src/gnulib
or similar read-only place.  Or imagine multiple ./bootstrap running at
the same time for different projects, both pointing to the same gnulib
.git work tree.  I think the path indicated by --gnulib-srcdir should be
read-only.

Should the 'git checkout' code be replaced with something like

  git clone --reference "$GNULIB_SRCDIR" "$gnulib_path" \
  && git checkout -C "$gnulib_path" $GNULIB_REVISION
  GNULIB_SRCDIR="$gnulib_path"

Discussion before suggesting patches would be useful, to establish some
agreement on how we want this to behave.

/Simon


signature.asc
Description: PGP signature


Re: autoreconf --force seemingly does not forcibly update everything

2024-04-10 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Bernhard Voelker wrote:
>>  > Last month, I spent 2 days on prerelease testing of coreutils. If, after
>>  > downloading the carefully prepared tarball from ftp.gnu.org, the first
>>  > thing a distro does is to throw away the *.m4 files and regenerate the
>>  > configure script with their own one,
>>  >* It shows [...]
>> 
>> FWIW: especially the downstream builds of the 'coreutils' package have been
>> using 'autoreconf -fi' for a long time, because the upstream tools do not
>> have full I18N support, and the large I18N patch is in use e.g. at Fedora,
>> openSUSE and Debian probably since >15 years.
>
> Sure, if downstream applies a patch that modifies bootstrap.conf, they need
> to rerun 'bootstrap'.

Is bootstrap intended to be reliable from within a tarball?  I thought
the bootstrap script was not included in tarballs because it wasn't
designed to be ran that way, and the way it is designed may not give
expected results.  Has this changed, so we should recommend maintainers
to 'EXTRA_DIST = bootstrap bootstrap-funclib.sh bootstrap.conf' so this
is even possible?  I recall some project already added something to that
effect, but I'm not sure if that is something gnulib supports?

/Simon


signature.asc
Description: PGP signature


Re: autoreconf --force seemingly does not forcibly update everything

2024-04-01 Thread Simon Josefsson via Gnulib discussion list
Guillem Jover  writes:

> But if as a downstream distribution I explicitly request everything to
> be considered obsolete via --force, then I really do want to get whatever
> is in the system instead of in the upstream package. Because then I
> can fix things centrally in a distribution dependency package, instead
> of having to wait for upstreams to do so, for example, or because then I
> have a higher chance of building from known system files instead of
> local stuff.

I think that is the perception that leads to problems: 'autoreconf -f
-i' will not give you what you are looking for even if we make '-f' pull
in *.m4 files regardless of serial number.  There will be other files
that are not updated from the central system package.  It was never
intended as a re-bootstrapping tool.

Nick Bowler  writes:

> On 2024-04-01 16:43, Guillem Jover wrote:
>> But if as a downstream distribution I explicitly request everything
>> to be considered obsolete via --force, then I really do want to get
>> whatever is in the system instead of in the upstream package.
>
> If I distribute a release package, what I have tested is exactly what is
> in that package.  If you start replacing different versions of m4 macros,
> or use some distribution-patched autoconf/automake/libtool or whatever,
> then this you have invalidated any and all release testing.
>
> This is fine, modifying a package and distributing modified versions
> are freedoms 1 and 3, but if it breaks you keep both pieces.
>
> The aclocal --install feature should be seen as a feature to help update
> dependencies as part of the process of preparing a modified version, not
> something that should ever be routinely performed by system integrators.
>
> GNU/Linux distributions have a long history of buggy backports to the
> autotools.  For a recent example, Gentoo shipped a broken libtool 2.4.6
> which included a patch to make Gentoo installs go faster but if you
> prepared a package with this broken libtool version, the resulting
> package would not build on HP-UX, oops.

Indeed, I think distributions are using autoconf tools in unintended
ways, and when repeatedly being told so, the response has usually been
"we know what we are doing, and want it our way for portability".
Usually because distribution packagers find it easier to run 'autoreconf
-fi' than to patch config.guess etc to support new targets.

That said, co-operation between GNU and Debian has been historically
poor, and we should try to collaborate and keep the lowest common
denominator between the projects as healthy as possible.  So if there is
something that can be improved, let's do that, but let's base it on good
design rather than "I want it my way".  I think the fundamental feature
request -- to re-bootstrap all generated files -- from Guillem is a fair
request, and arguable the GNU project has not been helpful to provide
this historically.  Instead the approach has been "we want various files
to be included in the tarball for portability".  This approach is still
important for porting to new targets, but has a cost in that it makes
auditing harder.  I believe we can support both the old way (*.tar.gz
with pre-generated content, impossible to fully re-bootstrap reliably
without risks) and a new way (*-src.tar.gz with just source code) at the
same time.  This appears to me more reliable than to fix all kind of
re-bootstrapping problems with 'autoreconf -f -i'.

/Simon


signature.asc
Description: PGP signature


Re: autoreconf --force seemingly does not forcibly update everything

2024-04-01 Thread Simon Josefsson via Gnulib discussion list
Eric Blake  writes:

> Widening the audience to include bug-gnulib, which is the upstream
> source of "# build-to-host.m4 serial 3" which was bypassed by the
> malicious "# build-to-host.m4 serial 30".
>
> On Sun, Mar 31, 2024 at 11:51:36PM +0200, Guillem Jover wrote:
>> Hi!
>> 
>> While analyzing the recent xz backdoor hook into the build system [A],
>> I noticed that one of the aspects why the hook worked was because it
>> seems like «autoreconf -f -i» (that is run in Debian as part of
>> dh-autoreconf via dh) still seems to take the serial into account,
>> which was bumped in the tampered .m4 file. If either the gettext.m4
>> had gotten downgraded (to the version currently in Debian, which would
>> not have pulled the tampered build-to-host.m4), or once Debian upgrades
>> gettext, the build-to-host.m4 would get downgraded to the upstream
>> clean version, then the hook would have been disabled and the backdoor
>> would be inert. (Of course at that point the malicious actor would
>> have found another way to hook into the build system, but the less
>> avenues there are the better.)
>> 
>> I've tried to search the list and checked for old bug reports on the
>> debbugs.gnu.org site, but didn't notice anything. To me this looks like
>> a very unexpected behavior, but it's not clear whether this is intentional
>> or a bug. In any case regardless of either position, it would be good to
>> improve this (either by fixing --force to force things even if
>> downgrading, or otherwise perhaps to add a new option to really force
>> everything).
>> 
>> [A] 
>> Longish mail, search for "try to go in detail" for the analysis.
>
> My understanding is that the use of serial numbers in .m4 snippets was
> intentional in gnulib (more or less where the practice originated),
> but only because gnulib prefers a linear history (everything is
> monotonically increasing, no forks for the serial number to diverge
> on).  In light of this weekend's mess, Bruno may have more ideas about
> how to prevent his files from being turned into backdoor delivery
> mechanisms in the future.

I think the root cause here is assuming 'autoreconf -fi' achieves
anything related to re-bootstrapping.  I think the entire concept of
re-bootstrapping from a source tarball with generated contents in it is
fundamentally flawed.  I have proposed that we should start to release
*-src.tar.gz tarballs that doesn't have any pre-generated in it, that
can be completely bootstrapped using external tools.  See writeup here:

https://blog.josefsson.org/2024/04/01/towards-reproducible-minimal-source-code-tarballs-please-welcome-src-tar-gz/

To me, moving things towards this approach allows incremental work that
eventually will be more reliable than anything that attempts to
re-boostrap from a tarball with some pre-generated artifacts in it
(because there will always be uncertainty if the artifact used was
actually built or came from the tarball).

I suggest that we extend 'make dist' to produce these *-src.tar.gz
tarballs, possibly only when some new automake AM_INIT_AUTOMAKE flag is
used.  There could be some functions to modify how the tarball is
generated, much like we have dist-hooks today that is often used to
generate ChangeLog for the tarballs.  Thoughts?

/Simon


signature.asc
Description: PGP signature


Re: [PATCH] maint: Allow gnulib's readutmp module to use systemd.

2024-03-24 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi Simon and Collin,
>
>> > Could putting the following into bootstrap.conf be a method that
>> > we could recommend?  Then developers can override it with
>> > GNULIB_TOOL_IMPL=sh ./bootstrap if they want.
>> > 
>> > GNULIB_TOOL_IMPL=${GNULIB_TOOL_IMPL:-py}
>> 
>> I'd like to hear what Bruno thinks about this idea. I think this might
>> be a good starting point for real-world testing. Maybe we can disable
>> it by default in bootstrap.conf, but leave a comment saying it is a
>> work-in-progress and experimental?
>
> It's simpler than that: The GNULIB_TOOL_IMPL environment variable was
> designed in such a way that no autogen.sh and no bootstrap.conf needs
> modifications.
>
> As of today, any developer can set this environment variable to 'py'
> or 'sh+py' and see whether they get regressions.
>
> In a short while (when the test suite passes and Collin has tried it
> with a few more GNU packages), the likelihood of such regressions will
> be small, and it will be possible to *recommend* it.
>
> In a longer while, we will make GNULIB_TOOL_IMPL=py the default, and
> there will be nothing else to recommend, because everyone will get
> the benefit of the speedup.
>
> So, Simon, as a package maintainer:
>   - You can try it yourself,
>   - You can spread the work to your co-maintainers,
>   - But it's pointless to modify your autogen.sh or bootstrap.conf files.

While I agree with everything you said, I think that miss some common
use-cases for when ./bootstrap is used, including: 1) non-regular
developers who we want to be able to build things as quickly as possible
with no additional documentation and git clone + ./bootstrap +
./configure should work so they can contribute and write a patch easily,
and 2) continous integration building of projects, which is where I
think we would most likely catch any regressions in gnulib-tool.py the
quickest, at least for my projects.

I think it would be nice to find a method that we can recommend from the
gnulib project to other projects that wants to opt-in to gnulib-tool.py
today, so that everyone building these opt-in projects get exposed to
gnulib-tool.py and we (indirectly) get bug reports about any problems
early.

/Simon


signature.asc
Description: PGP signature


Re: [PATCH] maint: Allow gnulib's readutmp module to use systemd.

2024-03-23 Thread Simon Josefsson via Gnulib discussion list
(moving to bug-gnulib)

Collin Funk  writes:

> On 3/22/24 2:18 PM, Simon Josefsson wrote:
>> Upgrading inetutils to use gnulib-tool.py would be nice.  As a start, I
>> bumped the gnulib submodule.
>
> Bruno and I are still working on it with a test suite. We want the
> file output and stdout output to be the same before we recommend using
> it. Then we won't get many bug reports for the same issue and we can
> test that new changes don't break expected behavior.
>
> With that in mind, I was curious to see how it worked and figured I
> should share.

Thank you!

Is there a way to opt-in inetutils to prefer python gnulib-tool, before
gnulib as a whole changes its default behaviour?  I think doing that
will allow better testing of gnulib-tool.py in the wild until gnulib as
a whole can change.  This way, we can migrate a bunch of projects to
gnulib-tool.py and get real-world testing of how it works over time for
many months.  I would be happy to do this for a bunch of projects
(libidn, libidn2, oath-toolkit, inetutils, libtasn1, gsasl, libntlm,
etc).  Maybe this was already discussed and I forgot.

Hmm.  Could putting the following into bootstrap.conf be a method that
we could recommend?  Then developers can override it with
GNULIB_TOOL_IMPL=sh ./bootstrap if they want.

GNULIB_TOOL_IMPL=${GNULIB_TOOL_IMPL:-py}

/Simon


signature.asc
Description: PGP signature


Re: gnulib-tool: Obey environment variable GNULIB_TOOL_IMPL

2024-03-15 Thread Simon Josefsson via Gnulib discussion list
Collin Funk  writes:

>> But in the current state, it fails for nearly every command. There's
>> no hope that you can expect identical results from the two implementations
>> as long as there are still items in the TODO list.
>
> Yes. I am working on it. I've added the following lines to my
> ~/.profile:
>
> GNULIB_TOOL_IMPL="sh+py"
> export GNULIB_TOOL_IMPL

Wow it is very much faster!  \o/

jas@kaka:~/src/oath-toolkit$ time env GNULIB_TOOL_IMPL=sh make -f cfg.mk 
glimport 
...
real0m44,908s
user0m47,285s
sys 0m8,306s
jas@kaka:~/src/oath-toolkit$ git clean -d -x -f; git restore --worktree 
--staged .
jas@kaka:~/src/oath-toolkit$ time env GNULIB_TOOL_IMPL=py make -f cfg.mk 
glimport 
...
real0m0,746s
user0m0,593s
sys 0m0,154s
jas@kaka:~/src/oath-toolkit$ 

OATH Toolkit fails with sh+py though...  isn't the --local-dir part
working?  It doesn't notice the patches in liboath/gl/override.  Also it
seems to "forget" gl_LGPL([2]) in gnulib-cache.m4.

From a clean checkout, you want to run this command to run all
gnulib-tool invocations for OATH Toolkit:

make -f cfg.mk glimport

/Simon


signature.asc
Description: PGP signature


Re: planning for beta-testing gnulib-tool.py

2024-03-11 Thread Simon Josefsson via Gnulib discussion list
I like the plan to replace gnulib-tool with a faster implementation, and
a two-year migration phase sounds reasonable to see if it will work in
practice.

Trying gnulib-tool.py on OATH Toolkit (which use a somewhat unorthodox
gnulib usage style by adding code into git) results in error below.  I
thought it was a missing mkdir at some point, but I couldn't find a
solution... ideas?

git clone https://gitlab.com/oath-toolkit/oath-toolkit.git
cd oath-toolkit
echo $GNULIB_REFDIR
/home/jas/src/gnulib  # 3088ee223bb986ad51d7c71ca64aaf4b600bc06c
$GNULIB_REFDIR/gnulib-tool.py --add-import
Module list with included dependencies (indented):
File list:
  lib/dummy.c
  m4/00gnulib.m4
  m4/gnulib-common.m4
  m4/zzgnulib.m4
/home/jas/src/gnulib/gnulib-tool.py: *** could not create file 
/home/jas/src/gnulib/lib/dummy.c
/home/jas/src/gnulib/gnulib-tool.py: *** Stop.

/Simon


signature.asc
Description: PGP signature


Re: planning for beta-testing gnulib-tool.py

2024-03-10 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> I guess we are thinking about slightly different things:
>
>   * (A) I am thinking about
> - for P in { coreutils, gettext, ... }, taking a frozen(!) checkout of P,
>   removing irrelevant source files (esp. all *.h, *.c, documentation, 
> etc.),
> - and a frozen(!) set of gnulib modules at a specific time point,
> - and merely invoke gnulib-tool and compare the generated files and 
> stdout.
>
>   * (B) You seem to be thinking about
> - for P in { coreutils, gettext, ... }, taking the current git of P
>   (or latest release of P),
> - taking the current set of gnulib modules,
> - and invoke not only gnulib-tool, but also './configure' and make.
>
> I think that
>   - With either approach, the confidence to any change in gnulib-tool will be
> the same.
>   - With approach (A), when we make a change to gnulib-tool, we need to commit
> new expected test results, which is quite easy. No effort otherwise.
>   - With approach (B), we will get failures for other reasons as well: when
> a gnulib module has changed in an incompatible way; when the git 
> repository
> of P has moved; when package P itself is broken. Sounds like a continuous
> effort to hunt down (mostly) false positives.

Right, I agree!

I wonder when/if we could get rid of gnulib-tool.sh?  Maintaining both
would be time-consuming.  Maybe we have to declare some features no
longer supported if they cannot be implemented easily in
gnulib-tool.py...

/Simon


signature.asc
Description: PGP signature


Re: planning for beta-testing gnulib-tool.py

2024-03-10 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> 5) Possibly it makes also sense to allow GNULIB_TOOL_IMPL to be set to
>'sh+py'. In this case the script will make a full copy of the destination
>dir, run the shell implementation and the Python implementation on the
>two destination dirs, separately, and compare the results (again, both
>in terms of effects on the file system, as well as standard output).
>And err out if they are different.

Generally I'm happy to hear about speedupds of gnulib-tool!  The plan
sounds fine.  I think this step 5) is an important part to get
maintainers try the new implementation, and report failures that needs
to be looked into.  If there was a small recipe I can follow to get a
diff that can be reported back, I would run it for a bunch of projects
that I contribute to.

While a self-test suite for gnulib-tool would be nice, some real
regression testing by attempting to build a bunch of real-world projects
that rely on gnulib-tool may be simpler to realize.  If there is a CI/CD
that builds ~30 different real-world projects (perhaps at known-good
commits) and compares the output against an earlier known-good build,
for each modification to gnulib-tool in gnulib, that would give good
confidence to any change to gnulib-tool.

/Simon


signature.asc
Description: PGP signature


Re: gnulib-tool caching

2024-02-19 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Simon Josefsson wrote:
>> is it possible to design a reliable
>> caching mechanism?  Something similar to CONFIG_SITE for autoconf?
>
> CONFIG_SITE is not reliable; that's the problem with it...
>
>> I find that ./gnulib-tool takes a long time and 95% of the time I use
>> it, it ended up doing exactly the same thing as it did last time I ran
>> it: copying a set of possibly patched files out of the gnulib directory.
>
> I use gnulib-tool with option --symlink in such cases. As long as the
> module descriptions don't change, you don't need to re-run gnulib-tool
> then.

My usage pattern is to frequently re-bootstrap from a clean git checkout
to confirm that my changes still work properly for fresh rebuilds.  The
reason is that I don't trust 'make clean', 'make distclean', etc.  That
may not be a common work flow, but I'm pretty committed to it.  I now
remember that something like this was discussed before:

https://git.savannah.gnu.org/cgit/libidn.git/commit/?id=9ae53e866a6fafa56db26d184ccae9c39dae7446
https://lists.gnu.org/archive/html/bug-gnulib/2021-05/msg00077.html

I don't recall if I actually had any real problems with that approach...
however I got a faster laptop and stoped using it to mimize any
deviation from standard gnulib workflows.

Maybe a hook below would allow further experiments?  I am not proposing
to commit this now since it is relatively easy for me to experiment with
improvements by adding this to the bootstrap.conf script on projects
that I work on.  The GNULIB_BOOTSTRAP_SITE also allows me to change my
setup without altering bootstrap.conf for every project I work on.

/Simon

diff --git a/top/bootstrap-funclib.sh b/top/bootstrap-funclib.sh
index 9e40f4a3e4..9b8972f56c 100644
--- a/top/bootstrap-funclib.sh
+++ b/top/bootstrap-funclib.sh
@@ -194,6 +194,7 @@ bootstrap_sync=false
 # Make sure that bootstrap.conf is sourced from the current directory
 # if we were invoked as "sh bootstrap".
 conffile=`dirname "$me"`/bootstrap.conf
+test -r "$GNULIB_BOOTSTRAP_SITE" && . "$GNULIB_BOOTSTRAP_SITE"
 test -r "$conffile" && . "$conffile"
 
 # - Build-time prerequisites -


signature.asc
Description: PGP signature


Re: [PATCH] gnulib-tool.py: Fix function call on incorrect object.

2024-02-19 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

>> What is the status of the Python gnulib tool? I'm not sure how far
>> behind it is compared to the shell script but it seems like it would
>> be much faster. I would say more maintainable but I might just be bad
>> at writing shell scripts. :)
>
> Yes, it's the hope that it will be faster that is the main motivation
> behind the Python rewrite.

Orthogonal to a rewrite in python: is it possible to design a reliable
caching mechanism?  Something similar to CONFIG_SITE for autoconf?

I find that ./gnulib-tool takes a long time and 95% of the time I use
it, it ended up doing exactly the same thing as it did last time I ran
it: copying a set of possibly patched files out of the gnulib directory.

How about logic like this:

. $GNULIB_SITE
if test -d $gnulib_cache_dir; then
  rsync -av $gnulib_cache_dir .
else if test -n "$gnulib_cache_dir"; then
  mkdir $savedir
  rsync -av . $savedir

  # do whatever gnulib normally is doing

  # compare . with $savedir, saving a copy of each modified
  # file into $gnulib_cache_dir
fi

then I could put something like this into a $GNULIB_SITE script:

if test -z "$gnulib_cache_dir"; then
hash=`echo $PWD|md5sum|cut -d' ' -f1`
my_cache_dir=$HOME/.cache/gnulib.site
gnulib_cache_dir=$my_cache_dir/cache.`basename $PWD`.$hash
test -d $gnulib_cache_dir || mkdir -p $gnulib_cache_dir
fi

/Simon


signature.asc
Description: PGP signature


Re: syntax-check rule to silence -Winclude-next-absolute-path warning

2024-02-19 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> --- a/top/maint.mk
> +++ b/top/maint.mk
> @@ -503,6 +503,7 @@ sc_prohibit_have_config_h:
>  # Nearly all .c files must include .  However, we also permit this
>  # via inclusion of a package-specific header, if cfg.mk specified one.
>  # config_h_header must be suitable for grep -E.
> +# Rationale: The Gnulib documentation, node 'Include '.

Having a way to learn the rationale for a syntax-check is a really good
idea!  There are some checks that I struggle to understand the point of,
and one that I simply disagree with (sc_prohibit_strcmp).  Having a link
to discussion helps to determine how to deal with errors.

What do you think about:

   1) using a URL to the gnulib online manual instead?  For most users,
  that allows easier lookup, and for people who are really offline,
  the URL contains sufficient detail to find the relevant in the
  manual, and

   2) print the rationale link as part of the error message instead of a
  comment in the code

?

I think 'make syntax-check' is one of the powerful and under-appreciated
aspects of gnulib, so improving its usability can help make it more
used.

/Simon


signature.asc
Description: PGP signature


Re: Let's remove Gnulib's ctime module

2024-02-10 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Paul Eggert wrote:
>> >> So CTIME_BUFSIZE should be 35?
>> > With 50 years of computer science experience, we should have learned
>> > the lesson to allocate more room than sounds necessary*now*. If
>> > now you think 35 will be sufficient for all times, then we should
>> > better choose twice that value: 70.
>> 
>> We needn't be that pessimistic. We can use something like this:
>> 
>>#define CTIME_BUFSIZE \
>>  (sizeof "Wed Jun 30 21:49:08 \n" \
>>   + INT_STRLEN_BOUND (time_t) - 7)
>
> This formula reflects *today*'s expectations. My point is that we
> should be prepared for the unexpected events. Such as the U.S.
> adopting the French Republican calendar with its longer month
> names (germinal, brumaire, etc.). Or that the abbreviation and
> padding habits change.

I think support for those unexpected events belong in your more properly
designed str_to_time API family, rather than in a smallest safe
replacement for the existing US/English/Gregorian/ISO8601-centric poorly
designed ctime API, that for simplicity should continue to be
US/English/Gregorian/ISO8601-centric.

That means we ought to encourage people to consider the str_to_time API
when they fix existing occurances of ctime.  It seems some code will
need the predictable behaviour from the safer_ctime/strctime API I'm
thinking of, and some code will prefer a more human-friendly approach
that str_to_time can support.

/Simon


signature.asc
Description: PGP signature


Re: Let's remove Gnulib's ctime module

2024-02-10 Thread Simon Josefsson via Gnulib discussion list
Another attempt below, with known open issues:

1) it seems something has to be said about tz variables, either the
function "always sets" them, "never sets" them, or (in new text below)
"may set" them depending on what other functions are called.  Not
optimal, but better than not documenting it.  We can inspect the
implementation later and change this too.

2) can we implement this in a way that it never fails?  I still allow
return==NULL to indicate errors below, until we can confirm that it is
possible to implement this in a way that cannot fail.  Returning "magic"
values like "1970-01-01" seems worse than NULL to me, since then callers
will need to do string comparisons to catch error situations.

3) below it says that nothing can be assumed about thread safety (beyond
that it depend on an environment variable), which seems a bit
sub-optimal, but let's see how this ends up being implemented and if we
can say something better.  Saying that nothing can be assumed about
thread safety is better than not saying anything, IMO.

4) Bruno suggested not documenting anything about week/month names, what
calendar is used, the year < 1 handling or expected output string
lengths -- I tend to disagree: at least my goal is for this function to
be a drop-in well-defined superset for ctime.  For ctime those
properties are either specified and documented (and then we want to
document that this function is compatible) or left
undefined/undocumented (and then we want to provide well-defined
portable documented semantics).  The later problem with ctime seems to
be the reason we need to introduce a safer variant in the first place.

5) Naming.  I'm okay with 'safer_ctime' but still think it is ugly.
Bruno's suggestion of 'c_strnl_from_time' sounds better to me, even
though it is a mouthful.  How about 'strtime.h' and 'strctime'?  If
there is a need to offer a drop-in for asctime, then 'strasctime' is
relevant.  This seems more in line with existing C stdlib functions.

/Simon

/* strtime.h -- safe versions of time-related string functions.
   Copyright (C) 2024 FSF
   Authors: Paul Eggert, Bruno Haible, Simon Josefsson
   License: LGPL-2+
 */

#include 
#include 

/* This evaluates to 35 on typical machines today, and will grow
   automatically if time_t gets wider - it could even exceed 70 if
   needed.  7 = floor(log10(60*60*24*365)). */
#define STRCTIME_BUFSIZE \
(sizeof "Wed Jun 30 21:49:08 \n" \
 + INT_STRLEN_BOUND (time_t) - 7)

/* Convert WHEN representing the number of seconds related to epoch,
   1970-01-01 00:00:00 + (UTC), to a fixed locale-independent
   NUL-terminated string such as "Wed Jun 30 21:49:08 1993\n\0",
   relative to the user's specified timezone (TZ environment variable),
   using abbreviations for the days of the week as "Sun", "Mon", "Tue",
   "Wed", "Thu", "Fri", and "Sat" and abbreviations for the months as
   "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct",
   "Nov", and "Dec".  The function may set the external variables
   tzname, timezone, and daylight (see tzset(3)) with information about
   the current timezone.  The output is copied into STR which should
   have room for at least STRCTIME_BUFSIZE bytes.  If STR is NULL, a
   pointer to a global statically pre-allocated buffer of size
   STRCTIME_BUFSIZE is used instead.  For years 1000 to  inclusive
   the output string length is 26 characters including the final NUL
   byte.  The string length may be shorter for some years before 1000,
   and larger for years after  or before -999.  The years are not
   padded with whitespace or zeros, so valid outputs include strings
   "Wed Jun 30 21:49:08 623\n" and "Wed Jun 30 21:49:08 11147\n", and
   for negative years strings such as "Wed Jun 30 21:49:08 -42\n".  The
   preloptic Gregorian calendar is used for all years, to cover years
   before the Gregorian calendar was adopted; and for years before 1 the
   ISO 8601 approach to have years 2, 1, 0, -1, and so on is used
   instead of having 2 BC, 1 BC, AD 1, AD 2.  On systems with a 64-bit
   time_t type, the year value may be large as in strings looking like
   "Sun Sep 16 01:03:52 -292471206706\n\0", and future systems with
   larger time_t types may lead to even longer strings.  If WHEN cannot
   be converted into a string, NULL is returned and errno is set to an
   error, otherwise on success STR (or a pointer to the global buffer)
   is returned.  The result depends on the environment variable TZ; no
   further Thread safety attributes can be reliably assumed about this
   function. */

char *strctime (time_t when, char *str);


signature.asc
Description: PGP signature


Re: Let's remove Gnulib's ctime module

2024-02-09 Thread Simon Josefsson via Gnulib discussion list
How about this (or gl-ctime?):

/* safer-ctime.h -- safer version of ctime().
   Copyright (C) 2024 FSF
   Authors: Paul Eggert, Bruno Haible, Simon Josefsson
   License: LGPL-2+
 */

#define SAFER_CTIME_BUFSIZE 35

/* Convert WHEN representing the number of seconds before/after epoch,
   1970-01-01 00:00:00 + (UTC) to a fixed locale-independent
   NUL-terminated string such as "Wed Jun 30 21:49:08 1993\n\0",
   relative to the user's specified timezone, using abbreviations for
   the days of the week as "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", and
   "Sat" and abbreviations for the months as "Jan", "Feb", "Mar", "Apr",
   "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", and "Dec".  The
   function does not set the external variables tzname, timezone or
   daylight, see tzset(3).  The output is copied into STR which should
   have room for at least SAFER_CTIME_BUFSIZE bytes.  For years 1000 to
    inclusive the output string length will be 26 characters
   including the final NUL byte.  The string length may be shorter for
   years before 1000 and larger for years after .  The years are not
   padded with whitespace or zeros, so valid outputs include strings
   "Wed Jun 30 21:49:08 623\n" and "Wed Jun 30 21:49:08 11147\n", and
   for negative years strings such as "Wed Jun 30 21:49:08 -42\n".  The
   preloptic Gregorian calendar is used for all years, to cover years
   before the Gregorian calendar was adopted; and for years before 1 the
   ISO 8601 approach to have years 2, 1, 0, -1, and so on is used
   instead of having 2 BC, 1 BC, AD 1, AD 2.  On systems with a 64-bit
   time_t type, the year value may be large as in strings looking like
   "Sun Sep 16 01:03:52 -292471206706\n\0", and future systems with
   larger time_t types may lead to even longer strings.  If WHEN cannot
   be converted into a string, NULL is returned and errno is set to an
   error, otherwise on success STR is returned.  The function's Thread
   safety attribute value is MT-Safe env locale. */

char *safer_ctime (time_t when, char *str);

/Simon


signature.asc
Description: PGP signature


Re: Let's remove Gnulib's ctime module

2024-02-08 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> and we can package this up into a function like this:
>
>   char c[CTIME_BUFSIZE];
>   safer_ctime (c, *tp);
>
> if people prefer simplicity.

Yes please.  Using complex APIs to implement safer_ctime is fine, but I
would prefer to not make existing ctime code more complicated when
fixing the undefined problem with (as)ctime(_r).

I would prefer having a global static variant since several uses of
ctime is in non-thread global application context where even the above
is unnecessarily complicated and wasteful of stack space.

A term "safe" is subjective (safer than what?  does it preclude even
safer variants?), how about using the 'c_' prefix?

So unless I'm missing something I envision these new functions:

   #define CTIME_BUFSIZE

   char *c_asctime(const struct tm *tm);
   char *c_asctime_r(const struct tm *tm, char *buf);

   char *c_ctime(const time_t *timep);
   char *c_ctime_r(const time_t *timep, char *buf);

that would never trigger undefined behaviour due to the year with 64-bit
time_t, and maybe fix some other underspecified aspect of the existing
functions.

CTIME_BUFSIZE should allow room for all possible years for 64-bit time_t
when time_t is 64-bit, and all possible years for 128-bit time_t when
time_t is 128-bit...  support for sizeof (time_t) > 8 may be unsupported
by gnulib though -- I'm not sure any reasonable system will ever use
128-bit time_t, although it is conceivable that some system may not have
efficient integers less than 128-bit and then sizeof (time_t) == 16
would make sense, I guess.  If that is even permitted.

64-bit time_t signed is sufficient for years -292471206707
.. 292471210647 or something like that.  So I suppose the longest
possible strings we could see with 64-bit time_t would be this:

"Sun Sep 16 01:03:52 -292471206707\n\0"

Even a 64-bit unsigned year would fit in the same string:

1970+2^64/60/60/24/365 =  584942419325
1970+2^63/60/60/24/365 =  292471210647
1970-2^64/60/60/24/365 = -584942415385
1970-2^63/60/60/24/365 = -292471206707

So CTIME_BUFSIZE should be 35?

It would be a nice property that all possible time_t values can be
converted into some human readable string that can be converted back to
same time_t value, for the same time zone.

I don't think anyone will care strongly what the string format is as
long as it follows the "Sun Sep 16 01:03:52 1973\n\0" format for all
valid 32-bit time_t values.  Cut-off years for comparison:

1970-2^32/60/60/24/365 = 1834
1970-2^31/60/60/24/365 = 1902
1970+2^31/60/60/24/365 = 2038
1970+2^32/60/60/24/365 = 2106

Thanks Bruno for fixing the primitives!  They are important to have as
tooling for any higher level functions.

Btw, how does 'difftime' handle 64-bit time_t?  I suppose it depends on
size of double.  It seems difftime cannot return errors, and doesn't
document undefined behaviour for input values:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/difftime.html

/Simon


signature.asc
Description: PGP signature


Re: Let's remove Gnulib's ctime module

2024-02-06 Thread Simon Josefsson via Gnulib discussion list
You convinced me inetutils (and many other programs) has real bugs
related to ctime today that should be fixed.  Now I want to figure out
what the best fix to existing code is.

Paul Eggert  writes:

>> Another idea is to have gnulib's ctime augment the C standard to have
>> ctime not be undefined but to return shorter and longer strings, which I
>> believe is still consistent with the C standard?
>
> I would look askance at any Gnulib implementation of ctime that does
> this sort of thing. The ctime API is so poorly designed that callers 
> should use some other API. This is partly why C23 is deprecating
> ctime. Gnulib shouldn't encourage ctime's continued use.

If gnulib provides a simple to use replacement with clear documented
semantics and interface, and a clear upgrade path from current ctime, it
seems okay to give up on trying to use the ctime API.

Perhaps more than one upgrade path is needed, to accomodate different
situations: the inetutils examples illustrate some different needs.

If we do a good job here, it may serve as a template solution for this
problem elsewhere.  I see some systems migrate 32-bit time_t to 64-bit
time_t, and if not done carefully, that may introduce reliance on
undefined behaviour for years <1000 and > when calling ctime that
wasn't there before.

> Perhaps we could a new module c_nstrftime, which acts like nstrftime
> but operates in the C locale. That should suffice to replace all uses
> of ctime relatively easily.

Yes, although I would prefer a wrapper to hide the complex strftime
format string needed.

How about the API below?

I'm not confident about the timezone handling: maybe it should set the
tzset variables?  And maybe a c_nctime_r would be useful to provide the
timezone TZ to use?  I'm also not certain about year 0 handling.

/* Convert TIMEP representing the number of seconds elapsed since epoch,
1970-01-01 00:00:00 + (UTC), to a fixed locale-independent string
such as "Wed Jun 30 21:49:08 1993\n" using abbreviations for the days of
the week as "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", and "Sat" and
abbreviations for the months as "Jan", "Feb", "Mar", "Apr", "May",
"Jun", "Jul", "Aug", "Sep", "Oct", "Nov", and "Dec".  The function does
not set the external variables tzname, timezone or daylight, see
tzset(3).  The output is copied into STR which must have room for at
least LEN bytes.  For years 1000 to  inclusive the needed length
will be 26 characters including the final NUL byte, but the required
length may be shorter for years < 1000 and larger for years > .  The
years are not padded with whitespace or zeros, so valid outputs include
strings such as "Wed Jun 30 21:49:08 623\n" for years <1000 and for
years > strings such as "Wed Jun 30 21:49:08 11147\n" and for
negative years strings such as "Wed Jun 30 21:49:08 -42\n".  The
preloptic Gregorian calendar is used for all years, to cover years
before the Gregorian calendar was adopted; and for years before 1 the
ISO 8601 approach to have years 2, 1, 0, -1, and so on is used instead
of having 2 BC, 1 BC, AD 1, AD 2.  If TIMEP cannot be converted into a
string of size LEN, NULL is returned and errno is set to an error,
otherwise on success STR is returned. */

char *c_ctime_r (time_t timep, char *str, size_t len);

/Simon


signature.asc
Description: PGP signature


Re: Let's remove Gnulib's ctime module

2024-02-05 Thread Simon Josefsson via Gnulib discussion list
Here are some examples of ctime usage in GNU InetUtils, starting with
inetd (a single-threaded application):

https://git.savannah.gnu.org/cgit/inetutils.git/tree/src/inetd.c?id=aba8d6528e2577eee7fafab3c418ee5bd94c096b#n1710

This prints day of time of the system.  While we could rewrite that to
use strftime, that would complicate the code to support years < 1000 and
years >  as far as I understand.

Another one is in logger, also a single-threaded application:

https://git.savannah.gnu.org/cgit/inetutils.git/tree/src/logger.c?id=aba8d6528e2577eee7fafab3c418ee5bd94c096b#n301

The code discards day of week and year, but is otherwise okay for years
1000...

Another one is in syslogd:

https://git.savannah.gnu.org/cgit/inetutils.git/tree/src/syslogd.c?id=aba8d6528e2577eee7fafab3c418ee5bd94c096b#n1211
https://git.savannah.gnu.org/cgit/inetutils.git/tree/src/syslogd.c?id=aba8d6528e2577eee7fafab3c418ee5bd94c096b#n1343

There is another use inside libls, which is a bit more interesting:

https://git.savannah.gnu.org/cgit/inetutils.git/tree/libls/print.c?id=aba8d6528e2577eee7fafab3c418ee5bd94c096b#n268

Overall, I'm not sure rewriting these to use sprintf/strftime is a clear
improvement, as the code would be uglier.  Maybe some wrapper will help.
I don't like having code that rely on or trigger undefined behaviour
though.

Could gnulib's ctime replacement call abort() when year<1000 or
year>?

Another idea is to have gnulib's ctime augment the C standard to have
ctime not be undefined but to return shorter and longer strings, which I
believe is still consistent with the C standard?

For example it would be permitted to return strings like "Wed Jun 30
21:49:08 623\n" or "Wed Jun 30 21:49:08 11147\n".

Callers will need to make sure they handle string lengths != 26 though,
but that could be documented for the gnulib replacement.

I think this solution make sense for these examples, and is somewhat
more in line with what the original wish may have been: "give me a
english-centric human string representation of this time_t value in a
known fixed format".

/Simon


signature.asc
Description: PGP signature


Re: Let's remove Gnulib's ctime module

2024-02-05 Thread Simon Josefsson via Gnulib discussion list
mån 2024-02-05 klockan 00:59 -0800 skrev Paul Eggert:
> On 2024-02-05 00:16, Simon Josefsson wrote:
> > didn't see anything in your patch that would warn about usage of
> > ctime?
> > Would it make sense for a gnulib ctime module to NOT replace ctime
> > but
> > warn that this function should really not be used?
> 
> The time-h module does that, so there's no need for the ctime module
> for 
> that.
> 
> As I recall, Gnulib's ctime module fixes some time zone issues on 
> MS-Windows, something that's quite low priority for a function that
> user 
> code shouldn't be calling anyway due to ctime's undefined behavior
> when 
> the time_t arg is out of range.

Okay -- I noticed several ctime() uses in GNU InetUtils (and in
somewhat hibernating GNU Shishi..) and will see if we can fix those. 
People seems to be porting GNU InetUtils to Windows so there may be
interest in having this working.

/Simon



signature.asc
Description: This is a digitally signed message part


Re: Let's remove Gnulib's ctime module

2024-02-05 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> This recent bug relating to ctime suggests that the ctime module is
> more trouble than it's worth now. I propose that we remove
> it. Proposed patch attached (but not installed).

Intresting approach -- I don't mind changing any ctime calls to strftime
in code I come across, however I worry about not noticing these.  I
didn't see anything in your patch that would warn about usage of ctime?
Would it make sense for a gnulib ctime module to NOT replace ctime but
warn that this function should really not be used?  Via header macros,
maybe a stub ctime that calls abort, and maybe a 'make syntax-check'
test.  What do you think?

/Simon


signature.asc
Description: PGP signature


Re: Copyright year update

2024-01-01 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> On 2024-01-01 16:08, Bernhard Voelker wrote:
>> That commit broke the 'update-copyright' tests, because the test script
>> got messed up.
>
> Thanks for reporting that. Turing would have been amused by
> update-copyright modifying its own test, and then failing the modified 
> test. I installed the attached to immunize the test against the
> program it tests.

Indeed =)  Thanks for report Bernhard and patch Paul!

/Simon


signature.asc
Description: PGP signature


Re: Behaviour of strverscmp(3)

2024-01-01 Thread Simon Josefsson via Gnulib discussion list
Thanks for report, Dmitry.  I am slowly coming back to this.  I have
noticed that Cygwin (via MSYS2) has the same strverscmp as musl:

https://cygwin.com/cgit/newlib-cygwin/tree/newlib/libc/string/strverscmp.c

Compare against musl strverscmp:

https://git.musl-libc.org/cgit/musl/tree/src/string/strverscmp.c

Since gsasl (and many other projects) gets strverscmp() from gnulib, I'm
cc'ing the bug-gnulib list.  I think gnulib should detect and work
around this buggy strverscmp.  The documentation says the function is
missing on all non-glibc platforms, but this is not the case, see:

https://www.gnu.org/software/gnulib/manual/html_node/strverscmp.html

I don't have time to work on a patch for gnulib now, but this e-mail
will serve as a reminder... but happy if someone else has ideas on how
to resolve it in gnulib.  See reproducer below; I recall seeing other
problems too such as strverscmp("1.7", "1.7") behaving different.

Compare to the glibc/gnulib implementation:

https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/strverscmp.c

/Simon

Dmitry Bogatov  writes:

> Hello.
>
> While trying to building gsasl statically with musl library as part of
> Nixpkgs distribution, I noticed that test built from tests/version.c
> fails when built with musl library. After a bit of troubleshooting, I
> can pinpoint the reason -- different behaviour of "strverscmp" from
> glibc and musl.
>
> Example code:
>
> #include 
> #include 
>
> int main()
> {
>   int value = strverscmp("UNKNOWN", "2.2.0");
>   printf("%d\n", value);
>   return 0;
> }
>
> Under glibc value "35" is printed (positive), under musl value "-1" is
> printed (negative). Not sure what is the correct solution for the
> issue, so I cross-post into two lists.
>
> For now I plan to patch-out this particular test. Thank you.
>
>


signature.asc
Description: PGP signature


Copyright year update

2024-01-01 Thread Simon Josefsson via Gnulib discussion list
Happy hew year!

I was greeted with the seasonal

copyright_check
./gnulib/lib/version-etc.c
maint.mk: out of date copyright in ./gnulib/lib/version-etc.c; update it

in several projects, and did a copyright year bump.

/Simon


signature.asc
Description: PGP signature


[PATCH] announce-gen: Improve links.

2023-12-29 Thread Simon Josefsson via Gnulib discussion list
Hi,

I noticed http:// links were used here...  patch below is pushed.

Happy new year,
/Simon

* build-aux/announce-gen: Use https:// URLs.
---
 ChangeLog  | 5 +
 build-aux/announce-gen | 6 +++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4c5a678323..a3a10e0258 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2023-12-29  Simon Josefsson  
+
+   announce-gen: Improve links.
+   * build-aux/announce-gen: Use https:// URLs.
+
 2023-12-29  Bruno Haible  
 
error: More clang -Winclude-next-absolute-path silencing.
diff --git a/build-aux/announce-gen b/build-aux/announce-gen
index 4056d443b0..c73871f022 100755
--- a/build-aux/announce-gen
+++ b/build-aux/announce-gen
@@ -35,7 +35,7 @@
 eval 'exec perl -wSx "$0" "$@"'
  if 0;
 
-my $VERSION = '2023-07-17 20:05'; # UTC
+my $VERSION = '2023-12-29 18:26'; # UTC
 # The definition above must lie within the first 8 lines in order
 # for the Emacs time-stamp write hook (at end) to update it.
 # If you change this file with Emacs, please let the write hook
@@ -570,10 +570,10 @@ $first_name [on behalf of the $package_name maintainers]
 ==
 
 Here is the GNU $package_name home page:
-http://gnu.org/s/$package_name/
+https://gnu.org/s/$package_name/
 
 For a summary of changes and contributors, see:
-  http://git.sv.gnu.org/gitweb/?p=$package_name.git;a=shortlog;h=v$v1
+  https://git.sv.gnu.org/gitweb/?p=$package_name.git;a=shortlog;h=v$v1
 or run this command from a git-cloned $package_name directory:
   git shortlog v$v0..v$v1
 
-- 
2.34.1



signature.asc
Description: PGP signature


Re: test-argp and clang's ASAN

2023-12-08 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Simon Josefsson wrote:
>> Looking at many traditional GNU tools, it seems --help strings uses
>> lower-case so can we settle on that?
>
> From what I've seen, there are widely used programs in either camp.
> Picking some programs at random:
>
> Lowercase:
> as, cp, bash, bison, ldd, iconv, diff, xgettext, sudo, emacs, tar
>
> Capitalized:
> objdump, gcc, gcal, vim, localedef, modprobe

Indeed.  Maybe harmonizing this is not feasible or even useful use of
our time.  Some of the tools above aren't even using upper/lower
consistently.  It would be nice to recommend one approach going forward
though, and put that in the GNU Coding Standards.  While it seems there
are more tools using lower-case I would personally prefer to recommend
upper case and sentences with final '.'.  Few tools seems to use that,
though, only gcc and 'bash -c "help set"' that I could find, so
established practice argues against that...

/Simon


signature.asc
Description: PGP signature


Re: test-argp and clang's ASAN

2023-12-08 Thread Simon Josefsson via Gnulib discussion list
Jeffrey Walton  writes:

>> What should we do?
>>   (A) Ensure that glibc and gnulib argp behave the same:
>>   - Push Sergey's lowercase commit into glibc?
>>   - Revert Sergey's lowercase commit in gnulib?
>> or
>>   (B) Ensure that gnulib overrides glibc:
>>   - Use '#define argp_parse rpl_argp_parse' so that clang doesn't
>> insert its interceptor?
>
> I don't think capital/lower-case matters. The docs for argp defer to
> GNU Coding Standards [1], and I don't see a treatment in the GNU
> Coding Standards. [2,3]
>
> Maybe a third option is, perform a case insensitive compare. Using
> sentence-case does not materially change the result. I.e., the message
> is conveyed and nothing is broken. So why produce a failure?

I think doing a case-insensitive compare is a good idea short term,
however I would prefer converging glibc and gnulib argp more.

Looking at many traditional GNU tools, it seems --help strings uses
lower-case so can we settle on that?

/Simon


signature.asc
Description: PGP signature


Re: [PATCH] base32, base64: disallow non-canonical encodings

2023-10-27 Thread Simon Josefsson via Gnulib discussion list
Pádraig Brady  writes:

> However if there are good use-cases for bad inputs
> we may need to adjust this patch,
> rather than failing unconditionally.
>
> For example we could just flag non canonical input in the context,
> and leave it up to the caller how to deal with that.

That adds complexity -- I'd prefer to just default to fail and see if we
get complaints.

> It would be good to know an example of good use-cases
> for bad inputs though, as I can't think of any.

The simplest example of good use-case is to be able to decode existing
incorrectly formatted inputs.  However I think this is one that could be
defered to other tools for that purpose, since generally this is not a
trivial feature and it is a slippery slope to support all needs.

This may becomes a problem if user failure happens at a very high level
and doing the low-level base64-decoding separately is not feasible in an
application, but let's see...

/Simon


signature.asc
Description: PGP signature


Re: [PATCH] base32, base64: disallow non-canonical encodings

2023-10-27 Thread Simon Josefsson via Gnulib discussion list
Pádraig Brady  writes:

> To give a little more context, this will avoid
> round trip issues like the following, by failing early:
>
>   $ echo "HelloWorld==" | base64 -d | base64
>   HelloWorlQ==

Thanks for background and patches!  There are use-cases for bad inputs
(both for good and malicious purposes), but I believe these should be
considered corner-cases and agree that the default should be to reject
them.

/Simon


signature.asc
Description: PGP signature


Code indentation?

2023-09-10 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> * Commit b93de66735cd6f935ee0970f8cb26908d113e09d introduced mcel.h, but
>   it has tabs. Can we untabify
> mcel.h
> mountlist.c
> verify.h
>   (as we do with all source files that are not shared with glibc)?

We may have discussed this before, but what do you think about
automating running 'indent' on gnulib source code?  Or clang-format,
some projects have switched because it behaves better for them, and GNU
indent releases are far between (which may be a good thing because then
we don't have to re-indent code with every new release).

One way forward could be to:

- Somehow set up a way to identify non-indented code - 'make
  indent-check'?

- Run it continously, maybe through some CICD step with output to a web
  page.

- Have an exclusion list to opt out of the report.

- Manually go through each identified indentation mis-match and propose
  a fix and check if the module maintainer agrees with it, then either
  commit the indentation fix or add the file to the exclusion list.

This will take a long time, and I'm not sure it is a good idea, but at
least we would then have a process for code indentation style
conformance in gnulib.

Maybe there are other approaches to this too.

/Simon


signature.asc
Description: PGP signature


Re: relocatable-lib-lgpl: Don't export symbols from static MSVC .obj files

2023-09-07 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> -#define LIBFOO_DLL_EXPORTED __attribute__((__visibility__("default")))
> -#elif (defined _WIN32 && !defined __CYGWIN__) && BUILDING_SHARED && 
> BUILDING_LIBFOO
> -#define LIBFOO_DLL_EXPORTED __declspec(dllexport)
> -#elif (defined _WIN32 && !defined __CYGWIN__) && BUILDING_SHARED
> -#define LIBFOO_DLL_EXPORTED __declspec(dllimport)
> +# define LIBFOO_DLL_EXPORTED __attribute__((__visibility__("default")))
> +#elif (defined _WIN32 && !defined __CYGWIN__) && @@BUILDING_SHARED@@ && 
> BUILDING_LIBFOO
> +# if defined DLL_EXPORT
> +#  define LIBFOO_DLL_EXPORTED __declspec(dllexport)
> +# else
> +#  define LIBFOO_DLL_EXPORTED
> +# endif
> +#elif (defined _WIN32 && !defined __CYGWIN__) && @@BUILDING_SHARED@@
> +# define LIBFOO_DLL_EXPORTED __declspec(dllimport)

Hi Bruno.  The idea is that this code snippet would go into the public
header file that is installed in /usr/include on people's system, so
using @@BUILDING_SHARED@@ in it does not seem to work.  I think some
other technique or improved documentation is needed here.  I must admit
I'm not sure I understand the entire background here, and just looking
at the end result.

If there is no way in CPP to know if we're building code that will use
libfoo as a shared library (symbols DLL_EXPORT or PIC?) I think it would
be acceptable for the public header file to default to setting things up
for using a shared library, but allow the user to specify a
-DLIBFOO_STATIC_BUILD=1 or similar if she wants to build with a static
libfoo.

/Simon


signature.asc
Description: PGP signature


[PATCH] announce-gen: Allow using local git user.name.

2023-07-17 Thread Simon Josefsson via Gnulib discussion list
Hi.  I think announce-gen should use the username provided by the
per-repo .git/config rather than the global ~/.gitconfig.

I hit this corner on build farms where I don't want to modify files
outside of the repository.

/Simon
From 6928b3b86169bf5d265f745ff5f93eb21181a59e Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Mon, 17 Jul 2023 22:07:57 +0200
Subject: [PATCH] announce-gen: Allow using local git user.name.

* build-aux/announce-gen (readable_interval): Remove --global
parameter to 'git config' call.
---
 ChangeLog  | 6 ++
 build-aux/announce-gen | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 88de38a2ae..a19162d1bf 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2023-07-17  Simon Josefsson  
+
+	announce-gen: Allow using local git user.name.
+	* build-aux/announce-gen (readable_interval): Remove --global
+	parameter to 'git config' call.
+
 2023-07-17  Bruno Haible  
 
 	mbuiter: Optimize.
diff --git a/build-aux/announce-gen b/build-aux/announce-gen
index 850619a121..4056d443b0 100755
--- a/build-aux/announce-gen
+++ b/build-aux/announce-gen
@@ -35,7 +35,7 @@
 eval 'exec perl -wSx "$0" "$@"'
  if 0;
 
-my $VERSION = '2023-02-26 17:15'; # UTC
+my $VERSION = '2023-07-17 20:05'; # UTC
 # The definition above must lie within the first 8 lines in order
 # for the Emacs time-stamp write hook (at end) to update it.
 # If you change this file with Emacs, please let the write hook
@@ -545,7 +545,7 @@ EOF
   my $v0 = $prev_version;
   my $v1 = $curr_version;
 
-  (my $first_name = `git config --global user.name|cut -d' ' -f1`)
+  (my $first_name = `git config user.name|cut -d' ' -f1`)
 =~ m{\S} or die "no name? set user.name in ~/.gitconfig\n";
 
   chomp (my $n_ci = `git rev-list "v$v0..v$v1" | wc -l`);
-- 
2.34.1



signature.asc
Description: PGP signature


Re: new modules string-desc, xstring-desc, string-desc-quotearg

2023-03-31 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Thanks for the inputs and feedbacks. Here are the new modules
>   string-desc,
>   xstring-desc,
>   string-desc-quotearg
> that I'm adding.

Thank you for making it library-friendly!  I will try to find some
package and migrate to these tools, rather than to use custom similar
variants.

/Simon


signature.asc
Description: PGP signature


Re: RFC: add a string-desc module

2023-03-27 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

>   struct
>   {
> size_t nbytes;
> char * data;
>   }
>
> I propose to add a module that adds such a type, together with elementary
> functions that work on them.

I think this is a useful contribution, however I see two deal-breakers
for having it in gnulib -- both related to use in libraries.  I think
string helpers types/functions like this is useful not only in
applications but also in libraries.  Thus:

 1) License - there really isn't much novelty here, how about making
 this public domain or LGPLv2+?

 2) Applicability to use in a library - using x*alloc and abort is
 frowned upon in libraries.  Libraries should return error codes on
 expected errors (and I argue memory allocation failure is an expected
 error), and not cause application exits.

What do you think?

One way to resolve 2) is to have two variants of this functionality: one
low-level variant that doesn't abort the application on errors, and one
high-level variant that behaves like your implementation.  The
high-level variant could depend on the low-level variant, but that's not
essential.

/Simon


signature.asc
Description: PGP signature


Re: Release management: how do you update the libtool version information?

2023-03-07 Thread Simon Josefsson via Gnulib discussion list
tis 2023-03-07 klockan 14:20 +0100 skrev Bruno Haible:
> Simon Josefsson wrote:
> > Consider adjusting your habit to update the libtool version
> > directly
> > AFTER a release instead.  I put the following in cfg.mk to make
> > sure I
> > don't forget this:
> > 
> > sc_libtool_version_bump:
> > @git diff v$(PREV_VERSION).. | grep '^+AC_SUBST(LT' >
> > /dev/null
> > 
> > Of course, you still have to bump it if you make any API/ABI
> > changes,
> 
> Is a maintainer not _more_ likely to forget about the libtool
> versioning, when he has a rule like that, that makes him think "I'm
> already on the safe side, since I have done it already"?

Yes, maybe -- although but approaches are fragile and depend on
maintainer attention, so they are both sub-optimal.

I have used libabigail's abidiff to find API/ABI differences with good
results -- however, I don't know of a good way to check libtool shared
library versionining information against it.  Some brain storming how
it would work:

1) On 'make check' (or distcheck, or similar), do a abidiff of the
newly built library against a previously stored *.abi file and exit
with failure on differences (or if no file exists).

2) Add README notes to instruct maintainers to add known good new *.abi
files named like libfoo-x86_64-1-2-3.abi where 1-2-3 is the libtool-
version when any API/ABI-changes are made, or when libtool version is
bumped.  Maybe the '-3' part shouldn't be part of the filename.

3) for bonus points: Add some consistency check that the diff follows
libtool-semantics for ADDED vs MODIFIED ABI differences.

Not all API/ABI changes results in abidiff-changes, though, although
these days I think it is generally considered a bad idea to bump
libtool shared library version for anything but pure symbol changes. 
For semantical changes, introduce new APIs and deprecate the old ones
instead.

/Simon



signature.asc
Description: This is a digitally signed message part


Re: Release management: how do you update the libtool version information?

2023-03-07 Thread Simon Josefsson via Gnulib discussion list
Vivien Kraus  writes:

> Dear gnulib people,
>
> How do you manage the libtool version information for a library using
> gnulib? For now, I have it written down explicitly in configure.ac.
> Unfortunately, this requires a new commit to bump the numbers before
> each release.
>
> Gnulib provides a script to help update the libtool version
> information. Is there a way to involve that script in the "make
> release-commit" invocation? It is a little awkward to create a commit
> just to bump the libtool version information, or to squash it with the
> commit created by "make release-commit".
>
> My current solution involves a bit of cheating: fix do-release-commit-
> and-tag not to complain about a dirty tree, and have the libtool update
> already staged when running make release-commit.
>
> Am I missing something here? How do you update the libtool version
> information?

Consider adjusting your habit to update the libtool version directly
AFTER a release instead.  I put the following in cfg.mk to make sure I
don't forget this:

sc_libtool_version_bump:
@git diff v$(PREV_VERSION).. | grep '^+AC_SUBST(LT' > /dev/null

Of course, you still have to bump it if you make any API/ABI changes,
but with this approach, you don't need to bump it just before each
release.  And you can have the latest released version installed and
co-exist nicely with the development branch too.

/Simon


signature.asc
Description: PGP signature


Re: Gnulib and nullptr

2023-02-06 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Therefore, I would be in favour of EITHER
> * doing this when the community as a whole has adopted 'nullptr' in C, i.e.
>   this keyword is no longer something that is new to an average newcomer,
>   (even if that's only 10 years from now),
> OR
> * doing the change only in those places where it actually matters, that is,
>   in varargs argument lists.

I agree with this conclucsion -- and pending 1) above, I believe 2) is
sufficient and I would argue that we should all generally continue to
use NULL in all other cases than varargs because it is a well-known
idiom.  This may cause 1) above to never occur, which seems acceptable.
This assumes there aren't other important use-cases for nullptr than
varargs that aren't clear.  Personally I don't believe consistency with
C++ is important (usually this makes C code uglier and less idiomatic in
my experience) but opinion may vary.

/Simon


signature.asc
Description: PGP signature


Re: maint.mk: announcement should not be emailed to the TP when there are no changes

2023-02-04 Thread Simon Josefsson via Gnulib discussion list
Reuben Thomas  writes:

> Using the standard gnulib release procedure for GNU projects,
> coordina...@translationproject.org is automatically emailed for each
> release, but apparently this is not desired. I received this from Benno
> Schulenberg :
>
> My scripts find zero changes in the msgids.
>>
>> When there are no changes in the POT file, there is no need
>> to announce the (pre)release to me -- it just creates extra,
>> unneeded work for me.
>>
>
> Of course, I can try to catch this manually, but I presume other projects
> are also sending unwanted email like this (unless I'm doing something
> wrong?); can something be done in gnulib?

I manually remove that Cc when not appropriate, or add the following to
cfg.mk for projects without translations:

translation_project_ =

however I realize this is manual work, so I agree with you it could be
improved.

To automatically understand if a cc to the translation project is
necessary or not, I think we'd need access to the previously sent pot
file, no?  I guess we could wget that from somewhere and compare it, but
that seems a bit fragile.  I generally don't like fetching files from
the Internet durings builds, so another approach is to store them in git
and make a comparison there.

Sorry, no real answer, but just some additional thoughts.

/Simon


signature.asc
Description: PGP signature


Re: fts: Document this module

2023-01-19 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> The 'fts' module was not documented in the documentation, so far.

The gnulib fts is different from glibc API, and they can return
different results when called the same way.  See end of earlier thread
here:

https://lists.gnu.org/archive/html/bug-gnulib/2021-07/msg00070.html

I'm not sure what could be done.  Perhaps adding a sentence to the
gnulib documentation stating that the gnulib fts is not a drop-in
replacement for missing fts functionality but a separate implementation
with different behaviour.

/Simon


signature.asc
Description: PGP signature


Re: RFC: git-commit based mtime-reproducible tarballs

2023-01-16 Thread Simon Josefsson via Gnulib discussion list
Vivien Kraus  writes:

> However, there are situations in which you only have access to a
> shallow clone of the git repository (for instance, Gitlab CI). I am not
> sure how this solution would work in that case.

Indeed, good point.  I think 'make dist' should continue to work in
shallow clones, with its obvious consequences (incomplete ChangeLog,
non-deterministic mtime of version controlled files, anything else?).

/Simon


signature.asc
Description: PGP signature


Re: RFC: git-commit based mtime-reproducible tarballs

2023-01-16 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Paul Eggert wrote:
>> some users want to "trust but verify" and a reproducible 
>> tarball is easier to audit than a non-reproducible one, so for these 
>> users it can be a win to omit the irrelevant data from the tarball.
>
> Reproducibility can be implemented in different ways:
>   - by omitting irrelevant data from the tarball,
>   - by having a customized comparison program 'diff', such that
> "diff --ignore-irrelevant-metadata contents1 contents2"
> would ignore the irrelevant parts.

The problem with a --ignore-irrelevant-metadata approach is that it will
be a judgement call what is irrelevant, and two projects may have
different philosophies that are mutually incompatible.

A devils advocate case: consider a build-system that embeds the
source-code timestamp information in the binary, and the binary sends of
a hash of its executable binary to a remote server for verification
purposes.  In some projects this may be what you want to achieve.  Then
ignoring this particular metadata will be a critical failure for that
project.

I think it is a worthy goal to reach a tarball that is deterministically
and one-way reproducable from git source code [for the same set of tool
versions].

>> when I do an 'ls 
>> -l' of a source directory that I got from a distribution tarball, it's 
>> useful to see the last time the contents of each source file was changed 
>> upstream.
>
> OK, now we're discussing different ways to make a tarball reproducible.
> That's nice, because Simon's proposal was to make all timestamps equal,
> and that puts me off.
> In binutils-2.40.tar.bz2 all files are from 2023-01-14.
> In android-studio-2021.3.1.17-linux.tar.gz all files are from 2010-01-01.
> It gives me as a user no idea whether this tarball is 13 years old,
> 2 years old, or from yesterday.
>
> I much prefer Paul's approach, since it still conveys meaningful
> timestamps:

I agree!

I even wonder if the binutils tarball build properly on say HP-UX then?

>> For TZDB, where users have long wanted reproducibility, I use something 
>> like this in a Makefile recipe for each source file $$file:
>> 
>>time=`git log -1 --format='tformat:%ct' $$file` &&
>>touch -cmd @$$time $$file
>
> That's good for the files that are under version control.
>
>> 2. What about platform-independent files that are automatically created 
>> from source files from the repository, and that are shipped in the 
>> release tarball?
>
> For these, you could unpack the tarball, see in which order the timestamps
> are, and then assign artificial timestamps, in the same order but exactly
> 2 seconds apart. For example, if the tarball contains
> under version control:
>   hello.c 2023-01-14 13:28:14
>   configure.ac2023-01-01 14:03:07
> and not under version control:
>   configure   2023-01-15 04:09:10
>   config.h.in 2023-01-15 04:05:19
> then you would determine the
>   max_timestamp_under_vc = max { 2023-01-14 13:28:14, 2023-01-01 14:03:07 }
>  = 2023-01-14 13:28:14
> and then, since config.h.in is older than configure:
>   touch -m (max_timestamp_under_vc + 2 seconds) config.h.in
>   touch -m (max_timestamp_under_vc + 4 seconds) configure
>
> You can do this without knowing the Makefile rules or scripts which created
> config.h.in and configure.
>
> The increment of 2 seconds is, of course, for VFAT file systems, which have
> only 2 seconds of resolution for file modification times.

Clever!

To implement this we would need a dist-hook to do the 'touch -m ...'
dance on all files.

I somewhat fear that the solution here will be more of a problem than
the original problem due to the complexity.

Does anyone see a problem with this approach?  Do you think it is a good
idea?  I like it and don't see any further problems, except for the
complexity but I don't see a way to reduce it.

/Simon


signature.asc
Description: PGP signature


Re: RFC: git-commit based mtime-reproducible tarballs

2023-01-16 Thread Simon Josefsson via Gnulib discussion list
Hi Bruno,

> Hi Simon,
>
>> >   This attempts to make
>> >   reproducible tarballs by sorting the files and passing the
>> >   "--mtime=" option to tar. ...
>> Having the same mtime on all files in a tarball
>
> First question: What is the point of doing that?

Good question, I don't know the motivation for the binutils people.  For
me, the motivation would be to get rid of arbitrary/random differences
in non-source artifacts.  Those makes auditing non-source for
reproducibility more difficult and error prone, and even a source for
side-channels.  I think the exact motivations are still not fully
understood and that this is an evolving space -- articulating the goals
is useful to measure if we will actually meet them.

To me this is similar to including a build timestamp in a binary.  In
theory it would not cause any problems for anyone, but in practice it
will be one more source of differences that may hide or complicate
finding other more important differences.  Thus from a helicopter
perspective it does make sense to fix that particular non-reproducible
behaviour even though it is difficult to argue that the timestamp by
itself is a serious bug that is important to fix.

> Reproducibility is about verifying that an artifact A was generated
> from a source S.

Right, and I think the proponents of reproducability suggest that an
even stronger verification should be possible: that there is a
one-to-one correspondence between source S and artifact A [for a
particular environment where A is relevant].

If different artifacts A can be generated from the same source S this
will be a source of unreproducability and non-deterministic behaviour,
which ultimately can be a security/safety/reliability problem.

> When I, as a GNU maintainer or uploader, create a tarball and upload it
> to ftp.gnu.org, that tarball is the source S. Because that's what I sign
> with my GPG key. The commits in the git repo aren't the source, and even
> the git checkout on my disk aren't the source — because I am free to
> unpack and repack the tarball as I like, before I upload it to ftp.gnu.org.

Yeah, and I think this is what is being challenged recently -- some
people don't consider tarballs the only relevant source code any more.

To me this makes some sense: we all have tried to fix a small bug in a
package by making changes to some source code, and then see the build
fail catastrophically and sometimes in ways that can't even be resolved
because the necessary tools or source codes were forgotten from the
released tarball.

I think it is good practice to verify that our tarballs can be
regenerated reproducibly from version controlled sources and free tools.

>> 1) Having the same mtime on all files in a tarball may cause problems
>
> Definitely. HP-UX 'make' attempts to rebuilds a file Y that depends on
> a file X, if Y and X have the same timestamp (mtime). It is long known
> that you have to have actually different timestamps for some files.

Interesting -- I wonder if supporting HP-UX [without GNU make] is worth
more than the benefits from reproducible tarballs.

/Simon


signature.asc
Description: PGP signature


RFC: git-commit based mtime-reproducible tarballs

2023-01-15 Thread Simon Josefsson via Gnulib discussion list
Hi.  Quoting the recent binutils announcement:

>   As an experiment these tarballs were made with the new "-r "
>   option supported by the src-release.sh script.  This attempts to make
>   reproducible tarballs by sorting the files and passing the
>   "--mtime=" option to tar.  The date used for these tarballs was
>   obtained by running:
>   
> git log -1 --format=%cd --date=format:%F bfd/version.m4

This got me thinking about git-version-gen and GNUmakefile, and I came
up with the patch below to use the most recent commit as the timestamp
for all files in the tarball.  What do you think?

There are some concerns about this:

1) Having the same mtime on all files in a tarball may cause problems
for some projects that have fragile dependency-systems.  While I think
all dependency checks really should be using >= timestamp tests, I
wouldn't rule out that some use > timestamp tests, which would cause
(sometimes unwanted) rebuilding of some files.  Are there
dependency-constructs where the same mtime for all files in a tarball is
just a bad idea, with no better approach available?

2) The use of TAR_OPTIONS in GNUmakefile is complex and somewhat hard to
debug.  I can't find any cleaner way to provide options to tar for 'make
dist' though.  Automake defines $(AMTAR) but looks like an internal
symbol which also isn't used (bug?), instead $(am__tar) is used and
defined as am__tar = $${TAR-tar} chof - "$$tardir".  So we can override
TAR in Makefile.am but it looks like a user-variable that we shouldn't
override.  So pending support for a AMTAR (or AM_TAR?) variable in
Makefile.am that actually works, I guess we are stuck with the
TAR_OPTIONS approach.  We could do 'TAR = env TAR_OPTIONS_=... tar' in
Makefile.am but it looks like the wrong approach.

3) The Makefile.am snippet in git-version-gen is difficult to maintain,
can't we put such snippets in a gnulib-owned file and suggest use of
'include gl/top-gl-Makefile.am-include.mk' instead?  The same applies to
gen-ChangeLog rule.  The logic would have to be a bit more complex to
support per-project modifications to these rules though.

Two small bugs that are possible to fix but not important before we know
if mtime-reproducible tarballs is useful or not:

4) If there is no .version file when you type 'make dist' my patch below
would fail to provide --mtime=... to tar.  So it fails if you didn't do
'make' before 'make dist' after ./bootstrap + ./configure in a clean
checkout.

5) It is also a bit fragile that it assume 'git log -1' works without
checking for errors before invoking touch.

/Simon

diff --git a/build-aux/git-version-gen b/build-aux/git-version-gen
index a72057bf2c..0a98cb12dd 100755
--- a/build-aux/git-version-gen
+++ b/build-aux/git-version-gen
@@ -66,6 +66,7 @@ scriptversion=2022-07-09.08; # UTC
 # BUILT_SOURCES = $(top_srcdir)/.version
 # $(top_srcdir)/.version:
 #  echo '$(VERSION)' > $@-t
+#  touch -m -d @$(shell git log -1 --format=%cd --date=unix) $@-t
 #  mv $@-t $@
 # dist-hook:
 #  echo '$(VERSION)' > $(distdir)/.tarball-version
diff --git a/top/GNUmakefile b/top/GNUmakefile
index 07b331fe53..f0dd41b5b4 100644
--- a/top/GNUmakefile
+++ b/top/GNUmakefile
@@ -25,8 +25,14 @@
 _gl-Makefile := $(wildcard [M]akefile)
 ifneq ($(_gl-Makefile),)
 
+_gl-.version := $(wildcard .version)
+ifneq ($(_gl-.version),)
+_tar_mtime := --mtime=.version
+endif
+
 # Make tar archive easier to reproduce.
-export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name
+export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name \
+   $(_tar_mtime)
 
 # Allow the user to add to this in the Makefile.
 ALL_RECURSIVE_TARGETS =


signature.asc
Description: PGP signature


test-stat-time fails on AFS?

2022-12-12 Thread Simon Josefsson via Gnulib discussion list
Hi.  We got a bug report about a gnulib self-test failure:

https://gitlab.com/oath-toolkit/oath-toolkit/-/issues/30

Summarizing, this is the failure:

test-stat-time.c:186: assertion 'statinfo[2].st_mtime < statinfo[1].st_ctime || 
(statinfo[2].st_mtime == statinfo[1].st_ctime && (get_stat_mtime_ns 
([2]) < get_stat_ctime_ns ([1])))' failed

The user is building it on an AFS file system, which most likely
explains the failure (although that has to be confirmed).  While the
explanation may be that AFS is not providing POSIX-semantics for the
file system, I think failing a self-test for this reason is too strong:
it could print an error message, or SKIP the test.  Thoughts?

/Simon


signature.asc
Description: PGP signature


Re: explicit_bzero and -std=c99

2022-11-28 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

>> 2) It seems explicit_bzero.c in gnulib fall backs to using 'asm' for
>> GCC, which isn't working in non-GNU modes of gcc.  Further wondering:
>
> I hope I fixed this particular problem by installing the attached.

Thank you!

> Perhaps Gnulib's other uses of asm should also be changed?

Yes I think we should '__asm__' instead of 'asm' for the reason
explained by the gcc manual that Bruno linked to.

>> 3) Is the idiom of using separate functions bzero() vs explicit_bzero()
>> to avoid security-problematic compiler optimization still a good one?
>
> Yes

If so, I would prefer a read_sensitive_file() API instead of read_file()
with a flag to enable the security-sensitive functionality.  I'll leave
it for the future, as this the immediate problem is resolved.

Bruno Haible  writes:

>> 1) Does gnulib support building with gcc -std=c99?  I think we should,
>> but it could have documented missing functionality or breakage.
>
> No, Gnulib does not support this. We freely use GCC extensions,
> such as ({...}) or asm, usually conditionalized with
>   defined __GNUC__ || defined __clang__
> Only in math.in.h and xalloc-oversized.h we also test __STRICT_ANSI__.
>
> We could test __STRICT_ANSI__ also in more places, but what really is the
> point? So that people then complain "the asyncsafe-spin and simple-atomic
> tests fail for me"?
>
> The point of '-std=c99' is to verify that the source code is pure ISO C
> without any extensions. Gnulib is not in that category.

Your answer is a bit different from Paul's, and both seems like
reasonable approaches to me.  This may be a situation where sometimes we
make a small effort of being compatible with -std=c99 and sometimes
decide against it.  I think what could help is a bit more documentation
about this problem.  Building gnulib with -std=c99 and fixing some of
the minor issues will likely help future compatibility of code, so I
think we should make small efforts to comply.  I agree that there is
likely some parts of gnulib that simply don't work in C99-mode --
documenting what they are would be useful.

In libtasn1, we want to support C89 environments since it is such a
low-level and bootstrap-relevant library.  At least for the library, the
command-line tool doesn't have to be C89-compatible IMHO.

>>1) The reason for having explicit_bzero is read_file, which needs it
>>for reading sensitive files, a feature we don't use.  Uncoupling this
>>unnecessary dependency would have been nice.
>
> No, we have explicit_bzero because it's a glibc function that we think
> should be available to programs on all OSes.
> 

Sorry I was unclear: the reason for LIBTASN1 to have explicit_bzero is
read_file.  But libtasn1 never uses the sensitive flag, and thus never
really excercise the explicit_bzero code path.

>>3) Is there a way to detect if the compiler supports 'asm'?  The
>>current test 'defined __GNUC__ && !defined __clang__' is what is
>>really failing here.
>
> Probably something like
>   (defined __GNUC__ || defined __clang__) && !defined __STRICT_ANSI__

Using __asm__ instead seems more elegant, and even aligned with gcc
manual.

>> 3) Is the idiom of using separate functions bzero() vs explicit_bzero()
>>to avoid security-problematic compiler optimization still a good one?
>> 
>>1) If yes, I think we should have read_sensitive_file() rather than
>>extending read_file() with a flag for this purpose.
>> 
>>2) If no, what is the better idiom to use here instead of
>>explicit_bzero?
>
> When the code for average contexts and the code for secure contexts differ
> only by a few lines of code, we would like to avoid code duplication. As
> code duplication means twice the maintenance effort in the future.

Sure -- although it would be possible to implement the essence of
read_file in a way where support for the sensitive flag is a
compile-time option.

/Simon


signature.asc
Description: PGP signature


Re: [PROPOSED 0/4] memset_explicit patches

2022-11-28 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> Here's a proposed set of patches to add support for C23's
> memset_explicit function, along with the corresponding fallout in
> Gnulib.  The idea is to prefer memset_explicit, but continue to
> support explicit_bzero (which is not marked as obsolescent, as it's
> too soon for that).  Comments welcome.

Thanks -- I did a brief code review and it looks fine, and thanks for
adding a test-case for this -- it will be interesting to see in what
environments it will fail, indicating problematic compiler optimizations
(or bugs).

A general observation is that I'm mixed about offering replacement of
security-relevant APIs which do not offer the same guarantees as a
secure implementation.  In these situations, it may actually be
preferrably to crash or to refuse to build the application, at least by
default.  Compare with gnulib's getrandom().  On platforms we care
about, things should be secure, but it is just a small bug away from
gnulib deciding to replace a system/compiler-provided secure
memset_explicit with our less secure memset_explicit.

OTOH, this would create a lot of problems: libtasn1's use of read_file()
never uses the sensitive flag, and thus will never call explicit_bzero.
Refusing to build would be excessive.

/Simon


signature.asc
Description: PGP signature


explicit_bzero and -std=c99

2022-11-27 Thread Simon Josefsson via Gnulib discussion list
Hi.  We got a bug report about a build failure (see below), and I'm
wondering:

1) Does gnulib support building with gcc -std=c99?  I think we should,
but it could have documented missing functionality or breakage.

2) It seems explicit_bzero.c in gnulib fall backs to using 'asm' for
GCC, which isn't working in non-GNU modes of gcc.  Further wondering:

   1) The reason for having explicit_bzero is read_file, which needs it
   for reading sensitive files, a feature we don't use.  Uncoupling this
   unnecessary dependency would have been nice.

   2) Is there no other way to implement explicit_bzero without 'asm'?
   There is a another fallback code using volatile pointers, but I'm not
   sure it really has the same semantics.

   3) Is there a way to detect if the compiler supports 'asm'?  The
   current test 'defined __GNUC__ && !defined __clang__' is what is
   really failing here.

3) Is the idiom of using separate functions bzero() vs explicit_bzero()
   to avoid security-problematic compiler optimization still a good one?

   1) If yes, I think we should have read_sensitive_file() rather than
   extending read_file() with a flag for this purpose.

   2) If no, what is the better idiom to use here instead of
   explicit_bzero?

Other thoughts?

/Simon

Simon Josefsson via Discussion list for GNU Libtasn1
 writes:

> Vincent Fortier  writes:
>
>> While preparing a gnutls update I ended-up updating libtasn1 from
>> 4.16.  Going to 4.17 works but anything after that fails with:
>
> Thanks for the report!  I can reproduce this using:
>
> ./configure ac_cv_func_explicit_bzero=no CPPFLAGS="-std=c99"
>
> In other words, the problem is due to a combination of a platform
> without explicit_bzero and forcing GCC into C99 mode where it doesn't
> support 'asm', so it is not possible to implement 'explicit_bzero'.  Why
> do you hard code -std=c99?  Try just removing that.  Or use -std=gnu99'
> to allow GCC to use 'asm'.
>
> Analysing further, the 'explicit_bzero' function is only used by
> 'read_file' in a mode that we never use (for reading sensitive files),
> so it is never needed by libtasn1.  This is not ideal, but I'm not sure
> what a maintainable solution is.
>
> /Simon
>


signature.asc
Description: PGP signature


[PATCH] vc-list-files-tests: Avoid OpenPGP private key operations.

2022-11-13 Thread Simon Josefsson via Gnulib discussion list
Hi.  I had a background job doing 'make check' in a project that
triggered a GnuPG private key operation PIN prompt... this was
surprising to me, and the attached fix should avoid that happening.  If
my PIN had been cached, this would have signed a commit behind my back
(although this would have been a harmless one).  I think this behaviour
should generally be considered a bug.  I wonder if there are more
examples of this hidden deep inside scripts.

/Simon
From 0ab73798b5bc703233195c1d37f96d977fc26ad8 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Sun, 13 Nov 2022 11:50:51 +0100
Subject: [PATCH] vc-list-files-tests: Avoid OpenPGP private key operations.

* tests/test-vc-list-files-git.sh (GIT_CONFIG_GLOBAL): Set it to /dev/null.
---
 ChangeLog   | 6 ++
 tests/test-vc-list-files-git.sh | 7 +++
 2 files changed, 13 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 70ece5200..d51a62a02 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2022-11-13  Simon Josefsson  
+
+	vc-list-files-tests: Avoid OpenPGP private key operations.
+	* tests/test-vc-list-files-git.sh (GIT_CONFIG_GLOBAL): Set it to
+	/dev/null.
+
 2022-11-03  Bruno Haible  
 
 	dynarray: Rename to glibc-internal/dynarray.
diff --git a/tests/test-vc-list-files-git.sh b/tests/test-vc-list-files-git.sh
index 28292322a..d4e574370 100755
--- a/tests/test-vc-list-files-git.sh
+++ b/tests/test-vc-list-files-git.sh
@@ -22,6 +22,13 @@
 tmpdir=vc-git-$$
 GIT_DIR= GIT_WORK_TREE=; unset GIT_DIR GIT_WORK_TREE
 
+# Ignore local git configurations that may interact badly with
+# commands below.  For example, if the user has set
+# commit.gpgsign=true in ~/.gitconfig the 'git commit' below will
+# require a OpenPGP private key operation which trigger PIN prompts
+# and unwanted hardware access on the developer's machine.
+GIT_CONFIG_GLOBAL=/dev/null; export GIT_CONFIG_GLOBAL
+
 fail=1
 mkdir $tmpdir && cd $tmpdir &&
   # without git, skip the test
-- 
2.37.1 (Apple Git-137.1)



signature.asc
Description: PGP signature


Re: [PATCH] maintainer-makefile: Fix Apple Xcode 'make syntax-check'.

2022-11-01 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi Simon,
>
>> +@if ! indent --version 2> /dev/null | grep -q 'GNU indent'; then\
>
> As mentioned in the Autoconf manual [1], the grep option '-q' is not portable.
> E.g. on Solaris 10:
>
> $ echo | grep -q x
> grep: illegal option -- q
> Usage: grep -hblcnsviw pattern file . . .
>
> The portable alternative is
>   grep 'GNU indent' > /dev/null

Thank you!  I installed these two fixes, which caught the unportable
usages in scripts for some packages.

/Simon
From 4555e788613f1f5d1e8519427591bef3274d3124 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Tue, 1 Nov 2022 09:06:56 +0100
Subject: [PATCH 1/2] maintainer-makefile: Add syntax-check rule for unportable
 'grep -q'.

* top/maint.mk (sc_unportable_grep_q): Add.
---
 top/maint.mk | 4 
 1 file changed, 4 insertions(+)

diff --git a/top/maint.mk b/top/maint.mk
index 045609c285..85b15fb2d2 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -1373,6 +1373,10 @@ sc_vulnerable_makefile_CVE-2012-3386:
 	  '  see https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2012-3386 for details') \
 	  $(_sc_search_regexp)
 
+sc_unportable_grep_q:
+	@prohibit='grep -q' halt="unportable 'grep -q', use >/dev/null instead" \
+	  $(_sc_search_regexp)
+
 vc-diff-check:
 	$(AM_V_GEN)(unset CDPATH; cd $(srcdir) && $(VC) diff) > vc-diffs || :
 	$(AM_V_at)if test -s vc-diffs; then			\
-- 
2.30.2

From 757160ccb0fb5d86d09e415eb52ffa9a3a85be5b Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Tue, 1 Nov 2022 09:09:02 +0100
Subject: [PATCH 2/2] maintainer-makefile: Fix last sc_indent commit.

* top/maint.mk (sc_indent): Don't use grep -q.
Suggested by Bruno Haible.
---
 top/maint.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/top/maint.mk b/top/maint.mk
index 85b15fb2d2..0b42438b2c 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -1663,7 +1663,7 @@ indent: # Running indent once is not idempotent, but running it twice is.
 	indent $(indent_args) $(INDENT_SOURCES)
 
 sc_indent:
-	@if ! indent --version 2> /dev/null | grep -q 'GNU indent'; then\
+	@if ! indent --version 2> /dev/null | grep 'GNU indent' > /dev/null; then \
 	echo 1>&2 '$(ME): sc_indent: GNU indent is missing';	\
 	else\
 	  fail=0; files="$(INDENT_SOURCES)";\
-- 
2.30.2



signature.asc
Description: PGP signature


[PATCH] maintainer-makefile: Fix Apple Xcode 'make syntax-check'.

2022-10-31 Thread Simon Josefsson via Gnulib discussion list
Hi.  I installed this to let 'make syntax-check' on Mac OS succeed, I
don't think it is useful to use non-GNU indent.

/Simon
From 4ad0eedf4ff8d294a10c20b8945d0e59aa8141db Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Mon, 31 Oct 2022 09:42:42 +0100
Subject: [PATCH] maintainer-makefile: Fix Apple Xcode 'make syntax-check'.

* top/maint.mk (sc_indent): Don't use non-GNU indent.
---
 ChangeLog| 5 +
 top/maint.mk | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 634aa8797f..acbe62a996 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2022-10-31  Simon Josefsson  
+
+	maintainer-makefile: Fix Apple Xcode 'make syntax-check'.
+	* top/maint.mk (sc_indent): Don't use non-GNU indent.
+
 2022-10-30  Paul Eggert  
 
 	thread: pacify gcc -Wbad-function-cast
diff --git a/top/maint.mk b/top/maint.mk
index 495a0a2bf6..045609c285 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -1659,8 +1659,8 @@ indent: # Running indent once is not idempotent, but running it twice is.
 	indent $(indent_args) $(INDENT_SOURCES)
 
 sc_indent:
-	@if ! command -v indent > /dev/null; then			\
-	echo 1>&2 '$(ME): sc_indent: indent is missing';		\
+	@if ! indent --version 2> /dev/null | grep -q 'GNU indent'; then\
+	echo 1>&2 '$(ME): sc_indent: GNU indent is missing';	\
 	else\
 	  fail=0; files="$(INDENT_SOURCES)";\
 	  for f in $$files; do		\
-- 
2.30.2



signature.asc
Description: PGP signature


[PATCH] gendocs: Output timestamp in English.

2022-10-25 Thread Simon Josefsson via Gnulib discussion list
Hi.

I noticed that the generated InetUtils manual had a locale problem in
the timestamp:

https://www.gnu.org/software/inetutils/manual/

The script gendocs.sh has:

: "${SETLANG="env LANG= LC_MESSAGES= LC_ALL= LANGUAGE="}"
...
curdate=`$SETLANG date '+%B %d, %Y'`

The reason seems to be LC_TIME which PureOS 10 for some reason set.

jas@latte:~/src/gnulib$ locale
LANG=sv_SE.UTF-8
LANGUAGE=
LC_CTYPE="sv_SE.UTF-8"
LC_NUMERIC=sv_SE.UTF-8
LC_TIME=sv_SE.UTF-8
LC_COLLATE="sv_SE.UTF-8"
LC_MONETARY=sv_SE.UTF-8
LC_MESSAGES="sv_SE.UTF-8"
LC_PAPER=sv_SE.UTF-8
LC_NAME=sv_SE.UTF-8
LC_ADDRESS=sv_SE.UTF-8
LC_TELEPHONE=sv_SE.UTF-8
LC_MEASUREMENT=sv_SE.UTF-8
LC_IDENTIFICATION=sv_SE.UTF-8
LC_ALL=
jas@latte:~/src/gnulib$ env LANG= LC_MESSAGES= LC_ALL= LANGUAGE= date '+%B %d, 
%Y'
oktober 25, 2022
jas@latte:~/src/gnulib$ env LANG= LC_TIME= LC_MESSAGES= LC_ALL= LANGUAGE= date 
'+%B %d, %Y'
October 25, 2022
jas@latte:~/src/gnulib$ 

The attached patch fixes this.

/Simon
From 1575cb2bb925bd0b4bd160e06e05d39303c5cca5 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Tue, 25 Oct 2022 23:39:15 +0200
Subject: [PATCH] gendocs: Output timestamp in English.

* build-aux/gendocs.sh (SETLANG): Add LC_TIME= for "date".
---
 ChangeLog| 5 +
 build-aux/gendocs.sh | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 6f4bea5c1c..f410dbe048 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2022-10-25  Simon Josefsson  
+
+	gendocs: Output timestamp in English.
+	* build-aux/gendocs.sh (SETLANG): Add LC_TIME= for "date".
+
 2022-10-23  Bruno Haible  
 
 	assert-h: Make static_assert work on Solaris 11.4.
diff --git a/build-aux/gendocs.sh b/build-aux/gendocs.sh
index f6811eea46..ff00029283 100755
--- a/build-aux/gendocs.sh
+++ b/build-aux/gendocs.sh
@@ -2,7 +2,7 @@
 # gendocs.sh -- generate a GNU manual in many formats.  This script is
 #   mentioned in maintain.texi.  See the help message below for usage details.
 
-scriptversion=2022-01-01.00
+scriptversion=2022-10-25.23
 
 # Copyright 2003-2022 Free Software Foundation, Inc.
 #
@@ -40,7 +40,7 @@ srcdir=`pwd`
 scripturl="https://git.savannah.gnu.org/cgit/gnulib.git/plain/build-aux/gendocs.sh;
 templateurl="https://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/gendocs_template;
 
-: "${SETLANG="env LANG= LC_MESSAGES= LC_ALL= LANGUAGE="}"
+: "${SETLANG="env LANG= LC_TIME= LC_MESSAGES= LC_ALL= LANGUAGE="}"
 : "${MAKEINFO="makeinfo"}"
 : "${TEXI2DVI="texi2dvi"}"
 : "${DOCBOOK2HTML="docbook2html"}"
-- 
2.30.2



signature.asc
Description: PGP signature


Re: getdelim: Work around buggy implementation on macOS 10.13

2022-10-23 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> While testing a GNU sed snapshot on macOS 10.13, I see this test failure:
...
> ==85029== Invalid read of size 16
...
> An out-of-bounds read. Oh oh. When I reconfigure and recompile with the
> environment variable
>   gl_cv_func_working_getdelim=no
> this test succeeds. Assuming that the GNU sed code is correct (it's not a
> particularly complex code), this proves that the out-of-bounds read comes
> from the macOS getdelim() function.
...
> +  [case "$host_os" in
> + darwin*)
> +   dnl On macOS 10.13, valgrind detected an out-of-bounds read during
> +   dnl the GNU sed test suite:
> +   dnl   Invalid read of size 16
> +   dnl  at 0x100EE6A05: _platform_memchr$VARIANT$Base (in 
> /usr/lib/system/libsystem_platform.dylib)
> +   dnl  by 0x100B7B0BD: getdelim (in 
> /usr/lib/system/libsystem_c.dylib)
> +   dnl  by 0x1B0BE: ck_getdelim (utils.c:254)
> +   gl_cv_func_working_getdelim=no ;;

I don't care strongly about this issue, but this brings up a design
perspective for gnulib wrt valgrind: we have several valgrind
suppression files (see lib/*.valgrind) to silence valgrind complaints
already.  Your solution here choses a different path.

It can be difficult to assess wether a valgrind complaint is a false
positive or not, especially for system functions (and we've had valgrind
complaints for libc issues that looked problematic in theory but not in
practice).  Unless we can trigger a real bug in the code and test for
that, we have a choice how to handle valgrind complaints: 1) add a
valgrind suppressions file for the complaint, or 2) unconditionally
bring in a gnulib replacement code without testing for that behaviour.

I prefer 1) since 2) will over time leads to us bringing in the entire
gnulib replacement code on all systems, which is really bloated and
leads to other problems.

What do you think about adding a valgrind suppressions file for the
output you found instead?  And possibly improve documentation about how
gnulib intends these suppression files to be used by people who run the
test-suite under valgrind, if anything is missing in that area today.

/Simon


signature.asc
Description: PGP signature


Re: stdbool module unconditionally #define true

2022-10-16 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Paul Eggert wrote:
>> > Shouldn't the following cause a compilation error?
>> > 
>> > root@18544d251872:/# cat>foo.c
>> > int main (void) {
>> > int true = 42;
>> > return true;
>> > }
>> 
>> Yes if you have a C23 compiler, which GCC 12 isn't. To get a proper 
>> compilation error you will have to wait for a future GCC release ...
>
> clang 15 already supports this part of C23. I installed the clang-15.0.2
> binaries from https://github.com/llvm/llvm-project/releases/tag/llvmorg-15.0.2
> on a CentOS 8-stream machine, and it produces the error that you expect:

Wonderful, I managed to reproduce this now -- and it triggered a bunch
of other C23 issues with InetUtils and I'm sure in other projects too
when I get to it...  thank you!

/Simon


signature.asc
Description: PGP signature


Re: stdbool module unconditionally #define true

2022-10-14 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Simon Josefsson wrote:
>> rlogind.c: In function 'rlogind_mainloop':
>> rlogind.c:1112:7: error: expected identifier or '(' before numeric constant
>>  1112 |   int true;
>>   |   ^~~~
>> 
>> The file does not include stdbool.h.  ...
>> 
>> Does C23 disallow this?
>
> Yes. C23 § 6.4.1 states that true and false are now keywords. This precludes
> the use as variable names.

Thanks for analysis and the pointer!  How can I trigger that without
gnulib's config.h?  Shouldn't the following cause a compilation error?

$ podman run -it gcc:latest
root@18544d251872:/# cat>foo.c
int main (void) {
int true = 42;
return true;
}
^D
root@18544d251872:/# gcc --version
gcc (GCC) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

root@18544d251872:/# gcc -std=gnu2x -o foo foo.c -Wall -Wpedantic -Wextra
root@18544d251872:/# 

> So, that source code will need to change to conform to C23.
>
> With Gnulib, you can opt to avoid the 'stdbool' module and use 'stdbool-c99'.
> This will avoid this compilation error in rlogind.c, but many Gnulib modules
> will not compile in this setting.

I changed the variable names here instead.

/Simon


signature.asc
Description: PGP signature


stdbool module unconditionally #define true

2022-10-14 Thread Simon Josefsson via Gnulib discussion list
Hi.

Upgrading gnulib in inetutils causes a build failure:

rlogind.c: In function 'rlogind_mainloop':
rlogind.c:1112:7: error: expected identifier or '(' before numeric constant
 1112 |   int true;
  |   ^~~~

The file does not include stdbool.h.  Apparently this has worked for
many years and on many platforms.  One may question the wisdom to name a
variable 'true' but apparently it seemed to have worked well, and the
code may originate back before stdbool was introduced into C.

Does C23 disallow this?

I see config.h contains the following (I'm using gcc 10):

/* Define to 1 if bool, true and false work as per C2023. */
/* #undef HAVE_C_BOOL */
...
#ifndef HAVE_C_BOOL
# if !defined __cplusplus && !defined __bool_true_false_are_defined
#  if HAVE_STDBOOL_H
#   include 
#  else
#   if defined __SUNPRO_C
#error " is not usable with this configuration. To make it 
usable, add -D_STDC_C99= to $CC."
#   else
#error " does not exist on this platform. Use gnulib module 
'stdbool-c99' instead of gnulib module 'stdbool'."
#   endif
#  endif
# endif
# if !true
#  define true (!false)
# endif
#endif

This seems surprising to me -- the snippet comes from m4/c-bool.m4 and
always introduce a #define for the CPP symbol 'true', even when the code
did not request any stdbool-related declarations by doing '#include
'.

Shouldn't gnulib provide a stdbool.h file that #define true, instead of
putting it into config.h forcing it on all code?

/Simon


signature.asc
Description: PGP signature


Re: "git-version-gen" prints "Print a version string." three times

2022-10-10 Thread Simon Josefsson via Gnulib discussion list
Bjarni Ingi Gislason  writes:

> "sh -x build-aux/get-version-gen" outputs:

I don't think that's a bug -- running the script under 'sh -x' is not a
supported by of invoking it, but a way to invoke shell debugging of the
script.  I don't see anything unexpected in the debug output, it doesn't
print the help string three times, it is merely shown in debug output.
Redirect debug output with 2>/dev/null and you will only see what the
script prints.  Running 'build-aux/get-version-gen' works fine as
expected.

/Simon

> + scriptversion=2022-07-09.08
> + me=build-aux/git-version-gen
> + expr 2022-07-09.08 : \([^-]*\)
> + year=2022
> + version=git-version-gen 2022-07-09.08
>
> Copyright (C) 2022 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later 
> .
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> + usage=Usage: build-aux/git-version-gen [OPTION]... $srcdir/.tarball-version 
> [TAG-NORMALIZATION-SED-SCRIPT]
> Print a version string.
>
> Options:
>
>--prefix PREFIXprefix of git tags (default 'v')
>--fallback VERSION
>   fallback version to use if "git --version" fails
>
>--help display this help and exit
>--version  output version information and exit
>
> Send patches and bug reports to .
> + prefix=v
> + fallback=
> + test 0 -gt 0
> + test x = x
> + echo Usage: build-aux/git-version-gen [OPTION]... $srcdir/.tarball-version 
> [TAG-NORMALIZATION-SED-SCRIPT]
> Print a version string.
>
> Options:
>
>--prefix PREFIXprefix of git tags (default 'v')
>--fallback VERSION
>   fallback version to use if "git --version" fails
>
>--help display this help and exit
>--version  output version information and exit
>
> Send patches and bug reports to .
> Usage: build-aux/git-version-gen [OPTION]... $srcdir/.tarball-version 
> [TAG-NORMALIZATION-SED-SCRIPT]
> Print a version string.
>
> Options:
>
>--prefix PREFIXprefix of git tags (default 'v')
>--fallback VERSION
>   fallback version to use if "git --version" fails
>
>--help display this help and exit
>--version  output version information and exit
>
> Send patches and bug reports to .
> + exit 1
>
>


signature.asc
Description: PGP signature


Re: lib/malloca.c: warning about [-Wsign-compare]

2022-09-23 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> On 9/22/22 11:20, Bjarni Ingi Gislason wrote:
>
>> CC='clang -Wsign-compare' ./gnulib-tool --test malloca 2>
>
> Oh, please don't use -Wsign-compare. Clang generates too many false
> alarms with -Wsign-compare, we don't recommend that warning, and 
> Gnulib-using programs generally don't enable that warning when
> compiling Gnulib code.
>
> If you happen to find a real bug with that warning we'd like to know
> it. But please don't bother us with the false alarms; they're not
> worth your time or ours.

I added a similar comment to the manual: it is handy with a reference
for people like me who cannot remember all different warning flags and
whether they are generally useful or not.

/Simon
From 54c09c98a67219ba2cf70c4bb23f80990db37066 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Fri, 23 Sep 2022 09:06:22 +0200
Subject: [PATCH] warnings, manywarnings: Doc fixes.

* doc/manywarnings.texi (manywarnings): Improve usage instruction.
Start list of comments on particular warning flags, based on
comment from Paul Eggert .
* doc/warnings.texi (warnings): Mention that it is often used with manywarnings.
---
 ChangeLog |  8 
 doc/manywarnings.texi | 14 +-
 doc/warnings.texi |  4 +++-
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index a6399f1048..5b5804df68 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2022-09-23  Simon Josefsson  
+
+	warnings, manywarnings: Doc fixes.
+	* doc/manywarnings.texi (manywarnings): Improve usage instruction.
+	Start list of comments on particular warning flags, based on
+	comment from Paul Eggert .
+	* doc/warnings.texi (warnings): Mention that it is often used with manywarnings.
+
 2022-09-21  Paul Eggert  
 
 	assert-h: suppress clang false alarms
diff --git a/doc/manywarnings.texi b/doc/manywarnings.texi
index 1b3e5907be..7ab3f09cee 100644
--- a/doc/manywarnings.texi
+++ b/doc/manywarnings.texi
@@ -32,7 +32,7 @@ go through the list of warnings. You will likely deactivate warnings that
 occur often and don't point to mistakes in the code, by adding them to the
 @samp{nw} variable, then reconfiguring and recompiling. When warnings point
 to real mistakes and bugs in the code, you will of course not disable
-them.
+them but fix your code to silence the warning instead.
 
 There are also many GCC warning options which usually don't point to mistakes
 in the code; these warnings enforce a certain programming style. It is a
@@ -44,3 +44,15 @@ When a new version of GCC is released, you can add the new warning options
 that it introduces into the @code{gl_MANYWARN_ALL_GCC} macro (and submit your
 modification to the Gnulib maintainers :-)), and enjoy the benefits of the
 new warnings, while adding the undesired ones to the @samp{nw} variable.
+
+Comments on particular warning flags:
+
+@table @samp
+
+@item -Wsign-compare
+Clang generates too many false alarms with -Wsign-compare, and we don't
+recommend that warning.  Programs using Gnulib generally don't enable
+that warning when compiling Gnulib code.  If you happen to find a real
+bug with that warning we'd like to know it.
+
+@end table
diff --git a/doc/warnings.texi b/doc/warnings.texi
index 1836c04325..47ce633250 100644
--- a/doc/warnings.texi
+++ b/doc/warnings.texi
@@ -2,7 +2,9 @@
 @section warnings
 
 The @code{warnings} module allows to regularly build a package with more
-GCC warnings than the default warnings emitted by GCC.
+GCC warnings than the default warnings emitted by GCC.  It is often used
+indirectly through the @code{manywarnings} module
+(@pxref{manywarnings}).
 
 It provides the following functionality:
 
-- 
2.30.2



signature.asc
Description: PGP signature


Re: maint.mk: public-submodule-commit rule is broken

2022-09-12 Thread Simon Josefsson via Gnulib discussion list
Btw, see my concerns with this code earlier here:

https://lists.gnu.org/archive/html/bug-gnulib/2022-08/msg00040.html
https://lists.gnu.org/archive/html/bug-gnulib/2022-08/msg00044.html

Instead of the patch, I have merely disabled it when I ran into issues,
since it doesn't add value for me and causes problems.

Couldn't this be converted into a syntax-check rule instead?  Or just
removed.

/Simon


signature.asc
Description: PGP signature


Re: TAR_OPTIONS after one decade

2022-09-06 Thread Simon Josefsson via Gnulib discussion list
Hi.  I have committed this.

/Simon
From 4b17a1ae49e69df1ac5dc35a4f60b20ab958baf2 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Tue, 6 Sep 2022 14:32:05 +0200
Subject: [PATCH] gnumakefile: Improve tarball reproducibility.

* top/GNUmakefile (TAR_OPTIONS): Add --sort=name.  Suggested by
Tzvetelin Katchov .
* DEPENDENCIES: Mention tar 1.28 dependency.
---
 ChangeLog   | 7 +++
 DEPENDENCIES| 9 +
 top/GNUmakefile | 2 +-
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 26231407bb..d5a5af7d74 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-09-06  Simon Josefsson  
+
+	gnumakefile: Improve tarball reproducibility.
+	* top/GNUmakefile (TAR_OPTIONS): Add --sort=name.  Suggested by
+	Tzvetelin Katchov .
+	* DEPENDENCIES: Mention tar 1.28 dependency.
+
 2022-09-05  Bruno Haible  
 
 	pthread-h: Fix compilation error on mingw with --enable-threads=windows.
diff --git a/DEPENDENCIES b/DEPENDENCIES
index 23fa1f5a8b..3b24f45ad6 100644
--- a/DEPENDENCIES
+++ b/DEPENDENCIES
@@ -174,3 +174,12 @@ at any time.
 https://www.gnu.org/software/libtool/
   + Download:
 https://ftp.gnu.org/gnu/libtool/
+
+* GNU tar 1.28 or newer.
+  + Recommended.
+Needed if you use the 'gnumakefile' module, which sets TAR_OPTIONS
+to --sort=names (added in version 1.28) in GNUmakefile for 'make dist'.
+  + Homepage:
+https://www.gnu.org/software/tar/
+  + Download:
+https://ftp.gnu.org/gnu/tar/
diff --git a/top/GNUmakefile b/top/GNUmakefile
index 7a08c9d55b..a778610d28 100644
--- a/top/GNUmakefile
+++ b/top/GNUmakefile
@@ -26,7 +26,7 @@ _gl-Makefile := $(wildcard [M]akefile)
 ifneq ($(_gl-Makefile),)
 
 # Make tar archive easier to reproduce.
-export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner
+export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name
 
 # Allow the user to add to this in the Makefile.
 ALL_RECURSIVE_TARGETS =
-- 
2.30.2



signature.asc
Description: PGP signature


Re: unictype/category-none tests: Fix a link error on MSVC

2022-09-06 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> On MSVC, with libunistring installed as a shared library, I get this link
> error:
>
> /home/bruno/msvc/compile cl -nologo  -MD  -L/usr/local/msvc64/lib -o 
> test-categ_none.exe unictype/test-categ_none.obj libtests.a ../gllib/libgnu.a 
> libtests.a ../gllib/libgnu.a libtests.a  -lunistring 
> test-categ_none.obj : error LNK2019: unresolved external symbol 
> _UC_CATEGORY_NONE referenced in function main
> test-categ_none.exe : fatal error LNK1120: 1 unresolved externals
> make[4]: *** [Makefile:16335: test-categ_none.exe] Error 2
>
> The reason is that _UC_CATEGORY_NONE is not a public API of the shared library
> and therefore not exported. The simplest fix is to disable the test.

Thanks -- although isn't that also a bug in libunistring that the symbol
is visible for non-MSVC?  Shouldn't be hidden?

/Simon

>
> 2022-09-04  Bruno Haible  
>
>   unictype/category-none tests: Fix a link error on MSVC.
>   * tests/unictype/test-categ_none.c (main): Disable the test on MSVC.
>
> diff --git a/tests/unictype/test-categ_none.c 
> b/tests/unictype/test-categ_none.c
> index 4615fb162b..913011a5e4 100644
> --- a/tests/unictype/test-categ_none.c
> +++ b/tests/unictype/test-categ_none.c
> @@ -25,11 +25,18 @@
>  int
>  main ()
>  {
> +  /* This test cannot be compiled on platforms on which _UC_CATEGORY_NONE
> + is not exported from the libunistring shared library.  For now,
> + MSVC is the only platform where this is a problem.  */
> +#if !defined _MSC_VER
> +
>uc_general_category_t ct = _UC_CATEGORY_NONE;
>unsigned int c;
>  
>for (c = 0; c < 0x11; c++)
>  ASSERT (!uc_is_general_category (c, ct));
>  
> +#endif
> +
>return 0;
>  }
>
>
>
>
>


signature.asc
Description: PGP signature


Re: ISO C 23 ahead

2022-08-31 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

>> +if (ckd_mul (, plen, sizeof (CHAR))
>>   \
>
> Indeed it should. Thanks for reporting that. I installed the attached.

Thank you!

>> I wonder why no self-check has caught this?
>
> Apparently I tested it only on platforms with working fnmatch, so the
> bad code was never compiled.

I'm experimenting with a Debian6 gnulib CI/CD job, not sure it will be
worthwile: https://gitlab.com/jas/gnulib-ci/-/pipelines

/Simon


signature.asc
Description: PGP signature


Re: ISO C 23 ahead

2022-08-30 Thread Simon Josefsson via Gnulib discussion list
I think some of these patches introduced a build failure of GNU
InetUtils on Debian 6, see build error below.  The following looks
strange:

-if (INT_MULTIPLY_WRAPV (plen, sizeof (CHAR), )   \
-|| INT_ADD_WRAPV (new_used, plensize, _used)) \
+if (ckd_mul (, plen, sizeof (CHAR), )   \
+|| ckd_add (_used, new_used, plensize))   \

Shouldn't it be like this:

+if (ckd_mul (, plen, sizeof (CHAR))  \

I wonder why no self-check has caught this?

/Simon

https://gitlab.com/jas/inetutils/-/jobs/2956458571

gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -Wno-cast-qual -Wno-conversion 
-Wno-float-equal -Wno-sign-compare -Wno-undef -Wno-unused-function 
-Wno-unused-parameter -Wno-sign-conversion -Wno-type-limits -g -O2 -MT 
libgnu_a-xasprintf.o -MD -MP -MF .deps/libgnu_a-xasprintf.Tpo -c -o 
libgnu_a-xasprintf.o `test -f 'xasprintf.c' || echo './'`xasprintf.c
mv -f .deps/libgnu_a-xasprintf.Tpo .deps/libgnu_a-xasprintf.Po
depbase=`echo asnprintf.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -g -O2 -MT asnprintf.o -MD 
-MP -MF $depbase.Tpo -c -o asnprintf.o asnprintf.c &&\
mv -f $depbase.Tpo $depbase.Po
depbase=`echo fnmatch.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -g -O2 -MT fnmatch.o -MD 
-MP -MF $depbase.Tpo -c -o fnmatch.o fnmatch.c &&\
mv -f $depbase.Tpo $depbase.Po
In file included from fnmatch.c:143:
fnmatch_loop.c:1067:13: error: macro "ckd_mul" passed 4 arguments, but takes 
just 3
In file included from fnmatch.c:143:
fnmatch_loop.c: In function 'ext_match':
fnmatch_loop.c:1067: error: 'ckd_mul' undeclared (first use in this function)
fnmatch_loop.c:1067: error: (Each undeclared identifier is reported only once
fnmatch_loop.c:1067: error: for each function it appears in.)
fnmatch_loop.c:1074:13: error: macro "ckd_mul" passed 4 arguments, but takes 
just 3
In file included from fnmatch.c:232:
fnmatch_loop.c:1067:1: error: macro "ckd_mul" passed 4 arguments, but takes 
just 3
In file included from fnmatch.c:232:
fnmatch_loop.c: In function 'ext_wmatch':
fnmatch_loop.c:1067: error: 'ckd_mul' undeclared (first use in this function)
fnmatch_loop.c:1074:1: error: macro "ckd_mul" passed 4 arguments, but takes 
just 3
make[4]: *** [fnmatch.o] Error 1
make[4]: Leaving directory `/builds/jas/inetutils/inetutils-2.3.4-6821/lib'

/Simon


signature.asc
Description: PGP signature


Re: Creating a formula for Homebrew

2022-08-26 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi,
>
> Wesley Viana wrote:
>> So I was wondering how to contribute by "packing" gnulib into a brew
>> formula.
>
> Packaging gnulib through a packaging system (such as Debian, pkg, BSD ports,
> or brew) is, in the current state of things, not desirable.
>
> Gnulib is a source code library [1], and, although the documentation states
> that the user has the choice between using the git repo and stable releases 
> [2],
> there have not been such stable releases for 4 years. That is, everyone uses
> the git repo. And we are taking QA steps to ensure a high quality of the
> code in the git repo.

Whilke I agree with the above, I do believe there is a way to package
gnulib in packaging systems that make sense: by packging the gnulib git
repository including its entire git history, and keep that "package" up
to date regulary.  After all, that is the way developers use gnulib: it
is a source-archive, and every git commit is a release that somebody may
want to use.

The source of the problem I see described in this thread is that
distributions (for good reasons) wants to rebuild everything from
source, and building from tarballs is quite fragile: it is difficult to
know if you rebuilt everything from its pure source code or accidentally
just re-used a pre-built artifact.  The alternative that people have
adopted in recent years is to build from version controlled sources and
going through the bootstraps steps themselves (usually quite poorly, by
just calling autoreconf -fvi which was never intended for that purpose),
and if done right I think this results in something that is better for
distributions.

Building from git introduces two new problems: 1) not using officiall
release tarball artifacts, introducing unreliability in what a release
is, including what cryptographic signatures to trust, and 2) any git
submodules (such as gnulib) and external resources (like translation
files) needs to be available offline locally in a package.

Translation files is tricky, and I'm not sure how to resolve that in a
good way.  There is no strong connection between a git repo and the
translation file intended to be used with it, introducing unreliability
and supply-chain issues.  Perhaps the solution is similar to the gnulib
packaging: just package all translationproject.org *.po files in a
separate package in the distribution, and keep that up to date, and use
that as the source during source-code rebuilds.

/Simon


signature.asc
Description: PGP signature


Re: tr portability

2022-08-25 Thread Simon Josefsson via Gnulib discussion list
tor 2022-08-25 klockan 19:21 +0200 skrev Bruno Haible:
> Simon Josefsson wrote:
> > > In GNU gettext, many tests use "tr -d '\r'" since 2007 already,
> > > and no one
> > > ever has reported a problem with it.
> > 
> > But does it fail fatally when it doesn't work?  tests/parser.sh in
> > libtasn1 did
> 
> It leads to a test failure, yes, because it removed all 'r's from the
> test result. But only on Solaris 10 and only when /usr/ucb comes too
> early in $PATH.

Okay -- I've reverted this usage in libtasn1 (committed just last year
-- but I cannot find the report), let's see if we get new reports.

/Simon



signature.asc
Description: This is a digitally signed message part


Re: tr portability

2022-08-25 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Since we are already stating in the generic INSTALL file, since 2008:
>
>  On Solaris, don't put '/usr/ucb' early in your 'PATH'.  This
>   directory contains several dysfunctional programs; working variants of
>   these programs are available in '/usr/bin'.  So, if you need '/usr/ucb'
>   in your 'PATH', put it _after_ '/usr/bin'.
>
> your precautions above are coping with a situation that is unsupported
> anyway.

Ah thanks for the pointer

> In GNU gettext, many tests use "tr -d '\r'" since 2007 already, and no one
> ever has reported a problem with it.

But does it fail fatally when it doesn't work?  tests/parser.sh in
libtasn1 did, and we got a report about this just a year or so ago, so I
guess the problem still exists somewhere.

/Simon


signature.asc
Description: PGP signature


Re: Bison submit patches

2022-08-24 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> In this case, you'll better modify the unit test to pipe the result
> through "tr -d '\r'".

This is unrelated, but alas I've not found a more portable way to trim
CR than this since some tr do not support \r:

if echo solaris | tr -d '\r' | grep solais > /dev/null; then
  cr='\015'
else
  cr='\r'
fi
# normalize output
LC_ALL=C tr -d "$cr" < $TMPFILE > x$TMPFILE

/Simon


signature.asc
Description: PGP signature


[PATCH] maintainer-makefile: Check for incorrect DISTCHECK_CONFIGURE_FLAGS usage.

2022-08-16 Thread Simon Josefsson via Gnulib discussion list
I discovered use of DISTCHECK_CONFIGURE_FLAGS in Makefile.am for several
projects, but that's a user-variable and AM_DISTCHECK_CONFIGURE_FLAGS
should be used.

/Simon
From dec7194206fc1ec7db0a94472d8ece58025040c6 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Tue, 16 Aug 2022 17:26:56 +0200
Subject: [PATCH] maintainer-makefile: Check for incorrect
 DISTCHECK_CONFIGURE_FLAGS usage.

* top/maint.mk (sc_makefile_DISTCHECK_CONFIGURE_FLAGS): Add.
---
 ChangeLog| 6 ++
 top/maint.mk | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index c38798745d..b639d1709d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2022-08-16  Simon Josefsson  
+
+	maintainer-makefile: Check for incorrect DISTCHECK_CONFIGURE_FLAGS
+	usage.
+	* top/maint.mk (sc_makefile_DISTCHECK_CONFIGURE_FLAGS): Add.
+
 2022-08-16  Bruno Haible  
 
 	tempname: Add tests.
diff --git a/top/maint.mk b/top/maint.mk
index c1fdf9ca2c..5745d5831d 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -1256,6 +1256,12 @@ sc_makefile_path_separator_check:
 	halt=$(msg)			\
 	  $(_sc_search_regexp)
 
+sc_makefile_DISTCHECK_CONFIGURE_FLAGS:
+	@prohibit='^DISTCHECK_CONFIGURE_FLAGS'\
+	in_vc_files='akefile|\.mk$$'	\
+	halt="use AM_DISTCHECK_CONFIGURE_FLAGS"\
+	  $(_sc_search_regexp)
+
 # Check that 'make alpha' will not fail at the end of the process,
 # i.e., when pkg-M.N.tar.xz already exists (either in "." or in ../release)
 # and is read-only.
-- 
2.30.2



signature.asc
Description: PGP signature


Re: support shallow gnulib submodule checkouts

2022-08-16 Thread Simon Josefsson via Gnulib discussion list
Updated patch below.

Meanwhile, I'll disable the public-submodule-commit test in the projects
I work on -- I don't think it is appropriate to run it on every 'make
check' invocation.  Disable it by putting this in cfg.mk:

submodule-checks =
gl_public_submodule_commit =

/Simon

diff --git a/top/maint.mk b/top/maint.mk
index c1fdf9ca2c..1982833e91 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -1493,10 +1493,13 @@ submodule-checks ?= no-submodule-changes 
public-submodule-commit
 # Ensure that each sub-module commit we're using is public.
 # Without this, it is too easy to tag and release code that
 # cannot be built from a fresh clone.
+gl_seed ?= d146c864e8d8cc82e96d722337253dd5a3a803b8
 .PHONY: public-submodule-commit
 public-submodule-commit:
$(AM_V_GEN)if test -d $(srcdir)/.git\
-   && git --version >/dev/null 2>&1; then  \
+   && git --version >/dev/null 2>&1\
+   && { cd $(gnulib_dir) &&\
+   git cat-file -e $(gl_seed); }; then \
  cd $(srcdir) &&   \
  git submodule --quiet foreach \
  'test "$$(git rev-parse "$$sha1")"\


signature.asc
Description: PGP signature


support shallow gnulib submodule checkouts

2022-08-16 Thread Simon Josefsson via Gnulib discussion list
Hi

Fetching the gnulib submodule takes quite some time, so I'd like to use
a shallow checkout instead.  However I get an error:

jas@latte:~/src/gsasl-small$ make check
  GEN  public-submodule-commit
fatal: run_command returned non-zero status for gnulib
.
maint.mk: found non-public submodule commit
make: *** [maint.mk:1498: public-submodule-commit] Fel 1
jas@latte:~/src/gsasl-small$ 

See code snippet from maint.mk below.  The test fails but the situation
the documentation says it is intended to protect against is not
occuring.

How about this patch?  It will only run the test if the first gnulib
commit is present.  I'm not sure it is possible to catch the problem the
situation is trying to detect in a shallow checkout?

diff --git a/top/maint.mk b/top/maint.mk
index c1fdf9ca2c..0b3208a158 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -1496,7 +1496,8 @@ submodule-checks ?= no-submodule-changes 
public-submodule-commit
 .PHONY: public-submodule-commit
 public-submodule-commit:
$(AM_V_GEN)if test -d $(srcdir)/.git\
-   && git --version >/dev/null 2>&1; then  \
+   && git --version >/dev/null 2>&1\
+   && git cat-file -e d146c864e8d8cc82e96d722337253dd5a3a803b8; 
then   \
  cd $(srcdir) &&   \
  git submodule --quiet foreach \
  'test "$$(git rev-parse "$$sha1")"\

/Simon

# Ensure that each sub-module commit we're using is public.
# Without this, it is too easy to tag and release code that
# cannot be built from a fresh clone.
.PHONY: public-submodule-commit
public-submodule-commit:
$(AM_V_GEN)if test -d $(srcdir)/.git\
&& git --version >/dev/null 2>&1; then  \
  cd $(srcdir) &&   \
  git submodule --quiet foreach \
  'test "$$(git rev-parse "$$sha1")"\
  = "$$(git merge-base origin "$$sha1")"'   \
|| { echo '$(ME): found non-public submodule commit' >&2;   \
 exit 1; }; \
else\
  : ;   \
fi
# This rule has a high enough utility/cost ratio that it should be a
# dependent of "check" by default.  However, some of us do occasionally
# commit a temporary change that deliberately points to a non-public
# submodule commit, and want to be able to use rules like "make check".
# In that case, run e.g., "make check gl_public_submodule_commit="
# to disable this test.
gl_public_submodule_commit ?= public-submodule-commit
check: $(gl_public_submodule_commit)


signature.asc
Description: PGP signature


[PATCH] pmccabe2html: Doc fix.

2022-08-15 Thread Simon Josefsson via Gnulib discussion list
This improve the suggested Makefile.am snippet.

/Simon
From 416872ced15e471f2af2f960ca911da05748a870 Mon Sep 17 00:00:00 2001
From: Simon Josefsson 
Date: Tue, 16 Aug 2022 00:28:22 +0200
Subject: [PATCH] pmccabe2html: Doc fix.

* build-aux/pmccabe2html: Don't use reserved _SOURCES namespace.
Use AM_V_GEN.  Use LC_ALL=C.
---
 ChangeLog  |  6 ++
 build-aux/pmccabe2html | 10 +-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8736a82d1e..c0f2259599 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2022-08-16  Simon Josefsson  
+
+	pmccabe2html: Doc fix.
+	* build-aux/pmccabe2html: Don't use reserved _SOURCES namespace.
+	Use AM_V_GEN.  Use LC_ALL=C.
+
 2022-08-15  Bruno Haible  
 
 	stdbool: Drop old BeOS support that gets in the way of ISO C 23 support.
diff --git a/build-aux/pmccabe2html b/build-aux/pmccabe2html
index 20bdf52bed..f29eae3f00 100644
--- a/build-aux/pmccabe2html
+++ b/build-aux/pmccabe2html
@@ -21,12 +21,12 @@
 
 # Typical Invocation is from a Makefile.am:
 #
-# CYCLO_SOURCES = ${top_srcdir}/src/*.[ch]
+# CYCLO_SRCS = ${top_srcdir}/src/*.[ch]
 #
-# cyclo-$(PACKAGE).html: $(CYCLO_SOURCES)
-# 	$(PMCCABE) $(CYCLO_SOURCES) \
-# 		| sort -nr \
-# 		| $(AWK) -f ${top_srcdir}/build-aux/pmccabe2html \
+# cyclo-$(PACKAGE).html: $(CYCLO_SRCS)
+# 	$(AM_V_GEN)$(PMCCABE) $(CYCLO_SRCS) \
+# 		| LC_ALL=C sort -nr \
+# 		| LC_ALL=C $(AWK) -f ${top_srcdir}/build-aux/pmccabe2html \
 # 			-v lang=html -v name="$(PACKAGE_NAME)" \
 # 			-v vcurl="https://git.savannah.gnu.org/gitweb/?p=$(PACKAGE).git;a=blob;f=%FILENAME%;hb=HEAD" \
 # 			-v url="https://www.gnu.org/software/$(PACKAGE)/" \
-- 
2.30.2



signature.asc
Description: PGP signature


Re: split bootstrap in two phases

2022-08-14 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible  writes:

> Hi all 'bootstrap' users,
>
> Over the last few years, it has become more and more clear that bootstrap
> does two things:
>   (1) Fetch auxiliary files that are not in the git checkout.
>   This is the part that requires network access and that has
>   supply-chain concerns.
>   (2) Generate files such as configure, config.h, Makefile.in etc.
>   This includes running gnulib-tool.
>
> Recent discussion in gnu-prog-discuss has shown that making the separation
> into two phases (1) and (2) explicit will have several benefits:
>
>   * For reproducible builds, it is necessary to be execute the second
> phase without the first one.
>
>   * For people who have local modifications in dependency packages
> (e.g. in gnulib), or for distros who want to apply patches to .m4 files,
> it is important to have these phases separated.
>
>   * The second phase is a way for people to regenerate files in a
> supported way. So far, too many users have been running 'autoreconf -fvi',
> which many GNU packages don't support.

To replace the use of 'autoreconf -fvi', doesn't (at least) the
autogen.sh script needs to be EXTRA_DIST'ed?  How should this be
achieved?  Probably some end-user documentation snippet about this
should be available.

It would be nice if Debian would then use ./autogen.sh instead of
autoreconf -fvi to re-generated all generated files from the tarball.
But the first step is to ship it.

/Simon


signature.asc
Description: PGP signature


  1   2   3   4   5   6   7   8   9   10   >