Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-18 Thread Paul Eggert

On 2025-04-18 10:32, Zack Weinberg wrote:

this hypothetical 'shawk' would be
a *harder* task than converting all the existing unconditional uses of
awk to use sed instead.


In 2007 we starting using awk in config.status because sed didn't always 
handle backslash-newline correctly, and because our sed approach ran 
into POSIX length limits that our awk approach doesn't have. I also 
vaguely recall awk ran faster. So I'd be cautious about switching back 
to sed.


It's been mentioned a couple of times that 'configure' should diagnose 
and fail if awk is missing rather than bulldozing ahead with nonsense, 
so I installed the first attached patch to try to do that. The check is 
actually in config.status, which 'configure' calls but which can be 
called separately.


Also, I saw that a minimal 'configure' uses grep only for "grep -c ^", 
so I installed the second attached patch to use "sed -n '$='" instead, 
since sed is used lots of other places in the script. I think most 
'configure' scripts use grep for other stuff anyway, so this second 
minor tweak is not that much of a change in practice.From b73f28c5196de7f79d62f793cfc247d2d6910393 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 18 Apr 2025 13:32:03 -0700
Subject: [PATCH 1/2] config.status now checks for missing awk

* lib/autoconf/status.m4 (_AC_OUTPUT_CONFIG_STATUS):
Diagnose missing awk and fail, rather than blundering on.
---
 NEWS   | 3 +++
 lib/autoconf/status.m4 | 8 +++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 0b57386c..0899de5f 100644
--- a/NEWS
+++ b/NEWS
@@ -43,6 +43,9 @@ GNU Autoconf NEWS - User visible changes.
 *** Autoconf no longer generates ${1+"$@"} in scripts, working around
   a bug in AIX 7.2 ksh93.
 
+*** config.status now fails with a diagnostic if awk is missing,
+  rather than misbehaving.
+
 * Noteworthy changes in release 2.72 (2023-12-22) [release]
 
 ** Backward incompatibilities
diff --git a/lib/autoconf/status.m4 b/lib/autoconf/status.m4
index 4fd8f39b..8bbf4b38 100644
--- a/lib/autoconf/status.m4
+++ b/lib/autoconf/status.m4
@@ -1470,7 +1470,13 @@ AC_PROVIDE_IFELSE([AC_PROG_MKDIR_P],
 AC_PROVIDE_IFELSE([AC_PROG_AWK],
 [AWK='$AWK'
 ])dnl
-test -n "\$AWK" || AWK=awk
+test -n "\$AWK" || {
+  awk '' >"$CONFIG_STATUS" <<\_ACEOF || ac_write_fail=1
-- 
2.48.1

From 3d79cd9ff455d85e788398bfe4fdc4544fabceb5 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 18 Apr 2025 14:01:38 -0700
Subject: [PATCH 2/2] Avoid grep in minimal configure
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lib/autoconf/status.m4 (_AC_OUTPUT_FILES_PREPARE):
Use ‘sed -n '$='’ instead of ‘grep -c’, since these
are the only uses of grep in a minimal ‘configure’
and we are already using sed elsewhere.
---
 doc/install.texi   | 12 ++--
 lib/autoconf/status.m4 |  5 ++---
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/doc/install.texi b/doc/install.texi
index f9e5e7fa..98eebe3d 100644
--- a/doc/install.texi
+++ b/doc/install.texi
@@ -138,15 +138,15 @@ you can type @samp{make uninstall} to remove the installed files.
 Installation requires a POSIX-like environment
 with a shell and at least the following standard utilities:
 
-@example
-awk cat cp diff echo expr false
-grep ls mkdir mv printf pwd
-rm rmdir sed sort test tr
-@end example
+@quotation
+@t{awk cat cp diff echo expr false
+ls mkdir mv printf pwd
+rm rmdir sed sort test tr}
+@end quotation
 
 @noindent
 This package's installation may need other standard utilities such as
-@command{cmp}, @command{make}, @command{sleep} and @command{touch},
+@command{grep}, @command{make}, @command{sleep} and @command{touch},
 along with compilers like @command{gcc}.
 
 @node Compilers and Options
diff --git a/lib/autoconf/status.m4 b/lib/autoconf/status.m4
index 8bbf4b38..0ac451dc 100644
--- a/lib/autoconf/status.m4
+++ b/lib/autoconf/status.m4
@@ -403,14 +403,13 @@ rm -f conf$$files.sh
   echo "_ACEOF"
 } >conf$$subs.sh ||
   AC_MSG_ERROR([could not make $CONFIG_STATUS])
-ac_delim_num=`echo "$ac_subst_vars" | grep -c '^'`
+ac_delim_num=`echo "$ac_subst_vars" | sed -n '$='`
 ac_delim='%!_!# '
 for ac_last_try in false false false false false :; do
   . ./conf$$subs.sh ||
 AC_MSG_ERROR([could not make $CONFIG_STATUS])
 
-dnl Do not use grep on conf$$subs.awk, since AIX grep has a line length limit.
-  ac_delim_n=`sed -n "s/.*$ac_delim\$/X/p" conf$$subs.awk | grep -c X`
+  ac_delim_n=`sed -n "s/.*$ac_delim\$/X/p" conf$$subs.awk | sed -n '$='`
   if test $ac_delim_n = $ac_delim_num; then
 break
   elif $ac_last_try; then
-- 
2.48.1



Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-18 Thread Bruno Haible via Gnulib discussion list
Simon Josefsson wrote:
> I'm concerned about rewriting efforts, they tend to never get finished.

Right. And the target of the rewrite should be something that is not
changing rapidly. Because users in the year 2040 should be able to take
a tarball packaged in 2030 and configure and build it. This means, it
should be something that is standardized, not something new that is
rapidly changing (like 'zig' for example).

With these requirements (standardized and easy to bootstrap, unlike Perl
or Python), there's not much left besides the Bourne shell.

> That's fine, and with Paul's patch it is now clear that Autoconf depends
> on sed during runtime of generated ./configure.

Regarding documentation, we may do more:
  - Either by adding a DEPENDENCIES file to each package, that - like the
gnulib/DEPENDENCIES - lists
  - A shell
  - Core POSIX utilities
  - The comparison utilities 'cmp' and 'diff'
  - Grep
  - Awk
  - sed
  - Or by a per-platform documentation:
  - On RHEL, clones, and Fedora, doyum -y install make gcc diffutils
  - On Alpine Linux, doapk add coreutils make gcc
  - On Cygwin, install gcc-core make

> I think the main aspect here is to see if we can find unnecessary
> dependencies on some tools, and fix them.  Sometimes code that rely on
> 'cmp' or 'diff' can be rewritten in some other style.

That's a pointless effort. When I did that for the 'join' program
(missing on Alpine Linux), that was already a waste of time. It is a
better use of our time to just document "on Alpine Linux, install coreutils
first".

Bruno






Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-18 Thread Zack Weinberg
On Fri, Apr 18, 2025, at 12:48 PM, Simon Josefsson via Bug reports for autoconf 
wrote:
> "Zack Weinberg"  writes: I think the main aspect
> here is to see if we can find unnecessary dependencies on some tools,
> and fix them.  Sometimes code that rely on 'cmp' or 'diff' can be
> rewritten in some other style.

I want to reiterate here that if Fedora (or any other corporate entity)
sees value in this work getting done, then they need to either do it
themselves or pay someone to do it.  I am not going to work on this
(except *maybe* the grep part) unless I get paid to do it and I doubt
anyone else is either.

> I think we have come to the point where it is only reasonable to
> assume /bin/sh and that it supports some reasonable subset of POSIX
> shell semantics.  Anything beyond that needs to be documented as a
> buildtime or runtime dependency.  I have experimental Guix containers
> where /bin/sh is Gash and common Unix tools are from Gash-Utils.  It
> doesn't run ./configure well yet, but I suppose it is only a matter
> of time.

That's fine, but...

> One way around this is to implement a limited awk in POSIX shell.  For
> the features of awk that Autoconf needs.  Has anyone looked into that?
> 'shawk'?

I think you are underestimating how limited the string manipulation
capabilities are, of the POSIX shell subset that Autoconf is restricted
to, in the absence of awk.  I think this hypothetical 'shawk' would be
a *harder* task than converting all the existing unconditional uses of
awk to use sed instead.


zw



Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-18 Thread Simon Josefsson via Gnulib discussion list
"Zack Weinberg"  writes:

> For additional clarity, the purpose of AC_PROG_AWK (and AC_PROG_SED and
> AC_PROG_*GREP) is to find the _best available implementation_ of these
> utilities, not to determine whether they exist at all.  Autoconf core
> code assumes that all three exist in some form.
>
> There are only a few uses of grep and its relatives in autoconf core
> code, they're all pretty straightforward, and it might genuinely be
> worth getting rid of them just because of the number of portability
> headaches associated with grep.

This is good insight!  Reducing the dependency on grep for autoconf may
be a worthy goal then.  I recall that when I tried to make use of 'grep'
in some gnulib script, Jim Meyering patched it away and for trivial uses
of grep I'm trying to get into a habit of avoiding it.

> Sed, however, is used ubiquitously, throughout both Autoconf proper and
> the M4sh support layer.  It's needed for super basic things like parsing
> command line options to ./configure.  I don't see any way to remove this
> dependency short of rewriting the entirety of Autoconf in a completely
> different programming language.

That's fine, and with Paul's patch it is now clear that Autoconf depends
on sed during runtime of generated ./configure.

I think the main aspect here is to see if we can find unnecessary
dependencies on some tools, and fix them.  Sometimes code that rely on
'cmp' or 'diff' can be rewritten in some other style.

> Awk is not used nearly as much, but it is required by the code that
> generates config.status, and then by the actual execution of
> config.status. It _might_ be possible to use sed for this instead, but
> the config.status generation and execution process is some of the
> hairiest code in all of Autoconf.  I'm gonna ballpark it at *at least*
> a full-time person-month of effort.  Bluntly: if Fedora wants this to
> happen, Fedora needs to pay someone to do it.
...
> I would like to see someone do some research on what an ergonomic,
> extensible shell-type scripting language, suitable for rewriting
> Autoconf in, with a core minimal enough  that it *would* be possible to
> make it available in the earliest stages of an architecture bootstrap,
> would be like.  *But it's a research project*.  None of the Bourne shell
> replacement languages I am aware of (e.g. zsh, fish, rc, oils) are even
> *trying* to fix the most serious problems with Bourne shell.  I'm not
> convinced anyone even really understands what those *are*.

I'm concerned about rewriting efforts, they tend to never get finished.

I think we have come to the point where it is only reasonable to assume
/bin/sh and that it supports some reasonable subset of POSIX shell
semantics.  Anything beyond that needs to be documented as a buildtime
or runtime dependency.  I have experimental Guix containers where
/bin/sh is Gash and common Unix tools are from Gash-Utils.  It doesn't
run ./configure well yet, but I suppose it is only a matter of time.

One way around this is to implement a limited awk in POSIX shell.  For
the features of awk that Autoconf needs.  Has anyone looked into that?
'shawk'?  That implementation could be embedded into the ./configure
script and used if the script cannot find any system awk.

I'm not convinced this is good use of time, though.  A 'dnf install awk'
is a simple solution.  But that has bootstrapping issues.  And these
low-level assumptions are fun to discuss.

/Simon


signature.asc
Description: PGP signature


Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-18 Thread Zack Weinberg
On Thu, Apr 17, 2025, at 7:31 AM, Simon Josefsson via Bug reports for autoconf 
wrote:
> Paul Eggert  writes:
>> On 2025-04-16 23:31, Simon Josefsson via Bug reports for
>> autoconf wrote:
>>
>>> I tried reading through the autoconf manual to see if 'awk' is a run-
>>> time dependency for running generated ./configure scripts
>>
>> It is a dependency, and the documentation should mention this
>
> Thank you for patch and background context!

For additional clarity, the purpose of AC_PROG_AWK (and AC_PROG_SED and
AC_PROG_*GREP) is to find the _best available implementation_ of these
utilities, not to determine whether they exist at all.  Autoconf core
code assumes that all three exist in some form.

There are only a few uses of grep and its relatives in autoconf core
code, they're all pretty straightforward, and it might genuinely be
worth getting rid of them just because of the number of portability
headaches associated with grep.

Sed, however, is used ubiquitously, throughout both Autoconf proper and
the M4sh support layer.  It's needed for super basic things like parsing
command line options to ./configure.  I don't see any way to remove this
dependency short of rewriting the entirety of Autoconf in a completely
different programming language.

Awk is not used nearly as much, but it is required by the code that
generates config.status, and then by the actual execution of
config.status. It _might_ be possible to use sed for this instead, but
the config.status generation and execution process is some of the
hairiest code in all of Autoconf.  I'm gonna ballpark it at *at least*
a full-time person-month of effort.  Bluntly: if Fedora wants this to
happen, Fedora needs to pay someone to do it.

>>> 0) Modify autoconf to continue to work in this situation without
>>>awk, replacing it with more POSIX shell or something else?
>>
>> It might be possible to replace awk with Python (say), but it'd be a
>> nontrivial. I doubt whether the shell itself would suffice. Probably
>> not worth the effort at this point.
>
> Python is not available in the default Fedora 42 image either.
...
> I would prefer if we can make things work with /bin/sh and as few low-
> complex utilities as possible, for bootstrapping reasons.  Making
> Python, Rust/Go, Guile etc work on new architectures is serious work,
> and sometimes people just give up (e.g., m68k, sh4, sparc64 is not in
> a good state here).  So I don't think that is always the right
> solution.

Backing this up, in roughly 2019 I was told (by Fedora maintainers) that
it is not feasible to make Python available early enough in architecture
bootstrap for it to be used in the build process of glibc or the C
toolchain.  More recently I was told the same thing about Perl.  I do not
know any specifics of why, but that pretty much rules out any use of
either in code unavoidably emitted into generated configure scripts,
such as the config.status logic.

(We get away with requiring Perl for the `autoconf` program _only_
because we ship the generated configure script in the tarball of
everything needed that early in architecture bootstrap.)

I would like to see someone do some research on what an ergonomic,
extensible shell-type scripting language, suitable for rewriting
Autoconf in, with a core minimal enough  that it *would* be possible to
make it available in the earliest stages of an architecture bootstrap,
would be like.  *But it's a research project*.  None of the Bourne shell
replacement languages I am aware of (e.g. zsh, fish, rc, oils) are even
*trying* to fix the most serious problems with Bourne shell.  I'm not
convinced anyone even really understands what those *are*.

zw



Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-18 Thread Eric Blake
On Thu, Apr 17, 2025 at 08:31:08AM +0200, Simon Josefsson via Gnulib discussion 
list wrote:
> Hi
> 
> I got a CI/CD build failure [1] for libidn on the new release fedora 42.

It's not just Fedora 42; OpenSUSE did the same back in last August:
https://gitlab.com/libvirt/libvirt-ci/-/merge_requests/425
https://bugzilla.suse.com/show_bug.cgi?id=1214365#c6

I wonder if distros that provide shortcuts for setting up a devel
environment (such as installing gcc, make, etc) should make sure awk
is part of that common devel environment.  But yeah, we are now at the
point where it is no longer reasonable to assume that awk is available
on every bare-bones distro, and so we either need to tweak autoconf to
avoid awk (hard) or to at least alert the user that their environment
is too bare-bones, and they need to adjust their CI setup to install
just a bit more.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-18 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible via Bug reports for autoconf  writes:

> Simon Josefsson wrote:
>> It seems awk is going away as a standard tool, and it is possible to
>> either fight that or just accept it.
>
> I don't think that the set of "standard tools" is shrinking. Rather,
> it's expanding. On OpenBSD for example, perl is installed by default.
> Similarly, Python is considered a standard tool nowadays.

I think it depends on what kind of environment we are talking about.
"Debian" installed via debian-installer do not result in the same
environment as running the docker container images (which are typically
used for CI) or the cloud images (which are typically used in Kubernetes
environments).  I'm not sure there is a trend that container images for
OS's are growing, I think most distributions are actively trying to make
them as small as possible and are dropping packages.  This also seem to
change over time, what used to be included in a (for example) "Debian
11" container image could at some point be removed.

/Simon


signature.asc
Description: PGP signature


Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Paul Eggert

On 2025-04-17 09:46, Simon Josefsson wrote:

I don't think it's worth making a distinction here between Autoconf and
"Autoconf + Automake", since most packages that use Autoconf also use
Automake.

I would appreciate distinguishing this, it helps to make it more clear
where dependencies are coming from.  OpenSSH is using autoconf but not
automake, IIRC, and I think there are more examples.


I looked into it a bit more from the Autoconf viewpoint, and installed 
the attached patch, which I think is closer to the correct list of 
minimal prerequisites for 'configure' scripts generated by Autoconf alone.


I haven't had time to look into exactly what Automake requires; any such 
list could be put into the Automake manual. However, the attached patch 
does mention 'sleep' in its list of example commands that installers may 
need, along with 'cmp', 'gcc', 'make', and 'touch'.From 95b849dd837210dc3109adfbd52a0929fbec141e Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 17 Apr 2025 12:14:41 -0700
Subject: [PATCH] Improve list of 'configure' prereqs

* doc/install.texi (Installation Prerequisites):
Use a more-accurate list.
---
 doc/install.texi | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/doc/install.texi b/doc/install.texi
index cf9a9c67..f9e5e7fa 100644
--- a/doc/install.texi
+++ b/doc/install.texi
@@ -136,17 +136,18 @@ you can type @samp{make uninstall} to remove the installed files.
 @section Installation Prerequisites
 
 Installation requires a POSIX-like environment
-with a shell and the following standard utilities:
+with a shell and at least the following standard utilities:
 
 @example
-[ awk cat cmp cp diff echo expr false
-grep ls mkdir mv printf pwd rm rmdir
-sed sort test touch tr true
+awk cat cp diff echo expr false
+grep ls mkdir mv printf pwd
+rm rmdir sed sort test tr
 @end example
 
 @noindent
-Depending on the package, other programs may be needed,
-such as a compiler for the language the package is written in.
+This package's installation may need other standard utilities such as
+@command{cmp}, @command{make}, @command{sleep} and @command{touch},
+along with compilers like @command{gcc}.
 
 @node Compilers and Options
 @section Compilers and Options
-- 
2.45.2



Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Paul Eggert

On 2025-04-17 04:31, Simon Josefsson wrote:

Paul Eggert  writes:

[ awk cat cmp cp diff echo expr false
grep ls mkdir mv printf pwd rm rmdir
sed sort test touch tr true


Of those, I'm guessing awk, grep, sed and ls are the most complex.  The
'diff' tool also stands out, and most of my packages build fine (except
for some spuriois error messages during ./configure) without 'diff', and
it is often not available in containers.


In a minimal 'configure', if diff is absent then caches are ignored. 
This will slow things down but it won't break things.


However, many 'configure's will use diff in more-complicated ways. For 
example, when 'configure' checks whether 'grep' can handle long lines, 
if 'diff' is missing and grep lacks a --version flag, 'configure' will 
error out. This probably worked for you because you were using GNU grep 
which has --version, but it won't work in a minimal POSIX environment. 
And I imagine there are more-complicated dependencies on 'diff' in some 
'configure' files, where if 'diff' is missing 'configure' will configure 
your program incorrectly, perhaps merely hurting performance so 'make 
check' doesn't catch it.


In short, I wouldn't recommend running 'configure' without 'diff'.



Having a limited/partial/inefficient awk implementation inside Autoconf,
implemented in /bin/sh, for the functionality that Autoconf needs
itself, would be nice.  It could be used if the system lacks 'awk'.  Or
all code that rely on awk could be rewritten in shell syntax.


I don't offhand see how that would work, without significantly hurting 
performance in the usual case. But if someone could get it to work then 
yes it's better if 'configure' depends on fewer programs.


From a minimalist point of view, Fedora's dropping of 'awk' in minimal 
containers should not be that big of a deal. If you want to build 
programs, you need build tools like gcc that really ought to be missing 
in minimal environments. Just add awk to build tool list.




Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Eli Schwartz
On 4/17/25 10:14 AM, Eric Blake wrote:
> On Thu, Apr 17, 2025 at 02:28:36PM +0200, Bruno Haible via Bug reports for 
> autoconf wrote:
>> Jeffrey Walton wrote:
>>> Awk is a standard Posix utility:
>>> .
>>
>> The newest POSIX is POSIX:2024. Update your URLs:
>> .
> 
> Just because it is a POSIX utility doesn't matter - in the modern
> world of container images, the goal is to start as bare-bones as
> possible (intentionally NOT a full POSIX environment) and then
> explicitly document what additional tools are needed for whatever task
> your container will be doing.  If your container will be running a
> configure script, it appears that the distros now want it to be your
> responsibility, as a prerequisite of preparing that container, to pull
> in additional POSIX utilities that the configure script will rely on,
> rather than assuming that the distro has provided them out-of-the-box.


I very much understand and sympathize with the desire to have a
barebones environment.

But a real barebones environment is an empty root containing exactly two
packages:

- the one providing /bin/sh
- the one providing Alpine's `apk`, so that the person entering the
  container can explicitly apk add the things they need

...

In general, the autoconf design of depending on shell invites itself to
needing "POSIX utilities" to do that work, and it is very reasonable
that it do so using utilities that are:

- simple or already solved to bootstrap
- available everywhere on a desktop install
- available everywhere as an optional package

awk fills this need admirably. The next packages to be trimmed down as
"unneeded in containers" will surely be grep and sed.

Maybe distros could or should provide virtual / meta packages for
installing a POSIX environment, or at least a "gnuconfigure-runtime", so
that people can easily install what they need to build packages in CI/CD.

As above:


> Maybe Autoconf should be changed so that 'configure' fails quickly if 
> there's no awk. That shouldn't be much work.


Probably it is worth having 'configure' fail quickly if any of the basic
prerequisites Paul added to the manual are missing. awk isn't special here.


-- 
Eli Schwartz


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Bruno Haible via Gnulib discussion list
Simon Josefsson wrote:
> the docker container images (which are typically
> used for CI) or the cloud images (which are typically used in Kubernetes
> environments).  I'm not sure there is a trend that container images for
> OS's are growing, I think most distributions are actively trying to make
> them as small as possible

Yes, definitely. When you look at cloud / docker images, there is no
concept of "standard tools" any more.

The first occurrence of this trend was the missing 'join' program on Alpine
Linux [1]. Now 'cmp' on RHEL 9 and 'awk' on Fedora 42... In hindsight, for
the 'join' problem, it would have been better if we had merely documented
as a prerequisite; this would have saved us the the time implementing
workarounds.

Bruno

[1] https://lists.gnu.org/archive/html/bug-gnulib/2021-04/msg00041.html






Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Simon Josefsson via Gnulib discussion list
Bruno Haible via Bug reports for autoconf  writes:

> Simon Josefsson wrote:
>> 0) Modify autoconf to continue to work in this situation without awk,
>> replacing it with more POSIX shell or something else?
>
> Many years ago, Autoconf produced configure scripts that used 'sed'
> for the job of replacing the various @FOO@ occurrences in *.in files.
> When they switched to 'awk', it was for speed reasons, IIRC.

I understand the desire to drop all the old code, but it would be nice
to have something like it as a fall-back in case awk isn't available.

> I don't think it's worth making a distinction here between Autoconf and
> "Autoconf + Automake", since most packages that use Autoconf also use
> Automake.

I would appreciate distinguishing this, it helps to make it more clear
where dependencies are coming from.  OpenSSH is using autoconf but not
automake, IIRC, and I think there are more examples.

/Simon


signature.asc
Description: PGP signature


Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Eric Blake
On Thu, Apr 17, 2025 at 02:28:36PM +0200, Bruno Haible via Bug reports for 
autoconf wrote:
> Jeffrey Walton wrote:
> > Awk is a standard Posix utility:
> > .
> 
> The newest POSIX is POSIX:2024. Update your URLs:
> .

Just because it is a POSIX utility doesn't matter - in the modern
world of container images, the goal is to start as bare-bones as
possible (intentionally NOT a full POSIX environment) and then
explicitly document what additional tools are needed for whatever task
your container will be doing.  If your container will be running a
configure script, it appears that the distros now want it to be your
responsibility, as a prerequisite of preparing that container, to pull
in additional POSIX utilities that the configure script will rely on,
rather than assuming that the distro has provided them out-of-the-box.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Bruno Haible via Gnulib discussion list
Simon Josefsson wrote:
> It seems awk is going away as a standard tool, and it is possible to
> either fight that or just accept it.

I don't think that the set of "standard tools" is shrinking. Rather,
it's expanding. On OpenBSD for example, perl is installed by default.
Similarly, Python is considered a standard tool nowadays.

Probably some people at Red Hat just thought "how can we reduce the
weight of our default install" and opted to remove 'awk' here and
'diffutils' there, knowing that the user can easily install them from
the package repositories.

Bruno






Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Bruno Haible via Gnulib discussion list
Jeffrey Walton wrote:
> Awk is a standard Posix utility:
> .

The newest POSIX is POSIX:2024. Update your URLs:
.

Bruno






Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Bruno Haible via Gnulib discussion list
Simon Josefsson wrote:
> 0) Modify autoconf to continue to work in this situation without awk,
> replacing it with more POSIX shell or something else?

Many years ago, Autoconf produced configure scripts that used 'sed'
for the job of replacing the various @FOO@ occurrences in *.in files.
When they switched to 'awk', it was for speed reasons, IIRC.

Paul Eggert wrote:
> This repeats the 
> list of programs mentioned in the GNU Coding Standards 
> , 
> except it adds [ (which Autoconf-generated scripts use) and omits 
> install-info, ln, sleep, and tar (which I don't think they do).

'sleep' is used by the Automake macros: In any configure script generated
with Autoconf + Automake, you find:

  sleep $am_try_res
  sleep $am_try_res
sleep $am_try_res
  sleep "$am_cv_filesystem_timestamp_resolution"

I don't think it's worth making a distinction here between Autoconf and
"Autoconf + Automake", since most packages that use Autoconf also use
Automake.

Bruno






Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Jeffrey Walton
On Thu, Apr 17, 2025 at 2:33 AM Simon Josefsson via Gnulib discussion
list  wrote:
>
> I got a CI/CD build failure [1] for libidn on the new release fedora 42.
>
> checking that generated files are newer than configure... done
> configure: creating ./config.status
> config.status: creating csharpcomp.sh
> ./config.status: line 2711: awk: command not found
> config.status: error: could not create csharpcomp.sh
>
> This is building from a "make dist" tarball and not from git, so the
> installed dependencies are minimal:
>
> dnf -y install make gcc diffutils valgrind
>
> This used to work fine with Fedora 41.  But Fedora 42 doesn't have awk
> by default any more:
>
> jas@kaka:~$ podman run -it --rm fedora:41
> [root@0a7731017978 /]# awk --version|head -1
> GNU Awk 5.3.0, API 4.0, PMA Avon 8-g1, (GNU MPFR 4.2.1, GNU MP 6.3.0)
> [root@0a7731017978 /]#
> exit
> jas@kaka:~$ podman run -it --rm fedora:42
> [root@8e2093893358 /]# awk --version
> bash: awk: command not found
> [root@8e2093893358 /]#

Awk is a standard Posix utility:
.

I would call it a Fedora 42 bug.

Jeff



Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Bruno Haible via Gnulib discussion list
Simon Josefsson wrote:
> I got a CI/CD build failure [1] for libidn on the new release fedora 42.
> 
> checking that generated files are newer than configure... done
> configure: creating ./config.status
> config.status: creating csharpcomp.sh
> ./config.status: line 2711: awk: command not found

And similarly, AlmaLinux 9 does not have 'cmp' (from the diffutils) installed
by default, since 2 days ago. (Or, possibly, the dependencies of 'make' and 
'gcc'
have changed to not include 'diffutils' any more.)

Here are two CI runs of coreutils [1][2].
In [1], all tests passed. In [2] many tests fail with
  ../tests/init.sh: line 711: cmp: command not found

The fix is easy for the CI: Just do a "yum -y install diffutils".

Bruno

[1] https://github.com/coreutils/ci-check/actions/runs/14453120538
[2] https://github.com/coreutils/ci-check/actions/runs/14465809045






Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Simon Josefsson via Gnulib discussion list
Paul Eggert  writes:

> On 2025-04-16 23:31, Simon Josefsson via Bug reports for autoconf wrote:
>
>> I tried reading through the autoconf manual to see if 'awk' is a
>> run-time dependency for running generated ./configure scripts
>
> It is a dependency, and the documentation should mention this

Thank you for patch and background context!

> [ awk cat cmp cp diff echo expr false
> grep ls mkdir mv printf pwd rm rmdir
> sed sort test touch tr true

Of those, I'm guessing awk, grep, sed and ls are the most complex.  The
'diff' tool also stands out, and most of my packages build fine (except
for some spuriois error messages during ./configure) without 'diff', and
it is often not available in containers.

>> 0) Modify autoconf to continue to work in this situation without awk,
>> replacing it with more POSIX shell or something else?
>
> It might be possible to replace awk with Python (say), but it'd be a
> nontrivial. I doubt whether the shell itself would suffice. Probably
> not worth the effort at this point.

Python is not available in the default Fedora 42 image either.

I think the trend is to either assume /bin/sh with the smallest set of
utilities possible, or to switch everything to some higher-level
language like Python, Rust/Go or Guile.

It seems awk is going away as a standard tool, and it is possible to
either fight that or just accept it.

I would prefer if we can make things work with /bin/sh and as few
low-complex utilities as possible, for bootstrapping reasons.  Making
Python, Rust/Go, Guile etc work on new architectures is serious work,
and sometimes people just give up (e.g., m68k, sh4, sparc64 is not in a
good state here).  So I don't think that is always the right solution.

Having a limited/partial/inefficient awk implementation inside Autoconf,
implemented in /bin/sh, for the functionality that Autoconf needs
itself, would be nice.  It could be used if the system lacks 'awk'.  Or
all code that rely on awk could be rewritten in shell syntax.

/Simon


signature.asc
Description: PGP signature


Re: fedora 42 doesn't have awk: how to deal with autoconf subst?

2025-04-17 Thread Paul Eggert

On 2025-04-16 23:31, Simon Josefsson via Bug reports for autoconf wrote:


I tried reading through the autoconf manual to see if 'awk' is a
run-time dependency for running generated ./configure scripts


It is a dependency, and the documentation should mention this. I 
installed the attached patch into Autoconf on Savannah. This repeats the 
list of programs mentioned in the GNU Coding Standards 
, 
except it adds [ (which Autoconf-generated scripts use) and omits 
install-info, ln, sleep, and tar (which I don't think they do).




0) Modify autoconf to continue to work in this situation without awk,
replacing it with more POSIX shell or something else?


It might be possible to replace awk with Python (say), but it'd be a 
nontrivial. I doubt whether the shell itself would suffice. Probably not 
worth the effort at this point.




2) Autoconf require awk only for certain kind of usages (substitions?),
and then this should be documented to clarify which autoconf
functionality requires awk, and I need to document for libidn.


I didn't take the time to do this. The main thing is substitutions, but 
there are others.




4) If 2) is true and gnulib uses the necessary features from autoconf
that brings in awk, could gnulib be modified to not use those
features so that building on systems without awk works?


This is mostly an Autoconf thing. Two Gnulib modules (getopt, 
libunistring-base) also use Awk.


Typically the Autoconf and Gnulib code that use Awk can't easily be 
rewritten to use other POSIX utilities, as Awk was generally the last 
resort.


I expect that builders on Fedora will quickly learn to install an awk, 
preferably GNU awk.


Maybe Autoconf should be changed so that 'configure' fails quickly if 
there's no awk. That shouldn't be much work.From 56860215d1af490eddb821d73154a190137aa8ec Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 17 Apr 2025 01:08:52 -0700
Subject: [PATCH] Document that 'configure' needs awk etc

Problem reported by Simon Josefsson in:
https://lists.gnu.org/r/bug-gnulib/2025-04/msg00127.html
* doc/install.texi (Installation Prerequisites): New section.
---
 doc/autoconf.texi |  2 ++
 doc/install.texi  | 16 
 2 files changed, 18 insertions(+)

diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index 8dffd02c..fd16d5fb 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -598,6 +598,7 @@ Transforming Program Names When Installing
 Running @command{configure} Scripts
 
 * Basic Installation::  Instructions for typical cases
+* Installation Prerequisites::  What you need to install
 * Compilers and Options::   Selecting compilers and optimization
 * Multiple Architectures::  Compiling for multiple architectures at once
 * Installation Names::  Installing in different directories
@@ -22893,6 +22894,7 @@ may use comes with Autoconf.
 
 @menu
 * Basic Installation::  Instructions for typical cases
+* Installation Prerequisites::  What you need to install
 * Compilers and Options::   Selecting compilers and optimization
 * Multiple Architectures::  Compiling for multiple architectures at once
 * Installation Names::  Installing in different directories
diff --git a/doc/install.texi b/doc/install.texi
index 414c8939..cf9a9c67 100644
--- a/doc/install.texi
+++ b/doc/install.texi
@@ -132,6 +132,22 @@ If the package follows the GNU Coding Standards,
 you can type @samp{make uninstall} to remove the installed files.
 @end enumerate
 
+@node Installation Prerequisites
+@section Installation Prerequisites
+
+Installation requires a POSIX-like environment
+with a shell and the following standard utilities:
+
+@example
+[ awk cat cmp cp diff echo expr false
+grep ls mkdir mv printf pwd rm rmdir
+sed sort test touch tr true
+@end example
+
+@noindent
+Depending on the package, other programs may be needed,
+such as a compiler for the language the package is written in.
+
 @node Compilers and Options
 @section Compilers and Options
 
-- 
2.45.2