Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-02 Thread Steffen Nurpmeso
Stephane Chazelas wrote in
 <20230902084912.vdfedsgbnat2w...@chazelas.org>:
 |2023-09-01 23:28:50 +0200, Steffen Nurpmeso via austin-group-l at The \
 |Open Group:
 ...
 |>|FWIW, a "printf %b" github shell code search returns ~ 29k
 |>|entries
 |>|(https://github.com/search?q=printf+%25b+language%3AShell=code=Sh\
 |>|ell)
 ...
 |> Actually this returns a huge amount of false positives where
 |> printf(1) and %b are not on the same line, let alone the same
 ...
 |Apparently, we can also search with regexps and searching for
 |printf.*%b
 |(https://github.com/search?q=%2Fprintf.*%25b%2F+language%3AShell=code)
 |It's probably a lot more accurate. It returns ~ 19k.
 ...
 |> Furthermore it shows a huge amount of false use cases like
 ...
 |Yes, I also see a lot of echo -e stuff that should have been
 |echo -E stuff (or echo alone in those (many) implementations
 |that don't expand by default or use the more reliable printf
 |with %s (not %b)).
 |
 |> It seems people think you need this to get colours mostly, which
 ...
 |Incidentally, ANSI terminal colour escape sequences are somewhat
 |connecting those two %b's as they are RGB (well BGR) in binary
 |(white is 7 = 0b111, red 0b001, green 0b010, blue 0b100), with:
 |
 |R=0 G=1 B=1
 |printf '%bcyan%b\n' "\033[3$(( 2#$B$G$R ))m" '\033[m'
 |
 |(with Korn-like shells, also $(( 0b$B$G$R )) in zsh though zsh
 |has builtin colour output support including RGB-based).

..and, off-topic, but in my opinion that is also false usage, one
should use tput(1) instead, and then simply printf(1) (or echo(1)
(or cat(1))) the output, something like, fwiw :),

  color_init() {
  [ -n "${NO_COLOUR}" ] && return
  # We do not want color for "make test > .LOG"!
  if [ -t 1 ] && command -v tput >/dev/null 2>&1; then
  { sgr0=$(tput sgr0); } 2>/dev/null
  [ $? -eq 0 ] || return
  { saf1=$(tput setaf 1); } 2>/dev/null
  [ $? -eq 0 ] || return
  { saf2=$(tput setaf 2); } 2>/dev/null
  [ $? -eq 0 ] || return
  { saf3=$(tput setaf 3); } 2>/dev/null
  [ $? -eq 0 ] || return
  { saf5=$(tput setaf 5); } 2>/dev/null
  [ $? -eq 0 ] || return
  { b=$(tput bold); } 2>/dev/null
  [ $? -eq 0 ] || return

  COLOR_ERR_ON=${saf1}${b} COLOR_ERR_OFF=${sgr0}
  COLOR_DBGERR_ON=${saf5} COLOR_DBGERR_OFF=${sgr0}
  COLOR_WARN_ON=${saf3}${b} COLOR_WARN_OFF=${sgr0}
  COLOR_OK_ON=${saf2} COLOR_OK_OFF=${sgr0}
  unset saf1 saf2 saf3 b
  fi
  }

  ...

  printf '%s%s%s' "${COLOR_WARN_ON}" "$SOME_MSG" "${COLOR_WARN_OFF}"

Of course this is also only ANSI via sgr0 (:-|

 |Speaking of stackexchange, on the June data dump of
 |unix.stackexchange.com:
 |
 |stackexchange/unix.stackexchange.com$ xml2 < Posts.xml | grep -c 'printf\
 |.*%b'
 |494
 |
 |(FWIW)
 |
 |Compared with %d (though that will have entries for printf(3) as well):
 |
 |stackexchange/unix.stackexchange.com$ xml2 < Posts.xml | grep -c 'printf\
 |.*%d'
 |3444

I am totally stunned by the ratio.  I myself have never used %b
(like this, aka for printf).

 --End of <20230902084912.vdfedsgbnat2w...@chazelas.org>

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-02 Thread Léa Gris

Le 02/09/2023 à 07:46, Phi Debian écrivait :

On Fri, Sep 1, 2023 at 8:10 PM Stephane Chazelas 
wrote:


2023-09-01 07:54:02 -0500, Eric Blake via austin-group-l at The Open Group:


FWIW, a "printf %b" github shell code search returns ~ 29k
entries
(
https://github.com/search?q=printf+%25b+language%3AShell=code=Shell
)



Ha super, at least some numbers :-), I didn't knew we could make this kind
of request... thanx for that.


18k results on 



Those actual numbers vary a lot depending on request accuracy. Because 
there is no Regex replacement for a shell language parser; it cannot 
match all syntactically valid use cases, even with a carefully crafted 
Regex.


--
Léa Gris



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-02 Thread Stephane Chazelas
2023-09-01 23:28:50 +0200, Steffen Nurpmeso via austin-group-l at The Open 
Group:
[...]
>  |FWIW, a "printf %b" github shell code search returns ~ 29k
>  |entries
>  |(https://github.com/search?q=printf+%25b+language%3AShell=code=Sh\
>  |ell)
>  |
>  |That likely returns only a small subset of the code that uses
>  |printf with %b inside the format and probably a few false
>  |positives, but that gives many examples of how printf %b is used
>  |in practice.
> 
> Actually this returns a huge amount of false positives where
> printf(1) and %b are not on the same line, let alone the same
> command, if you just scroll down a bit it starts like neovim match
[...]

You're right, I only looked at the first few results and saw
that already gave interesting ones.

Apparently, we can also search with regexps and searching for
printf.*%b
(https://github.com/search?q=%2Fprintf.*%25b%2F+language%3AShell=code)
It's probably a lot more accurate. It returns ~ 19k.

(still FWIW, that's still just a sample of random code on the
internet)

[...]
> Furthermore it shows a huge amount of false use cases like
> 
>  printf >&2 "%b\n" "The following warnings and non-fatal errors were 
> encountered during the installation process:"
[...]

Yes, I also see a lot of echo -e stuff that should have been
echo -E stuff (or echo alone in those (many) implementations
that don't expand by default or use the more reliable printf
with %s (not %b)).

> It seems people think you need this to get colours mostly, which
> then, it has to be said, is also practically mislead.  (To the
> best of *my* knowledge that is.)
[...]

Incidentally, ANSI terminal colour escape sequences are somewhat
connecting those two %b's as they are RGB (well BGR) in binary
(white is 7 = 0b111, red 0b001, green 0b010, blue 0b100), with:

R=0 G=1 B=1
printf '%bcyan%b\n' "\033[3$(( 2#$B$G$R ))m" '\033[m'

(with Korn-like shells, also $(( 0b$B$G$R )) in zsh though zsh
has builtin colour output support including RGB-based).

Speaking of stackexchange, on the June data dump of
unix.stackexchange.com:

stackexchange/unix.stackexchange.com$ xml2 < Posts.xml | grep -c 'printf.*%b'
494

(FWIW)

Compared with %d (though that will have entries for printf(3) as well):

stackexchange/unix.stackexchange.com$ xml2 < Posts.xml | grep -c 'printf.*%d'
3444

-- 
Stephane



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Phi Debian
On Fri, Sep 1, 2023 at 8:10 PM Stephane Chazelas 
wrote:

> 2023-09-01 07:54:02 -0500, Eric Blake via austin-group-l at The Open Group:
>
>
> FWIW, a "printf %b" github shell code search returns ~ 29k
> entries
> (
> https://github.com/search?q=printf+%25b+language%3AShell=code=Shell
> )
>
>
Ha super, at least some numbers :-), I didn't knew we could make this kind
of request... thanx for that.


Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Steffen Nurpmeso
Stephane Chazelas via austin-group-l at The Open Group wrote in
 <20230901181024.pwx4plwclz7ij...@chazelas.org>:
 |2023-09-01 07:54:02 -0500, Eric Blake via austin-group-l at The Open Group:
 ...
 |> How many scripts in the wild actually use %b, though?  And if there
 |> are such scripts, anything we can do to make it easy to do a drop-in
 |> replacement that still preserves the old behavior (such as changing %b
 |> to %#s) is going to be easier to audit than the only other
 |> currently-portable alternative of actually analyzing the string to see
 |> if it uses any octal or \c escapes that have to be re-written to
 |> portably function as a printf format argument.
 |[...]
 |
 |FWIW, a "printf %b" github shell code search returns ~ 29k
 |entries
 |(https://github.com/search?q=printf+%25b+language%3AShell=code=Sh\
 |ell)
 |
 |That likely returns only a small subset of the code that uses
 |printf with %b inside the format and probably a few false
 |positives, but that gives many examples of how printf %b is used
 |in practice.

Actually this returns a huge amount of false positives where
printf(1) and %b are not on the same line, let alone the same
command, if you just scroll down a bit it starts like neovim match

 pr_title="${pr_title// /,}" # Replace spaces with commas.
 pr_title="$(printf 'vim-patch:%s' "${pr_title#,}")"

(bash only btw).
Furthermore it shows a huge amount of false use cases like

 printf >&2 "%b\n" "The following warnings and non-fatal errors were 
encountered during the installation process:"

This is only the first result page.
It seems people think you need this to get colours mostly, which
then, it has to be said, is also practically mislead.  (To the
best of *my* knowledge that is.)

Ah it is a copy world, and for one Stephane at stackoverflow
there are 99 that fool and mislead you, or do not know for sure
themselves, but also copy and paste!

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Stephane Chazelas
2023-09-01 07:54:02 -0500, Eric Blake via austin-group-l at The Open Group:
[...]
> > Well in all case %b can not change semantic in the bash script, since it is
> > there for so long, even if it depart from python, perl, libc, it is
> > unfortunate but that's the way it is, nobody want a semantic change, and on
> > next routers update, see the all internet falling appart :-)
> 
> How many scripts in the wild actually use %b, though?  And if there
> are such scripts, anything we can do to make it easy to do a drop-in
> replacement that still preserves the old behavior (such as changing %b
> to %#s) is going to be easier to audit than the only other
> currently-portable alternative of actually analyzing the string to see
> if it uses any octal or \c escapes that have to be re-written to
> portably function as a printf format argument.
[...]

FWIW, a "printf %b" github shell code search returns ~ 29k
entries
(https://github.com/search?q=printf+%25b+language%3AShell=code=Shell)

That likely returns only a small subset of the code that uses
printf with %b inside the format and probably a few false
positives, but that gives many examples of how printf %b is used
in practice.

printf %b is also what all serious literature about shell
scripting has been recommending to use in place of the
unportable echo -e (or XSI echo, or print without -r). That
includes the POSIX standard which has been recommending using
printf instead of the non-portable echo for 30 years.

So that change will also invalidate all those. It will take a
while before %#s is supported widely enough that %b can be
safely replaced with %#s

-- 
Stephane



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Stephane Chazelas
2023-09-01 07:15:14 -0500, Eric Blake:
[...]
> > Note that in bash, you need both
> > 
> > shopt -s xpg_echo
> > set -o posix
> > 
> > To get a XSI echo. Without the latter, options are still
> > recognised. You can get a XSI echo without those options with:
> > 
> > xsi_echo() {
> >   local IFS=' ' -
> >   set +o posix
> >   echo -e "$*\n\c"
> > }
> > 
> > The addition of those \n\c (noop) avoids arguments being treated as
> > options if they start with -.
> 
> As an extension, Bash (and Coreutils) happen to honor \c always, and
> not just for %b.  But POSIX only requires \c handling for %b.
> 
> And while Issue 8 has taken steps to allow implementations to support
> 'echo -e', it is still not standardized behavior; so your xsi_echo()
> is bash-specific (which is not necessarily a problem, as long as you
> are aware it is not portable).
[...]

Yes, none of local (from ash I believe), the posix option
(several shells have an option called posix all used to improve
POSIX conformance, bash may have been the first) nor -e (from
Research Unix v8) are standard, that part was about bash
specifically (as the thread is also posted on gnu.bash.bug).

BTW, that xsi_echo is not strictly equivalent to a XSI echo in
the case where the last character of the last argument is an unescaped
backslash or a character whose encoding ends in the same byte as
the encoding of backslash.

-- 
Stephane



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Eric Blake
On Fri, Sep 01, 2023 at 07:19:13AM +0200, Phi Debian wrote:
> Well after reading yet another thread regarding libc_printf() I got to
> admit that even %B is crossed out, (Yet already choosen by ksh93)
> 
> The other thread also speak about libc_printf() documentting %# as
> undefined for things other than  a, A, e, E, f, F, g, and G, yet the same
> thread also talk about a A comming late (citing C99) in the dance, meaning
> what is undefined today become defined tomorow, so %#b is no safer.
>

Caution: The proposal here is for %#s (an alternative string), not %#b
(which C2x wants to be similar to %#x, in that it outputs a '0b'
prefix for all values except bare '0').

Yes, there is a slight risk that C may decide to define %#s.  But as
the Austin Group includes a member of WG14, we are able to advise the
C committee that such an addition is not wise.

> My guess is that printf(1) is now doomed to follow its route, keep its old
> format exception, and then may be implement something like c_printf like
> printf but the format string follow libc semantic, or may be a -C option to
> printf(1)...

Adding an option to printf is also a possibility, if there is
wide-spread implementation practice to standardize.  If someone wants
to implement 'printf -C' right now, that could help feed such a future
standardization.  But it is somewhat orthogonal to the request in this
thread, which is how to allow users to still access the old %b
behavior even if %b gets repurposed in the future; if we can get
multiple implementations to add a %#s alias now, it makes the future
decisions easier (even if it is too late for Issue 8 to add any new
features, or for that matter, to make any normative changes other than
marking %b obsolescent as a way to be able to revisit it in the future
for Issue 9).


> 
> Well in all case %b can not change semantic in the bash script, since it is
> there for so long, even if it depart from python, perl, libc, it is
> unfortunate but that's the way it is, nobody want a semantic change, and on
> next routers update, see the all internet falling appart :-)

How many scripts in the wild actually use %b, though?  And if there
are such scripts, anything we can do to make it easy to do a drop-in
replacement that still preserves the old behavior (such as changing %b
to %#s) is going to be easier to audit than the only other
currently-portable alternative of actually analyzing the string to see
if it uses any octal or \c escapes that have to be re-written to
portably function as a printf format argument.

POSIX is not mandating %#s at this time, so much as suggesting that if
implementations are willing to implement it now, it will make Issue 9
easier to reason about.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Eric Blake
On Fri, Sep 01, 2023 at 08:59:19AM +0100, Stephane Chazelas wrote:
> 2023-08-31 15:02:22 -0500, Eric Blake via austin-group-l at The Open Group:
> [...]
> > The current POSIX says that %b was added so that on a non-XSI
> > system, you could do:
> > 
> > my_echo() {
> >   printf %b\\n "$*"
> > }
> 
> That is dependant on the current value of $IFS. You'd need:
> 
> xsi_echo() (
>   IFS=' '
>   printf '%b\n' "$*"
> )

Let's read the standard in context (Issue 8 draft 3 page 2793 line 92595):

"
The printf utility can be used portably to emulate any of the traditional 
behaviors of the echo
utility as follows (assuming that IFS has its standard value or is unset):
• The historic System V echo and the requirements on XSI implementations in 
this volume of
  POSIX.1-202x are equivalent to:
printf "%b\n" "$*"
"

So yes, the standard does mention the requirement to have a sane IFS,
and I failed to include that in my one-off implementation of
my_echo().  Thank you for pointing out a more robust version.

> 
> Or the other alternatives listed at
> https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo/65819#65819
> 
> [...]
> > Bash already has shopt -s xpg_echo
> 
> Note that in bash, you need both
> 
> shopt -s xpg_echo
> set -o posix
> 
> To get a XSI echo. Without the latter, options are still
> recognised. You can get a XSI echo without those options with:
> 
> xsi_echo() {
>   local IFS=' ' -
>   set +o posix
>   echo -e "$*\n\c"
> }
> 
> The addition of those \n\c (noop) avoids arguments being treated as
> options if they start with -.

As an extension, Bash (and Coreutils) happen to honor \c always, and
not just for %b.  But POSIX only requires \c handling for %b.

And while Issue 8 has taken steps to allow implementations to support
'echo -e', it is still not standardized behavior; so your xsi_echo()
is bash-specific (which is not necessarily a problem, as long as you
are aware it is not portable).

> [...]
> > The Austin Group also felt that standardizing bash's behavior of %q/%Q
> > for outputting quoted text, while too late for Issue 8, has a good
> > chance of success, even though C says %q is reserved for
> > standardization by C. Our reasoning there is that lots of libc over
> > the years have used %qi as a synonym for %lli, and C would be foolish
> > to burn %q for anything that does not match those semantics at the C
> > language level; which means it will likely never be claimed by C and
> > thus free for use by shell in the way that bash has already done.
> [...]
> 
> Note that %q is from ksh93, not bash and is not portable across
> implementations and with most including bash's gives an output
> that is not safe for reinput in arbitrary locales (as it uses
> $'...' in some cases), not sure  it's a good idea to add it to
> the standard, or at least it should come with fat warnings about
> the risk in using it.

%q is NOT being added to Issue 8, but $'...' is.  Bug 1771 asked if %q
could be added to Issue 8, but it came it past the deadline for
feature requests, so the best we could do is add a FUTURE DIRECTIONS
blurb that mentions the idea.  But since FUTURE DIRECTIONS is
non-normative, we can always change our mind in Issue 9 and delete
that text if it turns out we can't get consensus to standardize some
form of %q/%Q after all.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-09-01 Thread Stephane Chazelas
2023-08-31 15:02:22 -0500, Eric Blake via austin-group-l at The Open Group:
[...]
> The current POSIX says that %b was added so that on a non-XSI
> system, you could do:
> 
> my_echo() {
>   printf %b\\n "$*"
> }

That is dependant on the current value of $IFS. You'd need:

xsi_echo() (
  IFS=' '
  printf '%b\n' "$*"
)

Or the other alternatives listed at
https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo/65819#65819

[...]
> Bash already has shopt -s xpg_echo

Note that in bash, you need both

shopt -s xpg_echo
set -o posix

To get a XSI echo. Without the latter, options are still
recognised. You can get a XSI echo without those options with:

xsi_echo() {
  local IFS=' ' -
  set +o posix
  echo -e "$*\n\c"
}

The addition of those \n\c (noop) avoids arguments being treated as
options if they start with -.


[...]
> The Austin Group also felt that standardizing bash's behavior of %q/%Q
> for outputting quoted text, while too late for Issue 8, has a good
> chance of success, even though C says %q is reserved for
> standardization by C. Our reasoning there is that lots of libc over
> the years have used %qi as a synonym for %lli, and C would be foolish
> to burn %q for anything that does not match those semantics at the C
> language level; which means it will likely never be claimed by C and
> thus free for use by shell in the way that bash has already done.
[...]

Note that %q is from ksh93, not bash and is not portable across
implementations and with most including bash's gives an output
that is not safe for reinput in arbitrary locales (as it uses
$'...' in some cases), not sure  it's a good idea to add it to
the standard, or at least it should come with fat warnings about
the risk in using it.

See also:

https://unix.stackexchange.com/questions/379181/escape-a-variable-for-use-as-content-of-another-script/600214#600214

-- 
Stephane



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-08-31 Thread Emanuele Torre
On Thu, Aug 31, 2023 at 03:02:22PM -0500, Eric Blake wrote:
> On Thu, Aug 31, 2023 at 03:10:58PM -0400, Chet Ramey wrote:
> > Why not standardize another character, like %B? I suppose I'll have to look
> > at the etherpad for the discussion. I think that came up on the mailing
> > list, but I can't remember the details.
> 
> Yes, https://austingroupbugs.net/view.php?id=1771 has a good
> discussion of the various ideas.
> 
> %B is out for the same reason as %b: although the current C2x draft
> wording says that % is reserved for implementation use, other
> than [AEFGX] which already have a history of use by C (as it was, when
> C99 added %A, that caused problems for some folks), it goes on to
> _highly_ encourage any implementation that adds %b for "0b0" binary
> output also add %B for "0B0" binary output (to match the x/X
> dichotomy).  Burning %B to retain the old behavior while repurposing
> %b to output lower-case binary values is thus a non-starter, while
> burning %#s (which C says is undefined) felt nicer.

Also note that, in ksh93, %B is already used for something else.
It interprets its argument as a variable name, and dereferences it:
`printf %B PWD' is similar to `printf %s "$PWD"' (assuming PWD is a
string variable).

o/
 emanuele6



Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-08-31 Thread Eric Blake
On Thu, Aug 31, 2023 at 03:10:58PM -0400, Chet Ramey wrote:
> On 8/31/23 11:35 AM, Eric Blake wrote:
> > In today's Austin Group call, we discussed the fact that printf(1) has
> > mandated behavior for %b (escape sequence processing similar to XSI
> > echo) that will eventually conflict with C2x's desire to introduce %b
> > to printf(3) (to produce 0b000... binary literals).
> > 
> > For POSIX Issue 8, we plan to mark the current semantics of %b in
> > printf(1) as obsolescent (it would continue to work, because Issue 8
> > targets C17 where there is no conflict with C2x), but with a Future
> > Directions note that for Issue 9, we could remove %b entirely, or
> > (more likely) make %b output binary literals just like C.
> 
> I doubt I'd ever remove %b, even in posix mode -- it's already been there
> for 25 years.

But the longer that printf(3) supports "%b" to output binary values,
the more surprised new shell coders will be that printf(1) %b does not
behave the same.  What's more, other languages have already started
using %b for binary output (python, for example), so it is definitely
gaining in mindshare.

That said, I also agree with your desire to keep the functionality in
place.  The current POSIX says that %b was added so that on a non-XSI
system, you could do:

my_echo() {
  printf %b\\n "$*"
}

and then call my_echo everywhere that a script used to depend on XSI
echo (perhaps by 'alias echo=my_echo' with aliases enabled), for a
much quicker portability hack than a tedious search-and-replace of
every echo call that requires manual inspection of its arguments for
translation of any XSI escape sequences into printf format
specifications.  In particular, code like [var='...\c'; echo "$var"]
cannot be changed to use printf by a mere s/echo/printf %s\\n/.  Thus,
when printf was invented and standardized for the shell, the solution
at the time was to create [printf %b\\n "$var"] as a drop-in
replacement for XSI [echo "$var"], even for platforms without XSI
echo.

Nowadays, I personally have not seen very many scripts like this in
the wild (for example, autoconf scripts prefer to directly use printf,
rather than trying to shoe-horn behavior into echo).  But assuming
such legacy scripts still exist, it is still much easier to rewrite
just the my_echo wrapper to now use %#s\\n instead of %b\\n, than it
would be to find every callsite of my_echo.

Bash already has shopt -s xpg_echo; I could easily see this being a
case where you toggle between the old or new behavior of %b (while
keeping %#s always at the old behavior) by either this or some other
shopt in bash, so that newer script writers that want binary output
for %b can do so with one setting, while scripts that must continue to
run under old semantics can likewise do so.

> 
> > But that
> > raises the question of whether the escape-sequence processing
> > semantics of %b should still remain available under the standard,
> > under some other spelling, since relying on XSI echo is still not
> > portable.
> > 
> > One of the observations made in the meeting was that currently, both
> > the POSIX spec for printf(1) as seen at [1], and the POSIX and C
> > standard (including the upcoming C2x standard) for printf(3) as seen
> > at [3] state that both the ' and # flag modifiers are currently
> > undefined when applied to %s.
> 
> Neither one is a very good choice, but `#' is the better one. It at least
> has a passing resemblence to the desired functionality.

Indeed, that's what the Austin Group settled on today after I first
wrote my initial email, and what I wrote up in a patch to GNU
Coreutils (https://debbugs.gnu.org/65659)

> 
> Why not standardize another character, like %B? I suppose I'll have to look
> at the etherpad for the discussion. I think that came up on the mailing
> list, but I can't remember the details.

Yes, https://austingroupbugs.net/view.php?id=1771 has a good
discussion of the various ideas.

%B is out for the same reason as %b: although the current C2x draft
wording says that % is reserved for implementation use, other
than [AEFGX] which already have a history of use by C (as it was, when
C99 added %A, that caused problems for some folks), it goes on to
_highly_ encourage any implementation that adds %b for "0b0" binary
output also add %B for "0B0" binary output (to match the x/X
dichotomy).  Burning %B to retain the old behavior while repurposing
%b to output lower-case binary values is thus a non-starter, while
burning %#s (which C says is undefined) felt nicer.

The Austin Group also felt that standardizing bash's behavior of %q/%Q
for outputting quoted text, while too late for Issue 8, has a good
chance of success, even though C says %q is reserved for
standardization by C. Our reasoning there is that lots of libc over
the years have used %qi as a synonym for %lli, and C would be foolish
to burn %q for anything that does not match those semantics at the C
language level; which means it will likely never be claimed by C and
thus 

Re: bug#65659: RFC: changing printf(1) behavior on %b

2023-08-31 Thread Paul Eggert

On 2023-08-31 08:35, Eric Blake wrote:

Typing-wise, %#s as a synonym for %b is
probably going to be easier (less shell escaping needed).  Is there
any interest in a patch to coreutils or bash that would add such a
synonym, to make it easier to leave that functionality in place for
POSIX Issue 9 even when %b is repurposed to align with C2x?


Sounds good to me for coreutils.