The following issue has been SUBMITTED. 
====================================================================== 
http://austingroupbugs.net/view.php?id=1222 
====================================================================== 
Reported By:                stephane
Assigned To:                
====================================================================== 
Project:                    1003.1(2016)/Issue7+TC2
Issue ID:                   1222
Category:                   Shell and Utilities
Type:                       Enhancement Request
Severity:                   Objection
Priority:                   normal
Status:                     New
Name:                       Stephane Chazelas 
Organization:                
User Reference:              
Section:                    echo utility 
Page Number:                echo 
Line Number:                echo 
Interp Status:              --- 
Final Accepted Text:         
====================================================================== 
Date Submitted:             2018-12-27 23:49 UTC
Last Modified:              2018-12-27 23:49 UTC
====================================================================== 
Summary:                    "echo" specification doesn't reflect current
implementations (missing -e, -E and - handling)
Description: 
(sorry for now adding the page/line numbers, it seems I'm enable to
download C181.pdf).

This is a follow-up on the discussion at
http://article.gmane.org/gmane.comp.standards.posix.austin.general/12097
started by Robert Elz where you'll find more information as well
as at
https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo
and https://www.in-ulm.de/~mascheck/various/echo+printf/

I know many will say "echo" is a lost cause, POSIX already
recommends to use "printf" instead.

But in practice, "echo" is still widely used, and POSIX "echo"
specification fails to describe the behaviour of current "echo"
implementations, in particular of Free, Libre and Open Source
ones which are now dominant.

For a history recap:

- up until Research Unix V6, echo didn't accepted any option and
  didn't expand any escape sequence so it could be used to
  output arbitrary text followed by a newline character with
  echo "$text"
- in the late 70s, PWB Unix added some \c (causing echo to
  exit which was a way to avoid the trailing newline), \0ooo,
  and \n escape sequences expanded by default.
- that was a very poor design fixed by Dennis Ritchie with the
  addition of -n (in Research Unix V7, 1979) to skip the newline
  and -e (research Unix V8, 1981) for expanding escape sequences
  (also fixing the non-standard \0ooo to \ooo like in C)
- USG Unices kept on adding more escape sequences to their
  (broken) implementation in the 80s.
- portable echo implementations (starting with GNU echo and
  bash's echo builtin in 1992, soon followed by pdksh, zsh),
  added a -E option to disable escape sequence expansion, and
  let the user choose between research Unix and USG behaviour as
  default at build time or run time.
- neither of those could output arbitrary data as there was no
  end-of-option marker. That was fixed by the zsh echo
  implementation in 1990 (initially as -- then as -). echo -E -
  "$text" in zsh now works like in our V6 echo "$text" above.
- POSIX.2 in 1992 left the behaviour unspecified if the first
  argument was "-n" or any argument contained backslashes but
  failed to account for those -e/-E/-
- XPG mandated the behaviour of USG/SysV (the least useful one IMO)
- SUS merged POSIX and XPG as the XSI option, but didn't fix
  those missing parts about -e/-E/-.
- at some point GNU echo added support for --version and --help
  (not when POSIXLY_CORRECT is in the environment, not their
  abbreviations)

Additionally, in those implementations that expand escape
sequences, with the exception of the "echo" builtin of yash,
they don't treat their argument as text.

In those locales where the charset contains characters whose
encoding contains the encoding of backslash (like in BIG5 where
α is 0xA3 0x5C for instance and \ is 0x5C), those characters
will be mangled by echo. For instance, in a locale where BIG5 is
the charset, echo αc outputs a 0xA3 byte instead of αc.

That yash "echo" is the only implementation that I know that is
fully compliant to POSIX+XSI (when ECHO_STYLE=XSI).

All certified Unices are at least non-compliant because of that
encoding issue above. While the "echo" builtin of macOS's sh is
mostly compliant (except for that encoding issue), its
standalone utility understands -n. It understands \c only if \c
is last, but no other escape sequence. I suppose the conformance
tests test "echo" in shell scripts, but not the "echo"
standalone utility like in "env echo".

Today, most people (from trends on
stackoverflow.com/usenet/unix.stackexchange.com) expect "-e" to
be needed to expand escape sequences. So much so that -E is
quite rarely used. Nowadays, except in zsh (where the bsdecho
option is not enabled by default), wherever -E is supported, it
is also the default. The only system that I know that compiles
bash with xpg_echo enabled by default (for -e to be the default)
is macOS for its sh (I suppose it's the case of K-UX as well,
but I've never come across those), but then as that means the
posix option is also enabled, -E is no longer recognised.

echo -e is supported by at least GNU echo, bash, pdksh and
derivatives, zsh, busybox, ksh93, most ash derivatives (dash
being a notable exception) and yash (with some values of
$ECHO_STYLE).

echo -E is supported by at least GNU, bash, pdksh, zsh, busybox
and yash (with some values of $ECHO_STYLE).

On BSDs, -e is generally supported by the "echo" builtin of "sh"
(also -E in OpenBSD/MirOS where sh is based on pdksh), but not
by the standalone "echo" utility.

Desired Action: 
Change the text to:

If the first argument is "-" followed by zero or more e, E or n
characters (or is --version or --help), or if any argument
contains backslash characters (or the byte encoding of the
backslash character), the behaviour is unspecified.

I find it a shame to force XSI systems to implement the broken
(expand escape sequences by default, no way to disable it)
behaviour. I would be of the opinion that since it now deviates
from the de-facto standard, it should be removed to allow
implementations to do the right thing and align with the more
common ones (see how silly it is that macOS decided to use bash
for their sh, but made it incompatible with most other bash
deployments in that regard, just to be UNIX certified)

Maybe also add a future direction that would specify -n/-e/-E à
la bash. (I'd expect zsh's "echo -" to mark the end-of-options
would be contentious, so I suppose echo cannot be fully "fixed").

I suppose the encoding issue can be ignored for now. Anyway, it
also applies to many other utilities including printf, awk,
sed... We can probably assume that those charsets that have
characters whose encoding contains the encoding of other
characters including some in the portable charset will die away
and they can't be used reliably anyway.

====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2018-12-27 23:49 stephane       New Issue                                    
2018-12-27 23:49 stephane       Name                      => Stephane Chazelas
2018-12-27 23:49 stephane       Section                   => echo utility    
2018-12-27 23:49 stephane       Page Number               => echo            
2018-12-27 23:49 stephane       Line Number               => echo            
======================================================================


Reply via email to