2018-09-30 05:18:03 +0700, Robert Elz:
>     Date:        Fri, 28 Sep 2018 14:44:55 +0200
>     From:        Harald van Dijk <a...@gigawatt.nl>
>     Message-ID:  <d1c042d8-4431-7232-aaf2-9802de0af...@gigawatt.nl>
> 
>   | 1. Are the rules for determining the end of a dollar-quoted string 
>   | intended to be fully specified, especially when taking into account 
>   | strings that are never expanded?
> 
> $'' gives a single quoted string, it is never "expanded" in the sense usually
> used in the shell.   That is unlike $ expressions in "" strings, which are.
> 
> The \ sequences in $'' are more like the \ sequences in "" - they are 
> processed
> as the string is read (while tokenisation is being performed).

As already discussed, in zsh (where $'\uxxxx' comes from in the
first place), while the $'...' are of course tokenised correctly
the \x, \uxxxx are expanded at *runtime*. It *is* an expansion
operator as you'd expect.

Just like in $(...), or course the content has to be parsed, but
the command inside is only run at runtime.

Same applies for $"..." and ${parameter:=cmd}. The fact that it
starts with $ suggests it *should* be an expansion.

That

printf %s $'\u00E9'

doesn't behave like

printf '\u00E9'

in some shells is surprising. (yes "printf '\uxxxx'" is not yet
specified by POSIX but not that it predates $'\uxxxx', zsh added
both $'\uxxxx' and printf '\uxxxx' at the same time, inspired by
the GNU printf utility (not the printf builtin of the GNU shell
which added support for \uxxxx much later).

Test case:


eacute() {
  printf %s $'\ue9'
}

LC_ALL=en_GB.UTF-8 eacute | od -An -vtx1
LC_ALL=en_GB.iso88591 eacute | od -An -vtx1
LC_ALL=C eacute | od -An -vtx1


Compare with:

eacute() {
  printf '\u00e9'
}

-- 
Stephane

Reply via email to