2018-09-30 05:18:03 +0700, Robert Elz: > Date: Fri, 28 Sep 2018 14:44:55 +0200 > From: Harald van Dijk <a...@gigawatt.nl> > Message-ID: <d1c042d8-4431-7232-aaf2-9802de0af...@gigawatt.nl> > > | 1. Are the rules for determining the end of a dollar-quoted string > | intended to be fully specified, especially when taking into account > | strings that are never expanded? > > $'' gives a single quoted string, it is never "expanded" in the sense usually > used in the shell. That is unlike $ expressions in "" strings, which are. > > The \ sequences in $'' are more like the \ sequences in "" - they are > processed > as the string is read (while tokenisation is being performed).
As already discussed, in zsh (where $'\uxxxx' comes from in the first place), while the $'...' are of course tokenised correctly the \x, \uxxxx are expanded at *runtime*. It *is* an expansion operator as you'd expect. Just like in $(...), or course the content has to be parsed, but the command inside is only run at runtime. Same applies for $"..." and ${parameter:=cmd}. The fact that it starts with $ suggests it *should* be an expansion. That printf %s $'\u00E9' doesn't behave like printf '\u00E9' in some shells is surprising. (yes "printf '\uxxxx'" is not yet specified by POSIX but not that it predates $'\uxxxx', zsh added both $'\uxxxx' and printf '\uxxxx' at the same time, inspired by the GNU printf utility (not the printf builtin of the GNU shell which added support for \uxxxx much later). Test case: eacute() { printf %s $'\ue9' } LC_ALL=en_GB.UTF-8 eacute | od -An -vtx1 LC_ALL=en_GB.iso88591 eacute | od -An -vtx1 LC_ALL=C eacute | od -An -vtx1 Compare with: eacute() { printf '\u00e9' } -- Stephane