Hey. I'm afraid but some more questions came up on my side:
1) POSIX says: "The encoded values associated with <period>, <slash>, <newline>, and <carriage-return> shall be invariant across all locales supported by the implementation." When now, for example, <period> is encoded as the byte 0x2E ... the consequence would be that it had to be 0x2E in all locales and their encodings, right? Doesn't that also mean that POSIX effectively forbids UTF16 or UTF32 and actually any >1-byte fixed-encoding? Cause there it would have to be "padded" with 0x00? 2) When I have a shell script in some encoding, and it contains e.g.: printf '.' would POSIX demand that this: a) always cause the byte 0x2E to be printed b) print the character 'x' according to the currently set locale, e.g. if that was using UTF16, it would print the bytes 0x2e 0x00 c) print the character 'x' according to the locale in which the shell parses the script (but there again, if it was UTF16... the bytes 0x2e 0x00) d) Would it in some weird encodings like IBM905 cause the byte 0x4B to be printed? ? 3) With respect to the command substitution with trailing newlines question: Because of (2) ... would it be in any way safer to e.g. printf '\056' (octal for . in ASCII/etc.) and also strip that off... rather than using '.'? Especially also with respect to a hypothetical UTF16/32 locale? 4) Doesn't strictly belong here, but maybe someone knows: On my Debian (=> glibc) I was trying this: /usr/share/i18n/charmaps$ zgrep "[xX]2[eEfF]" * | grep -Ev '[[:space:]](SOLIDUS|FULL STOP)$' i.e. searching for any entries that are 0x2E or 0x2f ( . and / ), filtering out any who really are considered as that. That gave quite some matches: BRF.gz:<U2828> /x2e BRAILLE PATTERN DOTS-46 BRF.gz:<U280C> /x2f BRAILLE PATTERN DOTS-34 EBCDIC-AT-DE-A.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-AT-DE-A.gz:<U0007> /x2f BELL (BEL) EBCDIC-AT-DE.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-AT-DE.gz:<U0007> /x2f BELL (BEL) EBCDIC-CA-FR.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-CA-FR.gz:<U0007> /x2f BELL (BEL) EBCDIC-DK-NO-A.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-DK-NO-A.gz:<U0007> /x2f BELL (BEL) EBCDIC-DK-NO.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-DK-NO.gz:<U0007> /x2f BELL (BEL) EBCDIC-ES-A.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-ES-A.gz:<U0007> /x2f BELL (BEL) EBCDIC-ES.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-ES.gz:<U0007> /x2f BELL (BEL) EBCDIC-ES-S.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-ES-S.gz:<U0007> /x2f BELL (BEL) EBCDIC-FI-SE-A.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-FI-SE-A.gz:<U0007> /x2f BELL (BEL) EBCDIC-FI-SE.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-FI-SE.gz:<U0007> /x2f BELL (BEL) EBCDIC-FR.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-FR.gz:<U0007> /x2f BELL (BEL) EBCDIC-IT.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-IT.gz:<U0007> /x2f BELL (BEL) EBCDIC-PT.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-PT.gz:<U0007> /x2f BELL (BEL) EBCDIC-UK.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-UK.gz:<U0007> /x2f BELL (BEL) EBCDIC-US.gz:<U0006> /x2e ACKNOWLEDGE (ACK) EBCDIC-US.gz:<U0007> /x2f BELL (BEL) IBM037.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM037.gz:<U0007> /x2f BELL (BEL) IBM038.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM038.gz:<U0007> /x2f BELL (BEL) IBM1026.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM1026.gz:<U0007> /x2f BELL (BEL) IBM1047.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM1047.gz:<U0007> /x2f BELL (BEL) IBM1132.gz:<U0006> /x2e <control> IBM1132.gz:<U0007> /x2f <control> IBM1160.gz:<U0006> /x2e <control> IBM1160.gz:<U0007> /x2f <control> IBM1164.gz:<U0006> /x2e <control> IBM1164.gz:<U0007> /x2f <control> IBM256.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM256.gz:<U0007> /x2f BELL (BEL) IBM273.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM273.gz:<U0007> /x2f BELL (BEL) IBM274.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM274.gz:<U0007> /x2f BELL (BEL) IBM275.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM275.gz:<U0007> /x2f BELL (BEL) IBM277.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM277.gz:<U0007> /x2f BELL (BEL) IBM278.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM278.gz:<U0007> /x2f BELL (BEL) IBM280.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM280.gz:<U0007> /x2f BELL (BEL) IBM281.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM281.gz:<U0007> /x2f BELL (BEL) IBM284.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM284.gz:<U0007> /x2f BELL (BEL) IBM285.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM285.gz:<U0007> /x2f BELL (BEL) IBM290.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM290.gz:<U0007> /x2f BELL (BEL) IBM297.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM297.gz:<U0007> /x2f BELL (BEL) IBM420.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM420.gz:<U0007> /x2f BELL (BEL) IBM423.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM423.gz:<U0007> /x2f BELL (BEL) IBM424.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM424.gz:<U0007> /x2f BELL (BEL) IBM500.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM500.gz:<U0007> /x2f BELL (BEL) IBM870.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM870.gz:<U0007> /x2f BELL (BEL) IBM871.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM871.gz:<U0007> /x2f BELL (BEL) IBM875.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM875.gz:<U0007> /x2f BELL (BEL) IBM880.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM880.gz:<U0007> /x2f BELL (BEL) IBM905.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM905.gz:<U0007> /x2f BELL (BEL) IBM918.gz:<U0006> /x2e ACKNOWLEDGE (ACK) IBM918.gz:<U0007> /x2f BELL (BEL) INIS-CYRILLIC.gz:<U2192> /x2e RIGHTWARDS ARROW INIS-CYRILLIC.gz:<U222B> /x2f INTEGRAL ISO_10646.gz:<I;> /x01/x2E LATIN CAPITAL LETTER I WITH OGONEK ISO_10646.gz:<i;> /x01/x2F LATIN SMALL LETTER I WITH OGONEK ISO_10646.gz:<JU> /x04/x2E CYRILLIC CAPITAL LETTER YU ISO_10646.gz:<JA> /x04/x2F CYRILLIC CAPITAL LETTER YA ISO_10646.gz:<x+> /x06/x2E ARABIC LETTER KHAH ISO_10646.gz:<d+> /x06/x2F ARABIC LETTER DAL ISO_10646.gz:<I:'> /x1E/x2E LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE ISO_10646.gz:<i:'> /x1E/x2F LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE ISO_10646.gz:<Io> /x22/x2E CONTOUR INTEGRAL ISO_10646.gz:<dlR> /x25/x2E BOX DRAWINGS RIGHT HEAVY AND LEFT DOWN LIGHT ISO_10646.gz:<dH-> /x25/x2F BOX DRAWINGS DOWN LIGHT AND HORIZONTAL HEAVY ISO_11548-1.gz:<U282E> /x2e BRAILLE PATTERN DOTS-2346 ISO_11548-1.gz:<U282F> /x2f BRAILLE PATTERN DOTS-12346 JIS_C6220-1969-JP.gz:<YO> /x2E <U30E7> KATAKANA LETTER SMALL YO JIS_C6220-1969-JP.gz:<TU> /x2F <U30C3> KATAKANA LETTER SMALL TU Since all these (well except perhaps ISO_10646) use 0x2E and 0x2F for other characters than . and / ... doesn't that already mean that they're invalid with respect to POSIX? Thanks, Chris.