A NOTE has been added to this issue. ====================================================================== https://www.austingroupbugs.net/view.php?id=1924 ====================================================================== Reported By: stephane Assigned To: ====================================================================== Project: 1003.1(2024)/Issue8 Issue ID: 1924 Category: Shell and Utilities Tags: tc1-2024 Type: Error Severity: Objection Priority: normal Status: Resolved Name: Stephane Chazelas Organization: User Reference: Section: Shell word splitting and "read" utility Page Number: various Line Number: various Interp Status: --- Final Accepted Text: https://www.austingroupbugs.net/view.php?id=1924#c7183 Resolution: Accepted As Marked Fixed in Version: ====================================================================== Date Submitted: 2025-05-05 19:02 UTC Last Modified: 2025-06-05 16:11 UTC ====================================================================== Summary: New word splitting requirements inappropriate in locales with non-self-synchronising character encodings ======================================================================
---------------------------------------------------------------------- (0007196) geoffclare (manager) - 2025-06-05 16:11 https://www.austingroupbugs.net/view.php?id=1924#c7196 ---------------------------------------------------------------------- In the June 5, 2025 teleconference the issues raised since the original resolution were discussed. The following is a new proposed resolution but the issue is being left open for feedback. After page 79 line 2388 section 3 Definitions, add: <b>3.328 Self-synchronizing Character Encoding</b> <blockquote>A character encoding in which no contiguous subset (other than the encoding of each character) of bytes from the encoding of any one character or two adjacent characters can also represent the encoding of any valid character on its own.</blockquote> and renumber the later subsections. On page 120 line 3840 section 6.2, change: <blockquote>Likewise, the byte values used to encode <period>, <slash>, <newline>, and <carriage-return> shall not occur as part of any other character in any locale.</blockquote> to: <blockquote>Likewise, the byte values used to encode <newline>, <carriage-return>, <tab>, <space>, <hyphen-minus>, <period>, <slash>, and <colon> shall not occur as part of any other character in any locale.</blockquote> On page 2481 line 80454 section 2.5.3 Shell Variables (IFS), after: <blockquote>If the value of <i>IFS</i> includes any bytes that do not form part of a valid character, the results of field splitting, expansion of '*', and use of the <i>read</i> utility are unspecified.</blockquote> add a sentence: <blockquote>If the current locale's character encoding is not self-synchronizing and the value of <i>IFS</i> includes any character for which the byte encoding can overlap with the byte encoding of any other sequence of characters, the results of field splitting, expansion of '*', and use of the <i>read</i> utility are unspecified.</blockquote> and two small-font notes: <blockquote><small><b>Note:</b> The UTF-8 encoding is self-synchronizing, meaning that no character's encoding can be confused with any other sequence of characters, and thus all characters are safe to use in <i>IFS</i> when the current locale uses this encoding.</small></blockquote> <blockquote><small><b>Note:</b> [xref to XBD 6.2 Character Encoding] specifies a set of characters from the portable character set whose byte values are not allowed to occur as part of any other character in any locale. These characters are safe to use in <i>IFS</i> with any locale.</small></blockquote> Issue History Date Modified Username Field Change ====================================================================== 2025-05-05 19:02 stephane New Issue 2025-05-15 15:14 geoffclare Note Added: 0007183 2025-05-15 15:16 geoffclare Status New => Resolved 2025-05-15 15:16 geoffclare Resolution Open => Accepted As Marked 2025-05-15 15:16 geoffclare Interp Status => --- 2025-05-15 15:16 geoffclare Final Accepted Text => https://www.austingroupbugs.net/view.php?id=1924#c7183 2025-05-15 15:16 geoffclare Tag Attached: tc1-2024 2025-05-16 06:25 stephane Note Added: 0007186 2025-05-16 06:28 stephane Note Added: 0007187 2025-05-16 09:39 hvd Note Added: 0007188 2025-05-16 14:13 stephane Note Added: 0007189 2025-05-16 15:31 hvd Note Added: 0007190 2025-05-16 18:48 chet_ramey Note Added: 0007191 2025-06-05 16:11 geoffclare Note Added: 0007196 ======================================================================