A NOTE has been added to this issue. 
====================================================================== 
https://www.austingroupbugs.net/view.php?id=1924 
====================================================================== 
Reported By:                stephane
Assigned To:                
====================================================================== 
Project:                    1003.1(2024)/Issue8
Issue ID:                   1924
Category:                   Shell and Utilities
Tags:                       tc1-2024
Type:                       Error
Severity:                   Objection
Priority:                   normal
Status:                     Resolved
Name:                       Stephane Chazelas 
Organization:                
User Reference:              
Section:                    Shell word splitting and "read" utility 
Page Number:                various 
Line Number:                various 
Interp Status:              --- 
Final Accepted Text:       
https://www.austingroupbugs.net/view.php?id=1924#c7183 
Resolution:                 Accepted As Marked
Fixed in Version:           
====================================================================== 
Date Submitted:             2025-05-05 19:02 UTC
Last Modified:              2025-05-16 09:39 UTC
====================================================================== 
Summary:                    New word splitting requirements inappropriate in
locales with non-self-synchronising character encodings
====================================================================== 

---------------------------------------------------------------------- 
 (0007188) hvd (reporter) - 2025-05-16 09:39
 https://www.austingroupbugs.net/view.php?id=1924#c7188 
---------------------------------------------------------------------- 
I'm not sure what the standardese would be, but I think it's possible to make it
less unspecified so that it still allows handling file names containing
arbitrary bytes, but restore the handling of all locales to what Issue 7
required. The rule that, as far as I know, all shells that support multibyte
characters try to implement, is simple:

When a shell interprets a byte string as a character string, this is done as if
by repeated calls to mbrtowc(), except that if it would encounter EILSEQ, an
unspecified character (other than a null character) is produced and conversion
resumes from the initial conversion state.

Are there any shells that do not try to follow to this general principle? If
not, if there is a way to phrase that in a manner appropriate for
standardization, the changes to  require splitting on byte sequences can be
reverted, the intended aim of those changes would then be handled transparently.


Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2025-05-05 19:02 stephane       New Issue                                    
2025-05-15 15:14 geoffclare     Note Added: 0007183                          
2025-05-15 15:16 geoffclare     Status                   New => Resolved     
2025-05-15 15:16 geoffclare     Resolution               Open => Accepted As
Marked
2025-05-15 15:16 geoffclare     Interp Status             => ---             
2025-05-15 15:16 geoffclare     Final Accepted Text       =>
https://www.austingroupbugs.net/view.php?id=1924#c7183    
2025-05-15 15:16 geoffclare     Tag Attached: tc1-2024                       
2025-05-16 06:25 stephane       Note Added: 0007186                          
2025-05-16 06:28 stephane       Note Added: 0007187                          
2025-05-16 09:39 hvd            Note Added: 0007188                          
======================================================================


  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group

Reply via email to