A NOTE has been added to this issue. 
Reported By:                Florian Weimer
Assigned To:                
Project:                    1003.1(2013)/Issue7+TC1
Issue ID:                   1078
Category:                   System Interfaces
Type:                       Clarification Requested
Severity:                   Editorial
Priority:                   normal
Status:                     New
Name:                       Florian Weimer 
Organization:               Red Hat 
User Reference:              
Section:                    isdigt, isxdigit 
Page Number:                unknown 
Line Number:                unknown 
Interp Status:              --- 
Final Accepted Text:         
Date Submitted:             2016-09-16 17:54 UTC
Last Modified:              2016-09-16 18:50 UTC
Summary:                    isdigit, isxdigit locale dependance

 (0003381) eblake (manager) - 2016-09-16 18:50
In the past, I recall the analysis that it is not possible to have a system
with simultaneous EBCDIC and ASCII-derived locales (you can have a system
with support for both encodings only if there is some non-standard way to
switch which mode you are in, but within a given mode, standard-conforming
apps see either that all available locales are EBCDIC-based, or that all
available locales are ASCII-based).  In part, this is because the standard
requires that all locales use the same single-byte representation for the
various digits; because isdigit() really is locale-independent, and returns
non-zero for the same set of 10 bytes regardless of what else is going on
in the locale.

The standard forbids a locale with UTF-16 or UTF-32 codesets (the all-zero
NUL byte is required to represent the NUL character across ALL locales, and
is not permitted to appear within a multi-byte character encoding).  That
does not mean that the standard forbids processing of UTF-16 or UTF-32 data
(that would be a job for iconv) but merely that your locale is never
encoded as UTF-16 or UTF-32.

Furthermore, while the standard does permit multibyte encodings where one
byte of a multi-byte character happens to also have the same value as a
single-byte letter character, it is fairly explicit that at least the
portable filename character set (which includes all of the characters in
question by isxdigit()) must be single-byte characters.

So about the only reason the locale could even play a role is in
determining whether you have a locale where some multibyte character
encodings happen to reuse a byte that can also be one of the characters in
question if it appears on its own; but not whether the characters in
question can have a different single-byte encoding. 

Issue History 
Date Modified    Username       Field                    Change               
2016-09-16 17:54 Florian Weimer New Issue                                    
2016-09-16 17:54 Florian Weimer Name                      => Florian Weimer  
2016-09-16 17:54 Florian Weimer Organization              => Red Hat         
2016-09-16 17:54 Florian Weimer Section                   => isdigt, isxdigit
2016-09-16 17:54 Florian Weimer Page Number               => unknown         
2016-09-16 17:54 Florian Weimer Line Number               => unknown         
2016-09-16 18:35 Don Cragun     Note Added: 0003380                          
2016-09-16 18:50 eblake         Note Added: 0003381                          

Reply via email to