On 12.12.24 19:14, Jeff Davis wrote:
On Mon, 2024-12-02 at 11:57 +0100, Peter Eisentraut wrote:
t_isdigit() and t_isspace() are just used to parse various
configuration
and data files, and surely we don't need support for encoding-
dependent
multibyte support for parsing ASCII digits and ASCII spaces.
... So these can
be
replaced by the normal isdigit() and isspace().
That would still call libc, and still depend on LC_CTYPE. Should we use
pure ASCII variants?
isdigit() and isspace() in particular are widely used throughout the
backend code without such concerns. I think the assumption is that this
is not a problem in practice: For multibyte encodings, these functions
would only be able to process the ASCII subset, and the character
classification of that should be consistent across all locales. For
single-byte encodings, among the encodings that PostgreSQL supports, I
don't think any of them actually provide non-ASCII digits or space
characters.