On 28/01/2022 01:48, Christoph Anton Mitterer wrote:
On Thu, 2022-01-27 at 15:18 +0000, Harald van Dijk via austin-group-l
at The Open Group wrote:
The benefit of this that when the
shell's locale changes, variables still hold their original text (as
opposed to their original bytes).

But doesn't that by itself already violate POSIX?

There is "2.5.3 Shell Variables", which AFAIU says that setting
LANG/LC_* must take effect during the shell runtime.

The way it works in yash, it does take effect at runtime, just not in the same way it does in other shells.

LC_CTYPE says:
"Determine the interpretation of sequences of bytes of text data as
characters (for example, single-byte as opposed to multi-byte
characters), which characters are defined as letters (character class
alpha) and <blank> characters (character class blank), and the behavior
of character classes within pattern matching. Changing the value of
LC_CTYPE after the shell has started shall not affect the lexical
processing of shell commands in the current shell execution environment
or its subshells. Invoking a shell script or performing exec sh
subjects the new shell to the changes in LC_CTYPE."

=> lexical scanning of the current script stays
=> everything else, changes
    including e.g. printf, or things like ${#var}, ${var##}, etc.
Right?

${#var}, ${var##}, etc. are supposed to work at the character level. In other shells that internally hold values as byte strings, yes, LC_CTYPE needs to be considered here. What the other shells effectively do is convert to a wide string (not necessarily holding the full wide string in memory). yash normally converts the wide strings to multibyte strings as needed. Here, that would mean converting the wide string to a multibyte string and immediately back a wide string again, which can be optimised by just acting on the wide string directly. That said...

So if the shell would keep holding it's original text/characters.. this
wouldn't work (or the shell would need to convert every time)?

...for anything that cannot be done by keeping the strings as wide strings, including calling any non-builtin command, yash does convert every time the values are used according to the current locale.

Cheers,
Harald van Dijk

      • Re:... Harald van Dijk via austin-group-l at The Open Group
        • ... Chet Ramey via austin-group-l at The Open Group
        • ... Chet Ramey via austin-group-l at The Open Group
        • ... Christoph Anton Mitterer via austin-group-l at The Open Group
        • ... Harald van Dijk via austin-group-l at The Open Group
    • Re: how... Geoff Clare via austin-group-l at The Open Group
      • Re:... Harald van Dijk via austin-group-l at The Open Group
        • ... Christoph Anton Mitterer via austin-group-l at The Open Group
        • ... Harald van Dijk via austin-group-l at The Open Group
    • Re: how... Christoph Anton Mitterer via austin-group-l at The Open Group
      • Re:... Harald van Dijk via austin-group-l at The Open Group
  • Re: how do t... Christoph Anton Mitterer via austin-group-l at The Open Group
    • Re: how... Geoff Clare via austin-group-l at The Open Group
      • Re:... Christoph Anton Mitterer via austin-group-l at The Open Group
        • ... Christoph Anton Mitterer via austin-group-l at The Open Group
        • ... Geoff Clare via austin-group-l at The Open Group
        • ... Christoph Anton Mitterer via austin-group-l at The Open Group
        • ... Christoph Anton Mitterer via austin-group-l at The Open Group
        • ... Eric Blake via austin-group-l at The Open Group
        • ... Christoph Anton Mitterer via austin-group-l at The Open Group
        • ... Christoph Anton Mitterer via austin-group-l at The Open Group

Reply via email to