Re: Should the readline *-meta flags reset when $LANG changes?

2022-11-14 Thread Koichi Murase
 w2022年11月15日(火) 5:31 Chet Ramey :
> On 11/14/22 11:40 AM, Koichi Murase wrote:
> > 2022年11月15日(火) 0:22 Chet Ramey :
> >> On 8/11/22 5:56 PM, Koichi Murase wrote:
> >>> Can we also change the behavior of TERM in a similar way with option
> >>> 4?  Currently, a temporal change of TERM clears keybindings of some
> >>> keys (home, end, right, left, etc.) even when the temporal change does
> >>> not survive across multiple calls of readline:
> >>
> >> I finally got back to look at this, and I couldn't reproduce it. That was
> >> expected, since the arrow key binding functions are pretty careful not to
> >> overwrite an existing binding. Then I figured out what was going on.
> >
> > Thank you for checking this.
> >
> >>>
> >>> $ bash-dev --norc
> >>> $ echo "$TERM"
> >>> screen.xterm-256color
> >>> $ bind '"\e[1~": bell'
> >>
> >> This unbinds the key sequence, since `bell' is not a valid bindable command
> >> name.
> >
> > Ah, OK. The above ``reduced case'' was not correct, but unbinding is
> > actually what I wanted to do in the original problem. In the original
> > code, I intentionally unbind the keybinding for "\e[1~" and instead
> > try to bind a single byte `\e'.
>
> What do you try to bind \e to?

For my particular case, I would like to decode the escape sequences by
myself within the shell code using « bind -x '"\e": shell-func' » to
provide another configuration interface for the terminal's specific
key sequences and keybindings. I must admit that my use case is
unusual, but I am not sure if we could say for sure that there would
never be any use cases of binding to `\e' other than that.

> >> I think the "TERM=$TERM" idiom to reset the readline terminal settings
> >> without overwriting existing key bindings is useful enough to retain the
> >> current behavior.
> >
> > I think it can be useful, but should that also apply to the tempenv of
> > the form "TERM=$TERM infocmp"?
>
> You can't really have one without the other, given the way the special
> variable handling works (and has worked).
>
> > In the sense that the side effects of
> > the temporary environment variables (tempenvs) are intended to be not
> > persistent after the execution of the command (unless it is for
> > special builtin and functions in the POSIX mode), I would like to
> > request that the idiom TERM=$TERM to reset the terminal settings would
> > not be invoked for the tempenvs.
>
> The variables in the temporary environment are restored to their previous
> values after the command executes. It's that restoration that triggers the
> call to rl_reset_terminal(), not the environment assignment, undoing any
> side effects of the environmnent assignment. Bash treats these uniformly,
> whether the simple command is a builtin, function, or command from the
> file system.
>
> So from readline's perspective, there is no difference between TERM=xxx
> and 'TERM=xxx command'. rl_reset_terminal() gets called the same way from
> the same function in both cases.

I would not think the fix would be an easy "single-line" fix either,
but is it impossible to get the context of setting the variable inside
sv_terminal (variables.c:6046) e.g. by checking the variable contexts
where TERM{,CAP,INFO} are defined or by adding the second parameter to
sh_sv_func_t? Maybe it could be more complicated than I initially
thought: For example, we need to care about the case « TERM= read
-e » where we need to finally set up the terminal settings for
readline. We also need to handle att_propagate (variables.c:134) for
the cases like « TERM= export TERM » where the value of the
tempenv would be propagated to the original scope.

--
Koichi



Re: Should the readline *-meta flags reset when $LANG changes?

2022-11-14 Thread Chet Ramey

On 11/14/22 11:40 AM, Koichi Murase wrote:

2022年11月15日(火) 0:22 Chet Ramey :

On 8/11/22 5:56 PM, Koichi Murase wrote:

Can we also change the behavior of TERM in a similar way with option
4?  Currently, a temporal change of TERM clears keybindings of some
keys (home, end, right, left, etc.) even when the temporal change does
not survive across multiple calls of readline:


I finally got back to look at this, and I couldn't reproduce it. That was
expected, since the arrow key binding functions are pretty careful not to
overwrite an existing binding. Then I figured out what was going on.


Thank you for checking this.



$ bash-dev --norc
$ echo "$TERM"
screen.xterm-256color
$ bind '"\e[1~": bell'


This unbinds the key sequence, since `bell' is not a valid bindable command
name.


Ah, OK. The above ``reduced case'' was not correct, but unbinding is
actually what I wanted to do in the original problem. In the original
code, I intentionally unbind the keybinding for "\e[1~" and instead
try to bind a single byte `\e'.


What do you try to bind \e to?


However, after running "TERM=xxx
infocmp" in the command line, the keybinding does not work anymore
This is what I experienced. 


Well, yes, if you unbind it, restoring TERM will restore the original
binding.



I think the "TERM=$TERM" idiom to reset the readline terminal settings
without overwriting existing key bindings is useful enough to retain the
current behavior.


I think it can be useful, but should that also apply to the tempenv of
the form "TERM=$TERM infocmp"? 


You can't really have one without the other, given the way the special
variable handling works (and has worked).



In the sense that the side effects of
the temporary environment variables (tempenvs) are intended to be not
persistent after the execution of the command (unless it is for
special builtin and functions in the POSIX mode), I would like to
request that the idiom TERM=$TERM to reset the terminal settings would
not be invoked for the tempenvs.


The variables in the temporary environment are restored to their previous
values after the command executes. It's that restoration that triggers the
call to rl_reset_terminal(), not the environment assignment, undoing any
side effects of the environmnent assignment. Bash treats these uniformly,
whether the simple command is a builtin, function, or command from the
file system.

So from readline's perspective, there is no difference between TERM=xxx
and 'TERM=xxx command'. rl_reset_terminal() gets called the same way from
the same function in both cases.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Should the readline *-meta flags reset when $LANG changes?

2022-11-14 Thread Chet Ramey

On 8/11/22 5:56 PM, Koichi Murase wrote:


Can we also change the behavior of TERM in a similar way with option
4?  Currently, a temporal change of TERM clears keybindings of some
keys (home, end, right, left, etc.) even when the temporal change does
not survive across multiple calls of readline:


I finally got back to look at this, and I couldn't reproduce it. That was
expected, since the arrow key binding functions are pretty careful not to
overwrite an existing binding. Then I figured out what was going on.



$ bash-dev --norc
$ echo "$TERM"
screen.xterm-256color
$ bind '"\e[1~": bell'


This unbinds the key sequence, since `bell' is not a valid bindable command
name. I happened to be using `previous-history' and testing with "\eOH",
which rebinds it instead.


$ bind -q beginning-of-line
beginning-of-line can be invoked via "\C-a", "\eOH", "\e[H".
$ TERM=dumb infocmp >dumb.ti


Bash does call rl_reset_terminal here when restoring the original value of
TERM, and it attempts to bind the arrow keys and the other specials (Home,
etc.). It finds that "\e[1~" is not bound, and binds it.


$ bind -q beginning-of-line
beginning-of-line can be invoked via "\C-a", "\eOH", "\e[1~", "\e[H".


I think the "TERM=$TERM" idiom to reset the readline terminal settings
without overwriting existing key bindings is useful enough to retain the
current behavior.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Should the readline *-meta flags reset when $LANG changes?

2022-11-14 Thread Koichi Murase
2022年11月15日(火) 0:22 Chet Ramey :
> On 8/11/22 5:56 PM, Koichi Murase wrote:
> > Can we also change the behavior of TERM in a similar way with option
> > 4?  Currently, a temporal change of TERM clears keybindings of some
> > keys (home, end, right, left, etc.) even when the temporal change does
> > not survive across multiple calls of readline:
>
> I finally got back to look at this, and I couldn't reproduce it. That was
> expected, since the arrow key binding functions are pretty careful not to
> overwrite an existing binding. Then I figured out what was going on.

Thank you for checking this.

> >
> > $ bash-dev --norc
> > $ echo "$TERM"
> > screen.xterm-256color
> > $ bind '"\e[1~": bell'
>
> This unbinds the key sequence, since `bell' is not a valid bindable command
> name.

Ah, OK. The above ``reduced case'' was not correct, but unbinding is
actually what I wanted to do in the original problem. In the original
code, I intentionally unbind the keybinding for "\e[1~" and instead
try to bind a single byte `\e'. However, after running "TERM=xxx
infocmp" in the command line, the keybinding does not work anymore
This is what I experienced. Currently. as a workaround, I run the
unbinding and rebinding code [1] every time the user command is
executed, but I would like to skip the workaround if possible in newer
versions of Bash.

[1] The related code is found at
https://github.com/akinomyoga/ble.sh/blob/0c6291f0c1/src/edit.sh#L6410-L6438

> I think the "TERM=$TERM" idiom to reset the readline terminal settings
> without overwriting existing key bindings is useful enough to retain the
> current behavior.

I think it can be useful, but should that also apply to the tempenv of
the form "TERM=$TERM infocmp"? In the sense that the side effects of
the temporary environment variables (tempenvs) are intended to be not
persistent after the execution of the command (unless it is for
special builtin and functions in the POSIX mode), I would like to
request that the idiom TERM=$TERM to reset the terminal settings would
not be invoked for the tempenvs.

--
Koichi



Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-12 Thread Chet Ramey

On 8/11/22 5:56 PM, Koichi Murase wrote:


I agree with option 4. Thank you for all your explanations.

--

Can we also change the behavior of TERM in a similar way with option
4?  


I'll look at that for the next version.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-11 Thread Koichi Murase
2022年8月12日(金) 4:22 Chet Ramey :
> >> Often enough to make a difference?
> >
> > My `bind -x' functions use `LC_ALL=' and `LC_CTYPE=C' for every
> > keystroke, for example, in combination with `builtin read'.  They also
> > use `LC_ALL=' for other purposes for mostly every keystroke.  Some vi
> > binding also uses `LC_CTYPE=C'.  My completion functions also change
> > `LC_ALL` and `LC_CTYPE`.  For example, `LC_CTYPE=C' is used in
> > calculating a PJW hash code of a given string.  I haven't carefully
> > checked, but there are probably other cases of changing `LC_CTYPE'.
> > Also, `LC_ALL=' is used everywhere.
>
> So you're using `read -e'?

No.

> Otherwise, these suggest that solution 4 is most appropriate.
>
> >> Across multiple calls to readline?
> >
> > I think I am missing the point.  What does ``multiple calls to
> > readline'' mean?  Is the situation different from a single call to
> > readline?
>
> It informs the solution. If I choose option 4, for instance, none of these
> matter.

Ah! Now I understand it. I was misunderstanding option 4. Option 4 works for me.

> Where I think we're converging is to use option 4, and -- as long as
> LC_ALL/LC_CTYPE/LANG don't change -- not modifying these variables when
> readline() is called. I can document that these variables are dependent on
> the current locale, and if the locale changes, those variables will need
> to be adjusted. If the locale doesn't change between calls to readline(),
> you don't need to do anything.

I agree with option 4. Thank you for all your explanations.

--

Can we also change the behavior of TERM in a similar way with option
4?  Currently, a temporal change of TERM clears keybindings of some
keys (home, end, right, left, etc.) even when the temporal change does
not survive across multiple calls of readline:

$ bash-dev --norc
$ echo "$TERM"
screen.xterm-256color
$ bind '"\e[1~": bell'
$ bind -q beginning-of-line
beginning-of-line can be invoked via "\C-a", "\eOH", "\e[H".
$ TERM=dumb infocmp >dumb.ti
$ bind -q beginning-of-line
beginning-of-line can be invoked via "\C-a", "\eOH", "\e[1~", "\e[H".

There are only a few places where TERM can be changed in my
configuration (unlike LANG/LC_* which are changed in many places), so
I can work around them by saving and restoring the keybindings, yet I
think it is more reasonable that automatic rebinding on TERM changes
only happens when the change survives to the next call of readline (as
option 4 for the locale variables).

--
Koichi



Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-11 Thread Chet Ramey

On 8/10/22 10:59 PM, Koichi Murase wrote:

2022年8月10日(水) 23:21 Chet Ramey :

Does it mean custom values of these readline variables will be lost
every time LANG or LC_{CTYPE,ALL} is changed even if a user or program
intentionally sets them up?


It means those settings will now mirror the locale.


We often temporarily change LANG or LC_* to perform some binary
operations [such as counting the number of bytes of data and safely
removing trailing x from the result of $(command;printf x)].


Do you often do this in interactive shells?


Yes, but I don't mean I directly type the above kinds of commands in
the command line and run them, but I use them in the functions called
through `bind -x'.  Also, the above cases (counting bytes and removing
trailing x) are just examples; I set locale variables for various
purposes in the actual codes.  For example, I often type and run
commands of the form

   LANG=C some-commands-or-functions

to get the default error messages that are not locale-specific (though
I could use LC_MESSAGES=C instead, yet LANG=C is easier to type for
me).  I normally use the locale LANG=ja_JP.UTF-8 by default, so the
commands output error messages in Japanese by default.  This is not
useful when I would like to search for the solution on the internet
because there is almost no information on the Japanese error message.


So let's talk through these, since it doesn't seem like these things will
be affected by the realistic available solutions.



Often enough to make a difference?


My `bind -x' functions use `LC_ALL=' and `LC_CTYPE=C' for every
keystroke, for example, in combination with `builtin read'.  They also
use `LC_ALL=' for other purposes for mostly every keystroke.  Some vi
binding also uses `LC_CTYPE=C'.  My completion functions also change
`LC_ALL` and `LC_CTYPE`.  For example, `LC_CTYPE=C' is used in
calculating a PJW hash code of a given string.  I haven't carefully
checked, but there are probably other cases of changing `LC_CTYPE'.
Also, `LC_ALL=' is used everywhere.


So you're using `read -e'? Otherwise, these suggest that solution 4 is
most appropriate.



Across multiple calls to readline?


I think I am missing the point.  What does ``multiple calls to
readline'' mean?  Is the situation different from a single call to
readline?


It informs the solution. If I choose option 4, for instance, none of these
matter. They will all happen as part of a single call to readline, and the
normal shell execution will ensure that the modified locale variables are
temporary.



Hmm, I think I first need to make it clear that the behavior of my
code, which is supposed to be sourced in an interactive session by
users, is unaffected by these readline settings. 


OK.


I just do not want
to break or change the existing user settings inside the functions
that I provide.  The behavior of my functions is unaffected (except
for « bind -x '"\M-x":'  » which is affected by `convert-meta',
for which I already implemented a workaround) because it doesn't try
to communicate with readline inside a single call of `bind -x'.  The
problem is that, with the new automatic adjustment of these readline
variables, the settings by users can be lost after using `LC_ALL=' or
`LC_CTYPE=C' inside my functions.


Only if those functions recursively call readline() (which is a bad idea
anyway) somehow, or leave the modified settings in the user's environment
for the next call to readline(). This is the point of my question.



I believe this is a general problem for writers of Bash
configurations. `bash_completion' also uses `LC_CTYPE=C' and
`LC_ALL=C'.  The behavior of such configurations itself will be
unaffected by the change of readline settings, but they need to
implement special treatment to preserve the user settings if the user
settings will be lost by changing locales.


This scenario is not relevant with option 4, unless bash-completion leaves
its modified LC_CTYPE and LC_ALL settings in the user's environment after
the call to readline() completes. If it did, I imagine people would have
complained by now.




And, if the change is intended to be temporary, why would you not
want the relevant readline variables to reflect the locale when you
were finished?


Because I would not like to break the users' settings.  In general, a
third-party Bash configuration should not overwrite the users'
settings as far as the configuration does not need the setting.


So that argues against option 3, and in favor of option 4.




Also, if these readline variables would be cleared every time, it
seems to me that these readline variables would be effectively
unconfigurable and would lose the point of their existence, or we
could not touch LANG or LC_* at all after the initial setup.


The one caveat we would have to add is to tell users they have to
restore custom values of these readline variables if they change LC_ALL,
LC_CTYPE, or LANG from one call to readline to the next. They're already
auto-set when readline 

Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-10 Thread Koichi Murase
2022年8月10日(水) 23:21 Chet Ramey :
> > Does it mean custom values of these readline variables will be lost
> > every time LANG or LC_{CTYPE,ALL} is changed even if a user or program
> > intentionally sets them up?
>
> It means those settings will now mirror the locale.
>
> > We often temporarily change LANG or LC_* to perform some binary
> > operations [such as counting the number of bytes of data and safely
> > removing trailing x from the result of $(command;printf x)].
>
> Do you often do this in interactive shells?

Yes, but I don't mean I directly type the above kinds of commands in
the command line and run them, but I use them in the functions called
through `bind -x'.  Also, the above cases (counting bytes and removing
trailing x) are just examples; I set locale variables for various
purposes in the actual codes.  For example, I often type and run
commands of the form

  LANG=C some-commands-or-functions

to get the default error messages that are not locale-specific (though
I could use LC_MESSAGES=C instead, yet LANG=C is easier to type for
me).  I normally use the locale LANG=ja_JP.UTF-8 by default, so the
commands output error messages in Japanese by default.  This is not
useful when I would like to search for the solution on the internet
because there is almost no information on the Japanese error message.

> Often enough to make a difference?

My `bind -x' functions use `LC_ALL=' and `LC_CTYPE=C' for every
keystroke, for example, in combination with `builtin read'.  They also
use `LC_ALL=' for other purposes for mostly every keystroke.  Some vi
binding also uses `LC_CTYPE=C'.  My completion functions also change
`LC_ALL` and `LC_CTYPE`.  For example, `LC_CTYPE=C' is used in
calculating a PJW hash code of a given string.  I haven't carefully
checked, but there are probably other cases of changing `LC_CTYPE'.
Also, `LC_ALL=' is used everywhere.

> Across multiple calls to readline?

I think I am missing the point.  What does ``multiple calls to
readline'' mean?  Is the situation different from a single call to
readline?

Hmm, I think I first need to make it clear that the behavior of my
code, which is supposed to be sourced in an interactive session by
users, is unaffected by these readline settings.  I just do not want
to break or change the existing user settings inside the functions
that I provide.  The behavior of my functions is unaffected (except
for « bind -x '"\M-x":'  » which is affected by `convert-meta',
for which I already implemented a workaround) because it doesn't try
to communicate with readline inside a single call of `bind -x'.  The
problem is that, with the new automatic adjustment of these readline
variables, the settings by users can be lost after using `LC_ALL=' or
`LC_CTYPE=C' inside my functions.

I believe this is a general problem for writers of Bash
configurations. `bash_completion' also uses `LC_CTYPE=C' and
`LC_ALL=C'.  The behavior of such configurations itself will be
unaffected by the change of readline settings, but they need to
implement special treatment to preserve the user settings if the user
settings will be lost by changing locales.

> And, if the change is intended to be temporary, why would you not
> want the relevant readline variables to reflect the locale when you
> were finished?

Because I would not like to break the users' settings.  In general, a
third-party Bash configuration should not overwrite the users'
settings as far as the configuration does not need the setting.

> > Also, if these readline variables would be cleared every time, it
> > seems to me that these readline variables would be effectively
> > unconfigurable and would lose the point of their existence, or we
> > could not touch LANG or LC_* at all after the initial setup.
>
> It seems to me that the scenario Alan describes is much more common.

I agree with this point because I have also faced this problem
for « bind -x '"\M-x":...' » vs « convert-meta » before.  For this
problem, I have added a partial workaround at my side [1] where I
decided to save and restore `convert-meta' before and after running
`bind -x'.  Actually, the patch [2] I posted in this list before has
been a part of the workaround to this problem.

[1] 
https://github.com/akinomyoga/ble.sh/commit/f32808070796d3978787f4491f812d06a629ab3f
[2] https://lists.gnu.org/archive/html/bug-bash/2019-02/msg00036.html

I agree that we should somehow change the current behavior that the
default values of *-meta settings are determined by the locale on the
startup of Bash, but the proposed change will break the opposite
scenario while it solves Alan's scenario.

The combination (UTF-8 & 7bit-mode) doesn't make much sense, so we
might force (UTF-8 & 8bit-mode) for UTF-8 or similar for multibyte
character encodings with 8-bit bytes.  [ Note: Here, 7bit/8bit-mode
means « convert-meta on/off » and « {input,output}-meta off/on »,
respectively. ] However, on the opposite side of the single-byte
character encoding (e.g. for C), I 

Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-10 Thread Chet Ramey
On 8/10/22 10:21 AM, Chet Ramey wrote:

> I'm not going to make this much of a change at this point in the release
> process. I was willing to make the change I did because the changed
> behavior is a superset of the previous behavior.
> 
> So, assuming we say that the scenario Alan outlined is reasonable (it is),
> it looks like there are four alternatives:
> 
> 1. Do nothing; maintain the bash-5.1 behavior and force the change to the
>user.
> 
> 2. Leave the new function in place; automatically adjust to locale
>changes.
> 
> 3. Push it off to the application: introduce a new readline API that
>applications can call when locale variables change. This is very cheap.
> 
> 4. Push it onto readline: instead of checking the locale and making the
>eight-bit variables mirror it on each call, make readline check for
>locale changes (well, LC_CTYPE) and reset the eight-bit variables only
>if the current value doesn't match the value from the last call.
> 
> The last option is about as much of a change as I'm willing to make at
> this point.

There is a fifth option:

5. Make the locale-aware behavior dependent on a new readline option, which
   would be enabled by default.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-10 Thread Chet Ramey
On 8/9/22 4:50 PM, Koichi Murase wrote:
> 2022年8月10日(水) 2:07 Alan Coopersmith :
 Thanks for the report. The eight-bit settings are auto-set once, when
 readline is first called, but I'll see if it makes sense to change them
 on every call.
>>>
>>> It's fairly easy. I'll make the change for the next devel branch push and
>>> bash-5.2-rc3.
>>
>> Thanks for the quick investigation!
> 
> Does it mean custom values of these readline variables will be lost
> every time LANG or LC_{CTYPE,ALL} is changed even if a user or program
> intentionally sets them up?

It means those settings will now mirror the locale.

> We often temporarily change LANG or LC_* to perform some binary
> operations [such as counting the number of bytes of data and safely
> removing trailing x from the result of $(command;printf x)]. 

Do you often do this in interactive shells? Often enough to make a
difference? Across multiple calls to readline? And, if the change is
intended to be temporary, why would you not want the relevant
readline variables to reflect the locale when you were finished?


> Also, if these readline variables would be cleared every time, it
> seems to me that these readline variables would be effectively
> unconfigurable and would lose the point of their existence, or we
> could not touch LANG or LC_* at all after the initial setup.

It seems to me that the scenario Alan describes is much more common.


> Is it possible to make three states of the readline variables,
> `on/off/auto', and make `auto' the default, which determines the
> behavior depending on the current locale? In this case, the actual
> behavior on/off can be cached in another variable and can be updated
> on the change of LANG/LC_* when the readline variable has the value
> `auto'.

I'm not going to make this much of a change at this point in the release
process. I was willing to make the change I did because the changed
behavior is a superset of the previous behavior.

So, assuming we say that the scenario Alan outlined is reasonable (it is),
it looks like there are four alternatives:

1. Do nothing; maintain the bash-5.1 behavior and force the change to the
   user.

2. Leave the new function in place; automatically adjust to locale
   changes.

3. Push it off to the application: introduce a new readline API that
   applications can call when locale variables change. This is very cheap.

4. Push it onto readline: instead of checking the locale and making the
   eight-bit variables mirror it on each call, make readline check for
   locale changes (well, LC_CTYPE) and reset the eight-bit variables only
   if the current value doesn't match the value from the last call.

The last option is about as much of a change as I'm willing to make at
this point.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-09 Thread Koichi Murase
2022年8月10日(水) 2:07 Alan Coopersmith :
> >> Thanks for the report. The eight-bit settings are auto-set once, when
> >> readline is first called, but I'll see if it makes sense to change them
> >> on every call.
> >
> > It's fairly easy. I'll make the change for the next devel branch push and
> > bash-5.2-rc3.
>
> Thanks for the quick investigation!

Does it mean custom values of these readline variables will be lost
every time LANG or LC_{CTYPE,ALL} is changed even if a user or program
intentionally sets them up?

We often temporarily change LANG or LC_* to perform some binary
operations [such as counting the number of bytes of data and safely
removing trailing x from the result of $(command;printf x)]. If that
becomes to affect the user settings of readline variables, do we need
to save and restore these readline variables every time we touch LANG
or LC_*? This would become a serious overhead because it would
typically involve a subshell: save=$(bind -v).

Also, if these readline variables would be cleared every time, it
seems to me that these readline variables would be effectively
unconfigurable and would lose the point of their existence, or we
could not touch LANG or LC_* at all after the initial setup.

Is it possible to make three states of the readline variables,
`on/off/auto', and make `auto' the default, which determines the
behavior depending on the current locale? In this case, the actual
behavior on/off can be cached in another variable and can be updated
on the change of LANG/LC_* when the readline variable has the value
`auto'.

--
Koichi



Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-09 Thread Alan Coopersmith

On 8/9/22 08:15, Chet Ramey wrote:

On 8/9/22 10:45 AM, Chet Ramey wrote:

On 8/8/22 5:48 PM, Alan Coopersmith wrote:

One of our users complained that bash-5.1 on Solaris 11.4, when started
with LANG=C does not allow Unicode input after changing LANG to a UTF-8
locale until bash is restarted.


Thanks for the report. The eight-bit settings are auto-set once, when
readline is first called, but I'll see if it makes sense to change them
on every call.


It's fairly easy. I'll make the change for the next devel branch push and
bash-5.2-rc3.


Thanks for the quick investigation!

--
-Alan Coopersmith- alan.coopersm...@oracle.com
 Oracle Solaris Engineering - https://blogs.oracle.com/solaris



Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-09 Thread Chet Ramey
On 8/9/22 10:45 AM, Chet Ramey wrote:
> On 8/8/22 5:48 PM, Alan Coopersmith wrote:
>> One of our users complained that bash-5.1 on Solaris 11.4, when started
>> with LANG=C does not allow Unicode input after changing LANG to a UTF-8
>> locale until bash is restarted.
> 
> Thanks for the report. The eight-bit settings are auto-set once, when
> readline is first called, but I'll see if it makes sense to change them
> on every call.

It's fairly easy. I'll make the change for the next devel branch push and
bash-5.2-rc3.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Should the readline *-meta flags reset when $LANG changes?

2022-08-09 Thread Chet Ramey
On 8/8/22 5:48 PM, Alan Coopersmith wrote:
> One of our users complained that bash-5.1 on Solaris 11.4, when started
> with LANG=C does not allow Unicode input after changing LANG to a UTF-8
> locale until bash is restarted.

Thanks for the report. The eight-bit settings are auto-set once, when
readline is first called, but I'll see if it makes sense to change them
on every call.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Should the readline *-meta flags reset when $LANG changes?

2022-08-09 Thread Alan Coopersmith

One of our users complained that bash-5.1 on Solaris 11.4, when started
with LANG=C does not allow Unicode input after changing LANG to a UTF-8
locale until bash is restarted.

I've confirmed this is the default behavior, but can be overridden by
manually changing the readline output-meta flag from off to on:

% env LANG=C bash
bash-5.1$ echo \360\237\220\233

bash-5.1$ setenv LANG en_US.UTF-8
bash: setenv: command not found
bash-5.1$ export LANG=en_US.UTF-8
bash-5.1$ echo \360\237\220\233

bash-5.1$ bash
bash-5.1$ echo 

bash-5.1$ exit
exit
bash-5.1$ echo \360\237\220\233

bash-5.1$ bind 'set output-meta on'
bash-5.1$ echo 


(In all cases, the bug character was pasted the same way in a GNOME terminal,
 bash just displayed it differently in the input command line.  Our user was
 actually trying it with Chinese text, not emoji, but the results were the
 same.)

The documentation specifies that for output-meta "The default is ‘off’, but
Readline will set it to ‘on’ if the locale contains eight-bit characters."
The convert-meta & input-meta options are similarly documented as locale
dependent.

But none of them say what is expected to happen when the locale changes
after initialization - is the behavior we're seeing expected or are these
variables supposed to be automatically updated when the locale changes?

--
-Alan Coopersmith- alan.coopersm...@oracle.com
 Oracle Solaris Engineering - https://blogs.oracle.com/solaris



Should the readline *-meta flags reset when $LANG changes?

2022-08-08 Thread Alan Coopersmith

One of our users complained that bash-5.1 on Solaris 11.4, when started
with LANG=C does not allow Unicode input after changing LANG to a UTF-8
locale until bash is restarted.

I've confirmed this is the default behavior, but can be overridden by
manually changing the readline output-meta flag from off to on:

% env LANG=C bash
bash-5.1$ echo \360\237\220\233

bash-5.1$ setenv LANG en_US.UTF-8
bash: setenv: command not found
bash-5.1$ export LANG=en_US.UTF-8
bash-5.1$ echo \360\237\220\233

bash-5.1$ bash
bash-5.1$ echo 

bash-5.1$ exit
exit
bash-5.1$ echo \360\237\220\233

bash-5.1$ bind 'set output-meta on'
bash-5.1$ echo 


(In all cases, the bug character was pasted the same way in a GNOME terminal,
 bash just displayed it differently in the input command line.  Our user was
 actually trying it with Chinese text, not emoji, but the results were the
 same.)

The documentation specifies that for output-meta "The default is ‘off’, but
Readline will set it to ‘on’ if the locale contains eight-bit characters."
The convert-meta & input-meta options are similarly documented as locale
dependent.

But none of them say what is expected to happen when the locale changes
after initialization - is the behavior we're seeing expected or are these
variables supposed to be automatically updated when the locale changes?

--
-Alan Coopersmith- alan.coopersm...@oracle.com
 Oracle Solaris Engineering - https://blogs.oracle.com/solaris