Hello Ingo,

On 01/17/16 01:10, Ingo Schwarze wrote:
Hi,

the last few days i basically dug in to focus on understanding the vi
editing mode in ksh(1).  Given the quick partial success with the emacs
mode, i hoped that the usual 20/80 rule might apply.  However, the code
implementing vi mode seems substantially more contorted to me than the
code implementing emacs mode, so patches turn out to be slightly less
pretty.

Here is what might be 5% of the work giving 50% of the final usefulness.

This patch (-79 +145) includes three parts:

  1. Adding comments such that people reading the patch can understand
     what is going on (that accounts for 29 of the added lines).

  2. Cleanup by deleting the unused "lastref" global variable.

  3. Implement insertion of UTF-8 characters and basic UTF-8 support
     for the following commands: a h i I l x X ^H
     The following commands are out of scope for this patch:
     b B e E r R w W ^W

If you consider that useful, i can of course split the patch and
propose the three parts seperately.  But i think this is an exception
where it's easier to review as a whole because the comments are
added exactly at the places where you need them to understand the
changes.

I find this patch very useful since I run ksh in vi mode with UTF-8 support enabled, but I see a quirk in here. When applying this patch and running in an environment without an UTF-8 LC_ALL or LC_CTYPE the isu8cont gives the reverse problem of the current ksh having UTF-8 filenames with UTF-8 enabled. It becomes impossible to properly remove the utf-8 characters where the screen is badly redrawn, while in the old situation these could be removed byte-wise when in POSIX mode.
Example (caret is cursor):
$ cd ./Muziek/Motörhead/
                         ^
<move to the far left with h>
$ cd ./Muziek/Motörhead/
   ^
<move back to the far right with l>
$ ccd ./Muziek/Motörhead
                         ^

Since (afaik) utf-8 isn't supported on the console this could become an annoying situation in some situations.

I reckon it would be cleaner to do it like ls where the character is first pulled through mbtowc and based on it's return printed or presented as an ? per byte, or at the very least treated as a single byte.

Sincerely,

Martijn van Duren

Reply via email to