Hi Christian,

Christian Weisgerber wrote on Mon, Mar 07, 2016 at 03:51:41PM +0000:
> On 2016-03-07, Ingo Schwarze <schwa...@usta.de> wrote:

>> Consequently, in the interest of safe and sane defaults, i propose
>> switching our xterm(1) to enable UTF-8 mode by default.

> Seconded.
 
>> The best place to switch is in the setup function VTInitialize_locale()
>> that decides whether to enable UTF-8 mode and which supporting flags
>> to set, by pretending to it that CODESET is always UTF-8, but without
>> interfering with the actual value of the CODESET and without changing
>> the utility function xtermEnvUTF8().

> Hmm, maybe you are overthinking this.
> Other defaults that we set differently from upstream are simply
> resource changes to XTerm.ad (/usr/X11R6/share/X11/app-defaults/XTerm).

Heh.  I considered simply changing the resource defaults, but came
to the wrong conclusion that there wouldn't be a way to achieve the
desired effect.  Thanks for bringing it up again, that made me
re-check, and it turns out there *is* a way that is quite
straightforward, minimally intrusive, very robust, and doesn't get
in the way of explicit user configuration: See the patch below.
If this gets OKs, let's forget my previous, more intrusive patch.

With that change, users can obviously still set *locale to other
values (for example, "true" or "false" come to mind), and the command
line options changing *locale (-lc +lc -en) still work.  Looking
at the code, explicitly setting *utf8 to false (or equivalently,
+u8 on the command line) also overrides this.

Spending a day reading xterm source code wasn't wasted, though -
by reading the documentation only, i wouldn't have understood
that this way works as intended and is safe.

OK?
  Ingo


> ----
> PS:
 
>>   printf "\303\237\n"   # thanks to sobrado@ for the striking example
>> Now your local terminal hangs until you force a reset using the
>> menus of the xterm program.

> \237 is 0x9F, equivalent to ESC _, which is APC (Application Program
> Command).  That appears in a table, but is not explained in the
> VT220 manual.  The VT420 manual says: "The VT420 ignores all following
> characters until it receives a SUB, ST, or any other C1 control
> character."

Yes.  I spent so much time reading terminal control code documentation
lately that i probably assumed this to be widely known.  ;-)
You are right, explaining it is helpful.


Index: XTerm.ad
===================================================================
RCS file: /cvs/xenocara/app/xterm/XTerm.ad,v
retrieving revision 1.15
diff -u -p -r1.15 XTerm.ad
--- XTerm.ad    26 Aug 2013 20:06:10 -0000      1.15
+++ XTerm.ad    7 Mar 2016 22:54:44 -0000
@@ -259,6 +259,11 @@
 
 ! OpenBSD local modifications
 
+! Enable UTF-8 mode since OpenBSD does not support any other multibyte
+! locales.  Even for people using the C/POSIX locale for everything,
+! that's safer and more usable than the upstream default of "medium".
+*locale: UTF-8
+
 ! ScrollBar by default
 *scrollBar: true
 

Reply via email to