Suggestion: Redefine \Uxxxxx in double-quoted strings

Tony Mechelynck Mon, 06 Apr 2009 11:33:56 -0700

Vim is now capable of displaying any Unicode codepoint for which the 
installed 'guifont' has a glyph, even outside the BMP (i.e., even above 
U+FFFF), but there's no easy way to represent those "high" codepoints by 
Unicode value in strings: I mean, "\uxxxx" and \Uxxxx" still accept no 
more than four hex digits.


I propose to keep "\uxxxx" at its present meaning, but extend 
"\Uxxxxxxxx" to allow additional hex digits (either up to a total of 8 
hex digits, in line with ^VUxxxxxxxx as opposed to ^Vuxxxx in Insert 
mode, or at least up to the value \U10FFFF, above which the Unicode 
Consortium has decided that "there never shall be a valid Unicode 
codepoint at any future time".

I'm aware that this is an "incompatible" change, but I believe the risk 
is low compared with the advantages (as a sidenote, many rare CJK 
characters lie in plane 2, in the "CJK Unified Extension B" range 
U+20000-U+2A6DF).

The notation "\<Char-0x20000>" or "\<Char-131072>" doesn't work: here 
(in my GTK2/Gnome2 gvim with 'encoding' set to UTF-8), ":echo"ing such a 
string displays <f0><a0><80><fe>X<80><fe>X instead of just the one CJK 
character 𠀀 (and, yes, I've set my mailer to send this post as UTF-8 so 
if yours is "well-behaved" it should display that character properly).


Best regards,
Tony.
-- 
Although the moon is smaller than the earth, it is farther away.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Suggestion: Redefine \Uxxxxx in double-quoted strings

Raspunde prin e-mail lui