Bram Moolenaar wrote:
>
> Tony Mechelynck wrote:
>
>> Vim is now capable of displaying any Unicode codepoint for which the
>> installed 'guifont' has a glyph, even outside the BMP (i.e., even above
>> U+FFFF), but there's no easy way to represent those "high" codepoints by
>> Unicode value in strings: I mean, "\uxxxx" and \Uxxxx" still accept no
>> more than four hex digits.
>>
>> I propose to keep "\uxxxx" at its present meaning, but extend
>> "\Uxxxxxxxx" to allow additional hex digits (either up to a total of 8
>> hex digits, in line with ^VUxxxxxxxx as opposed to ^Vuxxxx in Insert
>> mode, or at least up to the value \U10FFFF, above which the Unicode
>> Consortium has decided that "there never shall be a valid Unicode
>> codepoint at any future time".
>
> It does cause problems for something like "\U12345" which would now be
> the character 0x1234 followed by the character 5.  After the change it
> would become one character 0x12345.
>
> I don't see a convenient alternative though.  Anyone?

Well, I don't know about *convenient*, but one option would be to
continue allowing \u to use 1-to-4 hex digits, and require that \U use
exactly 8 (or exactly 6, if we only support up to \U10FFFF) hex
digits.  On the one hand, it will break just about every existing
place where someone used \U instead of \u.  On the other hand, the fix
is trivial, and it gives an actual reason for supporting both \u and
\U.  I think it's better than the alternative you propose, since
changing the definition from "1-to-4 hex digits" to "1-to-8 hex
digits" will cause things to fail in non-obvious ways, and changing
the defiintion to "exactly 8 hex digits" should usually cause a more
obvious failure that we could assign a helpful error number to.

~Matt

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Raspunde prin e-mail lui