Matt Wozniski wrote:

> Bram Moolenaar wrote:
> >
> > Tony Mechelynck wrote:
> >
> >> Vim is now capable of displaying any Unicode codepoint for which the
> >> installed 'guifont' has a glyph, even outside the BMP (i.e., even above
> >> U+FFFF), but there's no easy way to represent those "high" codepoints by
> >> Unicode value in strings: I mean, "\uxxxx" and \Uxxxx" still accept no
> >> more than four hex digits.
> >>
> >> I propose to keep "\uxxxx" at its present meaning, but extend
> >> "\Uxxxxxxxx" to allow additional hex digits (either up to a total of 8
> >> hex digits, in line with ^VUxxxxxxxx as opposed to ^Vuxxxx in Insert
> >> mode, or at least up to the value \U10FFFF, above which the Unicode
> >> Consortium has decided that "there never shall be a valid Unicode
> >> codepoint at any future time".
> >
> > It does cause problems for something like "\U12345" which would now be
> > the character 0x1234 followed by the character 5. =C2=A0After the change =
> it
> > would become one character 0x12345.
> >
> > I don't see a convenient alternative though. =C2=A0Anyone?
> 
> Well, I don't know about *convenient*, but one option would be to
> continue allowing \u to use 1-to-4 hex digits, and require that \U use
> exactly 8 (or exactly 6, if we only support up to \U10FFFF) hex
> digits.  On the one hand, it will break just about every existing
> place where someone used \U instead of \u.  On the other hand, the fix
> is trivial, and it gives an actual reason for supporting both \u and
> \U.  I think it's better than the alternative you propose, since
> changing the definition from "1-to-4 hex digits" to "1-to-8 hex
> digits" will cause things to fail in non-obvious ways, and changing
> the defiintion to "exactly 8 hex digits" should usually cause a more
> obvious failure that we could assign a helpful error number to.

Requiring exactly 8 hex digits helps for the incompatibility.  However,
most Unicode characters are only 6 digits, so one needs to type two
more.  And it's easy to type the wrong number of digits with such a long
sequence..

The other suggestion about Perl give me this idea: "\x(123456)".
This has two advantages:
1. It's backwards compatible.
2. Avoids accidentally typing the wrong number of hex digits.
3. Allows typing a hex digit next as a separate character.

Eh, _three_ advantages.

I think perl uses "\x{123456}", but () is easier to type than {},
especially on some keyboards.  Don't see a reason to use {}.

-- 
Not too long ago, unzipping in public was illegal...

 /// Bram Moolenaar -- b...@moolenaar.net -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Raspunde prin e-mail lui