2016-01-03 18:38 GMT+03:00 Tony Mechelynck <[email protected]>:
> On Sun, Jan 3, 2016 at 3:53 PM, Nikolay Aleksandrovich Pavlov > <[email protected]> wrote: > > Unlike some other languages in VimL range always meant range in current > encoding’s codepoints. So in UTF-8 [a-zA-Z] is literally “from U+0061 to > U+007a (inclusive) or from U+0041 to U+005A (inclusive)” and that does not > include characters like ä: vim regexes never had normal unicode support, > and locale support is rather limited and only considers locale encoding > (actually, &encoding and not locale encoding, but unless you specify > otherwise in documentation one is derived from the other) AFAIK. I do not > think this is going to be ever fixed because making character ranges > locale-dependent changes their semantics significantly and where previously > plugin author may expect [a-zA-Z] to match all latin ASCII letters with > such change this is no longer the case. E.g. in Perl correct representation > of [a-zA-Z] in UTF-8 regex mode is something like > `(?:(?=\pL)\p{Block=Basic_Latin})`: not something one wants to write > constantly. > > > Indeed. In Vim, when 'encoding' is UTF-8, or indeed most or all of the > ISO-8859 encodings (and many others), [a-zA-Z] is just a short way of > writing [abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ] and > nothing else. > > The one possible exception is when running on a machine with EBCDIC > encoding (such as an IBM mainframe running not Linux but zOS as its > operating system), and keeping that encoding as 'encoding' within Vim, > You can’t set &encoding to anything that is not ASCII-compatible, unless you compiled Vim with -DEBCDIC. -DEBCDIC is AFAIK z/OS-specific and I do not remember any complaints from z/OS users recently: if you [search][1] for z/OS in vim-dev then the last relevant (i.e. where z/OS string is not part of the “context” in a patch) [topic][2] is patch 7.3.555 from 13.06.2012, more then three years old. It is very much unlikely that nothing has ever broken again build since then. [1]: https://groups.google.com/forum/#!searchin/vim_dev/z$2FOS%7Csort:date [2]: https://groups.google.com/forum/#!searchin/vim_dev/z$2FOS|sort:date/vim_dev/cYvmgP0v9ag/TT_XSWL7yHUJ > which IIUC is definitely not recommended. In that case, IIUC, to keep > the 52-letter alphabet shown above one whould have to abbreviate it no > more than [a-ij-rs-zA-IJ-RS-Z] because the EBCDIC alphabet is > discontinuous: there are punctuation marks between i and j, and > between r and s, and the same in uppercase. But maube I don't UC and > Vim interprets [a-zA-Z] the usual way there in order to avoid > surprising "tourists" from the papertape universe, even when working > in the punched-card universe. ;-) > > Best regards, > Tony. > > -- > -- > You received this message from the "vim_dev" maillist. > Do not top-post! Type your reply below the text you are replying to. > For more information, visit http://www.vim.org/maillist.php > > --- > You received this message because you are subscribed to the Google Groups > "vim_dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
