Bug#441785: Umlauts break syntax highlighting
I boiled down the problem to the vimoutliner syntax definition: :syntax region OL2 start=+^\t[^:\t]+ end=+^\t[^:\t]+me=e-2 contains=outlTags,BT2,BT3,PT2,PT3,TA2,TA3,UT2,UT3,UB2,UB3,spellErr,SpellErrors,BadWord,OL3 keepend The problem here is the offset for end: me=e-2. This basically means that at level 2 (one leading tab), the match region ends on the first character that's also at level 2 (unless it encounters a match region not in the set specified by contains), minus 2 (the character and the leading tab). Vim seems to use bytes instead of characters here though: the syntax highlighting only breaks when a UTF8 character is the first of the heading, in which case the me=e-2 offset somehow gets lost and the OL2 region is extended to the *next* level 2 heading. Using me=e-3 or me=e-1 both work, which really does not make sense to me. Sven, this remains a bug in vim, I think, and I don't see a way to work around it in vimoutliner. If you want to help fix it, bring up the issue on the vim mailing list (and CC this bug report). -- .''`. martin f. krafft [EMAIL PROTECTED] : :' : proud Debian developer, author, administrator, and user `. `'` http://people.debian.org/~madduck - http://debiansystem.info `- Debian - when you have better things to do than fixing systems digital_signature_gpg.asc Description: Digital signature (see http://martin-krafft.net/gpg/)
Bug#441785: Umlauts break syntax highlighting
Package: vim-vimoutliner Version: 0.3.4-8 Umlauts break syntax highlighting. $ locale LANG=en_US LC_CTYPE=zh_CN.UTF8 LC_NUMERIC=en_US LC_TIME=en_US LC_COLLATE=en_US LC_MONETARY=en_US LC_MESSAGES=en_US LC_PAPER=en_US LC_NAME=en_US LC_ADDRESS=en_US LC_TELEPHONE=en_US LC_MEASUREMENT=en_US LC_IDENTIFICATION=en_US LC_ALL= $ locale -a C de_CH de_CH.iso88591 de_CH.utf8 [EMAIL PROTECTED] [EMAIL PROTECTED] en_US en_US.iso88591 en_US.iso885915 en_US.utf8 POSIX ru_RU.koi8r ru_RU.utf8 russian zh_CN zh_CN.gb18030 zh_CN.gb2312 zh_CN.gbk zh_CN.utf8 zh_TW zh_TW.big5 zh_TW.utf8 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#441785: Umlauts break syntax highlighting
reassign 441785 vim retitle 441785 vim's POSIX regexp classes don't honour LC_CTYPE properly thanks also sprach Sven Bischof [EMAIL PROTECTED] [2007.09.11.1010 +0200]: Umlauts break syntax highlighting. $ locale LANG=en_US LC_CTYPE=zh_CN.UTF8 It appears to me as if this is a bug in vim, which does not include a character such as ä in the class [[:alpha:]]. However, with a Unicode charset, [[:alpha:]] seems to be defined to include any kind of letter from any language http://www.regular-expressions.info/posixbrackets.html http://www.regular-expressions.info/unicode.html See this: $ export LC_CTYPE=zh_CN.UTF8 $ echo a a $ echo ä ä $ file a ä a: ASCII text ä: UTF-8 Unicode text $ grep '[[:alpha:]]' a ä a:a ä:ä $ vim -es +'argdo g/[[:alpha:]]' +':q!' a ä a The problem is the same if I use the de_CH.UTF8 locale. Thanks, -- .''`. martin f. krafft [EMAIL PROTECTED] : :' : proud Debian developer, author, administrator, and user `. `'` http://people.debian.org/~madduck - http://debiansystem.info `- Debian - when you have better things to do than fixing systems digital_signature_gpg.asc Description: Digital signature (see http://martin-krafft.net/gpg/)