Hi,
2014/11/6 Thu 5:11:26 UTC+9 Bram Moolenaar wrote:
> Yasuhiro Matsumoto wrote:
>
> > Bram. you seems removed this issue from todo list.
> > But I'm thinking merging patch above is better than keeps current status.
> >
> > There is two problems.
> >
> > 1. diff.vim contains several encodings. So if DBCS is used on vim, vim
> > may handle invalid-characters.
> > 2. locale message of svn is encoded to system locale encoding. So it's
> > not match as vim's encoding.
> >
> > The first of those problems will be fixed with my patch.
> > To fix the second of the problems, I suggest removing syntax of
> > 'diffOnly' for multi-byte encodings.
>
> If I remember correctly, your patch breaks recognizing diff headers if
> the text does not match the current locale. E.g., when my locale is
> German and I edit a diff file generated by someone in Italy, I still
> expect the headers to be recognized.
>
> When the file's encoding differs from what Vim has detected then all
> bets are off, it will be impossible to compare the text correctly.
> Unless we have a regexp that works around it, it's probably very
> difficult.
>
> What is the error that is reported when using a DBCS encoding?
> A reproducible example is useful.
It occurs when enc=cp932 on Cygwin/MSYS/Linux.
E.g.:
$ vim -u NONE -N -c "set enc=cp932" -c "syntax on" -c "set ft=diff"
Error detected while processing /usr/local/share/vim/vim74/syntax/diff.vim:
line 128:
E401: Pattern delimiter not found: "^\\ ????????????????????????????????????? ??
?? ???????
E475: Invalid argument: diffNoEOL^I"^\\ ????????????????????????????????????? ??
?? ???????
It doesn't occur on Win32. Maybe it occurs only when libiconv is used.
libiconv fails to convert the encoding of diff.vim from utf-8 to cp932, so Vim
opens diff.vim without converting the encoding.
The root cause of this problem is handling of invalid characters.
The last two bytes of the line 128 are 0x97 0x22 (").
0x97 can be a lead byte in cp932, but 0x22 cannot be a trail byte in cp932.
However, Vim wrongly handle the byte sequence 0x97 0x22 as one character.
Thus Vim cannot find the ending double quotation mark (0x22).
Maybe we also need to check the trail byte (not only the lead byte), but it
might be a little bit slow. BTW, I think enc=cp932 is a legacy setting
(especially on Cygwin/Linux), so I don't want to make an effort to fix this.
Instead of fixing Vim itself, I have two ideas to work around this problem:
1. Add a dummy ending quotation ( | ") at the end of the line 128.
--- a/runtime/syntax/diff.vim
+++ b/runtime/syntax/diff.vim
@@ -125,7 +125,7 @@
syn match diffDiffer "^הזמ הז םינוש `.*'-ו `.*' םיצבקה$"
syn match diffBDiffer "^הזמ הז םינוש `.*'-ו `.*' םיירניב םיצבק$"
syn match diffIsA "^.* .*-ל .* .* תוושהל ןתינ אל$"
-syn match diffNoEOL "^\\ ץבוקה ףוסב השדח-הרוש ות רסח"
+syn match diffNoEOL "^\\ ץבוקה ףוסב השדח-הרוש ות רסח" | "
syn match diffCommon "^.*-ו .* :תוהז תויקית-תת$"
" hr
2. Add an empty pattern "\%(\)" at the end of the pattern in the line 128.
--- a/runtime/syntax/diff.vim
+++ b/runtime/syntax/diff.vim
@@ -125,7 +125,7 @@
syn match diffDiffer "^הזמ הז םינוש `.*'-ו `.*' םיצבקה$"
syn match diffBDiffer "^הזמ הז םינוש `.*'-ו `.*' םיירניב םיצבק$"
syn match diffIsA "^.* .*-ל .* .* תוושהל ןתינ אל$"
-syn match diffNoEOL "^\\ ץבוקה ףוסב השדח-הרוש ות רסח"
+syn match diffNoEOL "^\\ ץבוקה ףוסב השדח-הרוש ות רסח\%(\)"
syn match diffCommon "^.*-ו .* :תוהז תויקית-תת$"
" hr
Regards,
Ken Takata
--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.