Dominique Pellé <[email protected]>:
> Hi
>
> I see an error with Valgrind with Vim-7.2.416 on Linux:
>
> Steps to reproduce:
>
> 1) Create a file with illegal utf-8 sequence:
>
> $ perl -e 'print chr(0)' > sample.txt
>
> 2) Start Vim with valgrind:
>
> $ valgrind --num-callers=20 --log-file=vg.log vim --noplugin -u NONE
>
> 3) Type Ex commands:
>
> :syntax on
> :au BufReadPre *.dat let &bin=1
> :e ++enc=utf-8 ++bad=keep sample.txt
>
> 4) Observe the following error in vg.log:
>
> ==6333== Invalid read of size 1
> ==6333== at 0x4026088: strlen (mc_replace_strmem.c:282)
> ==6333== by 0x8088E64: set_cmdarg (eval.c:18309)
> ==6333== by 0x80C86A5: apply_autocmds_group (fileio.c:9067)
> ==6333== by 0x80C818B: apply_autocmds_exarg (fileio.c:8706)
> ==6333== by 0x80BD19E: readfile (fileio.c:649)
> ==6333== by 0x8052E96: open_buffer (buffer.c:131)
> ==6333== by 0x80952F6: do_ecmd (ex_cmds.c:3657)
> ==6333== by 0x80AA7AD: do_exedit (ex_docmd.c:7585)
> ==6333== by 0x80AA46A: ex_edit (ex_docmd.c:7481)
> ==6333== by 0x80A312D: do_one_cmd (ex_docmd.c:2629)
> ==6333== by 0x80A0A77: do_cmdline (ex_docmd.c:1098)
> ==6333== by 0x811C867: nv_colon (normal.c:5224)
> ==6333== by 0x8116188: normal_cmd (normal.c:1188)
> ==6333== by 0x80DE26F: main_loop (main.c:1211)
> ==6333== by 0x80DDD66: main (main.c:955)
> ==6333== Address 0x56543bf is 1 bytes before a block of size 100 alloc'd
> ==6333== at 0x4024F70: malloc (vg_replace_malloc.c:236)
> ==6333== by 0x8107152: lalloc (misc2.c:868)
> ==6333== by 0x810706F: alloc (misc2.c:767)
> ==6333== by 0x80B66AC: alloc_cmdbuff (ex_getln.c:2509)
> ==6333== by 0x80B2E6B: getcmdline (ex_getln.c:229)
> ==6333== by 0x80B5E47: getexline (ex_getln.c:2124)
> ==6333== by 0x80A080F: do_cmdline (ex_docmd.c:994)
> ==6333== by 0x811C867: nv_colon (normal.c:5224)
> ==6333== by 0x8116188: normal_cmd (normal.c:1188)
> ==6333== by 0x80DE26F: main_loop (main.c:1211)
> ==6333== by 0x80DDD66: main (main.c:955)
>
> The same kind of error happens at least with:
>
> :e ++enc=utf-8 ++bad=keep sample.txt
> or :e ++enc=utf-8 ++bad=drop sample.txt
> or :e ++enc=utf-8 ++bad=x sample.txt
>
> Error happens in eval.c:18309:
>
> 18308 if (eap->bad_char != 0)
> 18309 len += (unsigned)STRLEN(eap->cmd + eap->bad_char) + 7;
>
> Adding a printf, I see:
> - eap->cmd is "e ++enc=utf-8"
> - eap->bad_char is -1 !!!
>
> eap->bad_char was previously set to -1 (BAD_KEEP) at ex_docmd.c:4776:
>
> 4773 p = eap->cmd + eap->bad_char;
> 4774
> 4775 if (STRICMP(p, "keep") == 0)
> 4776 eap->bad_char = BAD_KEEP;
> 4777 else if (STRICMP(p, "drop") == 0)
> 4778 eap->bad_char = BAD_DROP;
> 4779 else if (MB_BYTE2LEN(*p) == 1 && p[1] == NUL)
> 4780 eap->bad_char = *p;
>
> This code is odd since field 'eap->bad_char' is used for two different things:
>
> - before ex_docmd.c:4773, 'eap->bad_char' contains an index in eap->cmd
> - after line ex_docmd.c:4776, 4777 or 4780, the meaning of eap->bad_char
> changes, it then contains: BAD_KEEP, BAD_DROP or the character to
> replace bad char with).
>
> Using 'eap->bad_char' for 2 different things is dangerous. And it bites us
> at eval.c:18309 where code expects 'eap->bad_char' to still contain an
> index in 'eap->cmd', but it's already been replaced by BAD_KEEP (bug).
>
> I'm not sure how it should be fixed. Maybe field 'eap->bad_char' should
> be replaced by 2 different fields to avoid mixing 2 different things:
> - bad_char_idx ... index in eap->cmd
> - bad_char ....... char to replace bad characters with
>
> Does it look like the right way to fix it?
>
> -- Dominique
Hi
Attached patch fixes the Valgrind errors in Vim-7.2.416 described
in above email.
After re-reading my previous email, I noticed that bug
happens when loading any kind of file with...
:e ++enc=utf-8 ++bad=keep filename
... with or without invalid utf-8 sequences, and that the sample,txt
file I gave in above email was in fact a valid utf-8 file.
This is how I checked that Vim works after patch:
1) create a file with invalid utf-8 char
$ perl -e 'print chr(254); print chr(255)' > invalid-utf-8.txt
2) Try to load the file with:
:e ++enc=utf-8 ++bad=keep invalid-utf-8.txt
-> should display <fe><ff>
:e ++enc=utf-8 ++bad=X invalid-utf-8.txt
-> should display XX
:e ++enc-utf-8 ++bad=drop invalid-utf-8.txt
-> should display nothing
Cheers
-- Dominique
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
diff -r e9c9f37be306 src/eval.c
--- a/src/eval.c Fri May 07 16:54:37 2010 +0200
+++ b/src/eval.c Wed May 12 19:17:33 2010 +0200
@@ -18305,8 +18305,8 @@
# ifdef FEAT_MBYTE
if (eap->force_enc != 0)
len += (unsigned)STRLEN(eap->cmd + eap->force_enc) + 7;
- if (eap->bad_char != 0)
- len += (unsigned)STRLEN(eap->cmd + eap->bad_char) + 7;
+ if (eap->bad_char_idx != 0)
+ len += (unsigned)STRLEN(eap->cmd + eap->bad_char_idx) + 7;
# endif
newval = alloc(len + 1);
@@ -18330,9 +18330,9 @@
if (eap->force_enc != 0)
sprintf((char *)newval + STRLEN(newval), " ++enc=%s",
eap->cmd + eap->force_enc);
- if (eap->bad_char != 0)
+ if (eap->bad_char_idx != 0)
sprintf((char *)newval + STRLEN(newval), " ++bad=%s",
- eap->cmd + eap->bad_char);
+ eap->cmd + eap->bad_char_idx);
# endif
vimvars[VV_CMDARG].vv_str = newval;
return oldval;
diff -r e9c9f37be306 src/ex_cmds.h
--- a/src/ex_cmds.h Fri May 07 16:54:37 2010 +0200
+++ b/src/ex_cmds.h Wed May 12 19:17:33 2010 +0200
@@ -1152,7 +1152,8 @@
int force_ff; /* ++ff= argument (index in cmd[]) */
#ifdef FEAT_MBYTE
int force_enc; /* ++enc= argument (index in cmd[]) */
- int bad_char; /* ++bad= argument (index in cmd[]) */
+ int bad_char_idx; /* ++bad= argument (index in cmd[]) */
+ int bad_char; /* behavior or char to replace bad char */
#endif
#ifdef FEAT_USR_CMDS
int useridx; /* user command index */
diff -r e9c9f37be306 src/ex_docmd.c
--- a/src/ex_docmd.c Fri May 07 16:54:37 2010 +0200
+++ b/src/ex_docmd.c Wed May 12 19:17:33 2010 +0200
@@ -4739,7 +4739,7 @@
else if (STRNCMP(arg, "bad", 3) == 0)
{
arg += 3;
- pp = &eap->bad_char;
+ pp = &eap->bad_char_idx;
}
#endif
@@ -4770,7 +4770,7 @@
{
/* Check ++bad= argument. Must be a single-byte character, "keep" or
* "drop". */
- p = eap->cmd + eap->bad_char;
+ p = eap->cmd + eap->bad_char_idx;
if (STRICMP(p, "keep") == 0)
eap->bad_char = BAD_KEEP;
else if (STRICMP(p, "drop") == 0)