Dominique Pelle wrote:
> Here is another bug because of invalid utf-8 sequence: > > ==21329== Conditional jump or move depends on uninitialised value(s) > ==21329== at 0x811C4C7: utf_ptr2char (mbyte.c:1390) > ==21329== by 0x811C83F: utf_composinglike (mbyte.c:1498) > ==21329== by 0x811CD96: utfc_ptr2len_len (mbyte.c:1756) > ==21329== by 0x80FEAF1: msg_outtrans_len_attr (message.c:1355) > ==21329== by 0x80FE9D0: msg_outtrans_len (message.c:1292) > ==21329== by 0x80B9D7F: draw_cmdline (ex_getln.c:2641) > ==21329== by 0x80BAA99: redrawcmd (ex_getln.c:3126) > ==21329== by 0x80BAF82: nextwild (ex_getln.c:3317) > ==21329== by 0x80B6D3B: getcmdline (ex_getln.c:803) > ==21329== by 0x80B912E: getexline (ex_getln.c:2101) > ==21329== by 0x80A33FC: do_cmdline (ex_docmd.c:991) > ==21329== by 0x81280DD: nv_colon (normal.c:5181) > ==21329== by 0x8121772: normal_cmd (normal.c:1152) > ==21329== by 0x80E4E8A: main_loop (main.c:1177) > ==21329== by 0x80E49DA: main (main.c:936) > > ==21434== Conditional jump or move depends on uninitialised value(s) > ==21434== at 0x811C4C7: utf_ptr2char (mbyte.c:1390) > ==21434== by 0x811C28C: utf_ptr2cells (mbyte.c:1268) > ==21434== by 0x805D084: ptr2cells (charset.c:746) > ==21434== by 0x80FEC65: msg_outtrans_len_attr (message.c:1394) > ==21434== by 0x80FE9D0: msg_outtrans_len (message.c:1292) > ==21434== by 0x80B9D7F: draw_cmdline (ex_getln.c:2641) > ==21434== by 0x80BAA99: redrawcmd (ex_getln.c:3126) > ==21434== by 0x80BAF82: nextwild (ex_getln.c:3317) > ==21434== by 0x80B6D3B: getcmdline (ex_getln.c:803) > ==21434== by 0x80B912E: getexline (ex_getln.c:2101) > ==21434== by 0x80A33FC: do_cmdline (ex_docmd.c:991) > ==21434== by 0x81280DD: nv_colon (normal.c:5181) > ==21434== by 0x8121772: normal_cmd (normal.c:1152) > ==21434== by 0x80E4E8A: main_loop (main.c:1177) > ==21434== by 0x80E49DA: main (main.c:936) > > It's easy to reproduce: > > 1/ Create a file names with invalid utf-8 name: > > $ mkdir testcase > $ touch testcase/$(perl -e 'print chr(0xfb)') > > 2/ Start Vim in 'no compatible' mode with Valgrind: > > $ valgrind vim -u NONE -c -N 2> vg.log > > 4/ In ex mode, enter: > > :e testcase/<TAB> > > When pressing <TAB> key (file completion), vim should display > file name with invalid utf-8 sequence 'testcase/<fb>' > and observe in vg.log that valgrind reports above errors. > > The 2 error messages are actually 2 distinct bugs. > > - First error: in mbyte.c:1756, calling > UTF_COMPOSINGLIKE(p + prevlen, p + len) is unsafe, since it > can read bytes beyond size. So in theory, there is a small risk > that vim combines characters when it should not. > > Attached patch "fix1-invalid-utf8-seq.patch" fixes this first > error, by calling utf_ptr2len_len(p + len, size - len) before > UTF_COMPOSINGLIKE(...) to ensure that UTF_COMPOSINGLIKE(...) > can't access beyond size. > > - Second error: in message.c:1394, calling ptr2cells(str) is > unsafe since it can read bytes in str beyond len (which are > uninitialized). At line 1394, we know that the character is > only 1 byte long (mb_l is 1) so it should call char2cells(*str). > mb_l is 1 here because of incomplete utf-8 sequence so calling > ptr2cells(str) overflows. > > Attached patch "fix2-invalid-utf8-seq.patch" fixes it. > > After fixing these 2 bugs, valgrind no longer complains with > the above testcase. > > However, there is a 3rd bug which can be triggered with a slightly > different test case, using file <ff> instead of <fb>. The difference > is that <ff> is an invalid utf-8 sequence, whereas <fb> was an > incomplete utf-8 sequence: > > 1/ Create a file names with invalid utf-8 names: > > $ mkdir testcase > $ touch testcase/$(perl -e 'print chr(0xff)') > > 2/ Start Vim in 'no compatible' mode with Valgrind: > > $ valgrind vim -u NONE -c -N 2> vg.log > > 3/ In ex mode, enter: > > :e testcase/<TAB><TAB> > > When pressing <TAB> key (file completion), vim should display > file name with invalid utf-8 sequence 'e: testcase/<fb>' (without > valgrind errors after above patches) then when processing <TAB> > again, vim displays next file 'e: testcase/<ff>' but observe > that valgrind reports following error: > > ==22527== Conditional jump or move depends on uninitialised value(s) > ==22527== at 0x811C4CB: utf_ptr2char (mbyte.c:1390) > ==22527== by 0x811C843: utf_composinglike (mbyte.c:1498) > ==22527== by 0x811CDC8: utfc_ptr2len_len (mbyte.c:1769) > ==22527== by 0x80FEAF1: msg_outtrans_len_attr (message.c:1355) > ==22527== by 0x80FE9D0: msg_outtrans_len (message.c:1292) > ==22527== by 0x80B9D7F: draw_cmdline (ex_getln.c:2641) > ==22527== by 0x80BAA99: redrawcmd (ex_getln.c:3126) > ==22527== by 0x80BAF82: nextwild (ex_getln.c:3317) > ==22527== by 0x80B6CC6: getcmdline (ex_getln.c:790) > ==22527== by 0x80B912E: getexline (ex_getln.c:2101) > ==22527== by 0x80A33FC: do_cmdline (ex_docmd.c:991) > ==22527== by 0x81280F5: nv_colon (normal.c:5181) > ==22527== by 0x812178A: normal_cmd (normal.c:1152) > ==22527== by 0x80E4E8A: main_loop (main.c:1177) > ==22527== by 0x80E49DA: main (main.c:936) > > This bug happens because utf_ptr2char(p) does not check > whether first byte of utf-8 sequence is legal. When p[0] >= 0x80 > and utf8len_tab[p[0]] is 1, then the first byte is illegal, and > utf_ptr2char(p) should return 1 immediately, without checking p[1]. > > Attached patch fix3-invalid-utf8-seq.patch fixes this 3rd bug. > > After those 3 fixes, I no longer observe errors. All three look like good fixes to me. I'll include them. > I hope errors with invalid utf-8 sequences are not too nitpicking. > I'm not aware of any other such errors at the moment. So hopefully > no more patches like this. Depending on the system reading uninitialised memory may cause a crash. And when mixing up latin1 with utf-8 it's easy to get illegal byte sequences. So it definitely makes sense to fix all these little problems. I'm very glad you find them. -- hundred-and-one symptoms of being an internet addict: 117. You are more comfortable typing in html. /// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.Moolenaar.net \\\ /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\ download, build and distribute -- http://www.A-A-P.org /// \\\ help me help AIDS victims -- http://ICCF-Holland.org /// --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_dev" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---
