On 20.11.2012 17:15, Johan Corveleyn wrote: > On Tue, Nov 20, 2012 at 4:09 PM, Stefan Sperling <s...@elego.de> wrote: >> On Tue, Nov 20, 2012 at 03:40:20PM +0100, Branko Čibej wrote: >>> On 20.11.2012 15:05, Stefan Sperling wrote: >>>> I just checked the UNIX patch code shipped with OpenBSD and it seems >>>> you're right that it only looks for the backslash and ignores the comment. >>>> >>>> However, it seems in practice patches usually contain this string in >>>> non-localised form. At least nobody has yet complained about svn patch >>>> misparsing such patches. >>> I'm distinctly remember a report on this very list, with a unidiff with >>> a localized message attached. However, I can't find it in the archives. >>> In any case it would be nice if our diff parser only looked at the \, >>> not the whole message. I'm less worried about our not localizing that >>> particular string. >> Currently, lines such as: >> \\ this is a comment >> anywhere in the patch are silently ignored. >> >> If we follow your suggestion we must treat any such lines as hunk >> terminators and assume the hunk lacks a trailing newline. >> I suppose such a change is correct but it's a behaviour change so >> I wouldn't want to backport it to 1.7.x. >> >> Below is a patch. Diff parser tests and patch tests are still passing. >> >> [[[ >> * subversion/libsvn_diff/parse-diff.c >> (parse_next_hunk): Treat any line that starts with a backslash as a >> hunk terminator, indicating that the hunk does not end with EOL. >> Comments following the backslash might be localised or missing >> in which case parsing the patch would fail. >> >> Suggested by: brane >> ]]] >> >> Index: subversion/libsvn_diff/parse-diff.c >> =================================================================== >> --- subversion/libsvn_diff/parse-diff.c (revision 1411078) >> +++ subversion/libsvn_diff/parse-diff.c (working copy) >> @@ -555,15 +555,11 @@ parse_next_hunk(svn_diff_hunk_t **hunk, >> pos = 0; >> SVN_ERR(svn_io_file_seek(apr_file, APR_CUR, &pos, iterpool)); >> >> - /* Lines starting with a backslash are comments, such as >> + /* Lines starting with a backslash indicate a missing EOL: >> * "\ No newline at end of file". */ >> if (line->data[0] == '\\') >> { >> - if (in_hunk && >> - ((!*is_property && >> - strcmp(line->data, "\\ No newline at end of file") == 0) || >> - (*is_property && >> - strcmp(line->data, "\\ No newline at end of property") == >> 0))) >> + if (in_hunk) >> { >> char eolbuf[2]; >> apr_size_t len; > I thought Branko said "leading backslash followed by a space": > > [[[ > Only the leading backslash and space are important for > signaling the no-trailing-eoln state. > ]]] > > (your argument about "\\ this is a comment" still holds though)
I'm wrong about the "and space" bit. In unidiff (and context-diff) the first column is for instruction markers; +/> for imsertions, -/< for removals, \ for no-newline-at-end, space for surrounding context. So as far as looking only at the first column is concerned, that's correct. I've frankly never heard about \ being a comment marker. However, we do know the size of the hunk -- it's signalled in the hunk header -- and the \ that signifies no-end-of-line would always appear /after/ the last line of the hunk. In that respect, it would appear that our diff parser has a bug, since it doesn't appear to handle these things correctly. N.B., the original coment in that code is wrong, too; "\ No newline at end of file" is *not* a comment, it's a processing instruction. -- Brane -- Branko Čibej Director of Subversion | WANdisco | www.wandisco.com