Re: [PATCH] Better error recovery for merge-conflict markers (v4)
On Wed, 2015-12-09 at 18:44 +0100, Bernd Schmidt wrote: > On 12/09/2015 05:58 PM, David Malcolm wrote: > > On Wed, 2015-11-04 at 14:56 +0100, Bernd Schmidt wrote: > >> > >> This seems like fairly low impact but also low cost, so I'm fine with it > >> in principle. I wonder whether the length of the marker is the same > >> across all versions of patch (and VC tools)? > > > > It's hardcoded for GNU patch: > [...] > >From what I can tell, Perforce is the outlier here. > > Thanks for checking all that. > > >> Just thinking out loud - I guess it would be too much to hope for to > >> share lexers between frontends so that we need only one copy of this? > > > > Probably :( > > Someone slap sense into me, I just thought of deriving C and C++ parsers > from a common base class... (no this is not a suggestion for this patch). > > > Would a better wording be: > > > > extern short some_var; /* This line would lead to a warning due to the > >duplicate name, but it is skipped when handling > >the conflict marker. */ > > I think so, yes. > > > That said, it's not clear they're always at the beginning of a line; > > this bazaar bug indicates that CVS (and bazaar) can emit them > > mid-line: > >https://bugs.launchpad.net/bzr/+bug/36399 > > Ok. CVS I think we shouldn't worry about, and it looks like this is one > particular bug/corner case where the conflict end marker is the last > thing in the file. I think on the whole it's best to check for beginning > of the line as you've done. > > > Wording-wise, should it be "merge conflict marker", rather > > than "patch conflict marker"? > > > > Clang spells it: > > "error: version control conflict marker in file" > > http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts > > Yeah, if another compiler has a similar/identical diagnostic I think we > should just copy that unless there's a very good reason not to. > > > Rebased on top of r231445 (from yesterday). > > Successfully bootstrapped on x86_64-pc-linux-gnu. > > Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum. > > > > OK for trunk? > > I'm inclined to say yes since it was originally submitted in time and > it's hard to imagine how the change could be risky (I'll point out right > away that there are one or two other patches in the queue that were also > submitted in time which I feel should not be considered for gcc-6 at > this point due to risk). > > Let's wait until the end of the week for objections, commit then. Thanks. I updated it based on the feedback above, including changing the wording to match clang's: "error: version control conflict marker in file" I replaced "patch conflict marker" with "conflict marker" in the code and names of test cases. I took the liberty of adding: gcc_assert (n > 0); to c_parser_peek_nth_token based on Martin's feedback. Having verified bootstrap (on x86_64-pc-linux-gnu), I've committed it to trunk as r231712. I'm attaching what I committed, for reference. Index: gcc/c-family/ChangeLog === --- gcc/c-family/ChangeLog (revision 231711) +++ gcc/c-family/ChangeLog (revision 231712) @@ -1,3 +1,8 @@ +2015-12-16 David Malcolm+ + * c-common.h (conflict_marker_get_final_tok_kind): New prototype. + * c-lex.c (conflict_marker_get_final_tok_kind): New function. + 2015-12-15 Ilya Verbin * c-common.c (c_common_attribute_table): Handle "omp declare target Index: gcc/c-family/c-lex.c === --- gcc/c-family/c-lex.c (revision 231711) +++ gcc/c-family/c-lex.c (revision 231712) @@ -1263,3 +1263,29 @@ return value; } + +/* Helper function for c_parser_peek_conflict_marker + and cp_lexer_peek_conflict_marker. + Given a possible conflict marker token of kind TOK1_KIND + consisting of a pair of characters, get the token kind for the + standalone final character. */ + +enum cpp_ttype +conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind) +{ + switch (tok1_kind) +{ +default: gcc_unreachable (); +case CPP_LSHIFT: + /* "<<" and '<' */ + return CPP_LESS; + +case CPP_EQ_EQ: + /* "==" and '=' */ + return CPP_EQ; + +case CPP_RSHIFT: + /* ">>" and '>' */ + return CPP_GREATER; +} +} Index: gcc/c-family/c-common.h === --- gcc/c-family/c-common.h (revision 231711) +++ gcc/c-family/c-common.h (revision 231712) @@ -1089,6 +1089,10 @@ extern int c_gimplify_expr (tree *, gimple_seq *, gimple_seq *); extern tree c_build_bind_expr (location_t, tree, tree); +/* In c-lex.c. */ +extern enum cpp_ttype +conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind); + /* In c-pch.c */ extern void pch_init (void); extern void pch_cpp_save_state (void); Index:
[PATCH] Better error recovery for merge-conflict markers (v4)
On Wed, 2015-11-04 at 14:56 +0100, Bernd Schmidt wrote: > On 10/30/2015 04:16 PM, David Malcolm wrote: > > The idea is to more gracefully handle merger conflict markers > > in the source code being compiled. Specifically, in the C and > > C++ frontends, if we're about to emit an error, see if the > > source code is at a merger conflict marker, and if so, emit > > a more specific message, so that the user gets this: > > > > foo.c:1:1: error: source file contains patch conflict marker > > <<< HEAD > > ^ > > > > It's something of a "fit and finish" cosmetic item, but these > > things add up. > > This seems like fairly low impact but also low cost, so I'm fine with it > in principle. I wonder whether the length of the marker is the same > across all versions of patch (and VC tools)? It's hardcoded for GNU patch: http://git.savannah.gnu.org/cgit/patch.git/tree/src/merge.c which hardcodes e.g.: fputs (outstate->after_newline + "\n<<<\n", fp); I don't know if it's hardcoded for CVS or Subversion, but both have documentation showing that format: ftp://ftp.gnu.org/old-gnu/Manuals/cvs/html_node/cvs_38.html http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.resolve It's the default of git: http://git.kernel.org/cgit/git/git.git/tree/Documentation/merge-config.txt (config option "merge.conflictStyle") This git commit (to git) seems to have generalized it to support a "conflict-marker-size" attribute: https://github.com/git/git/commit/8588567c96490b8d236b1bc13f9bcb0dfa118efe Mercurial uses them; the format appears to be a keyword-argument in: https://selenic.com/hg/file/tip/mercurial/simplemerge.py#l91 but it's hardcoded in this regex in filemerge.py: if re.search("^(<<< .*|===|>>> .*)$", fcd.data(), Bazaar uses them; see e.g.: http://bazaar.launchpad.net/~bzr-pqm/bzr/bzr.dev/view/head:/bzrlib/tests/test_merge3.py (I couldn't easily tell if they're configurable) FWIW, Perforce appears to use a different format; http://www.perforce.com/perforce/doc.current/manuals/p4guide/chapter.resolve.html has an example showing: ORIGINAL file#n (text from the original version) THEIR file#m (text from their file) YOURS file (text from your file) >From what I can tell, Perforce is the outlier here. > > +static bool > > +c_parser_peek_conflict_marker (c_parser *parser, enum cpp_ttype tok1_kind) > > +{ > > + c_token *token2 = c_parser_peek_2nd_token (parser); > > + if (token2->type != tok1_kind) > > +return false; > > + c_token *token3 = c_parser_peek_nth_token (parser, 3); > > + if (token3->type != tok1_kind) > > +return false; > > + c_token *token4 = c_parser_peek_nth_token (parser, 4); > > + if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind)) > > +return false; > > + return true; > > +} > > Just thinking out loud - I guess it would be too much to hope for to > share lexers between frontends so that we need only one copy of this? Probably :( > > +extern short some_var; /* this line would lead to a warning */ > > Would or does? I don't see anything suppressing it? It's skipped in error-handling. c_parser_declaration_or_fndef has: 1794 declarator = c_parser_declarator (parser, 1795specs->typespec_kind != ctsk_none, 1796C_DTR_NORMAL, ); 1797 if (declarator == NULL) 1798{ [...snip...] 1807 c_parser_skip_to_end_of_block_or_statement (parser); The call to c_parser_declarator fails, and when issuing: 3465 c_parser_error (parser, "expected identifier or %<(%>"); we emit the "conflict marker" wording error for the error. Then at line 1807 we skip, discarding everything up to the ";" in that decl. Would a better wording be: extern short some_var; /* This line would lead to a warning due to the duplicate name, but it is skipped when handling the conflict marker. */ > There seems to be no testcase verifying what happens if the marker is > not at the start of the line (IMO it should not be interpreted as a marker). The v3 patch actually reported them as markers regardless of location. The v4 patch now verifies that they are at the start of the line; I've added test coverage for this (patch-conflict-markers-11.c). That said, it's not clear they're always at the beginning of a line; this bazaar bug indicates that CVS (and bazaar) can emit them mid-line: https://bugs.launchpad.net/bzr/+bug/36399 I noticed a visual glitch with the v3 patch now that we have range information for tokens: with caret-printing, we get just the first token within the marker underlined: <<< HEAD ^~ which looks strange (especially with the underlined chars colorized). Hence in the v4 patch I've added a location tweak so that it underline/colorizes *all* of the marker: <<< HEAD ^~~ Wording-wise, should it be
Re: [PATCH] Better error recovery for merge-conflict markers (v4)
On 12/09/2015 05:58 PM, David Malcolm wrote: On Wed, 2015-11-04 at 14:56 +0100, Bernd Schmidt wrote: This seems like fairly low impact but also low cost, so I'm fine with it in principle. I wonder whether the length of the marker is the same across all versions of patch (and VC tools)? It's hardcoded for GNU patch: [...] >From what I can tell, Perforce is the outlier here. Thanks for checking all that. Just thinking out loud - I guess it would be too much to hope for to share lexers between frontends so that we need only one copy of this? Probably :( Someone slap sense into me, I just thought of deriving C and C++ parsers from a common base class... (no this is not a suggestion for this patch). Would a better wording be: extern short some_var; /* This line would lead to a warning due to the duplicate name, but it is skipped when handling the conflict marker. */ I think so, yes. That said, it's not clear they're always at the beginning of a line; this bazaar bug indicates that CVS (and bazaar) can emit them mid-line: https://bugs.launchpad.net/bzr/+bug/36399 Ok. CVS I think we shouldn't worry about, and it looks like this is one particular bug/corner case where the conflict end marker is the last thing in the file. I think on the whole it's best to check for beginning of the line as you've done. Wording-wise, should it be "merge conflict marker", rather than "patch conflict marker"? Clang spells it: "error: version control conflict marker in file" http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts Yeah, if another compiler has a similar/identical diagnostic I think we should just copy that unless there's a very good reason not to. Rebased on top of r231445 (from yesterday). Successfully bootstrapped on x86_64-pc-linux-gnu. Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum. OK for trunk? I'm inclined to say yes since it was originally submitted in time and it's hard to imagine how the change could be risky (I'll point out right away that there are one or two other patches in the queue that were also submitted in time which I feel should not be considered for gcc-6 at this point due to risk). Let's wait until the end of the week for objections, commit then. Bernd
Re: [PATCH] Better error recovery for merge-conflict markers (v4)
On 12/09/2015 10:44 AM, Bernd Schmidt wrote: Just thinking out loud - I guess it would be too much to hope for to share lexers between frontends so that we need only one copy of this? Probably :( Someone slap sense into me, I just thought of deriving C and C++ parsers from a common base class... (no this is not a suggestion for this patch). It'd be nice. But I don't even think it's on anyone's TODO list. Wording-wise, should it be "merge conflict marker", rather than "patch conflict marker"? Clang spells it: "error: version control conflict marker in file" http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts Yeah, if another compiler has a similar/identical diagnostic I think we should just copy that unless there's a very good reason not to. Agreed. Rebased on top of r231445 (from yesterday). Successfully bootstrapped on x86_64-pc-linux-gnu. Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum. OK for trunk? I'm inclined to say yes since it was originally submitted in time and it's hard to imagine how the change could be risky (I'll point out right away that there are one or two other patches in the queue that were also submitted in time which I feel should not be considered for gcc-6 at this point due to risk). Let's wait until the end of the week for objections, commit then. As the person who has come the closest to objecting, I'll go on the record as "not objecting". jeff