Re: [RFC] Allow %expect and %expect annotations on rules

Akim Demaille Wed, 07 Nov 2018 13:40:59 -0800

I have rebased Paul’s original proposal, and pushed in ‘expect’ branch.


commit f4d7d30ece4a0c7c9d5911d972b93c1544155dd6
Author: Paul Hilfinger <[email protected]>
Date:   Tue Feb 26 16:28:36 2013 -0800

    allow %expect and %expect-rr modifiers on individual rules
    
    This change allows one to document (and check) which rules participate
    in shift/reduce and reduce/reduce conflicts.  This is particularly
    important GLR parsers, where conflicts are a normal occurrence.  For
    example,
    
        %glr-parser
        %expect 1
        %%
    
        ...
    
        argument_list:
          arguments %expect 1
        | arguments ','
        | %empty
        ;
    
        arguments:
          expression
        | argument_list ',' expression
        ;
    
        ...
    
    Looking at the output from -v, one can see that the shift-reduce
    conflict here is due to the fact that the parser does not know whether
    to reduce arguments to argument_list until it sees the token AFTER the
    following ','.  By marking the rule with %expect 1 (because there is a
    conflict in one state), we document the source of the 1 overall shift-
    reduce conflict.
    
    In GLR parsers, we can use %expect-rr in a rule for reduce/reduce
    conflicts.  In this case, we mark each of the conflicting rules.  For
    example,
    
        %glr-parser
        %expect-rr 1
    
        %%
    
        stmt:
          target_list '=' expr ';'
        | expr_list ';'
        ;
    
        target_list:
          target
        | target ',' target_list
        ;
    
        target:
          ID %expect-rr 1
        ;
    
        expr_list:
          expr
        | expr ',' expr_list
        ;
    
        expr:
          ID %expect-rr 1
        | ...
        ;
    
    In a statement such as
    
        x, y = 3, 4;
    
    the parser must reduce x to a target or an expr, but does not know
    which until it sees the '='.  So we notate the two possible reductions
    to indicate that each conflicts in one rule.
    
    * doc/bison.texi (Suppressing Conflict Warnings): Document %expect,
    %expect-rr in grammar rules.
    * src/conflicts.c (count_state_rr_conflicts): Adjust comment.
    (rule_has_state_sr_conflicts): New static function.
    (count_rule_sr_conflicts): New static function.
    (rule_nast_state_rr_conflicts): New static function.
    (count_rule_rr_conflicts): New static function.
    (rule_conflicts_print): New static function.
    (conflicts_print): Also use rule_conflicts_print to report on individual
    rules.
    * src/gram.h (struct rule): Add new fields expected_sr_conflicts,
    expected_rr_conflicts.
    * src/reader.c (grammar_midrule_action): Transfer expected_sr_conflicts,
    expected_rr_conflicts to new rule, and turn off in current_rule.
    (grammar_current_rule_expect_sr): New function.
    (grammar_current_rule_expect_rr): New function.
    (packgram): Transfer expected_sr_conflicts, expected_rr_conflicts
    to new rule.
    * src/reader.h (grammar_current_rule_expect_sr): New function.
    (grammar_current_rule_expect_rr): New function.
    * src/symlist.c (symbol_list_sym_new): Initialize expected_sr_conflicts,
    expected_rr_conflicts.
    * src/symlist.h (struct symbol_list): Add new fields expected_sr_conflicts,
    expected_rr_conflicts.
    * tests/conflicts.at: Add tests "%expect in grammar rule not enough",
    "%expect in grammar rule right.", "%expect in grammar rule too much."

diff --git a/doc/bison.texi b/doc/bison.texi
index ec78af7e..cd0a4f42 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -5254,6 +5254,53 @@ in GLR parsers, using the declaration:
 %expect-rr @var{n}
 @end example
 
+You may wish to be more specific in your
+specification of expected conflicts.  To this end, you can also attach
+@code{%expect} and @code{%expect-rr} modifiers to individual rules.
+The interpretation of these modifiers differs from their use as
+declarations.  When attached to rules, they indicate the number of states
+in which the rule is involved in a conflict.  You will need to consult the
+output resulting from @samp{-v} to determine appropriate numbers to use.
+For example, for the following grammar fragment, the first rule for
+@code{empty_dims} appears in two states in which the @samp{[} token is a
+lookahead.  Having determined that, you can document this fact with an
+@code{%expect} modifier as follows:
+
+@example
+dims:
+  empty_dims
+| '[' expr ']' dims
+;
+
+empty_dims:
+  %empty   %expect 2
+| empty_dims '[' ']'
+;
+@end example
+
+Mid-rule actions generate implicit rules that are also subject to conflicts
+(@pxref{Midrule Conflicts,, Conflicts due to Midrule Actions}). To attach
+an @code{%expect} or @code{%expect-rr} annotation to an implicit
+mid-rule action's rule, put it before the action.  For example,
+
+@example
+%glr-parser
+%expect-rr 1
+
+%%
+
+clause:
+  "condition" %expect-rr 1 @{ value_mode(); @} '(' exprs ')'
+| "condition" %expect-rr 1 @{ class_mode(); @} '(' types ')'
+;
+@end example
+
+@noindent
+Here, the appropriate mid-rule action will not be determined until after
+the @samp{(} token is shifted.  Thus,
+the two actions will clash with each other, and we should expect one
+reduce/reduce conflict for each.
+
 In general, using @code{%expect} involves these steps:
 
 @itemize @bullet
@@ -5269,8 +5316,17 @@ go back to the beginning.
 
 @item
 Add an @code{%expect} declaration, copying the number @var{n} from the
-number which Bison printed.  With GLR parsers, add an
+number that Bison printed.  With GLR parsers, add an
 @code{%expect-rr} declaration as well.
+
+@item
+Optionally, count up the number of states in which one or more
+conflicted reductions for particular rules appear and add these numbers
+to the affected rules as @code{%expect-rr} or @code{%expect} modifiers
+as appropriate.  Rules that are in conflict appear in the output listing
+surrounded by square brackets or, in the case of reduce/reduce conflicts,
+as reductions having the same lookahead symbol as a square-bracketed
+reduction in the same state.
 @end itemize
 
 Now Bison will report an error if you introduce an unexpected conflict,
@@ -5491,7 +5547,14 @@ Start-Symbol}).
 @end deffn
 
 @deffn {Directive} %expect
-Declare the expected number of shift-reduce conflicts
+Declare the expected number of shift-reduce conflicts, either overall or
+for a given rule
+(@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
+@end deffn
+
+@deffn {Directive} %expect-rr
+Declare the expected number of reduce-reduce conflicts, either overall or
+for a given rule
 (@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
 @end deffn
 
diff --git a/src/conflicts.c b/src/conflicts.c
index 6a6f88d9..d57b7231 100644
--- a/src/conflicts.c
+++ b/src/conflicts.c
@@ -470,8 +470,8 @@ count_sr_conflicts (void)
 /*----------------------------------------------------------------.
 | Count the number of reduce/reduce conflicts.  If ONE_PER_TOKEN, |
 | count one conflict for each token that has any reduce/reduce    |
-| conflicts.  Otherwise, count one conflict for each pair of      |
-| conflicting reductions.                                         |
+| conflicts.  Otherwise, count one conflict for each reduction    |
+| after the first for a given token.                              |
 `----------------------------------------------------------------*/
 
 static size_t
@@ -504,6 +504,86 @@ count_rr_conflicts (bool one_per_token)
 }
 
 
+/*----------------------------------------------------------------------.
+| For a given rule, count the number of states for which it is involved |
+| in shift/reduce conflicts.                                               |
+`----------------------------------------------------------------------*/
+
+static bool
+rule_has_state_sr_conflicts (rule *r, state *s)
+{
+  int i;
+  int j;
+  transitions *trans = s->transitions;
+  reductions *reds = s->reductions;
+
+  for (i = 0; i < reds->num; i++)
+    if (reds->rules[i] == r)
+      break;
+  if (i >= reds->num)
+    return false;
+
+  FOR_EACH_SHIFT (trans, j)
+    if (bitset_test (reds->lookahead_tokens[i], TRANSITION_SYMBOL (trans, j)))
+      return true;
+
+  return false;
+}
+
+static size_t
+count_rule_sr_conflicts (rule *r)
+{
+  state_number i;
+  size_t res;
+
+  res = 0;
+  for (i = 0; i < nstates; i++)
+    if (conflicts[i] && rule_has_state_sr_conflicts (r, states[i]))
+      res++;
+  return res;
+}
+
+/*-----------------------------------------------------------------.
+| For a given rule, count the number of states in which it is      |
+| involved in reduce/reduce conflicts.                             |
+`-----------------------------------------------------------------*/
+
+static bool
+rule_has_state_rr_conflicts (rule *r, state *s)
+{
+  int i;
+  reductions *reds = s->reductions;
+  size_t res;
+  bitset lookaheads;
+  
+  for (i = 0; i < reds->num; i++)
+    if (reds->rules[i] == r)
+      break;
+  if (i >= reds->num)
+    return 0;
+  lookaheads = reds->lookahead_tokens[i];
+
+  for (i = 0; i < reds->num; i++)
+    if (reds->rules[i] != r &&
+        !bitset_disjoint_p (lookaheads, reds->lookahead_tokens[i]))
+      return true;
+
+  return false;
+}
+
+static size_t
+count_rule_rr_conflicts (rule *r)
+{
+  state_number i;
+  size_t res;
+
+  res = 0;
+  for (i = 0; i < nstates; i++)
+    if (conflicts[i] && rule_has_state_rr_conflicts (r, states[i]))
+       res++;
+  return res;
+}
+
 /*-----------------------------------------------------------.
 | Output the detailed description of states with conflicts.  |
 `-----------------------------------------------------------*/
@@ -548,14 +628,46 @@ conflicts_total_count (void)
   return count_sr_conflicts () + count_rr_conflicts (false);
 }
 
+/*------------------------------.
+| Reporting per-rule conflicts. |
+`------------------------------*/
 
-/*------------------------------------------.
-| Reporting the total number of conflicts.  |
-`------------------------------------------*/
+static void
+rule_conflicts_print (void)
+{
+  rule_number i;
+
+  for (i = 0; i < nrules; i += 1)
+    {
+      rule *r = &rules[i];
+      int expected_sr = r->expected_sr_conflicts;
+      int expected_rr = r->expected_rr_conflicts;
+
+      if (expected_sr != -1 || expected_rr != -1) 
+        {
+          int sr = count_rule_sr_conflicts (r);
+          int rr = count_rule_rr_conflicts (r);
+          if (sr != expected_sr && (sr != 0 || expected_sr != -1))
+            complain (&r->location, complaint, _("\
+shift/reduce conflicts for rule %d: %d found, %d expected"),
+                      r->user_number, sr, expected_sr);
+          if (rr != expected_rr && (rr != 0 || expected_rr != -1))
+            complain (&r->location, complaint, _("\
+reduce/reduce conflicts for rule %d: %d found, %d expected"),
+                      r->user_number, rr, expected_rr);
+        }
+    }
+}
+
+/*---------------------------------.
+| Reporting numbers of conflicts.  |
+`---------------------------------*/
 
 void
 conflicts_print (void)
 {
+  rule_conflicts_print ();
+
   if (! glr_parser && expected_rr_conflicts != -1)
     {
       complain (NULL, Wother, _("%%expect-rr applies only to GLR parsers"));
@@ -609,7 +721,6 @@ conflicts_print (void)
   }
 }
 
-
 void
 conflicts_free (void)
 {
diff --git a/src/gram.h b/src/gram.h
index 4ecf4bf2..b21eb27a 100644
--- a/src/gram.h
+++ b/src/gram.h
@@ -196,6 +196,11 @@ typedef struct
   bool useful;
   bool is_predicate;
 
+  /* Counts of the numbers of expected conflicts for this rule, or -1 if none
+     given. */
+  int expected_sr_conflicts;
+  int expected_rr_conflicts;
+
   const char *action;
   location action_location;
 } rule;
diff --git a/src/parse-gram.y b/src/parse-gram.y
index 1962835e..e98196f3 100644
--- a/src/parse-gram.y
+++ b/src/parse-gram.y
@@ -623,6 +623,10 @@ rhs:
     { grammar_current_rule_dprec_set ($3, @3); }
 | rhs "%merge" TAG
     { grammar_current_rule_merge_set ($3, @3); }
+| rhs "%expect" INT
+    { grammar_current_rule_expect_sr ($3, @3); }
+| rhs "%expect-rr" INT
+    { grammar_current_rule_expect_rr ($3, @3); }
 ;
 
 named_ref.opt:
diff --git a/src/reader.c b/src/reader.c
index d073d4ed..72b5ca45 100644
--- a/src/reader.c
+++ b/src/reader.c
@@ -430,6 +430,11 @@ grammar_midrule_action (void)
                                current_rule->action_props.is_predicate);
   code_props_none_init (&current_rule->action_props);
 
+  midrule->expected_sr_conflicts = current_rule->expected_sr_conflicts;
+  midrule->expected_rr_conflicts = current_rule->expected_rr_conflicts;
+  current_rule->expected_sr_conflicts = -1;
+  current_rule->expected_rr_conflicts = -1;
+
   if (previous_rule_end)
     previous_rule_end->next = midrule;
   else
@@ -573,6 +578,26 @@ grammar_current_rule_predicate_append (const char *pred, 
location loc)
                                /* is_predicate */ true);
 }
 
+/* Set the expected number of shift-reduce (reduce-reduce) conflicts for
+ * the current rule.  If a midrule is encountered later, the count
+ * is transferred to it and reset in the current rule to -1. */
+
+void
+grammar_current_rule_expect_sr (int count, location loc)
+{
+  current_rule->expected_sr_conflicts = count;
+}
+
+void
+grammar_current_rule_expect_rr (int count, location loc)
+{
+  if (! glr_parser)
+    complain (&loc, Wother, _("%s affects only GLR parsers"),
+              "%expect-rr");
+  else
+    current_rule->expected_rr_conflicts = count;
+}
+
 
 /*---------------------------------------------------------------.
 | Convert the rules into the representation using RRHS, RLHS and |
@@ -626,6 +651,8 @@ packgram (void)
       rules[ruleno].action = lhs->action_props.code;
       rules[ruleno].action_location = lhs->action_props.location;
       rules[ruleno].is_predicate = lhs->action_props.is_predicate;
+      rules[ruleno].expected_sr_conflicts = p->expected_sr_conflicts;
+      rules[ruleno].expected_rr_conflicts = p->expected_rr_conflicts;
 
       /* Traverse the rhs.  */
       {
diff --git a/src/reader.h b/src/reader.h
index 70128be0..76e42dc3 100644
--- a/src/reader.h
+++ b/src/reader.h
@@ -52,6 +52,8 @@ void grammar_current_rule_empty_set (location loc);
 void grammar_current_rule_prec_set (symbol *precsym, location loc);
 void grammar_current_rule_dprec_set (int dprec, location loc);
 void grammar_current_rule_merge_set (uniqstr name, location loc);
+void grammar_current_rule_expect_sr (int count, location loc);
+void grammar_current_rule_expect_rr (int count, location loc);
 void grammar_current_rule_symbol_append (symbol *sym, location loc,
                                          named_ref *nref);
 /* Attach an ACTION to the current rule.  */
diff --git a/src/symlist.c b/src/symlist.c
index 201ddabd..f7af3568 100644
--- a/src/symlist.c
+++ b/src/symlist.c
@@ -49,6 +49,8 @@ symbol_list_sym_new (symbol *sym, location loc)
   res->dprec_location = empty_location;
   res->merger = 0;
   res->merger_declaration_location = empty_location;
+  res->expected_sr_conflicts = -1;
+  res->expected_rr_conflicts = -1;
 
   res->next = NULL;
 
diff --git a/src/symlist.h b/src/symlist.h
index 3ade1704..0bd6bd75 100644
--- a/src/symlist.h
+++ b/src/symlist.h
@@ -87,6 +87,11 @@ typedef struct symbol_list
   int merger;
   location merger_declaration_location;
 
+  /* Counts of the number of expected conflicts for this rule, or -1 if none
+     given. */
+  int expected_sr_conflicts;
+  int expected_rr_conflicts;
+
   /* The list.  */
   struct symbol_list *next;
 } symbol_list;
diff --git a/tests/conflicts.at b/tests/conflicts.at
index db57c184..03c9acdc 100644
--- a/tests/conflicts.at
+++ b/tests/conflicts.at
@@ -1244,6 +1244,61 @@ AT_BISON_CHECK([-o input.c input.y], 1, [],
 AT_CLEANUP
 
 
+## ------------------------------------ ##
+## %expect in grammar rule not enough.  ##
+## ------------------------------------ ##
+
+AT_SETUP([%expect in grammar rule not enough])
+
+AT_DATA([input.y],
+[[%token NUM OP
+%expect 1
+%%
+exp: exp OP exp %expect 0 | NUM;
+]])
+
+AT_BISON_CHECK([-o input.c input.y], 1, [],
+[[input.y:4.6-25: error: shift/reduce conflicts for rule 1: 1 found, 0 expected
+]])
+AT_CLEANUP
+
+
+## ------------------------------- ##
+## %expect in grammar rule right.  ##
+## ------------------------------- ##
+
+AT_SETUP([%expect in grammar rule right])
+
+AT_DATA([input.y],
+[[%token NUM OP
+%expect 1
+%%
+exp: exp OP exp %expect 1 | NUM;
+]])
+
+AT_BISON_CHECK([-o input.c input.y])
+AT_CLEANUP
+
+
+## ---------------------------------- ##
+## %expect in grammar rule too much.  ##
+## ---------------------------------- ##
+
+AT_SETUP([%expect in grammar rule too much])
+
+AT_DATA([input.y],
+[[%token NUM OP
+%expect 1
+%%
+exp: exp OP exp | NUM %expect 1;
+]])
+
+AT_BISON_CHECK([-o input.c input.y], 1, [],
+[[input.y:4.19-31: error: shift/reduce conflicts for rule 2: 0 found, 1 
expected
+]])
+AT_CLEANUP
+
+
 ## ------------------------- ##
 ## %prec with user strings.  ##
 ## ------------------------- ##

Re: [RFC] Allow %expect and %expect annotations on rules

Reply via email to