I've been reading the bison source code and came up with some changes
I would like to be considered.



The for_all_rules.patch changes the double loop over all rules deriving
all variables to a single loop over all rules.

Tested with make check on Cygwin with 1 expected failure:
49: Output file name: [EMAIL PROTECTED]&*()-=_+{}[]|\:;<>, .' FAILED 
(output.at:188)
failed because the file system (NTFS) does not allow \/:?*"<>|
see http://lists.gnu.org/archive/html/bug-bison/2008-07/msg00008.html

2008-09-27  Di-an Jan  <[EMAIL PROTECTED]>

        * src/closure.c (set_firsts): Use single loop over all rules.

diff --git a/src/closure.c b/src/closure.c
index ff3109c..4c1d40c 100755
--- a/src/closure.c
+++ b/src/closure.c
@@ -123,17 +123,16 @@ print_fderives (void)
 static void
 set_firsts (void)
 {
-  symbol_number i, j;
+  rule_number r;

   firsts = bitsetv_create (nvars, nvars, BITSET_FIXED);

-  for (i = ntokens; i < nsyms; i++)
-    for (j = 0; derives[i - ntokens][j]; ++j)
-      {
-       item_number sym = derives[i - ntokens][j]->rhs[0];
-       if (ISVAR (sym))
-         bitset_set (FIRSTS (i), sym - ntokens);
-      }
+  for (r = 0; r < nrules; r++)
+    {
+      item_number sym = rules[r].rhs[0];
+      if (ISVAR (sym))
+       bitset_set (FIRSTS (rules[r].lhs->number), sym - ntokens);
+    }

   if (trace_flag & trace_sets)
     bitsetv_matrix_dump (stderr, "RTC: Firsts Input", firsts);



The shift_symbol-bitset.patch changes an array of NSYMS integers that's
sorted after adding all the shift symbols of a state to a bitset of NSYMS
bits.  Obviously there's space saving.  For speed, the trade off is between
an insertion sort verses possibly slower clear, set, count and iterate.
The changes to state.* allows transitions states to be filled in in place
rather than copied over.

I'm not sure whether BITSET_FOR_EACH or loop and bitset_test is more
efficient, or whether the original code is faster.  I couldn't get good
timing on Cygwin.  So I attached patches for both.  In general, someone
should figure out which is more efficient for each bitset and document it.

The bitset_test version is tested with make check on Cygwin with 1 expected
failure.  The bitset_iterator version is tested in combination with other
stuff.

2008-09-27  Di-an Jan  <[EMAIL PROTECTED]>

        * src/LR0.c (shift_symbol): Change to bitset.
        (nshifts, shiftset): Remove.
        (allocate_storage): Initialize shift_symbol, remove shiftset.
        (free_storage): Free shift_symbol, remove shiftset.
        (new_itemsets): Clear and update shift_symbol.
        (append_states): Iterate over shift_symbol already sorted.
        Allocate and fill in state transitions in place.
        Don't use shiftset.
        (set_states): Adjust call to state_transitions_set.
        (generate_states): Move state_transitions_set to append_states.
        * src/state.h (state_transitions_set): Remove states parameter.
        * src/state.c (state_transitions_set): Remove states parameter.
        * src/state.c (transitions_new): Remove states parameter.



Other things I'm thinking of working on:



Rework the line wrapping logic in src/print.c (print_grammar).
A straight forward version is attached as print_grammar.patch,
but it seems to be slower than with string buffers.  Maybe I could
do a version with a printf-like interface rather than END_TEST.



In src/reader.c, instead of using a single symbol_list for all rules,
use a list of rules, each containing a symbol_list.  At the cost of an
extra structure, this would move the per-rule fields out of each node
of symbol_list and make the code clearer.  Should be at least as efficient
since all iterations over the one symbol_list stops after each rule.



Use macros/inline functions for accessing ritem.  Makes the code easy
to understand without having to remember the special ritem structure.



Introduce the typedef var_number for 0..nvars-1.  More self-documenting
and makes some code simpler.



Do something like
        struct transitions { int num; state** states; };
        struct state { ... struct transitions transitions; ... };
This uses just one pointer, as before, but probably more efficient
and doesn't use "state *states[1];" to allocate 0 or more states.
Similarly for reductions and errs.



Consistent style.
i++ vs ++i vs i+=1 statements,
and put them in expressions as in dst[i++] = src[j++].
Remove semicolons after {}.
Use XNMALLOC(n, type) or xnmalloc(n, s) and xmemdup from gnulib.



Di-an Jan
diff --git a/src/closure.c b/src/closure.c
index ff3109c..f32f7f6 100755
--- a/src/closure.c
+++ b/src/closure.c
@@ -1,6 +1,6 @@
 /* Closures for Bison
 
-   Copyright (C) 1984, 1989, 2000, 2001, 2002, 2004, 2005, 2007 Free
+   Copyright (C) 1984, 1989, 2000, 2001, 2002, 2004, 2005, 2007, 2008 Free
    Software Foundation, Inc.
 
    This file is part of Bison, the GNU Compiler Compiler.
@@ -123,17 +123,16 @@ print_fderives (void)
 static void
 set_firsts (void)
 {
-  symbol_number i, j;
+  rule_number r;
 
   firsts = bitsetv_create (nvars, nvars, BITSET_FIXED);
 
-  for (i = ntokens; i < nsyms; i++)
-    for (j = 0; derives[i - ntokens][j]; ++j)
-      {
-       item_number sym = derives[i - ntokens][j]->rhs[0];
-       if (ISVAR (sym))
-         bitset_set (FIRSTS (i), sym - ntokens);
-      }
+  for (r = 0; r < nrules; r++)
+    {
+      item_number sym = rules[r].rhs[0];
+      if (ISVAR (sym))
+       bitset_set (FIRSTS (rules[r].lhs->number), sym - ntokens);
+    }
 
   if (trace_flag & trace_sets)
     bitsetv_matrix_dump (stderr, "RTC: Firsts Input", firsts);
diff --git a/src/LR0.c b/src/LR0.c
index efda69f..667ad3f 100755
--- a/src/LR0.c
+++ b/src/LR0.c
@@ -1,7 +1,7 @@
 /* Generate the nondeterministic finite state machine for Bison.
 
-   Copyright (C) 1984, 1986, 1989, 2000, 2001, 2002, 2004, 2005, 2006, 2007
-   Free Software Foundation, Inc.
+   Copyright (C) 1984, 1986, 1989, 2000, 2001, 2002, 2004, 2005, 2006, 2007,
+   2008 Free Software Foundation, Inc.
 
    This file is part of Bison, the GNU Compiler Compiler.
 
@@ -77,11 +77,9 @@ state_list_append (symbol_number sym, size_t core_size, 
item_number *core)
   return s;
 }
 
-static int nshifts;
-static symbol_number *shift_symbol;
+static bitset shift_symbol;
 
 static rule **redset;
-static state **shiftset;
 
 static item_number **kernel_base;
 static int *kernel_size;
@@ -136,19 +134,17 @@ allocate_storage (void)
 {
   allocate_itemsets ();
 
-  shiftset = xnmalloc (nsyms, sizeof *shiftset);
   redset = xnmalloc (nrules, sizeof *redset);
   state_hash_new ();
-  shift_symbol = xnmalloc (nsyms, sizeof *shift_symbol);
+  shift_symbol = bitset_create (nsyms, BITSET_FIXED);
 }
 
 
 static void
 free_storage (void)
 {
-  free (shift_symbol);
+  bitset_free (shift_symbol);
   free (redset);
-  free (shiftset);
   free (kernel_base);
   free (kernel_size);
   free (kernel_items);
@@ -183,17 +179,14 @@ new_itemsets (state *s)
 
   memset (kernel_size, 0, nsyms * sizeof *kernel_size);
 
-  nshifts = 0;
+  bitset_zero (shift_symbol);
 
   for (i = 0; i < nitemset; ++i)
     if (item_number_is_symbol_number (ritem[itemset[i]]))
       {
        symbol_number sym = item_number_as_symbol_number (ritem[itemset[i]]);
        if (!kernel_size[sym])
-         {
-           shift_symbol[nshifts] = sym;
-           nshifts++;
-         }
+         bitset_set (shift_symbol, sym);
 
        kernel_base[sym][kernel_size[sym]] = itemset[i] + 1;
        kernel_size[sym]++;
@@ -231,33 +224,25 @@ get_state (symbol_number sym, size_t core_size, 
item_number *core)
 | Use the information computed by new_itemsets to find the state |
 | numbers reached by each shift transition from S.              |
 |                                                                |
-| SHIFTSET is set up as a vector of those states.                |
+| Record in the transitions structure.                           |
 `---------------------------------------------------------------*/
 
 static void
 append_states (state *s)
 {
   int i;
+  symbol_number sym;
+  size_t nshifts = bitset_count (shift_symbol);
 
   if (trace_flag & trace_automaton)
     fprintf (stderr, "Entering append_states, state = %d\n", s->number);
 
-  /* First sort shift_symbol into increasing order.  */
-
-  for (i = 1; i < nshifts; i++)
-    {
-      symbol_number sym = shift_symbol[i];
-      int j;
-      for (j = i; 0 < j && sym < shift_symbol[j - 1]; j--)
-       shift_symbol[j] = shift_symbol[j - 1];
-      shift_symbol[j] = sym;
-    }
-
-  for (i = 0; i < nshifts; i++)
-    {
-      symbol_number sym = shift_symbol[i];
-      shiftset[i] = get_state (sym, kernel_size[sym], kernel_base[sym]);
-    }
+  state_transitions_set (s, nshifts);
+  i = 0;
+  for (sym = 0; sym < nsyms; sym++)
+    if (bitset_test (shift_symbol, sym))
+       s->transitions->states[i++]
+               = get_state (sym, kernel_size[sym], kernel_base[sym]);
 }
 
 
@@ -314,7 +299,7 @@ set_states (void)
         computed later, but set_conflicts.  */
       state *s = this->state;
       if (!s->transitions)
-       state_transitions_set (s, 0, 0);
+       state_transitions_set (s, 0);
       if (!s->reductions)
        state_reductions_set (s, 0, 0);
 
@@ -360,12 +345,9 @@ generate_states (void)
       save_reductions (s);
       /* Find the itemsets of the states that shifts can reach.  */
       new_itemsets (s);
-      /* Find or create the core structures for those states.  */
+      /* Find or create the core structures for those states.
+        Record the shifts allowed out of this state.  */
       append_states (s);
-
-      /* Create the shifts structures for the shifts to those states,
-        now that the state numbers transitioning to are known.  */
-      state_transitions_set (s, nshifts, shiftset);
     }
 
   /* discard various storage */
diff --git a/src/state.c b/src/state.c
index d3460c1..bf4ad3d 100755
--- a/src/state.c
+++ b/src/state.c
@@ -1,6 +1,6 @@
 /* Type definitions for nondeterministic finite state machine for Bison.
 
-   Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software
+   Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software
    Foundation, Inc.
 
    This file is part of Bison, the GNU Compiler Compiler.
@@ -39,12 +39,11 @@
 `-----------------------------------------*/
 
 static transitions *
-transitions_new (int num, state **the_states)
+transitions_new (int num)
 {
-  size_t states_size = num * sizeof *the_states;
+  size_t states_size = num * sizeof (state *);
   transitions *res = xmalloc (offsetof (transitions, states) + states_size);
   res->num = num;
-  memcpy (res->states, the_states, states_size);
   return res;
 }
 
@@ -169,15 +168,15 @@ state_free (state *s)
 }
 
 
-/*---------------------------.
-| Set the transitions of S.  |
-`---------------------------*/
+/*--------------------------------.
+| Allocate the transitions of S.  |
+`--------------------------------*/
 
 void
-state_transitions_set (state *s, int num, state **trans)
+state_transitions_set (state *s, int num)
 {
   aver (!s->transitions);
-  s->transitions = transitions_new (num, trans);
+  s->transitions = transitions_new (num);
 }
 
 
diff --git a/src/state.h b/src/state.h
index 4afc1f0..422555d 100755
--- a/src/state.h
+++ b/src/state.h
@@ -1,6 +1,6 @@
 /* Type definitions for nondeterministic finite state machine for Bison.
 
-   Copyright (C) 1984, 1989, 2000, 2001, 2002, 2003, 2004, 2007 Free
+   Copyright (C) 1984, 1989, 2000, 2001, 2002, 2003, 2004, 2007, 2008 Free
    Software Foundation, Inc.
 
    This file is part of Bison, the GNU Compiler Compiler.
@@ -223,8 +223,8 @@ extern state *final_state;
 state *state_new (symbol_number accessing_symbol,
                  size_t core_size, item_number *core);
 
-/* Set the transitions of STATE.  */
-void state_transitions_set (state *s, int num, state **trans);
+/* Allocate the transitions of STATE.  */
+void state_transitions_set (state *s, int num);
 
 /* Set the reductions of STATE.  */
 void state_reductions_set (state *s, int num, rule **reds);
diff --git a/src/LR0.c b/src/LR0.c
index efda69f..7e4c1d8 100755
--- a/src/LR0.c
+++ b/src/LR0.c
@@ -1,7 +1,7 @@
 /* Generate the nondeterministic finite state machine for Bison.
 
-   Copyright (C) 1984, 1986, 1989, 2000, 2001, 2002, 2004, 2005, 2006, 2007
-   Free Software Foundation, Inc.
+   Copyright (C) 1984, 1986, 1989, 2000, 2001, 2002, 2004, 2005, 2006, 2007,
+   2008 Free Software Foundation, Inc.
 
    This file is part of Bison, the GNU Compiler Compiler.
 
@@ -77,11 +77,9 @@ state_list_append (symbol_number sym, size_t core_size, 
item_number *core)
   return s;
 }
 
-static int nshifts;
-static symbol_number *shift_symbol;
+static bitset shift_symbol;
 
 static rule **redset;
-static state **shiftset;
 
 static item_number **kernel_base;
 static int *kernel_size;
@@ -136,19 +134,17 @@ allocate_storage (void)
 {
   allocate_itemsets ();
 
-  shiftset = xnmalloc (nsyms, sizeof *shiftset);
   redset = xnmalloc (nrules, sizeof *redset);
   state_hash_new ();
-  shift_symbol = xnmalloc (nsyms, sizeof *shift_symbol);
+  shift_symbol = bitset_create (nsyms, BITSET_FIXED);
 }
 
 
 static void
 free_storage (void)
 {
-  free (shift_symbol);
+  bitset_free (shift_symbol);
   free (redset);
-  free (shiftset);
   free (kernel_base);
   free (kernel_size);
   free (kernel_items);
@@ -183,17 +179,14 @@ new_itemsets (state *s)
 
   memset (kernel_size, 0, nsyms * sizeof *kernel_size);
 
-  nshifts = 0;
+  bitset_zero (shift_symbol);
 
   for (i = 0; i < nitemset; ++i)
     if (item_number_is_symbol_number (ritem[itemset[i]]))
       {
        symbol_number sym = item_number_as_symbol_number (ritem[itemset[i]]);
        if (!kernel_size[sym])
-         {
-           shift_symbol[nshifts] = sym;
-           nshifts++;
-         }
+         bitset_set (shift_symbol, sym);
 
        kernel_base[sym][kernel_size[sym]] = itemset[i] + 1;
        kernel_size[sym]++;
@@ -231,32 +224,26 @@ get_state (symbol_number sym, size_t core_size, 
item_number *core)
 | Use the information computed by new_itemsets to find the state |
 | numbers reached by each shift transition from S.              |
 |                                                                |
-| SHIFTSET is set up as a vector of those states.                |
+| Record in the transitions structure.                           |
 `---------------------------------------------------------------*/
 
 static void
 append_states (state *s)
 {
   int i;
+  bitset_iterator biter;
+  symbol_number sym;
+  size_t nshifts = bitset_count (shift_symbol);
 
   if (trace_flag & trace_automaton)
     fprintf (stderr, "Entering append_states, state = %d\n", s->number);
 
-  /* First sort shift_symbol into increasing order.  */
-
-  for (i = 1; i < nshifts; i++)
+  state_transitions_set (s, nshifts);
+  i = 0;
+  BITSET_FOR_EACH (biter, shift_symbol, sym, 0)
     {
-      symbol_number sym = shift_symbol[i];
-      int j;
-      for (j = i; 0 < j && sym < shift_symbol[j - 1]; j--)
-       shift_symbol[j] = shift_symbol[j - 1];
-      shift_symbol[j] = sym;
-    }
-
-  for (i = 0; i < nshifts; i++)
-    {
-      symbol_number sym = shift_symbol[i];
-      shiftset[i] = get_state (sym, kernel_size[sym], kernel_base[sym]);
+      s->transitions->states[i] = get_state (sym, kernel_size[sym], 
kernel_base[sym]);
+      i++;
     }
 }
 
@@ -314,7 +301,7 @@ set_states (void)
         computed later, but set_conflicts.  */
       state *s = this->state;
       if (!s->transitions)
-       state_transitions_set (s, 0, 0);
+       state_transitions_set (s, 0);
       if (!s->reductions)
        state_reductions_set (s, 0, 0);
 
@@ -360,12 +347,9 @@ generate_states (void)
       save_reductions (s);
       /* Find the itemsets of the states that shifts can reach.  */
       new_itemsets (s);
-      /* Find or create the core structures for those states.  */
+      /* Find or create the core structures for those states.
+        Record the shifts allowed out of this state.  */
       append_states (s);
-
-      /* Create the shifts structures for the shifts to those states,
-        now that the state numbers transitioning to are known.  */
-      state_transitions_set (s, nshifts, shiftset);
     }
 
   /* discard various storage */
diff --git a/src/state.c b/src/state.c
index d3460c1..bf4ad3d 100755
--- a/src/state.c
+++ b/src/state.c
@@ -1,6 +1,6 @@
 /* Type definitions for nondeterministic finite state machine for Bison.
 
-   Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software
+   Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software
    Foundation, Inc.
 
    This file is part of Bison, the GNU Compiler Compiler.
@@ -39,12 +39,11 @@
 `-----------------------------------------*/
 
 static transitions *
-transitions_new (int num, state **the_states)
+transitions_new (int num)
 {
-  size_t states_size = num * sizeof *the_states;
+  size_t states_size = num * sizeof (state *);
   transitions *res = xmalloc (offsetof (transitions, states) + states_size);
   res->num = num;
-  memcpy (res->states, the_states, states_size);
   return res;
 }
 
@@ -169,15 +168,15 @@ state_free (state *s)
 }
 
 
-/*---------------------------.
-| Set the transitions of S.  |
-`---------------------------*/
+/*--------------------------------.
+| Allocate the transitions of S.  |
+`--------------------------------*/
 
 void
-state_transitions_set (state *s, int num, state **trans)
+state_transitions_set (state *s, int num)
 {
   aver (!s->transitions);
-  s->transitions = transitions_new (num, trans);
+  s->transitions = transitions_new (num);
 }
 
 
diff --git a/src/state.h b/src/state.h
index 4afc1f0..422555d 100755
--- a/src/state.h
+++ b/src/state.h
@@ -1,6 +1,6 @@
 /* Type definitions for nondeterministic finite state machine for Bison.
 
-   Copyright (C) 1984, 1989, 2000, 2001, 2002, 2003, 2004, 2007 Free
+   Copyright (C) 1984, 1989, 2000, 2001, 2002, 2003, 2004, 2007, 2008 Free
    Software Foundation, Inc.
 
    This file is part of Bison, the GNU Compiler Compiler.
@@ -223,8 +223,8 @@ extern state *final_state;
 state *state_new (symbol_number accessing_symbol,
                  size_t core_size, item_number *core);
 
-/* Set the transitions of STATE.  */
-void state_transitions_set (state *s, int num, state **trans);
+/* Allocate the transitions of STATE.  */
+void state_transitions_set (state *s, int num);
 
 /* Set the reductions of STATE.  */
 void state_reductions_set (state *s, int num, rule **reds);
diff --git a/src/print.c b/src/print.c
index ddd76a6..a36c369 100755
--- a/src/print.c
+++ b/src/print.c
@@ -370,14 +370,15 @@ print_state (FILE *out, state *s)
 | Print information on the whole grammar.  |
 `-----------------------------------------*/
 
-#define END_TEST(End)                          \
+#define WRAP_PUT(Str)                          \
 do {                                           \
-  if (column + strlen(buffer) > (End))         \
+  if (column > 65)                             \
     {                                          \
-      fprintf (out, "%s\n   ", buffer);                \
+      fputs ("\n   ", out);                    \
       column = 3;                              \
-      buffer[0] = 0;                           \
     }                                          \
+  fputs ((Str), out);                          \
+  column += strlen(Str);                       \
 } while (0)
 
 
@@ -386,7 +387,7 @@ print_grammar (FILE *out)
 {
   symbol_number i;
   char buffer[90];
-  int column = 0;
+  int column;
 
   grammar_rules_print (out);
 
@@ -399,21 +400,20 @@ print_grammar (FILE *out)
        rule_number r;
        item_number *rhsp;
 
-       buffer[0] = 0;
-       column = strlen (tag);
-       fputs (tag, out);
-       END_TEST (65);
+       column = 0;
+       WRAP_PUT (tag);
        sprintf (buffer, " (%d)", i);
+       WRAP_PUT (buffer);
 
        for (r = 0; r < nrules; r++)
          for (rhsp = rules[r].rhs; *rhsp >= 0; rhsp++)
            if (item_number_as_symbol_number (*rhsp) == token_translations[i])
              {
-               END_TEST (65);
-               sprintf (buffer + strlen (buffer), " %d", r);
+               sprintf (buffer, " %d", r);
+               WRAP_PUT (buffer);
                break;
              }
-       fprintf (out, "%s\n", buffer);
+       fputc ('\n', out);
       }
   fputs ("\n\n", out);
 
@@ -438,23 +438,19 @@ print_grammar (FILE *out)
              }
        }
 
-      buffer[0] = 0;
-      fputs (tag, out);
-      column = strlen (tag);
-      sprintf (buffer, " (%d)", i);
-      END_TEST (0);
+      fprintf (out, "%s (%d)\n   ", tag, i);
+      column = 3;
 
       if (left_count > 0)
        {
-         END_TEST (65);
-         sprintf (buffer + strlen (buffer), _(" on left:"));
+         WRAP_PUT (_(" on left:"));
 
          for (r = 0; r < nrules; r++)
            {
              if (rules[r].lhs->number == i)
                {
-                 END_TEST (65);
-                 sprintf (buffer + strlen (buffer), " %d", r);
+                 sprintf (buffer, " %d", r);
+                 WRAP_PUT (buffer);
                }
            }
        }
@@ -462,22 +458,24 @@ print_grammar (FILE *out)
       if (right_count > 0)
        {
          if (left_count > 0)
-           sprintf (buffer + strlen (buffer), ",");
-         END_TEST (65);
-         sprintf (buffer + strlen (buffer), _(" on right:"));
+           {
+             fputc (',', out);
+             column++;
+           }
+         WRAP_PUT (_(" on right:"));
          for (r = 0; r < nrules; r++)
            {
              item_number *rhsp;
              for (rhsp = rules[r].rhs; *rhsp >= 0; rhsp++)
                if (item_number_as_symbol_number (*rhsp) == i)
                  {
-                   END_TEST (65);
-                   sprintf (buffer + strlen (buffer), " %d", r);
+                   sprintf (buffer, " %d", r);
+                   WRAP_PUT (buffer);
                    break;
                  }
            }
        }
-      fprintf (out, "%s\n", buffer);
+      fputc ('\n', out);
     }
 }
 

Reply via email to