Hi,

Following the discussion starting in
http://lists.gnu.org/archive/html/bug-bison/2018-03/msg00002.html
I have implemented a Bison skeleton for C++17 supporting features
such as move semantics and std::variant, based on the existing C++
skeleton. It is included in this mail.

To use the new skeleton, copy the skeleton files (data/) to
/usr/share/bison or your respective Bison data directory; they use
new file names to avoid conflicts with existing Bison data.

Then use the following setting in your parser:

  %skeleton "lalr1-c++17.cc"

An example calc-c++17, based on Bison's calc++ example, is included.

The new skeleton provides the following features:

- Includes bugfix for syntax_error constructor inlining, see:
  http://lists.gnu.org/archive/html/bug-bison/2018-03/msg00047.html

- Includes extra_header_prefix support, see
  http://lists.gnu.org/archive/html/bug-bison/2018-03/msg00058.html

- Always uses std::move (C++11) instead of copying internally, to
  support move-only types for semantic values. In user actions, it's
  up to you to use std::move where necessary (see the following
  point).

- A new define api.rhs.access that is applied automatically to all
  occurrences of "$n". E.g., the following setting will apply
  std::move to all such occurrences. Note that this can be dangerous
  if the same "$n" is used more than once in one action, or in an
  action and a previous mid-rule action. It's up to you then to make
  sure this does not occur, or take measures to avoid such problems
  when it occurs:

  %define api.rhs.access {std::move}

- In rules with no explicit user action, a default action
  "$$ = std::move ($1);" is executed. That's the behaviour according
  to the documentation, and also the behaviour of the C skeleton,
  and implementing it (when using std::variant) was less work than
  changing the documentation.

- When there is a user action, there is no pre-action of setting
  $$ to $1. (The C skeleton has one accidentally, but warns users
  not to rely on it. With move-only types, it would not be easily
  possible to provide such a pre-action without destroying $1.)

  Instead, $$ is set up before the user action as follows:

  Without variants, $$ is always set to { }, i.e.
  default-initialized.

  With variants, for non-empty rules, $$ is also default-initialized
  (to an invalid variant), so unless the user action sets $$, a
  bad_variant_access will happen when it's used by other rules.
  This may catch some errors in user actions.

  With variants, for empty rules, $$ is initialized to the default
  value of the correct type. The user action may set it or leave it.
  This makes it easier to build containers starting from a default
  (empty) one as in the following example:

  %type <std::vector <foo>> foo_list
  %type <foo> foo

  %%

  foo_list:
    %empty { }
  | foo_list foo { ($$ = std::move ($1)).push_back (std::move ($2)); };

- Bison's existing stack implementation works fine for the most
  part. Some cosmetic changes were made, such as using size_t where
  appropriate and explicitly declaring copy constructor/assignment
  as "= delete" instead of declaring and not defining them. The
  stealing push function was turned into a proper moving one (i.e.,
  taking a rvalue reference).

- The stack continues to use std::vector by default, and I think
  that's fine. If, however, you want to use another container
  instead, you can now set the following defines, e.g. for
  std::deque:

  %define api.stack.include deque
  %define api.stack.container {std::deque}

  The included example does this, but also works without it.

  If your container requires some setup, you can overload
  yy::stack_prepare. By default, it does "reserve (200);" for
  std::stack (as before) and nothing for other containers.

- Uses std::variant (C++17) instead of Bison's own variant
  implementation.

  If you don't have C++17 support yet, you can use an alternative
  variant implementation such as https://github.com/mpark/variant .
  Boost.Variant might also work; I have not tried it.

- When using variants, to build tokens manually, instead of
  "yylval->build(...)", you must now use the std::variant interface
  such as "yylval->emplace<...>(...)". However, the provided
  make_FOO functions continue to work, are recommended anyway, and
  accept rvalue references now.

Using std::variant fixes the following problems:

- $<type> (especially in mid-rule actions) didn't work at all (see
  http://lists.gnu.org/archive/html/bug-bison/2017-06/msg00000.html),
  now works (internally, this required a variant_setter<> helper
  function in order to support the "=" syntax).

- $<type> where "type" is a type that does not occur in the variant:

  With lalr1.cc, if the type is not bigger than the largest variant,
  it would happen to work (if not for the previous bug), otherwise
  would assert if parse.assert is set, and result in undefined
  behaviour if not set.

  With the new skeleton, this fails at compile time.

- $<type>, reading a different variant than was set:

  With lalr1.cc, it would assert if parse.assert is set, otherwise
  undefined behaviour.

  With the new skeleton, throws std::bad_variant_access.

- Likewise when the lexer returns a wrong type of variant (can't
  happen with the make_FOO functions, but when building tokens
  manually).

- %destructor

  With lalr1.cc, it does not work reliably (with or without
  variants) whenever the dynamic type does not match the expected
  type of the token.

  The new skeleton removes support of "%destructor" and warns when
  it's declared. Regular C++ destructors should do the job in all
  cases and are applied to the correct dynamic type.

- %printer

  With lalr1.cc, it doesn't work for $<type> overrides (in
  particular with mid-rule actions) with variants.

  The new skeleton removes support of "%printer" when using
  variants, and warns. (It continues to work with non-variant
  semantic values.)

  The idiomatic way of doing this kind of thing with std::variant is
  std::visit which will always use the current dynamic type. The new
  skeleton does this, calling a new function yy::yy_print_value().
  By default this function does nothing, but you can overload it;
  the generated code will always call it with a "const T&" argument,
  so unless you plan to call it yourself, you only need to overload
  const-reference versions.

  The included example contains two ways of defining yy_print_value
  in calc-c++17-parser.yy, one rather simple one and a more generic
  and more complex one. Depending on your needs, you can build on
  either of them or write your own print functions.

Regards,
Frank

-- 
Dipl.-Math. Frank Heckenbach <f.heckenb...@fh-soft.de>
Systems Programming, Software Development, IT Consulting

Attachment: bison-c++17.tar.gz
Description: application/gzip

Reply via email to