Re: Bison lexer

Hans Åberg Fri, 31 Aug 2018 14:54:02 -0700


> On 31 Aug 2018, at 22:26, Frank Heckenbach <f.heckenb...@fh-soft.de> wrote:
> 
> Hans Åberg wrote:
> 
>>> For a start, I didn't have very good experience communicating with
>>> Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings
>>> etc. in the generated code, so over the years I'd been adjusting
>>> various warning-suppression gcc options or doing dirty #define
>>> tricks to avoid warnings, or sometimes even post-processing the
>>> generated lexer with sed.
>> 
>> GCC 8.2 uses C17 as default.
> 
> I haven't used gcc-8 yet, but how is this relevant? If anything, I
> expect newer gcc versions to produce more warnings (usually useful)
> which flex might also suffer from.


Maybe the Flex lexers errors is due to using C89 to compile it or something.

>>> But the final straw was when, after changing to C++ Bison, I wanted
>>> to switch to C++ Flex too and found this beautiful comment:
>>> 
>>>   /* The c++ scanner is a mess. The FlexLexer.h header file relies on the
>>>    * following macro. This is required in order to pass the 
>>> c++-multiple-scanners
>>>    * test in the regression suite. We get reports that it breaks 
>>> inheritance.
>>>    * We will address this in a future release of flex, or omit the C++ 
>>> scanner
>>>    * altogether. */
>> 
>> It has been like that since the 1990s, I believe.
> 
> Even better! :(
> 
> Especially since C++ in the 1990s was totally different from modern
> C++, so I have no idea if anything of this comment is still
> relevant, or maybe even more relevant, today compared to then.

Indeed, very old.

> Lesson (as if anyone was listening): Always put a date on such
> messages.

Probably just a hack, never actually developed.

>>> So I wrote a small library that builds that massive RE out of single
>>> rules and maps subexpressions back to rules (even in the case that
>>> rules contain subexpressions of their own), and that works for me.
>> 
>> I did that, too: I wrote some DFA/NFA code, and incidentally found
>> the most efficient method make action matches via a reverse NFA
>> lookup, cf. [1-3]. Also, I have made UTF-8/32 to octet character
>> class translations.
>> 
>> 1. https://gcc.gnu.org/ml/libstdc++/2018-04/msg00032.html
>> 2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
>> 3. https://gcc.gnu.org/ml/libstdc++/2018-05/msg00015.html
> 
> Interesting, thanks. Fortunately, my REs are not so complex, so the
> bug you reported won't affect me and lexing speed is not so
> important for me, so (at least for now) I can just use the library
> as is. But if I ever need something more sophisticated, I'll keep
> this in mind.

If that is what you are using, note that it is recursive, so the function stack 
might overflow. But perhaps the rewrite it someday.

Re: Bison lexer

Reply via email to