Re: RFC: custom error messages

Akim Demaille Tue, 14 Jan 2020 12:25:43 -0800

Hi Christian,

Thanks for your answer!  I definitely need feedback on these matters.


> Le 14 janv. 2020 à 14:50, Christian Schoenebeck <[email protected]> a 
> écrit :
> 
>> I wouldn't call memory management in yacc.c "easy": lots of efforts
>> are made to allocate on the call stack, and to avoid malloc.
> 
> For such a new API function it would be easier, e.g. because you don't have 
> to 
> take care about breaking existing users' code. So you could e.g. add an 
> optional allocator argument (where NULL would use Bison's default allocation) 
> for this new API function, or a macro that could be redefined by users, etc. 
> So there would be a bunch of options to address this concern.

I'm looking for answers that apply in all the backends, so I'm trying
to stay away from macros.  C++ developers also tend to despise them.

The current state of my draft looks like this:

/* Put in YYARG at most YYARGN of the expected tokens given
   the current YYCTX, and return the number of tokens stored
   in YYARG.
   If YYARG is null, return the number of expected tokens.  */
static int
yyexpected_tokens (const yyparse_context_t *yyctx,
                   int yyarg[], int yyargn);

WDYT?  It should satisfy those who are ready to pay for a dynamic
allocation, and satisfy those who want a bounded one, at compile
time.  Anyway, it is always bounded by YYNTOKENS.




>> I do not plan to expose an enum for symbol numbers.  What value would
>> it bring to give name to these numbers?
> 
> To make it clear which internal numbers are actually reflecting user's rules/

Please, do not use "rule" when you mean "nonterminal symbol".  It
seems to confuse some people.

> tokens. Once in a while I am also fooled by looking up the wrong symbol names 
> when reading the generated parser code manually.

Currently you can hardly get it wrong: only the external names
are defined :)  Some tokens don't have a clear identifier that
coud be used for the internal representation: think about 'x' or
"XXX" when not attached to a token symbol.

Not to mention that nonterminal symbols can have '.' and '-' in
their names.  And maybe others, that don't immediately map to
a C identifier.

So, unless there's a clear need, I don't plan to ask for trouble.


>> I fully subscribe to this view, but string literals are definitely not
>> the way to go.  So a few months ago I realized that what we really need
>> to do is to merge Joel E. Denny's PhD into Bison
>> (https://tigerprints.clemson.edu/all_dissertations/519/).
>> 
>> _That's_ the real way forward.  That's Bison 4.
> 
> So IMO all other issues discussed here so far would be in the shadow of this 
> major new feature anyway.

Absolutely.  Joel did a wonderful work.  And it's a pity that it
went unnoticed for so long.


> Do you plan to "merge" the high level (user visible) aspects of this built-in 
> scanner support feature "as-is", or have you already ideas about adjusting 
> certain high-level aspects (if that's not too early to discuss at all)?

So far, I don't see things I would change.  I have not had the
chance to toy with his version of Bison, and there are legal
matters to settle.  Also, his schedule is very tight.  But Joel
and I both agree the marging should happen.  It is far too soon
for the details though.

That being said, if you have comments on this regard, you're most
welcome to shoot!


> Last question: I noticed you mentioned it was already hard enough to test 
> Bison code right now. Would it make sense to establish some kind of well 
> defined, distributed test case mechanism for upstream projects? I mean in the 
> sense that upstream projects using Bison would write test cases for their own 
> specific use cases of Bison by using a some kind of defined interface for you 
> to automatically grab, compile and execute them?

That would be very helpful, indeed.  But in a way, that's the point
of the betas...  It's not automated, but, well, I do count on you
guys to really give a shot at the betas and report your mileage.

On this regard Frank's feedback is extremely valuable.  Even when
his answer is just "runs fine with my stuff": at least I know someone
else tried, and it gives me _some_ confidence in the release.

If you look at the past, it's clear that QA is insufficient for
Bison:

- v2.7.90 v2.7.91 v3.0 v3.0.1 v3.0.2 v3.0.3 v3.0.4 v3.0.5
- v3.1
- v3.1.90 v3.1.91 v3.2 v3.2.1 v3.2.1 v3.2.2 v3.2.3 v3.2.4
- v3.2.90 v3.2.91 v3.3 v3.3.1 v3.3.2
- v3.3.90 v3.3.91 v3.4 v3.4.1 v3.4.2
- v3.4.90 v3.4.91 v3.4.92 v3.5

In spite of the two betas of v3.2, v3.3 and v3.4, bug fix releases
were needed.  And I have 3.5.1 ready to be released, although there
were three betas.

I don't know how to improve this.


However, when I was referring to the complexity of testing Bison,
I was thinking about the test suite: I try to always create
test cases for the regressions we discover, and for the features
I add.  But it's difficult, and hard to predict which combination
of features would break.  And it's impossible to try all the possible
combinations.

I'm about to pass a batch of commits about the test suite.  All
these changes were prompted by the test suite for parse.error
custom.  So far, it took already more of my time that the feature
itself.

Which is gaining maturity though.  And I think I will soon be able
to submit it here, and not just as a PR (which is still waiting
for comments: https://github.com/akimd/bison/pull/16  ;-).

Cheers!

Re: RFC: custom error messages

Reply via email to