Re: how to get left hand side symbol in action

Christian Schoenebeck Wed, 10 Jun 2020 06:08:50 -0700

On Mittwoch, 10. Juni 2020 07:29:33 CEST Akim Demaille wrote:
> > So my code looks at the individual grammar rules to decide
> > whether the corresponding symbol should be interpreted as terminal or non-
> > terminal accordingly.
> > 
> > Then the other thing is that my auto completion code actually auto
> > completes as much as possible, not just the very next token. So if you
> > have a highly redundant language (like it is often the case with human
> > readable protocols), then it would auto complete several tokens up to the
> > point where an actual input decision would have to be made. For instance,
> > say you were writing an> 
> > SQL parser and the user typed ('^' illustrating the cursor position):
> >     CREATE TABLE foo (id bigint UNIQUE U^
> > 
> > then it would auto complete it to:
> >     CREATE TABLE foo (id bigint UNIQUE USING INDEX TABLESPACE ^
> 
> Ain't the two features the same one?  I mean, the completion of U as USING
> could be a repeated concatenation of S I N G.


Depends. If you just look at this like the user clicks a button and it fills 
the command line, then yes, you are right.

In practice though an application most certainly also wants to know more 
details, i.e. what's a non-terminal, what's a terminal, and where does one 
particular terminal start and stop from raw character sequence point of view.

Sticking to the auto completion example:

a) In a modern UI you probably want to show a dropdown list with the next 
tokens in real-time e.g. while typing, probably also with a different visual 
markup for terminals vs. non-terminals in that list, and the app probably also 
wants to filter out some of the non-terminals like <newline>, <space>, <eof> 
or <string> from that list.

b) When you are showing the next possible options to the user in real-time, 
unlike the actual final auto completion action; you probably don't want to 
concatenate each option in that option list all to their *full* possible 
extent but instead e.g. just limit them to only one next token each (depending 
on the situation), as the number of next options can be quite large and things 
can become very slow in this case (due to potential polynomial complexity).

c) for language keywords the application probably wants to automatically 
handle obvious auto corrections like, e.g:

        select * from foo;  ->  SELECT * FROM foo;
        Select * From foo;  ->  SELECT * FROM foo;
        ...

such that e.g. a developer does not have to worry about case sensitivity 
aspects, hence avoiding to bloat grammar definitions. Or yet other things to 
correct automatically like:

        SELECT * FROM Main Table;  ->   SELECT * FROM 'Main Table';

> > And BTW this was also the 1st compilation issue with a recent Bison
> > version in more than 6 years. That's quite good I would say, so also
> > thanks for taking care of not breaking things! :)
> 
> Nothing *public* is expected to break :)

Mhm, with many other projects it is actually quite normal that things are 
broken in every n-th public release. So having such a long API stability 
(where I'm even using not so 'public' parts of the skeleton) is definitely not 
standard, and I appreciate that!

> >> I'm curious: why don't you require a modern Bison, *and ship the
> >> generated
> >> files*?  Waiting for the end users to have installed recent versions of
> >> the
> >> generators does not buy you a lot of freedom, and forces you to uglify
> >> your
> >> parser.
> > 
> > Reason for still supporting Bison 2: the license. Remember I also use this
> > for commercial projects.
> 
> Given that the GPL does not apply to the generated parser, I don't see what
> worries you hear.

First of all, it does not reflect my personal legal opinion. If you you are in 
the supply chain of other companies you simply have to adapt to their 
requirements.

AFAICS there are two issues: There are use cases where the language needs to 
be adjustable / extensible, which requires to ship a parser generator along.

Probably the most relevant issue though: Many companies simply have a strict 
policy that GPL3 components must not be used. Period. Now you might say, this 
does not make sense in this or another case, but who would evaluate that? 
Individual developers themselves, probably including apprentices or 
freelancers? People from the legal department which mostly cannot read code at 
all? And when would they do that, for every commit?

> > About shipping pregenerated parsers: I already do, for release tarballs
> > that is (not for development versions though, which these reports were
> > about).
> > 
> > Actually most reports about parser related compilation errors were always
> > about users compiling a pregenerated one (i.e. release tarballs),
> 
> I don't understand this.  If you released a pregenerated parser, it
> obviously works, you wouldn't have released otherwise.  So what kind of
> failure can users find that would be fixed by regenerating?
> 
> I can see one scenario: the tarball is old and newer compilers generate
> new warnings.  In which case regenerating with a more recent Bison would
> probably address the issue.
> 
> But I sense you are referring to something different.

I really meant compiler errors, not warnings. I could not tell you the details 
of these issues right now, as I often don't get detailed information about the 
precise circumstances that lead to these compilation errors (e.g. precise 
versions of compiler, flex, bison, ... probably any source changes who knows).

I usually also don't ask though, because the fix is so easy: regenerate it.

Best regards,
Christian Schoenebeck

Re: how to get left hand side symbol in action

Reply via email to