On Mittwoch, 10. Juni 2020 07:29:33 CEST Akim Demaille wrote: > > So my code looks at the individual grammar rules to decide > > whether the corresponding symbol should be interpreted as terminal or non- > > terminal accordingly. > > > > Then the other thing is that my auto completion code actually auto > > completes as much as possible, not just the very next token. So if you > > have a highly redundant language (like it is often the case with human > > readable protocols), then it would auto complete several tokens up to the > > point where an actual input decision would have to be made. For instance, > > say you were writing an> > > SQL parser and the user typed ('^' illustrating the cursor position): > > CREATE TABLE foo (id bigint UNIQUE U^ > > > > then it would auto complete it to: > > CREATE TABLE foo (id bigint UNIQUE USING INDEX TABLESPACE ^ > > Ain't the two features the same one? I mean, the completion of U as USING > could be a repeated concatenation of S I N G.
Depends. If you just look at this like the user clicks a button and it fills the command line, then yes, you are right. In practice though an application most certainly also wants to know more details, i.e. what's a non-terminal, what's a terminal, and where does one particular terminal start and stop from raw character sequence point of view. Sticking to the auto completion example: a) In a modern UI you probably want to show a dropdown list with the next tokens in real-time e.g. while typing, probably also with a different visual markup for terminals vs. non-terminals in that list, and the app probably also wants to filter out some of the non-terminals like <newline>, <space>, <eof> or <string> from that list. b) When you are showing the next possible options to the user in real-time, unlike the actual final auto completion action; you probably don't want to concatenate each option in that option list all to their *full* possible extent but instead e.g. just limit them to only one next token each (depending on the situation), as the number of next options can be quite large and things can become very slow in this case (due to potential polynomial complexity). c) for language keywords the application probably wants to automatically handle obvious auto corrections like, e.g: select * from foo; -> SELECT * FROM foo; Select * From foo; -> SELECT * FROM foo; ... such that e.g. a developer does not have to worry about case sensitivity aspects, hence avoiding to bloat grammar definitions. Or yet other things to correct automatically like: SELECT * FROM Main Table; -> SELECT * FROM 'Main Table'; > > And BTW this was also the 1st compilation issue with a recent Bison > > version in more than 6 years. That's quite good I would say, so also > > thanks for taking care of not breaking things! :) > > Nothing *public* is expected to break :) Mhm, with many other projects it is actually quite normal that things are broken in every n-th public release. So having such a long API stability (where I'm even using not so 'public' parts of the skeleton) is definitely not standard, and I appreciate that! > >> I'm curious: why don't you require a modern Bison, *and ship the > >> generated > >> files*? Waiting for the end users to have installed recent versions of > >> the > >> generators does not buy you a lot of freedom, and forces you to uglify > >> your > >> parser. > > > > Reason for still supporting Bison 2: the license. Remember I also use this > > for commercial projects. > > Given that the GPL does not apply to the generated parser, I don't see what > worries you hear. First of all, it does not reflect my personal legal opinion. If you you are in the supply chain of other companies you simply have to adapt to their requirements. AFAICS there are two issues: There are use cases where the language needs to be adjustable / extensible, which requires to ship a parser generator along. Probably the most relevant issue though: Many companies simply have a strict policy that GPL3 components must not be used. Period. Now you might say, this does not make sense in this or another case, but who would evaluate that? Individual developers themselves, probably including apprentices or freelancers? People from the legal department which mostly cannot read code at all? And when would they do that, for every commit? > > About shipping pregenerated parsers: I already do, for release tarballs > > that is (not for development versions though, which these reports were > > about). > > > > Actually most reports about parser related compilation errors were always > > about users compiling a pregenerated one (i.e. release tarballs), > > I don't understand this. If you released a pregenerated parser, it > obviously works, you wouldn't have released otherwise. So what kind of > failure can users find that would be fixed by regenerating? > > I can see one scenario: the tarball is old and newer compilers generate > new warnings. In which case regenerating with a more recent Bison would > probably address the issue. > > But I sense you are referring to something different. I really meant compiler errors, not warnings. I could not tell you the details of these issues right now, as I often don't get detailed information about the precise circumstances that lead to these compilation errors (e.g. precise versions of compiler, flex, bison, ... probably any source changes who knows). I usually also don't ask though, because the fix is so easy: regenerate it. Best regards, Christian Schoenebeck