Le lundi 20 mars 2023 à 00:15 +0100, David Kastrup a écrit : > Jean Abou Samra <[j...@abou-samra.fr](mailto:j...@abou-samra.fr)> writes: > > > > Le dimanche 19 mars 2023 à 17:51 +0100, David Kastrup a écrit : > > > > > > > > So how to better involve others? The parser may be one of those > > > areas with an awful amount of shoestring and glue, namely fiddling > > > around until things happen to work. All that fiddling happens in > > > private before commits end up in master, meaning that it has no > > > opportunity to end up contagious the way it happens now. > > > > > > That's not really fabulous regarding the "bus factor" in that area. > > > > > > I would feel a lot more comfortable with modifying the parser if there > > was an explanation, in code comments or in the CG, of how the > > parser/lexer interplay works, when lookahead is OK or bad, and how to > > avoid it when necessary. Things like the comment above MYBACKUP > > > > ``` > > // The following are somewhat precarious constructs as they may change > > // the value of the lookahead token. That implies that the lookahead > > // token must not yet have made an impact on the state stack other > > // than causing the reduction of the current rule, or switching the > > // lookahead token while Bison is mulling it over will cause trouble. > > ``` > > > > are obscure to me. > > > Well, Bison creates LALR(1) parsers. That means that the parser always > is in a certain state. It looks at the next token, the "lookahead" > token (only one, that's what the 1 in LALR(1) is about) and then > transitions into another state while either shifting the current state > onto some stack, or by using a rule for reducing the current stack into > a production. > > The above comment is fearsome about the possibility that the > statemachine processes the current lookahead token without eating it, > but then getting the lookahead token switched out under its radar and > ending in a state that is not able to process the switched-out token. > > So far, the fears expressed in that comment have not materialized. > > The parser is only able to process a certain subset of languages. Since > the parser makes deterministic progress by either consuming a lookahead > token while growing the stack by 1 or by consuming stack material, it > ends up O(1), namely efficient with regard to the size of its input. > > When the parser applies a rule, you can specify code that will be > executed in the reduction. > > The MYBACKUP and MYPARSE stuff messes with the input in order to trigger > syntactic decisions based on expression values. That's a bit more than > usually expected from a Bison-generated parser.
Yes, I understand the basic way Bison parsers work. What I don't understand is what other “effects” the lookahead can have, and why having caused the reduction of the current rule is never a problem. AFAIU, the parser works as a loop - Get next token from lexer. - Decide whether to shift or to reduce some rule. Use a lookahead token if necessary. - Do the shift or the reduction and execute the semantic action. The lookahead token gets switched during the semantic action. Isn't it a problem if the previous lookahead token says the current rule should be reduced, but the new one would have required shifting? Or is that just not a useful use of MYBACKUP/MYREPARSE?
signature.asc
Description: This is a digitally signed message part