On Wed, May 13, 2015 at 7:56 AM, Keean Schupke <ke...@fry-it.com> wrote:
> The problem I can see for performance is that you want to statically > generate and resolve the AST, so that: > > 'a', 'b', 'c' | 'a', 'b', 'd' > In LR, that would be a standard left-factoring problem. I agree with you that this is a case where back-tracking recursive descent (BTRD) tends to work poorly. Or rather: it's a place where programmers tend to *use* BTRD poorly, in that they fail to hand-implement the appropriate left factoring. Though it should be noted that (a) these cases are rare in real grammars (b) these cases are *dynamically* rare in real programs, (c) memoizing parse trees may be a sufficient solution in practice, and therefore (d) we should favor expressive clarity over performance. The nasty case is when these arise recursively, as in: p1 -> a b c // p1 is your example p1 -> a b d p2 -> p1 d e // yet more backtracking over the input range p2 -> p1 d f the problem here is that a reduction of p1 or p2 is provisional until we know that backtracking will not "undo" that reduction. If the parser is built using combinators, there doesn't seem to be any obvious way to know when a reduction may be hazarded by backtracking. Or at least I'm not seeing it. Once you hit the point where the remaining options are "reduce or fail" you can discard the memoized trees, but until you hit that point you have to hang on to them. But I'm wondering whether we may be talking past each other about static generation. Though I spoke of building an AST for the parser and running some CFG-checking algorithm, and that's definitely a static thing, I was *not* contemplating this as a way to execute the parser. I was thinking of it as a thing that could be done in a testing program, after which the parser itself could be run in the usual way using combinators to specify the recursive descent. What you don't want is to loop over the syntax tree at runtime. > Well, you certainly don't want blind looping, but I think you mean backtrack here rather than loop. And it's not so much that you don't want to backtrack as that you don't want to redo the work you have already done when you *do* backtrack. But assuming your parser is functional, it's a performance issue rather than a correctness issue. Hmm. Though it really helps to be able to do some compile-time analysis so that you know when memoized subtress can safely be discarded as unreachable. shap
_______________________________________________ bitc-dev mailing list bitc-dev@coyotos.org http://www.coyotos.org/mailman/listinfo/bitc-dev