On Mon, Apr 16, 2012 at 07:14:19PM +0200, Kaspar Schiess wrote:
>
> Just reviewing your latest set of changes on the arboriculture branch of
> your parslet repository. What you basically propose is a different
> approach to what errors matter and how they should be displayed. Let me
> try to explain to you your own approach and see if I get your idea
> straight. Only then will I try to do a critique of it and then maybe we
> can find convergence.

Hello Kaspar,

great.

> You use the concept of deepest error, which is the error that happened
> at the parse position that was most advanced in the source file. The
> changes you propose would completely remove the stack-trace like
> error-trees and replace them with just one error message that is
> associated with that deepest parse. It would also be mostly associated
> with the most concrete parse at that position: the error would not say
> that a high level rule failed, but that a rule like match() or str() failed.
>
> If the above doesn't capture your idea, consider what is below
> irrelevant to discussion. Feel free to set me right.

You're right. My motivation is coming from "end users" complaining about
error messages pointing at the haystack rather than at the needle. I was
already offering some pinpointing by reading the error_tree, but it fell
short because the error_tree felt truncated (hence my opening of
https://github.com/kschiess/parslet/issues/64 ).

> This approach seems to work well with the grammar you use. Have you
> thought about how this generalizes? It seems really easy to construct a
> pathological grammar where the deepest error carries no meaning to the
> user of your language.

Granted, my vision is certainly limited to the walls of my "grammar cubicle"

I have a couple other grammars I use elsewhere but they're not as fat as the
one with which I'm striving now.

> How does the grammar writer know how the deepest error relates to the
> grammar? What should I fiddle with if I know the input is correct but
> the grammar is not? It seems that we have two set of needs here. As a
> grammar writer I want to know how my grammar failed to parse X; as a
> writer of X I might indeed just want to know about one position to
> twiddle. The error tree anchors the errors back into the structure of
> the grammar; but it leaves the problem of what to display to the user
> (writer of X) completely unsolved. I know I've gone half way only and
> solved my own problem there. Finally somebody notices.

When developing the grammar, I used the error_tree a lot. Now that I'm
handing grammar, parser and transformer to the user, I need a helpful error,
the users aren't as patient as I am.

The parser I work with most of my time is the Ruby one. Most of the time it's
providing me with a decent error message pointing at my mistake, the rest of
the time

> Another concern I've been having (that you probably didn't think of
> here) is the time parslet is spending in the management of all those
> error objects. Even with efficient GC, constructing all those objects
> takes a lot of time when we probably don't need half of them. Your
> approach doesn't address the problem, it just filters what to keep
> differently.

Exactly.

"half of them": from what my exploration taught me, we could keep the error
with the deepest pos and discard errors (ie not instantiate them) with
smaller pos. Granted, some combinations of grammar and source could yield
"instantiate 90% of the errors" and the win would be worthless.

> I am thinking: could we do a first parse for getting just results, and
> once that fails, do a second parse that constructs error information
> using a kind of aggregator? Aggregation could then implement either of
> our ideas about how errors should look like... We might be winning on
> more than one front at once. How does that sound?

I like the idea a lot, sounds right, keep the happy path lean.

> We'd finally be comparing different kinds of apples when benchmarking
> against Treetop, at least...
>
> I will now try to hack your grammar to produce better error messages,
> without changing parslet. Just because I think this might be doable ;)
> I'll report back.

Looking forward to the results.

Please take some time to look at individual commits in my arboriculture fork,
it's not all misguided adventure ;-)

Thanks a ton!

John

Reply via email to