>> This is all the stuff that will almost certainly require separate >> implementations from the engine's core parser. And maybe >> that's fine. In my case, I wanted to implement a reflection of >> our existing parser, because it's guaranteed to track the >> behavior of SpiderMonkey's parser. > > Understood. But shouldn't separate parsers also implement > the standard parser API? And shouldn't it therefore cover the > information needed for such common use cases?
The problem is our parsers don't produce the same AST, and the structure produced by parsing an arbitrary piece of JS _is_ API. To have a standard API requires a standardised AST, which seems unlikely to happen (for reasons laid out many times in the past). > Browser parsers might then only support a partial profile of > the full standard API - whatever they can support without > negative impact on their main usage. Partial support of an API means your code would have to deal with what is missing (which will vary between browsers). JSC's parser is constructed in such a way that we could generate a parse tree directly into JS form (it would be a matter of jumping through hoops to create the correct builder), however doing so probably won't provide a tree that you necessarily want as something to manipulate. We drop var declarations (relying on other tracking instead) we don't track token locations beyond what is needed for some specific cases. Some "lists" in the grammar are represented as linked lists, others as arrays, etc, etc. I have a vague recollection that the SM parser strips out some characters prior to parsing (brendan?) and generates bytecode into the ast (though based on dherman's comments maybe that's no longer the case) > > Though it might not actually cost much to support the additional > info in SpiderMonkey: most of it could be in the token stream, > which is usually thrown away, but could be kept via a flag, and > the AST's source locations can be used to extract segments of > the token stream (such as any comments preceding a location). Speaking again for JSC -- we don't have an actual "token stream" the lexer provides. The lexer simply walks the input source a token at a time as requested, partially because how we lex is driven by the context and mode of the parser, and partially because creating a distinct token stream is nice in an academic context, but would be a huge performance hole in practice. And the tokens that we do have don't necessarily contain all the information you would want (because each additional write the lexer makes to the token structure is actually measurable in some our perf tests). --Oliver _______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

