On Wed, Jul 6, 2011 at 12:00 AM, David Herman <[email protected]> wrote: >> the AST API strawman - given the positive discussions on this list, I >> thought the idea was implicitly accepted last year, modulo details, >> so I was surprised not to see a refined strawman promoted. > > It hasn't really been championed so far. I was concentrating on other > proposals for ES.next. > >> - it does not support generic traversals, so it definitely needs a >> pre-implemented traversal, sorting out each type of Node >> (Array-based ASTs, like the es-lab version, make this slightly >> easier - Arrays elements are ordered, unlike Object properties); > > I designed it to be easily JSON-{de}serializable, so no special prototype. > However, you can use the builder API to construct your own format: > > https://developer.mozilla.org/en/SpiderMonkey/Parser_API#Builder_objects > > With a custom builder you can create objects with whatever methods you want, > and builders for various formats can be shared in libraries. > >> at that stage, simple applications (such as tag generation) >> may be better of working with hooks into the parser, rather >> than hooks into an AST traversal? also, there is the risk that >> one pre-implemented traversal might not cover all use cases, >> in which case the boilerplate tax would have to be paid again; > > I don't understand any of this. > >> - it is slightly easier to manipulate than an Array-based AST, but > > More than slightly, IMO. > >> lack of pattern matching fall-through (alternative patterns for >> destructuring) still hurts, and the selectors are lengthy, which >> hampers visualization and construction; (this assumes that >> fp-style AST processing is preferred over oo-style processing) > > If I'd defined a new object type with its own prototype, it still wouldn't > define all operations anyone would ever want. So they'd either have to > monkey-patch it or it would need a visitor. Which you could write anyway. So > I don't see much benefit to pre-defining a node prototype. > > But again, see the builder API, where you can create your own custom node > type. > >> - it is biased towards evaluation, which is a hindrance for other >> uses (such as faithful unparsing, for program transformations); > > It's just a reflection of the built-in SpiderMonkey parser, which was > designed for the sole purpose of evaluation. I didn't reimplement a new > parser. > >> this can be seen clearly in Literals, which are evaluated (why >> not evaluate Object, Array, Function Literals as well? eval should >> be part of AST processing, not of AST construction), but it also >> shows in other constructs (comments are not stored at all, and >> if commas/semicolons are not stored, how does one know >> where they were located - programmers tend to be picky >> about their personal or project-wide style guides?); > > None of this data is available in a SpiderMonkey parse node. > >> - there are some minor oddities, from spelling differences to >> the spec (Label(l)ed), > > Heh, I shouldn't've capitulated to my (excellent and meticulous!) reviewer, > who was unfamiliar with the spec: > > https://bugzilla.mozilla.org/show_bug.cgi?id=533874#c28 > > I can probably change that. > >> to structuring decisions (why separate >> UpdateExpression and LogicalExpression, when everything >> else is in UnaryExpression and BinaryExpression?); > > I separated update expressions and logical expressions because they have > different control structure from the other unary and binary operators. > >> btw, why alternate/consequent instead of then/else, and > > I was avoiding using keywords as property names, and consequent/alternate are > standard terminology. I suppose .then/.else would be more convenient. > >> shouldn't that really be consequent->then and alternate->else >> instead of the other way round (as the optional null for >> consequent suggests)? > > Doc bug, thanks. Fixed. > >> My main issue is unparsing support for program transformations > > https://bugzilla.mozilla.org/show_bug.cgi?id=590755 > >> (though IDEs will similarly need more info, for comment extraction, >> syntax highlighting, and syntax-based operations). > > This is all the stuff that will almost certainly require separate > implementations from the engine's core parser. And maybe that's fine. In my > case, I wanted to implement a reflection of our existing parser, because it's > guaranteed to track the behavior of SpiderMonkey's parser. > >> What I did for now was to add a field to each Node, in which I >> store an unprocessed Array of the sub-ASTs, including tokens. >> Essentially, the extended AST Nodes provide both abstract info >> for analysis and evaluation and a structured view of the token >> stream belonging to each Node, for lower-level needs. >> >> Whitespace/comments are stored separately, indexed by the >> start position of the following token (this is going to work better >> for comment-before-token that for comment-after-token, but it >> is a start, for unparsing or comment-extraction tools). > > You've lost me again. Are you describing a parser you wrote? > >> This allows for a generic traversal of the Array-based unprocessed >> AST fragments, for unparsing, but I still have to rearrange things >> so that I can actually store the information I need (can't add info >> to null as an AST value) and distinguish meta-info ("computed" >> and "prefix" properties) from sub-ASTs. > > I'm still lost. > >> Overall, the impression is that this AST was designed by someone >> resigned to the fact of having to write Node-type-specific traversal >> code for each purpose, with a limited number of purposes planned >> (such as evaluation). This could be a burden for other uses of such >> ASTs (boilerplate tax). > > It was designed to be minimal and serializable. It was a lot of code, so I > figured I would just focus on a) making sure all the data was there and b) > making it possible to provide a custom data format via the builder API. This > is what I came up with, but I can revisit the API design if it's useful. > >> I hope these notes help - I'd really like to see a standard JS >> parser API implemented across engines. For language >> experimentation, we'd still need separate tweakable parsers, >> but access to the efficient engine parsers for current JS would >> give tool development a boost. > > I'm still not convinced this is such a big win. Reflect.parse gives you > *some* performance, but it still requires two traversals (one to generate the > internal C++ JSParseNode tree and then a second to convert that to a JS > object tree). But part of the benefit is knowing you have exactly the > SpiderMonkey parser. Once implementors have to write a separate parser, the > possibility of divergence increases, and the maintenance cost of building a > second parser in a low-level language is high. At that point, they might just > want to write it in JS. But anybody could do that. > >>> But there are also tough questions about what the parser >>> should do with engine-specific language extensions. >> >> Actually, that starts before the AST: I'd like to see feature-based >> language versioning, instead of the current monolithic version >> numbering - take generators as an example feature: >> >> Perhaps JS1.7 ("javascript;version=1.7") happens to be the first >> JS version to support "yield", and is backwards compatible with >> JS1.5, which might happen to match ES3; and JS1.8.5, which >> happens to match ES5, might be backwards compatible with >> JS1.7. But it is unlikely that the JSx which happens to match ES6 >> will be backwards compatible with JS1.7 (while ES5-breaking >> changes will be limited, replacing experimental JS1.x features >> with standardized variants is another matter). >> >> Whereas, if I was able to specify "use yield", and be similarly >> selective about other language features, then either of JS1.7, >> JS1.8.5 and ES6 engines might be able to do the job, depending >> on what other language features my code depends on. Also, >> other engines might want to implement some features -like >> "yield"- selectively, without aiming to support all of JS1.7, and >> long before being able to support all of ES6. > > That's asking for quite a modularized/configurable parser. > >>> I agree about the issue of multiple parsers. The reason I >>> was able to do the SpiderMonkey library fairly easily was >>> that I simply reflect exactly the parser that exists. But to >>> have a standards-compliant parser, we'd probably have >>> to write a separate parser. That's definitely a tall order. >> >> It should not be, provided one distinguishes between >> standards-compliant and production use. If the ES grammar >> is LR(1), it should really be specified in a parser tool format, > > Mainstream production JS engines have moved away from parser generators. > >> both for verification and to generate standards-compliant >> tools to compare against. Depending on how efficient the >> JS Bison implementation is, this might even lead to useable >> parser performance. > > Again, this could be implemented by anyone as a pure JS library.
FWIW, I've implemented such a library here: https://github.com/zaach/reflect.js The grammar is based on the old JavaScriptCore Bison grammar with some tweaks to make it LALR(1) (Jison doesn't do efficient LR(1) yet.) > >> There may be problems in finding a tool that generates all >> the information needed for a useful AST (source locations, >> comments, scope info, ..), but we do not need to solve every >> issue immediately to make progress, right? And if the ES >> committee were to ask ES parser generator implementors >> whether their tools could be extended to serve an AST spec, >> response might be favourable. >> >> It would be nice if the spec parser was generated in Javascript, >> but any tool-usable standard grammar would be useful - once >> the grammar can be processed by a freely available tool, it can >> be translated to similar formats, some of which have Javascript >> implementations (eg Jison, ANTLR). >> >> Having played a little with the ANTLRWorks environment, it >> looks promising, is easy to install (just a .jar), has user-contributed >> ES grammars, and can spot some ambiguities easily (though >> I don't think its check is complete, and the ES grammar is too >> complex to make naïve parse-tree visualization helpful). If other >> tools have better ES grammar development support, I'd like to >> hear about them. >> >> Without a standard spec-conformant tool-readable grammar, >> such tools remain of limited use. With a tool-readable grammar, >> adding AST generation might turn out to be an afternoon's work >> (followed by years of testing/debugging;-). > > A standard, machine-processable grammar would be a nice-to-have. Agreed. > > I hate to complain, but can you try to trim your messages? It takes an > enormous amount of time to read and respond to these huge messages. > > https://twitter.com/#!/statpumpkin/status/66187260407709696 > > Dave > > _______________________________________________ > es-discuss mailing list > [email protected] > https://mail.mozilla.org/listinfo/es-discuss > -- Zach Carter _______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

