The goal is to parse the JS output of the fastcomp LLVM backend. Then we run optimization passes on that AST.
Thanks about TinyJS, looks interesting! Ok, at this point I am considering 3 options: 1. Modify TinyJS parser (already in C++, which is good) 2. Port Higgs parser from D (nicest written code of all the options) 3. Port Acorn parser from JS I am leaning to the last, because it seems the most active and maintained, and has support for parsing ES6 already (we don't need that immediately, but eventually we might). Also it is the only one that has focused on parsing speed, as far as I can tell. - Alon On Fri, Nov 14, 2014 at 7:44 PM, Marc <[email protected]> wrote: > This one is not bad: > https://code.google.com/p/tiny-js/source/browse/trunk/TinyJS.h > > There is only two files to include. > > The licence is ok (MIT like). > > Which part of the js files do you want to parse? Is it the generated > "LLVM as JS" output or any of the libraries you've made (like > "parseTools.js" or "analyzer.js"). > > I've looked a bit at ANTLR but the grammar files for Javascript are a > old. > > There is a more "exotic" alternative I can imagine. It is to use this > Haskell parser: > > https://hackage.haskell.org/package/language-javascript > > The grammar file is really pretty: > > > https://github.com/alanz/language-javascript/blob/master/src/Language/JavaScript/Parser/Grammar5.y > > I know that GHC generates a kind of C (some "C--") as an intermediate > code. It is may be possible to wrap a function around it. > > It's a crazy idea :-) > > > > Le Fri, 14 Nov 2014 16:43:55 -0800, > Alon Zakai <[email protected]> a écrit : > > > I wasn't familiar with that, thanks. Looks interesting, however the > > GPL license is a problem as we do want the option to run the parser > > on the client machine, linked to other code, and this would limit the > > amount of people that would use it. > > > > - Alon > > > > > > On Fri, Nov 14, 2014 at 3:04 AM, Marc <[email protected]> wrote: > > > > > Do you know this one? > > > https://github.com/cesanta/v7 > > > > > > Le Thu, 13 Nov 2014 17:19:46 -0800, > > > Alon Zakai <[email protected]> a écrit : > > > > > > > Early this year the fastcomp project replaced the core compiler, > > > > which was written in JS, with an LLVM backend in C++, and that > > > > brought large compilation speedups. However, the late JS > > > > optimization passes were still run in JS, which meant optimized > > > > builds could be slow (in unoptimized builds, we don't run those > > > > JS optimizations, typically). Especially in very large projects, > > > > this could be annoying. > > > > > > > > Progress towards speeding up those JS optimization passes just > > > > landed, turned off, on incoming. This is not yet stable or ready, > > > > so it is *not* enabled by default. Feel free to test it though > > > > and report bugs. To use it, build with > > > > > > > > EMCC_NATIVE_OPTIMIZER=1 > > > > > > > > in the environment, e.g. > > > > > > > > EMCC_NATIVE_OPTIMIZER=1 emcc -O2 tests/hello_world.c > > > > > > > > It just matters when building to JS (not building C++ to > > > > object/bitcode). When EMCC_DEBUG=1 is used, you should see it > > > > mention it uses the native optimizer. The first time you use it, > > > > it will also say it is compiling it, which can take several > > > > seconds. > > > > > > > > The native optimizer is basically a port of the JS optimizer > > > > passes from JS into c++11. c++11 features like lambdas made this > > > > much easier than it would have been otherwise, as the JS code has > > > > lots of lambdas. The ported code uses the same JSON-based AST, > > > > implemented in C++. > > > > > > > > Using c++11 is a little risky. We build the code natively, using > > > > clang from fastcomp, but we do use the system C++ standard > > > > libraries. In principle if those are not c++11-friendly, problems > > > > could happen. It seems to work fine where I tested so far. > > > > > > > > Not all passes have been converted, but the main time-consuming > > > > passes in -O2 have been (eliminator, simplifyExpresions, > > > > registerize). (Note that in -O3 the registerizeHarder pass has > > > > *not* yet been converted.) The toolchain can handle running some > > > > passes in JS and some passes natively, using JSON to serialize > > > > them. > > > > > > > > Potentially this approach can speed us up very significantly, but > > > > it isn't quite there yet. JSON parsing/unparsing and running the > > > > passes themselves can be done natively, and in tests I see that > > > > running 4x faster, and using about half as much memory. However, > > > > there is overhead from serializing JSON between native and JS, > > > > which will remain until 100% of the passes you use are native. > > > > Also, and more significantly, we do not have a parser from JS - > > > > the output of fastcomp - to the JSON AST. That means that we send > > > > fastcomp output into JS to be parsed, it emits JSON, and we read > > > > that in the native optimizer. > > > > > > > > For those reasons, the current speedup is not dramatic. I see > > > > around a 10% improvement, far from how much we could reach. > > > > > > > > Further speedups will happen as the final passes are converted. > > > > The bigger issue is to write a JS parser in C++ for this. This is > > > > not that easy as parsing JS is not that easy - there are some > > > > corner cases and ambiguities. I'm looking into existing code for > > > > this, but not sure there is anything we can easily use - JS > > > > engine parsers are in C++ but tend not to be easy to detach. If > > > > anyone has good ideas here that would be useful. > > > > > > > > - Alon > > > > > > > > > > -- > > > You received this message because you are subscribed to the Google > > > Groups "emscripten-discuss" group. > > > To unsubscribe from this group and stop receiving emails from it, > > > send an email to [email protected]. > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > -- > You received this message because you are subscribed to the Google Groups > "emscripten-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "emscripten-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
