Sounds like that would require modifying the JS engine? - Alon
On Tue, Nov 18, 2014 at 5:19 AM, <[email protected]> wrote: > Could you avoid serialization by modifying the Acorn parser to fill in a C > struct representation of the AST inside a typed array? You could then save > the typed array to a file and mmap it into the address space of your C++ > program. > > I've had a quick play around with your cpp optimizer. According to > Valgrind it does seem to be spending a lot of time in malloc/free/fwrite. > > Thanks > Liam Wilson > > On Sunday, November 16, 2014 7:02:47 PM UTC, Alon Zakai wrote: >> >> The goal is to parse the JS output of the fastcomp LLVM backend. Then we >> run optimization passes on that AST. >> >> Thanks about TinyJS, looks interesting! Ok, at this point I am >> considering 3 options: >> >> 1. Modify TinyJS parser (already in C++, which is good) >> 2. Port Higgs parser from D (nicest written code of all the options) >> 3. Port Acorn parser from JS >> >> I am leaning to the last, because it seems the most active and >> maintained, and has support for parsing ES6 already (we don't need that >> immediately, but eventually we might). Also it is the only one that has >> focused on parsing speed, as far as I can tell. >> >> - Alon >> >> >> >> On Fri, Nov 14, 2014 at 7:44 PM, Marc <[email protected]> wrote: >> >>> This one is not bad: >>> https://code.google.com/p/tiny-js/source/browse/trunk/TinyJS.h >>> >>> There is only two files to include. >>> >>> The licence is ok (MIT like). >>> >>> Which part of the js files do you want to parse? Is it the generated >>> "LLVM as JS" output or any of the libraries you've made (like >>> "parseTools.js" or "analyzer.js"). >>> >>> I've looked a bit at ANTLR but the grammar files for Javascript are a >>> old. >>> >>> There is a more "exotic" alternative I can imagine. It is to use this >>> Haskell parser: >>> >>> https://hackage.haskell.org/package/language-javascript >>> >>> The grammar file is really pretty: >>> >>> https://github.com/alanz/language-javascript/blob/master/src/Language/ >>> JavaScript/Parser/Grammar5.y >>> >>> I know that GHC generates a kind of C (some "C--") as an intermediate >>> code. It is may be possible to wrap a function around it. >>> >>> It's a crazy idea :-) >>> >>> >>> >>> Le Fri, 14 Nov 2014 16:43:55 -0800, >>> Alon Zakai <[email protected]> a écrit : >>> >>> > I wasn't familiar with that, thanks. Looks interesting, however the >>> > GPL license is a problem as we do want the option to run the parser >>> > on the client machine, linked to other code, and this would limit the >>> > amount of people that would use it. >>> > >>> > - Alon >>> > >>> > >>> > On Fri, Nov 14, 2014 at 3:04 AM, Marc <[email protected]> wrote: >>> > >>> > > Do you know this one? >>> > > https://github.com/cesanta/v7 >>> > > >>> > > Le Thu, 13 Nov 2014 17:19:46 -0800, >>> > > Alon Zakai <[email protected]> a écrit : >>> >>> > > >>> > > > Early this year the fastcomp project replaced the core compiler, >>> > > > which was written in JS, with an LLVM backend in C++, and that >>> > > > brought large compilation speedups. However, the late JS >>> > > > optimization passes were still run in JS, which meant optimized >>> > > > builds could be slow (in unoptimized builds, we don't run those >>> > > > JS optimizations, typically). Especially in very large projects, >>> > > > this could be annoying. >>> > > > >>> > > > Progress towards speeding up those JS optimization passes just >>> > > > landed, turned off, on incoming. This is not yet stable or ready, >>> > > > so it is *not* enabled by default. Feel free to test it though >>> > > > and report bugs. To use it, build with >>> > > > >>> > > > EMCC_NATIVE_OPTIMIZER=1 >>> > > > >>> > > > in the environment, e.g. >>> > > > >>> > > > EMCC_NATIVE_OPTIMIZER=1 emcc -O2 tests/hello_world.c >>> > > > >>> > > > It just matters when building to JS (not building C++ to >>> > > > object/bitcode). When EMCC_DEBUG=1 is used, you should see it >>> > > > mention it uses the native optimizer. The first time you use it, >>> > > > it will also say it is compiling it, which can take several >>> > > > seconds. >>> > > > >>> > > > The native optimizer is basically a port of the JS optimizer >>> > > > passes from JS into c++11. c++11 features like lambdas made this >>> > > > much easier than it would have been otherwise, as the JS code has >>> > > > lots of lambdas. The ported code uses the same JSON-based AST, >>> > > > implemented in C++. >>> > > > >>> > > > Using c++11 is a little risky. We build the code natively, using >>> > > > clang from fastcomp, but we do use the system C++ standard >>> > > > libraries. In principle if those are not c++11-friendly, problems >>> > > > could happen. It seems to work fine where I tested so far. >>> > > > >>> > > > Not all passes have been converted, but the main time-consuming >>> > > > passes in -O2 have been (eliminator, simplifyExpresions, >>> > > > registerize). (Note that in -O3 the registerizeHarder pass has >>> > > > *not* yet been converted.) The toolchain can handle running some >>> > > > passes in JS and some passes natively, using JSON to serialize >>> > > > them. >>> > > > >>> > > > Potentially this approach can speed us up very significantly, but >>> > > > it isn't quite there yet. JSON parsing/unparsing and running the >>> > > > passes themselves can be done natively, and in tests I see that >>> > > > running 4x faster, and using about half as much memory. However, >>> > > > there is overhead from serializing JSON between native and JS, >>> > > > which will remain until 100% of the passes you use are native. >>> > > > Also, and more significantly, we do not have a parser from JS - >>> > > > the output of fastcomp - to the JSON AST. That means that we send >>> > > > fastcomp output into JS to be parsed, it emits JSON, and we read >>> > > > that in the native optimizer. >>> > > > >>> > > > For those reasons, the current speedup is not dramatic. I see >>> > > > around a 10% improvement, far from how much we could reach. >>> > > > >>> > > > Further speedups will happen as the final passes are converted. >>> > > > The bigger issue is to write a JS parser in C++ for this. This is >>> > > > not that easy as parsing JS is not that easy - there are some >>> > > > corner cases and ambiguities. I'm looking into existing code for >>> > > > this, but not sure there is anything we can easily use - JS >>> > > > engine parsers are in C++ but tend not to be easy to detach. If >>> > > > anyone has good ideas here that would be useful. >>> > > > >>> > > > - Alon >>> > > > >>> > > >>> > > -- >>> > > You received this message because you are subscribed to the Google >>> > > Groups "emscripten-discuss" group. >>> > > To unsubscribe from this group and stop receiving emails from it, >>> > > send an email to [email protected]. >>> > > For more options, visit https://groups.google.com/d/optout. >>> > > >>> > >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "emscripten-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "emscripten-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "emscripten-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
