Early this year the fastcomp project replaced the core compiler, which was
written in JS, with an LLVM backend in C++, and that brought large
compilation speedups. However, the late JS optimization passes were still
run in JS, which meant optimized builds could be slow (in unoptimized
builds, we don't run those JS optimizations, typically). Especially in very
large projects, this could be annoying.

Progress towards speeding up those JS optimization passes just landed,
turned off, on incoming. This is not yet stable or ready, so it is *not*
enabled by default. Feel free to test it though and report bugs. To use it,
build with

EMCC_NATIVE_OPTIMIZER=1

in the environment, e.g.

EMCC_NATIVE_OPTIMIZER=1 emcc -O2 tests/hello_world.c

It just matters when building to JS (not building C++ to object/bitcode).
When EMCC_DEBUG=1 is used, you should see it mention it uses the native
optimizer. The first time you use it, it will also say it is compiling it,
which can take several seconds.

The native optimizer is basically a port of the JS optimizer passes from JS
into c++11. c++11 features like lambdas made this much easier than it would
have been otherwise, as the JS code has lots of lambdas. The ported code
uses the same JSON-based AST, implemented in C++.

Using c++11 is a little risky. We build the code natively, using clang from
fastcomp, but we do use the system C++ standard libraries. In principle if
those are not c++11-friendly, problems could happen. It seems to work fine
where I tested so far.

Not all passes have been converted, but the main time-consuming passes in
-O2 have been (eliminator, simplifyExpresions, registerize). (Note that in
-O3 the registerizeHarder pass has *not* yet been converted.) The toolchain
can handle running some passes in JS and some passes natively, using JSON
to serialize them.

Potentially this approach can speed us up very significantly, but it isn't
quite there yet. JSON parsing/unparsing and running the passes themselves
can be done natively, and in tests I see that running 4x faster, and using
about half as much memory. However, there is overhead from serializing JSON
between native and JS, which will remain until 100% of the passes you use
are native. Also, and more significantly, we do not have a parser from JS -
the output of fastcomp - to the JSON AST. That means that we send fastcomp
output into JS to be parsed, it emits JSON, and we read that in the native
optimizer.

For those reasons, the current speedup is not dramatic. I see around a 10%
improvement, far from how much we could reach.

Further speedups will happen as the final passes are converted. The bigger
issue is to write a JS parser in C++ for this. This is not that easy as
parsing JS is not that easy - there are some corner cases and ambiguities.
I'm looking into existing code for this, but not sure there is anything we
can easily use - JS engine parsers are in C++ but tend not to be easy to
detach. If anyone has good ideas here that would be useful.

- Alon

-- 
You received this message because you are subscribed to the Google Groups 
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to