Floh: the analysis works well in many cases, but the main problem is indirect calls - in a big enough codebase with enough of those, any indirect call looks like it can lead to something that sleeps :( We may indeed need a whitelist approach, some thinking is happening here: https://github.com/WebAssembly/binaryen/issues/2218
On Mon, Jul 22, 2019 at 4:36 AM Floh <[email protected]> wrote: > Many thanks for the detailed breakdown :) > > Is this only using asyncify for an 'infinite game loop' (instead of a > frame callback), or is this also used for other synchronous calls like > file- and network-I/O? I'm trying to understand the reason behind the 50% > size increase. As far as I understand Alon's recent blog post on the topic, > there's a control-flow analysis happening, so only functions along > call-stacks which need 'asyncification' would need to be instrumented, but > that might just be my overly optimistic interpretation ;) (so for instance, > if only the main loop would need to be asyncified, the size increase should > be very small, but if there are synchronous IO calls all over the place, > much more code would need to be instrumented, adding to the code size). > > Cheers! > -Floh. > > On Monday, 22 July 2019 08:40:13 UTC+2, Gabriel CV wrote: >> >> Hi! >> >> I did some tests with the new Upstream/Asyncify feature (ie. "Bysyncify") >> on the Doom 3 port. >> >> I am using Chrome 75/Ubuntu 18.04/nVidia binary drivers, and used the >> "timedemo demo1" command to measure the FPS (not available on the D3 demo >> though, too bad. I had to do this with the full version of the game). >> >> The good news: UPSTREAM/ASYNCIFY is working well! And easier to use than >> Emterpreter. However there is a catch on the final wasm size. Here are the >> raw results: >> >> TARGET FPS SIZE (MB) >> O2/FASTCOMP/EMTERPRETER 50 4,55 MB (for reference. NB: I >> am using whitelisting feature on EMTERPRETER) >> O2/UPSTREAM/ASYNCIFY 50 6,81 MB >> O2/UPSTREAM (no Asyncify) 50 3,90 MB >> O3/UPSTREAM/ASYNCIFY 51 6,96 MB >> Os/UPSTREAM/ASYNCIFY 41 5,56 MB >> Oz/UPSTREAM/ASYNCIFY 40 5,56 MB >> >> What to read from these numbers: >> - Performance >> -- FASTCOMP/EMTERPRETER and UPSTREAM/ASYNCIFY have a similar performance >> profile: with O2 optimization, there 50 FPS on average for both targets. >> -- ASYNCIFY have no impact on performance: with O2 optimization, there is >> 50 FPS on average with and without for both targets (NB: on the D3 port, I >> really tried to 'yield' as few as possible) >> -- There is however an important gap between Os/Oz and O2/O3: using Os >> lead to a 20% performance hit comparted to O2 (50 FPS with O2/O3 => 40 FPS >> with Os/Oz) >> -- O3 compared to O2 does not bring significant performance improvement >> -- Same thing for Oz compared to Os: both are almost the same >> >> - Binary size >> -- UPSTREAM/ASYNCIFY do have a big impact on final binary size: this >> roughly a +50% increase (from 4,55 MB with O2/FASTCOMP/EMTERPRETER => 6,81 >> MB with O2/UPSTREAM/ASYNCIFY) >> -- It is really the ASYNCIFY that cause this binary size increase, as >> without ASYNCIFY, UPSTREAM produce a binary that is 15% smaller than >> FASTCOMP (from 4,55 MB with FASTCOMP/EMTERPRETER => 3,90 MB with UPSTREAM) >> -- Using Os compared to O2 brings a binary size improvement (from 6,81 MB >> with O2 => 5,56 MB with Os), but this does not match with FASTCOMP (4,55 >> MB) >> -- Oz compared to Os does not bring significant binary size improvement >> >> So, all in all, my observation is that ASYNCIFY works well, but the >> binary size increase is not negligible (+50%). >> Using Os/Oz instead of O2/O3 allow to reduce that overhead to some >> extent, but it is at the expense of a 20% performance hit (at least on the >> D3 port), and not on par with the FASTCOMP binary size. >> >> As it appears it is really the Asyncify transformation that brings the >> binary size increase, the whitelisting feature could really bring the best >> of both world: >> - By default (that is, without whitelisting): >> - Ease of use of ASYNCIFY compared to EMTERPRETER (this works *by >> default*, without having to do some extra work) >> - No performance impact of using ASYNCIFY (at least, when using >> yield/sleep carefully) >> - Cons: +50% binary size >> - With whitelisting: >> - The binary size issue could be mitigated a lot, as UPSTREAM give >> smaller binary size than FASTCOMP (-15% on D3) >> - Cons: obviously, some work to do with whitelisting, but this is the >> same as with EMTERPRETER >> >> Here it is! >> >> >> -- > You received this message because you are subscribed to the Google Groups > "emscripten-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/emscripten-discuss/c9d94058-7dc6-4c3f-9d56-59edbde20955%40googlegroups.com > <https://groups.google.com/d/msgid/emscripten-discuss/c9d94058-7dc6-4c3f-9d56-59edbde20955%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "emscripten-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/CAEX4NpQRXCL6-L99U%2Bw%3DBCYg4kRkYt3m-HR8df77YiE2zTCO4A%40mail.gmail.com.
