I see, makes sense! Pretty much the same problem as virtual
methods / jump tables which disable dead-code-elimination I guess.
On Monday, 22 July 2019 18:45:22 UTC+2, Alon Zakai wrote:
Floh: the analysis works well in many cases, but the main
problem is indirect calls - in a big enough codebase with
enough of those, any indirect call looks like it can lead to
something that sleeps :( We may indeed need a whitelist
approach, some thinking is happening here:
https://github.com/WebAssembly/binaryen/issues/2218
On Mon, Jul 22, 2019 at 4:36 AM Floh <[email protected]> wrote:
Many thanks for the detailed breakdown :)
Is this only using asyncify for an 'infinite game loop'
(instead of a frame callback), or is this also used for
other synchronous calls like file- and network-I/O? I'm
trying to understand the reason behind the 50% size
increase. As far as I understand Alon's recent blog post
on the topic, there's a control-flow analysis happening,
so only functions along call-stacks which need
'asyncification' would need to be instrumented, but that
might just be my overly optimistic interpretation ;) (so
for instance, if only the main loop would need to be
asyncified, the size increase should be very small, but
if there are synchronous IO calls all over the place,
much more code would need to be instrumented, adding to
the code size).
Cheers!
-Floh.
On Monday, 22 July 2019 08:40:13 UTC+2, Gabriel CV wrote:
Hi!
I did some tests with the new Upstream/Asyncify
feature (ie. "Bysyncify") on the Doom 3 port.
I am using Chrome 75/Ubuntu 18.04/nVidia binary
drivers, and used the "timedemo demo1" command to
measure the FPS (not available on the D3 demo though,
too bad. I had to do this with the full version of
the game).
The good news: UPSTREAM/ASYNCIFY is working well! And
easier to use than Emterpreter. However there is a
catch on the final wasm size. Here are the raw results:
TARGET FPS SIZE (MB)
O2/FASTCOMP/EMTERPRETER 50 4,55 MB (for
reference. NB: I am using whitelisting feature on
EMTERPRETER)
O2/UPSTREAM/ASYNCIFY 50 6,81 MB
O2/UPSTREAM (no Asyncify) 50 3,90 MB
O3/UPSTREAM/ASYNCIFY 51 6,96 MB
Os/UPSTREAM/ASYNCIFY 41 5,56 MB
Oz/UPSTREAM/ASYNCIFY 40 5,56 MB
What to read from these numbers:
- Performance
-- FASTCOMP/EMTERPRETER and UPSTREAM/ASYNCIFY have a
similar performance profile: with O2 optimization,
there 50 FPS on average for both targets.
-- ASYNCIFY have no impact on performance: with O2
optimization, there is 50 FPS on average with and
without for both targets (NB: on the D3 port, I
really tried to 'yield' as few as possible)
-- There is however an important gap between Os/Oz
and O2/O3: using Os lead to a 20% performance hit
comparted to O2 (50 FPS with O2/O3 => 40 FPS with Os/Oz)
-- O3 compared to O2 does not bring significant
performance improvement
-- Same thing for Oz compared to Os: both are almost
the same
- Binary size
-- UPSTREAM/ASYNCIFY do have a big impact on final
binary size: this roughly a +50% increase (from 4,55
MB with O2/FASTCOMP/EMTERPRETER => 6,81 MB with
O2/UPSTREAM/ASYNCIFY)
-- It is really the ASYNCIFY that cause this binary
size increase, as without ASYNCIFY, UPSTREAM produce
a binary that is 15% smaller than FASTCOMP (from 4,55
MB with FASTCOMP/EMTERPRETER => 3,90 MB with UPSTREAM)
-- Using Os compared to O2 brings a binary size
improvement (from 6,81 MB with O2 => 5,56 MB with
Os), but this does not match with FASTCOMP (4,55 MB)
-- Oz compared to Os does not bring significant
binary size improvement
So, all in all, my observation is that ASYNCIFY works
well, but the binary size increase is not negligible
(+50%).
Using Os/Oz instead of O2/O3 allow to reduce that
overhead to some extent, but it is at the expense of
a 20% performance hit (at least on the D3 port), and
not on par with the FASTCOMP binary size.
As it appears it is really the Asyncify
transformation that brings the binary size increase,
the whitelisting feature could really bring the best
of both world:
- By default (that is, without whitelisting):
- Ease of use of ASYNCIFY compared to EMTERPRETER
(this works *by default*, without having to do some
extra work)
- No performance impact of using ASYNCIFY (at
least, when using yield/sleep carefully)
- Cons: +50% binary size
- With whitelisting:
- The binary size issue could be mitigated a lot,
as UPSTREAM give smaller binary size than FASTCOMP
(-15% on D3)
- Cons: obviously, some work to do with
whitelisting, but this is the same as with EMTERPRETER
Here it is!
--
You received this message because you are subscribed to
the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails
from it, send an email to
[email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/emscripten-discuss/c9d94058-7dc6-4c3f-9d56-59edbde20955%40googlegroups.com
<https://groups.google.com/d/msgid/emscripten-discuss/c9d94058-7dc6-4c3f-9d56-59edbde20955%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the
Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [email protected]
<mailto:[email protected]>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/emscripten-discuss/7de30e59-da9c-4ad6-af7d-8e23699d5d6a%40googlegroups.com
<https://groups.google.com/d/msgid/emscripten-discuss/7de30e59-da9c-4ad6-af7d-8e23699d5d6a%40googlegroups.com?utm_medium=email&utm_source=footer>.