Re: [v8-dev] Callee-saved registers and splintering.

Yang Guo Mon, 08 Oct 2018 06:20:45 -0700

Hey Pierre,

thanks for sharing your findings with us. This data sounds very useful in
evaluating possible paths V8 may take in the future. I'm adding a few
colleagues who may have better insights into V8's compiler and register
allocator.


Cheers,

Yang

On Thu, Oct 4, 2018 at 7:58 PM Pierre Langlois <[email protected]>
wrote:

>
> Hello V8 devs!
>
> Recently we've been investigating how introducing callee-saved registers
> as part of the JavaScript calling convention would affect generated
> code. We understand it opens up questions on how different parts of V8
> would need to be adapted, specifically the GC, and it would require a
> lot of investigation and long term planning. Specifically, having the GC
> walk stack frames to dynamically find where callee-saved registers were
> saved would surely have a performance impact. But, maybe it could be
> balanced out by generating better code, we can't know until we
> investigate :-).
>
> So here we've only looked at the register allocator. And more
> specifically how it copes with not having to save and restore all
> registers around calls. It's already given us food for thought so we
> decided to share it.
>
> This is a long email, so TL;DR: Adding callee-saved registers introduces
> longer live ranges assigned to registers. This is very good for
> non-deferred blocks where a *lot* of gap moves are removed (up to
> half!). But, we get a *lot* more moves at deferred block boundaries. So
> much so that code size increases by 6% when running typescript on
> arm64. It looks like the register allocator could be improved to deal
> with longer register ranges in general.
>
> So, we've built a prototype that takes the list of callee-saved
> registers from a call descriptor and propagates it down so the register
> allocator can look it up for each call. And then it can decide to only
> clobber certain registers. After this all we needed to do was to define
> sets of callee-saved registers for each type of call descriptors:
>
>   * Direct calls to C: Use callee-saved registers defined by the C
>     ABI. This also includes calls generated by the code-generator to
>     implement IEEE functions.
>
>   * Call to the runtime: Clobber everything unless the call does not
>     have a frame state. A GC could be triggered and it needs everything
>     in memory.
>
>   * Call to JS, stubs and builtins: We can define our own set of
>     callee-saved registers. When investigating we've picked the same set
>     as for C calls though.
>
>   * We haven't investigated WASM calls yet.
>
> Of course, except for direct C calls, the generated code isn't
> functional. This was just an experiment to see what it looked like. What
> we can do though is make it a run-time option and gather statistics. We
> can force V8 to compile every optimised JS function twice, the second
> time with callee-saved registers, and then discard the latter. Finally
> we can compare them!
>
> Here is a link to the prototype if you want to take a look:
> https://chromium-review.googlesource.com/c/v8/v8/+/1261643 it's in the
> form of a series of 6 patches, marked as abandoned now.
>
> So without further ado, let's look at some numbers! We've looked at how
> gap moves and code size were affected. We ran the typescript benchmark
> from web-tooling 0.5.2 and displayed statistics in percentages. We've
> purposely split moves into deferred and non-deferred blocks after
> realising they were not affected in the same way at all.
>
> So here we are, numbers of instructions, register/stack slot moves and
> register/constant moves:
>
> | arch                    | instructions (%) | R <-> S (%) | deferred R
> <-> S (%) | R <- CST (%) | deferred R <- CST (%) |
>
> |-------------------------+------------------+-------------+----------------------+--------------+-----------------------|
> | arm64 (12 cs registers) |             6.30 |      -51.86 |
>  138.56 |       -19.48 |                144.35 |
> | arm   (7 cs registers)  |             1.54 |      -34.60 |
>   57.53 |       -12.37 |                 42.93 |
> | x64   (5 cs registers)  |             1.37 |      -32.10 |
>   49.19 |        -8.13 |                 47.51 |
> | ia32  (3 cs registers)  |            -0.04 |       -7.92 |
>    7.79 |        -1.54 |                  4.00 |
>
> We were hoping to get fewer moves in general but instead we got more!
> And especially on arm64 where we have a lot of registers. We cannot
> accept such a code size increase.
>
> That being said, if we forget about the deferred columns, we've
> *theoretically* gotten rid of up to half of register/stack moves! This
> is promising.
>
> But now we're a bit confused as to why the more callee-saved registers
> we have, the more moves we get in deferred blocks. We eventually linked
> it to splintering. Indeed, if we disable it with the
> --turbo-no-preprocess-ranges flag, we get the following results:
>
> | arch                    | instructions (%) | R <-> S (%) | deferred R
> <-> S (%) | R <- CST (%) | deferred R <- CST (%) |
>
> |-------------------------+------------------+-------------+----------------------+--------------+-----------------------|
> | arm64 (12 cs registers) |            -2.82 |      -37.20 |
>   -5.32 |       -13.24 |                 -8.29 |
> | arm   (7 cs registers)  |            -2.35 |      -29.24 |
>  -18.87 |       -11.04 |                -10.53 |
> | x64   (5 cs registers)  |            -1.14 |      -24.74 |
>    0.89 |        -6.52 |                 -0.56 |
> | ia32  (3 cs registers)  |            -0.47 |      -10.30 |
>  -11.01 |        -2.18 |                 -2.74 |
>
> It looks much more like what we'd like to get.
>
> You can find the full data sets attached to this email.
>
> --
> --
> v8-dev mailing list
> [email protected]
> http://groups.google.com/group/v8-dev
> ---
> You received this message because you are subscribed to the Google Groups
> "v8-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [v8-dev] Callee-saved registers and splintering.

Reply via email to