As a follow-up, I wrote another benchmark, and ran it 'properly' this time, by running it on a custom-built Erlang/OTP v 24.2.0 which supports both the JIT and EMU `emu_flavor`s.

This updated benchmark compares three different implementations of a GenServer `handle_cast` callback, which seemed like a more realistic scenario to me. See here for the three implementations

The results in this example are that `Then`, `ThenInlined` and `Manual` are similarly efficient, and all take the same amount of memory. So I guess that at least when `Kernel.then/2` is in a tail-recursion position (which is probably the common case), it will be optimized well by the Erlang compiler. :-)

For whom wants to dig deeper into this benchmark themselves, see:


On 03-01-2022 22:18, José Valim wrote:
Unfortunately I don't know if there is a way to see the JIT code. But given that regular profiling tools like prof now work with the BEAM, maybe it is also possible to use similar tools to see the JITed code?

In any case, I tracked the commit: - none of the work is happening in the loader, unfortunately. Sorry for the red herring. The commit makes it so a function object is no longer allocated but you still have to perform a local call and perhaps that's the additional cost? I guess a further pass would be to eliminate the function call altogether if the invoked function does not define any variable, but that should be done by the Erlang Compiler.

On Mon, Jan 3, 2022 at 10:05 PM Wiebe-Marten Wijnja <> wrote:

    No worries, thanks a lot for your guidance in this matter! ^_^

    I will try to come up with some other, more 'real-world'-like
    examples to double-check whether the benchmark's results apply
    only on quick snippets or across the board.

    Do you happen to know if there is any way to inspect the result of
    the JIT-pass?

    On 03-01-2022 20:47, José Valim wrote:
    Sorry, for the short replies, I was on my phone. :)

    What I mean is, are the measurements across examples guaranteed
    to have the same amount of garbage collector calls (or no calls
    at all)? I am worried that, for quick snippets, the memory
    measurements are being influenced by other factors. But according
    to my understanding the anonymous function should not be
    allocated on Erlang/OTP 24 (and I think some further improvements
    are coming on 25).

    Plus comparing against OTP 23 and 24 will be tough due to the JIT.

    On Mon, Jan 3, 2022 at 8:38 PM Wiebe-Marten Wijnja
    <> wrote:

        Yes, across benchmark runs the memory measurements are the same.

        On 03-01-2022 20:17, José Valim wrote:
        Ah, df has no effect on a JIT system, I forgot about that.
        Is the memory measurements guaranteed to have consistent
        effect of the GC across benchmarks?

        On Mon, Jan 3, 2022 at 20:06 Wiebe-Marten Wijnja
        <> wrote:

            I have run some benchmarks (comparing OTP23 with
            JIT-enabled OTP24).
            Full results here:

            It compares, in a situation where no tail recursion
            optimization is possible, `Kernel.then/2` vs. writing
            the same code manually vs. using `Kernel.then/2` with
            `@compile :inline`.

            A brief summary of the results:

            - OTP24 is able to get roughly twice as many iterations
            per second as OTP23. However:
            - On OTP24:
              - using `Kernel.then/2` requires (when tail recursion
            is not possible) 2.5x the memory of the other two variants.
              - using `Kernel.then/2`is roughly 30% slower than the
            other two variants.
            - On OTP23:
              - all three techniques use the same amount of memory.
              - using `Kernel.then/2`is roughly 8% slower than the
            other two variants.


            I also took a look at the disassembled code using
            :erts_debug.df as you suggested.
            Details here:
            /(Note that under OTP24 the *.dis-files only contained
            1-5 empty lines, so the output is from OTP23. Should I
            file a bug with the OTP team for this?)/

            It seems that also during loading, no optimization of
            immediately-called anonymous functions is taking place.
            Above benchmarks seem to support this fact, although the
            results w.r.t. memory usage and the difference in
            slowdown vs OTP23/24 seems very odd to me.

            How to continue?


            On 03-01-2022 17:30, José Valim wrote:
            The optimization may happen on the loader. Use
            erts_debug:df(Mod, Fun, Arity) and see that.

            On Mon, Jan 3, 2022 at 5:03 PM Wiebe-Marten Wijnja
            <> wrote:

                I've been running my tests on Elixir v1.13.1 built
                for OTP24 with OTP 24.1.2.
                When decompiling the resulting BEAM bytecode, the
                anonymous functions are still visible.

                I will do some benchmarks to see how the resulting
                performance is. Maybe the JIT will do something
                which is not visible in the BEAM bytecode.

                On 03-01-2022 16:57, José Valim wrote:
                then/2 is a macro and the emitted code should be
                optimized from Erlang/OTP 24+.

                On Mon, Jan 3, 2022 at 4:28 PM
                <> wrote:

                    Since v1.12 we have the macro
                    `Kernel.then(value, function)` which expects
                    an arity-1 function and will call it with the
                    given value.

                    This makes code which used to be written as

                    def update(params, socket) do
                      socket =
                        |> assign(:myvar, params["myvar"])
                        |> assign_new(:some_default, fn -> 42 end)

                      {:noreply, socket}

                    more readable, by allowing it to be written as:

                    def update(params, socket) do
                        |> assign(:myvar, params["myvar"])
                        |> assign_new(:some_default, fn -> 42 end)
                        |> then(&{:noreply, &1})

                    This pattern seems to be common in codebases
                    using Elixir 1.12 and up (At least according
                    to anecdotal evidence).

                    All is well. Except there is a little snag:
                    The new code does not have the same runtime
                    characteristics (both in performance and in
                    memory usage) as `then`desugars to
                    `(function).(value)`: An anonymous function is
                    created and immediately run (and then garbage
                    collected soon after).

                    The Erlang compiler is clever enough to
                    optimize these immediately-called anonymous
                    functions away, but it will only do so when
                    `@compile :inline` is set in the given module,
                    to not mess with the call stack that might be
                    returned when an exception is thrown.

                    Now `@compile :inline` is quite the
                    sledgehammer, as it will inline /all/
                    functions in the current module (as long as
                    they are not 'too big', which can also be
                    configured, and only in the places where they
                    are called statically).
                    But since we're dealing with anonymous
                    functions here which do not have clear names,
                    there is no way to predict the name one should
                    pass to the `@compile` option.

                    It seems like this situation could be
                    improved, although I am not sure how.

                    Is there a way to mark these anonymous
                    functions in some kind of way, to allow only
                    them to be inlined?
                    Or is there maybe a way to have the
                    Elixir-compiler already inline common patterns
                    like a capture with a datatype, rather than
                    relying on the Erlang compiler for this?
                    Your input is greatly appreciated.

-- You received this message because you are
                    subscribed to the Google Groups
                    "elixir-lang-core" group.
                    To unsubscribe from this group and stop
                    receiving emails from it, send an email to
                    To view this discussion on the web visit

-- You received this message because you are
                subscribed to the Google Groups "elixir-lang-core"
                To unsubscribe from this group and stop receiving
                emails from it, send an email to
                To view this discussion on the web visit
-- You received this message because you are
                subscribed to the Google Groups "elixir-lang-core"
                To unsubscribe from this group and stop receiving
                emails from it, send an email to
                To view this discussion on the web visit

-- You received this message because you are subscribed to
            the Google Groups "elixir-lang-core" group.
            To unsubscribe from this group and stop receiving
            emails from it, send an email to
            To view this discussion on the web visit
-- You received this message because you are subscribed to
            the Google Groups "elixir-lang-core" group.
            To unsubscribe from this group and stop receiving emails
            from it, send an email to
            To view this discussion on the web visit

-- You received this message because you are subscribed to the
        Google Groups "elixir-lang-core" group.
        To unsubscribe from this group and stop receiving emails
        from it, send an email to
        To view this discussion on the web visit
-- You received this message because you are subscribed to the
        Google Groups "elixir-lang-core" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to
        To view this discussion on the web visit

-- You received this message because you are subscribed to a topic
    in the Google Groups "elixir-lang-core" group.
    To unsubscribe from this topic, visit
    To unsubscribe from this group and all its topics, send an email
    To view this discussion on the web visit
-- You received this message because you are subscribed to the Google
    Groups "elixir-lang-core" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to
    To view this discussion on the web visit

You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to To view this discussion on the web visit <>.

You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
To view this discussion on the web visit

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to