Fascinating! I ran through a similar exercise for the Go SDK when I was fixing the batch Load Tests, but optimizing some specialized DoFn execution nodes calls for DoFns that don't observe windows, or return
For comparison, the Go SDK already has a Returning DoFn as you call it, but I found that emitting was lower overhead than Returning. That could change as we use Go Generics and the Go Compiler to enable similar technics for simple DoFns as you've described in your doc for Java's JIT. All the necessary abstraction adds up. On Mon, Jun 13, 2022, 12:23 PM Andrew Pilloud <[email protected]> wrote: > I ran some experiments with our existing DoFn API trying to get the Java > JIT to Auto-Vectorize DoFn ProcessContext calls. Unfortunately that was > unsuccessful, it appears we will need a new Returning DoFn API to support > Auto-Vectorization (in addition to coder and runner changes). > > I also found that function calls through Java interfaces can be relatively > expensive and will not work with many optimizations supported by the Java > JIT. Our DoFn calling convention involves several layers of these interface > calls. I am proposing we add an interface to Beam Core to generate > flattened DoFn calls to concrete types, which would reduce our calling > overhead and allow the JIT to perform inlining optimization across DoFns. > > Please take a look and provide feedback: > https://docs.google.com/document/d/12XkHLcE0HpOS0fs0FekDzh68fMPCEZ5uGCh00kPZf0I/edit > > (If you are interested in this, it is one of the topics we will discuss at > our Beam Summit talk "Relational Beam: Process columns, not rows!" > <https://2022.beamsummit.org/sessions/relational-beam/>. Hope to see > everyone there!) > > Andrew >
