re: Flatbuffers, would it be possible to work with the Flatbuffers maintainers 
to upstream the (hypothetical) implementation there?

For stewardship, if there is a significant amount of code to transfer, we may 
want to treat it as a software grant (several implementations started out this 
way). As part of that you would file paperwork with the ASF certifying that you 
have the rights to transfer the code. (See for example 
https://incubator.apache.org/ip-clearance/arrow-flight-sql-odbc.html.)

If you are doing this implementation from the ground up it may be easier to do 
it as a series of PRs in the first place and avoid this process.

On Sat, Aug 9, 2025, at 01:11, Benjamin Philip wrote:
> Hi,
>
> I am working on an Erlang implementation for Apache Arrow, and I am
> interested in submitting it to the Apache Foundation as an official
> implementation for Erlang and Elixir, once it is ready.
>
> If you haven't heard of Erlang[1], you can read more about it here[2]. It's
> most famous for powering telecom switches, rabbitmq, and instant messaging
> apps like Whatsapp, Discord, and ejabberd. The important thing to keep in
> mind is it's used in highly parallel and distributed environments. It runs
> on a runtime called the BEAM/Erlang Virtual Machine (analogous to Java and
> the JVM). Being a functional language[3], all values are immutable and
> there is no SIMD support.
>
> Initial work[4] was started 2 years ago for compliance with some new
> OpenTelemetry specifications. However, my focus so far has only been
> (de)serialization and not operating on/manipulating Arrow Arrays since that
> was the only requirement in OpenTelemetry.
>
> The trouble with Erlang, is that natively producing and decoding binaries
> in pure Erlang is more effective than through a C FFI. This has also been
> the case with plaintext formats like JSON and XML, and with parsing markup
> like HTML and Markdown. This has meant that we've had to write an Erlang
> Arrow implementation from the ground up. The lack of an Erlang flatbuffer
> implementation (for IPC), SIMD support in the Erlang Virtual Machine (for
> efficient operations) and mutability (for zero-copy access; all values in
> Erlang are immutable) make a complete Arrow implementation in Erlang
> especially challenging.
>
> An alternative could be to handle serializations in Erlang and operations
> with the C bindings. We could also start with a minimal implementation with
> bindings to nanoarrow and deprecate that in favour of the Erlang one later.
>
> Upstreaming a fully compliant Erlang implementation could potentially be a
> multi-year project. This might also include writing an Erlang flatbuffers
> implementation. This will also be an additional implementation for the
> Arrow team to maintain, though I would be happy to aid in developing and
> maintaining it. What are the steps to get this going?
>
> How are implementations out of the mono repo tested? Is there any guide for
> setting up integration testing and benchmarking in third-party
> implementations? So far I've had to roll my own minimal tooling for what
> archery supports, and I would prefer if I could integrate with
> archery instead.
>
> Additionally, the initial work for this project was sponsored by the Erlang
> Ecosystem Foundation[5]. Would this be an issue when transferring
> stewardship to the ASF?
> [1]: https://en.wikipedia.org/wiki/Erlang_(programming_language)
> [2]: https://www.erlang.org/
> [3]: https://en.wikipedia.org/wiki/Functional_programming
> [4]: https://github.com/Benjamin-Philip/serde_arrow
> [5]: https://erlef.org/
>
> -- bp

Reply via email to