Hi, Because <benjamin.philip...@gmail.com> isn't subscribed to dev@arrow.apache.org. All your e-mails are pending by default. I'm accepting them manually as a mailing list administrator.
Could you subscribe dev@arrow.apache.org with <benjamin.philip...@gmail.com>? Thanks, -- kou In <CAMEXYWdwgLsJi371NTg3zGLqMMAVtQ5LfruUtxt-CpYXx8FH=a...@mail.gmail.com> "Re: [DISCUSS][Erlang] Erlang Apache Arrow Implementation" on Wed, 13 Aug 2025 23:12:29 +0530, Benjamin Philip <benjamin.philip...@gmail.com> wrote: > This is a little off topic, but I noticed David and Sotou's replies today > (13/08/25) when I checked lists.apache.dev. However, I haven't received > their reply via email yet (no, it's not in my spam nor in my bin). > Additionally, my original email took a few days to appear on Pony Mail. Is > there a way to debug this so that I can give a proper quoted response from > my email client? > > -- bp > > On Fri, 8 Aug, 2025, 9:41 pm Benjamin Philip, <benjamin.philip...@gmail.com> > wrote: > >> Hi, >> >> I am working on an Erlang implementation for Apache Arrow, and I am >> interested in submitting it to the Apache Foundation as an official >> implementation for Erlang and Elixir, once it is ready. >> >> If you haven't heard of Erlang[1], you can read more about it here[2]. >> It's most famous for powering telecom switches, rabbitmq, and instant >> messaging apps like Whatsapp, Discord, and ejabberd. The important thing to >> keep in mind is it's used in highly parallel and distributed environments. >> It runs on a runtime called the BEAM/Erlang Virtual Machine (analogous to >> Java and the JVM). Being a functional language[3], all values are immutable >> and there is no SIMD support. >> >> Initial work[4] was started 2 years ago for compliance with some new >> OpenTelemetry specifications. However, my focus so far has only been >> (de)serialization and not operating on/manipulating Arrow Arrays since that >> was the only requirement in OpenTelemetry. >> >> The trouble with Erlang, is that natively producing and decoding binaries >> in pure Erlang is more effective than through a C FFI. This has also been >> the case with plaintext formats like JSON and XML, and with parsing markup >> like HTML and Markdown. This has meant that we've had to write an Erlang >> Arrow implementation from the ground up. The lack of an Erlang flatbuffer >> implementation (for IPC), SIMD support in the Erlang Virtual Machine (for >> efficient operations) and mutability (for zero-copy access; all values in >> Erlang are immutable) make a complete Arrow implementation in Erlang >> especially challenging. >> >> An alternative could be to handle serializations in Erlang and operations >> with the C bindings. We could also start with a minimal implementation with >> bindings to nanoarrow and deprecate that in favour of the Erlang one later. >> >> Upstreaming a fully compliant Erlang implementation could potentially be a >> multi-year project. This might also include writing an Erlang flatbuffers >> implementation. This will also be an additional implementation for the >> Arrow team to maintain, though I would be happy to aid in developing and >> maintaining it. What are the steps to get this going? >> >> How are implementations out of the mono repo tested? Is there any guide >> for setting up integration testing and benchmarking in third-party >> implementations? So far I've had to roll my own minimal tooling for what >> archery supports, and I would prefer if I could integrate with >> archery instead. >> >> Additionally, the initial work for this project was sponsored by the >> Erlang Ecosystem Foundation[5]. Would this be an issue when transferring >> stewardship to the ASF? >> [1]: https://en.wikipedia.org/wiki/Erlang_(programming_language) >> [2]: https://www.erlang.org/ >> [3]: https://en.wikipedia.org/wiki/Functional_programming >> [4]: https://github.com/Benjamin-Philip/serde_arrow >> [5]: https://erlef.org/ >> >> -- bp >>