Benjamin-Philip commented on PR #1717:
URL: https://github.com/apache/arrow-adbc/pull/1717#issuecomment-2900996196

   > There have been efforts for direct Arrow support (e.g. due to OTel) but I 
am only slightly involved in that. So we would be talking specifically about 
ADBC and not Arrow, in case that changes any of what you said.
   
   @lidavidm Sorry about bumping this year old thread, but I am in charge for 
direct Arrow support in Erlang and Elixir. I would definitely be interested in 
official upstream inclusion. However, my focus so far has only been 
(de)serialization and not operating on/manipulating Arrow Arrays since that was 
the only requirement in OpenTelemetry.
   
   The trouble with Erlang, is that natively producing and decoding binaries in 
pure Erlang is more effective than through a C FFI. This has also been the case 
with plaintext formats like JSON and XML, and with parsing markup like HTML and 
Markdown. This has meant that we've had to write an Erlang Arrow implementation 
from ground up. The lack of an Erlang flatbuffer implementation (for IPC), SIMD 
support in the Erlang Virtual Machine (for efficient operations) and mutability 
(for zero-copy access; all values in Erlang are immutable) make a complete 
Arrow implementation in Erlang especially challenging.
   
   An alternative could be to handle serializations in Erlang and operations 
with the C bindings, though this could be unsafe, especially given Erlang's 
parallel nature (Erlang doesn't like mutations). We could also start with a 
minimal implementation with bindings to nanoarrow and deprecate that in favour 
of the erlang one later.
   
   Upstreaming a fully compliant Erlang implementation could potentially be a 
multi-year project. This might also include writing an Erlang flatbuffers 
implementation. This will also be an additional implementation for the Arrow 
team to maintain, though I would happy to aid in developing and maintaining it. 
What are the steps to get this going?
   
   How are implementations out of the mono repo tested? Is there any guide for 
setting up integration testing and benchmarking in third-party implementations? 
So far I've had to roll my own minimal tooling for what `archery` supports, and 
I would prefer if I could integrate with `archery` instead.
   
   I would be happy to shift this conversation somewhere else.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to