Hi Sutou,

Sorry about the long delay, but I wanted to follow up on this. I finally
filled up the forms and some other documents, and I sent an initial
draft in a new thread almost a week ago. Could you please have a look?

-- bp

Sutou Kouhei <[email protected]> writes:

> Hi,
>
> https://incubator.apache.org/ip-clearance/
>
> We need to fill the IP clearance template:
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/ip-clearance-template.xml
>
> (It's linked the above IP clearance page.)
>
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/arrow-flight-sql-odbc.xml
> is one of filled templates by us.
>
> Could you try filling the template as much as possible?
>
>
> Thanks,
> --
> kou
>
> In <camexywdchjdtjvdutjq4-zzc+faq0u3xhdwcwg_1xyhpetk...@mail.gmail.com>
>   "Re: [DISCUSS][Erlang] Erlang Apache Arrow Implementation" on Mon, 1 Sep 
> 2025 10:29:40 +0530,
>   Benjamin Philip <[email protected]> wrote:
>
>> Any update on this? If you can send me a link to the IP clearance process
>> and the guidelines and development practices for Apache repositories, I can
>> notify the other stakeholders in the EEF and start the transfer process.
>>
>> -- bp
>>
>> On Wed, 20 Aug, 2025, 1:51 pm Benjamin Philip, <[email protected]>
>> wrote:
>>
>>> On Wed, 20 Aug 2025 at 04:08, Jacob Wujciak <[email protected]> wrote:
>>>
>>>> > Secondly, this will be the first time I will be maintaining an Apache
>>>> > project, and I am not very familiar with the internal processes you
>>>> use. I feel I might
>>>> > move faster with a repo under my own user
>>>>
>>>> This does sound like it might be another use case for the 'arrow-contrib'
>>>> org:
>>>> Apache Datafusion has a community run, non-apache org called
>>>> 'datafusion-contrib' [1], where unofficial extensions and datafusion
>>>> related crates are developed. Once a project is mature/used enough it
>>>> can be donated to the ASF Datafusion TLP (so that is not a necessity).
>>>> This was for example done for Datafusion for Ray [2]. Though
>>>> apparently it will now be archived due to a lack of maintenance [3].
>>>> (So maybe not the best example xD)
>>>>
>>>> The idea of creating a similar org for arrow has been brought up a
>>>> number of times in the community meeting, This would not come with the
>>>> 'red tape' of an ASF project  and would allow faster initial
>>>> development for the Erlang implementation.
>>>>
>>>>
>>> That sounds like a good option. However, I don't want to eliminate
>>> developing this as an ASF project from the start. I figure that this will
>>> eventually become a regular ASF project, so I might as well get accustomed
>>> to it now. Is there a document with all the "red tape" an ASF project
>>> entails?
>>>
>>> If we were to do this, would the Erlang implementation be considered
>>> "official" and linked from the docs? I would like to improve awareness of
>>> the project, and I'd prefer it be mentioned in the official docs even as an
>>> alpha release. I think that is important in addition to promoting it on
>>> Elixir/Erlang specific channels.
>>>
>>> I also forgot to mention this in my previous email, but would any Arrow
>>> maintainer be able to review PRs to this project, maybe multiple times a
>>> week? I remember having many arrow specific doubts while working on this,
>>> and I think it would be wise to have someone re-check my work to ensure I
>>> haven't misinterpreted anything in the specifications and generally keep an
>>> eye from the Apache side. I also have 2 other reviewers from the Erlang
>>> Ecosystem Foundation reviewing my Erlang code, so that part is already
>>> taken care of.
>>>
>>> Regarding the ip clearance process (that as you say will need to
>>>> happen at some point of moving the implementation into
>>>> apache/arrow-erlang), IIRC as long as the code has always been
>>>> licensed under ASL 2.0 the process is more of a formality and
>>>> shouldn't be too hard to do.
>>>>
>>>
>>> The code is indeed licensed under ASL 2.0, so I think we can go with the
>>> ip clearance process then. Are there any other legal matters that need to
>>> be addressed?
>>>
>>> On Tue, 19 Aug 2025 at 14:09, Antoine Pitrou <[email protected]> wrote:
>>>
>>>> There isn't an official criterion for declaring an implementation
>>>> "complete" (and we don't really use that term, either).
>>>>
>>>> What is important is to address the most common needs that your users
>>>> may have (such as OpenTelemetry data payloads).
>>>
>>>
>>> That makes sense.
>>>
>>>
>>>> I would personally suggest:
>>>>
>>>> - support the most common data types (all primitive types + at least
>>>> list and struct + dictionary + basic support for extension types)
>>>> - support either the C Data Interface or the IPC format (preferably both)
>>>>
>>>> In the IPC format, you don't need to support everything (tensors are
>>>> rarely used, for example; endianness conversion is only useful if you
>>>> plan to exchange data with big-endian systems...).
>>>>
>>>>
>>> As of right now, we support about half of all primitive types and most of
>>> the lists (under nested types), but none of the special or extension types.
>>> We also have some rudimentary support for IPC (since that's needed for
>>> OTel). I plan to add support for everything under the Columnar Format
>>> anyway, so it's just a matter of time. Is Flight and friends handled by the
>>> Arrow team? How often and where is Flight used?
>>>
>>> Hi Benjamin,
>>>>
>>>> Le 14/08/2025 à 20:17, Benjamin Philip a écrit :
>>>> >
>>>> >> serialization/deserialization features but arrow-rs provides
>>>> >> more features such as computation features.
>>>> >
>>>> > This reminds me. What features will I have to support out of
>>>> > (de)serialization
>>>> > for an implementation to be considered complete?
>>>>
>>>> You're probably aware of https://arrow.apache.org/docs/dev/status.html ,
>>>> otherwise it will give you an idea of the variety of features that *can*
>>>> be implemented.
>>>>
>>>
>>>  This list only lists support for serialization and deserialization of
>>> various data types, whether that be the Columnar Format, the IPC Format or
>>> Flight. I realize that the words "out of" weren't very clear, but what I
>>> meant was what should I support *apart from* serde? For example, Sutou
>>> mentioned computation. I don't see a list of supported computations
>>> anywhere, what computations must I provide? I'm guessing serde (i.e. R/W of
>>> Arrow arrays) and computations (i.e. transformations of Arrow arrays) are
>>> it, but are there any other high-level features I should support?
>>>
>>> -- bp
>>>

Reply via email to