[ANNOUNCE] DataFusion Comet regular meetup

2024-04-30 Thread Andy Grove
*Note: I had previously sent a version of this email to the new DataFusion dev@ mailing list, but I don't think many people have migrated to that yet, so I am sending it to dev@ arrow as well.* Hello, I would like to invite anyone interested to join a regular meetup to discuss the DataFusion

Re: [DISCUSS] Drop Java 8 support

2024-04-30 Thread martin . traverse
Speaking for my own product we would like to see Java 11 support, we rely heavily on Arrow and have Java 11 as our minimum supported version. We’d like to keep doing that if possible. Our clients are big enterprises with notoriously sluggish update cycles, so we want to offer maximum

Re: [VOTE][Format] UUID canonical extension type

2024-04-30 Thread Antoine Pitrou
+1 (binding) Le 19/04/2024 à 22:22, Rok Mihevc a écrit : Hi all, Following initial requests [1][2] and recent tangential ML discussion [3] I would like to propose a vote to add language for UUID canonical extension type to CanonicalExtensions.rst as in PR [4] and written below. A draft C++

Re: [VOTE][Format] JSON canonical extension type

2024-04-30 Thread Antoine Pitrou
+1 (binding) for the current proposal, i.e. with the RFC 8289 requirement and the 3 current String types allowed. Regards Antoine. Le 30/04/2024 à 19:26, Rok Mihevc a écrit : Hi all, thanks for the votes and comments so far. I've amended [1] the proposed language with the RFC-8259

Re: [VOTE][Format] JSON canonical extension type

2024-04-30 Thread Jacob Wujciak
+1 (non-binding) Thanks for moving these two forward Rok! Am Di., 30. Apr. 2024 um 19:26 Uhr schrieb Rok Mihevc : > Hi all, thanks for the votes and comments so far. > I've amended [1] the proposed language with the RFC-8259 requirement as it > seems to be almost unanimously requested. New

Re: [VOTE][Format] UUID canonical extension type

2024-04-30 Thread Joris Van den Bossche
+1 (binding) On Tue, 30 Apr 2024 at 19:52, Jacob Wujciak wrote: > +1 (non-binding) > > Am Di., 30. Apr. 2024 um 17:48 Uhr schrieb Weston Pace < > weston.p...@gmail.com>: > > > +1 (binding) > > > > On Tue, Apr 30, 2024 at 7:53 AM Rok Mihevc wrote: > > > > > Thanks for all the reviews and

Re: [VOTE][Format] UUID canonical extension type

2024-04-30 Thread Jacob Wujciak
+1 (non-binding) Am Di., 30. Apr. 2024 um 17:48 Uhr schrieb Weston Pace < weston.p...@gmail.com>: > +1 (binding) > > On Tue, Apr 30, 2024 at 7:53 AM Rok Mihevc wrote: > > > Thanks for all the reviews and comments! I've included the big-endian > > requirement so the proposed language is now as

Re: [VOTE][Format] JSON canonical extension type

2024-04-30 Thread Rok Mihevc
Hi all, thanks for the votes and comments so far. I've amended [1] the proposed language with the RFC-8259 requirement as it seems to be almost unanimously requested. New language is below. To Micah's comment regarding rejecting Binary arrays [2] - please discuss in the PR. Let's leave the vote

Re: [Discuss] Extension types based on canonical extension types?

2024-04-30 Thread Dewey Dunnington
I don't think there is any current barrier to using implementation features of one extension type to help with another. In Python, for example, one might be able to do: class GeoJSONExtensionType(pa.ExtensionType): def __init__(self): self._json_ext = pa.JSONExtensionType() def

Re: [Discuss] Extension types based on canonical extension types?

2024-04-30 Thread Ian Cook
But consider that a user might want to define a non-canonical HLLSKETCH extension type and make use of Arrow implementations' features for handling JSON canonical extension type columns in order to handle HLLSKETCH extension type columns. The spec currently does not provide any means to enable

Re: [Discuss] Extension types based on canonical extension types?

2024-04-30 Thread Weston Pace
I think "inheritance" and "composition" are more concerns for implementations than they are for spec (I could be wrong here). So it seems that it would be sufficient to write the HLLSKETCH's canonical definition as "this is an extension of the JSON logical type and supports all the same storage

Re: [VOTE][Format] UUID canonical extension type

2024-04-30 Thread Weston Pace
+1 (binding) On Tue, Apr 30, 2024 at 7:53 AM Rok Mihevc wrote: > Thanks for all the reviews and comments! I've included the big-endian > requirement so the proposed language is now as below. > I'll leave the vote open until after the May holiday. > > Rok > > UUID > > > * Extension name:

Re: [VOTE][Format] JSON canonical extension type

2024-04-30 Thread Weston Pace
+1 (binding) I agree we should be explicit about RFC-8259 On Mon, Apr 29, 2024 at 4:46 PM David Li wrote: > +1 (binding) > > assuming we explicitly state RFC-8259 > > On Tue, Apr 30, 2024, at 08:02, Matt Topol wrote: > > +1 (binding) > > > > On Mon, Apr 29, 2024 at 5:36 PM Ian Cook wrote: > >

Re: [DISCUSS] Drop Java 8 support

2024-04-30 Thread Jacob Wujciak
Hello everyone! Great to see this move forward! +1 on dropping both 8 and 11 unless there is very good reason to keep 11 around. Otherwise people will just move to 11 and then have the pain of migration again when we drop that (which will happen soon regardless imo). Am Di., 30. Apr. 2024 um

[CROWDSOURCING] May 2024 ASF Board Report

2024-04-30 Thread Andrew Lamb
As part of being a new project, we need to submit reports to the board every month for the first three months[1]. In the tradition of Apache Arrow, I hope the community can help draft this report. Please take a look and add anything that might be relevant[2]. Thanks, Andrew [1]:

Re: [VOTE][Format] UUID canonical extension type

2024-04-30 Thread Rok Mihevc
Thanks for all the reviews and comments! I've included the big-endian requirement so the proposed language is now as below. I'll leave the vote open until after the May holiday. Rok UUID * Extension name: `arrow.uuid`. * The storage type of the extension is ``FixedSizeBinary`` with a

Re: [Discuss] Extension types based on canonical extension types?

2024-04-30 Thread Matt Topol
I think the biggest blocker to doing this is the way that we pass extension types through IPC. Extension types are sent as their underlying storage type with metadata key-value pairs of specific keys "ARROW:extension:name" and "ARROW:extension:metadata". Since you can't have multiple values for

Re: [DISCUSS] Drop Java 8 support

2024-04-30 Thread Dane Pitkin
Thanks, JB. Are we aware of any downstream dependencies that would benefit from maintaining Java 11 support? Apache Spark jumped straight to Java 17. It seems other projects are dropping both 8 and 11 at the same time as mentioned by Fokko. From a maintenance perspective, it would be nice to drop

[Discuss] Extension types based on canonical extension types?

2024-04-30 Thread Ian Cook
The vote on adding a JSON canonical extension type [1] got me wondering: Is it possible to define an extension type that is based on a canonical extension type? If so, how? For example, say I wanted to define a (non-canonical) HLLSKETCH extension type that corresponds to the type that Redshift