Re: [DISCUSSION] New Flags for Arrow C Interface Schema

2024-05-15 Thread Matt Topol
a single column? > >>>>> > >>>>> I suppose that I would have expected two functions (one to create a > >>>>> table and one to create a column). As a consumer I can't envision a > >>>>> situation where I would want to import an

Re: Fwd: [C++] Parquet and Arrow overlap

2024-05-10 Thread Matt Topol
I just wanted to also poke the question of non-Java developers who have worked on the other parquet implementations potentially being recognized as committers or otherwise on the Parquet project (speaking as the primary developer of the Go parquet implementation which also lives in the Arrow

Re: [Discuss] Extension types based on canonical extension types?

2024-04-30 Thread Matt Topol
I think the biggest blocker to doing this is the way that we pass extension types through IPC. Extension types are sent as their underlying storage type with metadata key-value pairs of specific keys "ARROW:extension:name" and "ARROW:extension:metadata". Since you can't have multiple values for

Re: [VOTE][Format] UUID canonical extension type

2024-04-29 Thread Matt Topol
+1 (binding) pending agreement on the endianness which I agree needs to be specified in the docs. While I lean towards big-endian as it appears most implementations of UUID use a big-endian byte order, I don't much mind what endianness we use as long as we explicitly specify it in the spec. On

Re: [VOTE][Format] JSON canonical extension type

2024-04-29 Thread Matt Topol
+1 (binding) On Mon, Apr 29, 2024 at 5:36 PM Ian Cook wrote: > +1 (non-binding) > > I added a comment in the PR suggesting that we explicitly refer to RFC-8259 > in CanonicalExtensions.rst. > > On Mon, Apr 29, 2024 at 1:21 PM Micah Kornfield > wrote: > > > +1, I added a comment to the PR

Re: ADBC - OS-level driver manager

2024-04-23 Thread Matt Topol
;> to use Power BI with Oracle, they either need a way to install Oracle > > >> drivers onto their machine in a standard way which lets us find them > or > > we > > >> need to go through a painful and sometimes expensive "biz dev" effort > to > &g

Re: [DISCUSSION] New Flags for Arrow C Interface Schema

2024-04-21 Thread Matt Topol
s, > > -dewey > > On Fri, Apr 19, 2024 at 6:34 PM Matt Topol wrote: > > > > Hey everyone, > > > > With some of the other developments surrounding libraries adopting the > > Arrow C Data interfaces, there's been a consistent question about > handling

[DISCUSSION] New Flags for Arrow C Interface Schema

2024-04-19 Thread Matt Topol
Hey everyone, With some of the other developments surrounding libraries adopting the Arrow C Data interfaces, there's been a consistent question about handling tables (record batch) vs columns vs scalars. Right now, a Record Batch is sent through the C interface as a struct column whose children

[Go][Release][Discussion] Backporting something to a previous version

2024-04-15 Thread Matt Topol
Hey all, There was a request to backport a fix for Go Arrow from v13 back to v12 [1] to improve a dependency situation for a user of Go Arrow where the databricks-sql-go driver is using Arrow v12 currently, thus exhibiting the bug they want to backport a fix for. To be fair, they have also made a

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-04-12 Thread Matt Topol
/arrow/pull/41180 On Tue, Apr 9, 2024, 10:39 AM Matt Topol wrote: > Hey JB, > > The next step for me is going to be converting the document into a > markdown version with more prose that we can add to the Arrow documentation > site (marked as Experimental of course). > > &g

Re: [RFC] Enabling data frames in disaggregated shared memory

2024-04-10 Thread Matt Topol
Hi John, I recently proposed on the mailing list an experimental extension of the Arrow IPC protocol that would make it easier to leverage disaggregated shared memory along with non-cpu memory via utilities such as UCX and libfabric [1]. I'll be putting together a more formal description of it

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-04-09 Thread Matt Topol
us, what's the next step (if I can help in any way :) ) ? > > Regards > JB > > On Tue, Feb 27, 2024 at 6:35 PM Matt Topol wrote: > > > > Hey all, > > > > I'd like to propose a vote for us to officially adopt the protocol > > described in the google doc[1] f

Re: [VOTE] Add new info codes and options keys to ADBC specification

2024-04-06 Thread Matt Topol
+1 On Sat, Apr 6, 2024, 4:54 AM Andrew Lamb wrote: > +1 > > On Fri, Apr 5, 2024 at 9:55 PM Jacob Wujciak > wrote: > > > + 1 (non-binding) > > > > Am Sa., 6. Apr. 2024 um 01:57 Uhr schrieb Joel Lubinitsky < > > joell...@gmail.com>: > > > > > Yes, just updated both the issue and the PR. > > > >

[RESULT] Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-04-05 Thread Matt Topol
; > fine voting +1 on this (I'm not sure a formal vote is even needed). I > > would want to see at least 2 implementations if we wanted to remove the > > experimental label. > > > > On Sun, Mar 31, 2024 at 2:43 PM Joel Lubinitsky > > wrote: > > > >&

Re: [VOTE] Bulk ingestion support for Flight SQL (vote #2)

2024-04-05 Thread Matt Topol
+1 (binding) On Fri, Apr 5, 2024, 5:25 AM Joel Lubinitsky wrote: > Thanks David, > > Just a minor correction: The reference implementation is at [1]. The link > in your message is to an earlier version of the PR that has been closed. > > My vote: +1 > > [1]:

[ANNOUNCE] New Committer Joel Lubinitsky

2024-04-01 Thread Matt Topol
On behalf of the Arrow PMC, I'm happy to announce that Joel Lubinitsky has accepted an invitation to become a committer on Apache Arrow. Welcome, and thank you for your contributions! --Matt

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-03-28 Thread Matt Topol
oach is useful and generally > applicable. > > So a big +1 for the idea of disassociated transports but I'm not sure why > we need a vote to start working on it (but I'm not opposed if a vote helps) > > [1] > > https://www.databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-co

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-03-28 Thread Matt Topol
I'll keep this new vote open for at least the next 72 hours. As before please reply with: [ ] +1 Accept this Proposal [ ] +0 [ ] -1 Do not accept this proposal because... Thanks everyone! On Wed, Mar 27, 2024 at 7:51 PM Benjamin Kietzman wrote: > +1 > > On Tue, Mar 26, 2024, 18:36 M

Re: [VOTE] Release Apache Arrow ADBC 0.11.0 - RC0

2024-03-28 Thread Matt Topol
+1 (binding) Verified on PopOS! 22.04 amd64 using Conda with: USE_CONDA=1 ./dev/release/verify-release-candidate.sh 0.11.0 0 Though there's one issue that i don't think should block the release: > Running the tests in ‘tests/testthat.R’ failed. > Last 13 lines of output: > ── Error

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-03-26 Thread Matt Topol
Should I start a new thread for a new vote? Or repeat the original vote email here? Just asking since there hasn't been any responses so far. --Matt On Thu, Mar 21, 2024 at 11:46 AM Matt Topol wrote: > Absolutely, it will be marked experimental until we see some people using > it and c

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-03-21 Thread Matt Topol
Li wrote: > I think let's try again. Would it be reasonable to declare this > 'experimental' for the time being, just as we did with Flight/Flight > SQL/etc? > > On Tue, Mar 19, 2024, at 15:24, Matt Topol wrote: > > Hey All, It's been another month and we've gotten a whol

Re: [VOTE] Stateless prepared statements in FlightSQL

2024-03-21 Thread Matt Topol
+1 (binding) I'm gonna give the Go impl another review and once over, but in general it looks good and the Idea is sound. On Thu, Mar 21, 2024, 10:12 AM Andrew Lamb wrote: > +1 (binding) > > I reviewed the spec proposal and the rust implementation and I think they > look good to go. I am not

Re: Apache Arrow Flight - From Rust to Javascript (FlightData)

2024-03-20 Thread Matt Topol
I don't think there is currently a direct equivalent to `FlightRecordBatchStream` in the arrow javascript library, but you should be able to combine the data header + body and then read it using the `fromIPC` functions since it's just the Arrow IPC format On Fri, Mar 15, 2024 at 5:39 AM Alexander

Re: ADBC - OS-level driver manager

2024-03-20 Thread Matt Topol
> it seems like the current driver manager work has been largely targeting an app-specific implementation. Yup, that was the intention. So far discussions of ADBC having a system-wide driver registration paradigm like ODBC have mostly been to discuss how much we dislike that paradigm and would

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-03-19 Thread Matt Topol
ok at the proposal and don’t think there’s anything > preventing in-place updating in the future - ultimately the data body could > just be in the same location for subsequent messages. > > Thanks! > Paul > > On Fri, Mar 1, 2024 at 5:28 PM Matt Topol wrote: > > >

Re: [ANNOUNCE] New Arrow committer: Bryce Mecum

2024-03-17 Thread Matt Topol
Congrats!!! Well deserved!! On Sun, Mar 17, 2024, 11:19 PM Weston Pace wrote: > Congratulations! > > On Sun, Mar 17, 2024, 8:01 PM Jacob Wujciak wrote: > > > Congrats, well deserved! > > > > Nic Crane schrieb am Mo., 18. März 2024, 03:24: > > > > > On behalf of the Arrow PMC, I'm happy to

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-03-01 Thread Matt Topol
reading the proposal initially, I gleaned that the most > important > > audience was those writing interfaces to GPUs/remote memory/non-standard > > transports/etc. And it wasn't clear to me whether updating batches in > > place (and the producer/consumer coordination that comes with that)

Re: [VOTE] Move Arrow DataFusion Subproject to new Top Level Apache Project

2024-03-01 Thread Matt Topol
+1 (binding) On Fri, Mar 1, 2024, 12:58 PM QP Hou wrote: > +1 (binding) > > exciting milestone :) > > On Fri, Mar 1, 2024 at 9:49 AM David Li wrote: > > > > +1 > > > > On Fri, Mar 1, 2024, at 12:06, Jorge Cardoso Leitão wrote: > > > +1 - great work!!! > > > > > > On Fri, Mar 1, 2024 at 5:49 PM

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Matt Topol
't really have any solution for > > generating engagement except nagging and pinging people explicitly :-) > > > > > > > > Le 27/02/2024 à 19:09, Matt Topol a écrit : > > > I would like to see the same Antoine, currently given the lack of > > > engagemen

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Matt Topol
parties before this is formally adopted as an Arrow spec. > > Regards > > Antoine. > > > Le 27/02/2024 à 18:35, Matt Topol a écrit : > > Hey all, > > > > I'd like to propose a vote for us to officially adopt the protocol > > described in the google do

[VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Matt Topol
Hey all, I'd like to propose a vote for us to officially adopt the protocol described in the google doc[1] for Dissociated Arrow IPC Transports. This proposal was originally discussed at [2]. Once this proposal is adopted, I will work on adding the necessary documentation to the Arrow website

Re: [DISCUSS] Proposal to expand Arrow Communications

2024-02-12 Thread Matt Topol
t; > Thanks ! > Regards > JB > > On Sat, Feb 3, 2024 at 12:22 AM Matt Topol wrote: > > > > Hey all, > > > > In my current work I've been experimenting and playing around with > > utilizing Arrow and non-cpu memory data. While the creation of the >

Re: [DISCUSS] Flight RPC: add 'fallback' URI scheme

2024-02-12 Thread Matt Topol
> (Correct me if I'm wrong Matt, but as I recall, UCX addresses aren't hostnames but rather opaque byte blobs, for instance.) You can use a hostname and port to create a ucx connection, but there is separately an address object. A UCX address object is an opaque byte blob that includes a whole

[DISCUSS] Proposal to expand Arrow Communications

2024-02-02 Thread Matt Topol
Hey all, In my current work I've been experimenting and playing around with utilizing Arrow and non-cpu memory data. While the creation of the ArrowDeviceArray struct and the enhancements to the Arrow library Device abstractions were necessary, there is also a need to extend the communications

Re: [VOTE] Accept donation of Comet Spark native engine

2024-01-27 Thread Matt Topol
+1 (binding) On Sat, Jan 27, 2024, 6:00 PM Wes McKinney wrote: > +1 (binding) > > On Sat, Jan 27, 2024 at 12:26 PM Micah Kornfield > wrote: > > > +1 Binding > > > > On Sat, Jan 27, 2024 at 10:21 AM David Li wrote: > > > > > +1 (binding) > > > > > > On Sat, Jan 27, 2024, at 13:03, L. C. Hsieh

Re: [VOTE] Release Apache Arrow 15.0.0 - RC1

2024-01-19 Thread Matt Topol
on Debian 12 'bookworm'. I had issues with binaries but > that was because of AlmaLinux failing to verify its own GPG key for some > reason. > > On Thu, Jan 18, 2024, at 04:40, Raúl Cumplido wrote: > > El mié, 17 ene 2024 a las 23:37, Matt Topol () > escribió: > >> > >

Re: [VOTE] Release Apache Arrow 15.0.0 - RC1

2024-01-17 Thread Matt Topol
o increase the chance? > > [1] > > https://github.com/apache/arrow/blob/c170af41ba0c30b80aa4172da0b3637206368cf2/go/arrow/flight/flightsql/driver/utils_test.go#L90 > > *Regards,* > *Rossi* > > > Matt Topol 于2024年1月18日周四 02:55写道: > > > @pitrou Looks like

Re: [VOTE] Release Apache Arrow 15.0.0 - RC1

2024-01-17 Thread Matt Topol
and I don't have access to a mac at the moment. Would anyone happen to have a mac they can try to dig into and check out that unit test on? Otherwise I can spin up an AWS instance and try replicating and debugging on that if necessary. --Matt On Wed, Jan 17, 2024 at 1:30 PM Matt Topol wrote

Re: [VOTE] Release Apache Arrow 15.0.0 - RC1

2024-01-17 Thread Matt Topol
I'll take a look at that Go test failure in a bit. As for the ubuntu 22.04 verification failure, I'll double check that we're installing Go 1.19 for the verification and using the right PATH to it, I thought we addressed this but I guess something must have been overlooked. --Matt On Wed, Jan

Re: [DISCUSS] Flight SQL as experimental

2023-12-06 Thread Matt Topol
+1, I agree with everyone else On Wed, Dec 6, 2023 at 7:49 PM James Duong wrote: > +1 from me. It's used in a good number of databases now. > > Get Outlook for Android > > From: David Li > Sent: Wednesday, December 6, 2023 9:59:54 AM >

Re: [ANNOUNCE] New Arrow PMC chair: Andy Grove

2023-11-27 Thread Matt Topol
Congrats Andy! On Mon, Nov 27, 2023 at 9:44 AM Gavin Ray wrote: > Yay, congrats Andy! Well-deserved! > > On Mon, Nov 27, 2023 at 9:13 AM Kevin Gurney > > wrote: > > > Congratulations, Andy! > > > > From: Raúl Cumplido > > Sent: Monday, November 27, 2023 8:58

Re: [VOTE][FORMAT] Bulk ingestion support for Flight SQL

2023-11-15 Thread Matt Topol
+1 On Wed, Nov 15, 2023, 10:44 AM Jean-Baptiste Onofré wrote: > +1 (non binding) > > Regards > JB > > On Wed, Nov 15, 2023 at 4:37 PM David Li wrote: > > > > My vote: +1 > > > > Are any PMC members able to give this a look? > > > > On Thu, Nov 9, 2023, at 04:36, Antoine Pitrou wrote: > > > For

Re: [ANNOUNCE] New Arrow PMC member: Raúl Cumplido

2023-11-13 Thread Matt Topol
Congratulations Raul!! On Mon, Nov 13, 2023, 3:09 PM Antoine Pitrou wrote: > > Welcome Raul, we're glad to have you! > > Regards > > Antoine. > > > Le 13/11/2023 à 20:27, Andrew Lamb a écrit : > > The Project Management Committee (PMC) for Apache Arrow has invited > > Raúl Cumplido to become a

Re: [VOTE][Format] C data interface format strings for Utf8View and BinaryView

2023-10-18 Thread Matt Topol
+1 On Wed, Oct 18, 2023 at 1:05 PM Antoine Pitrou wrote: > +1 > > Le 18/10/2023 à 19:02, Benjamin Kietzman a écrit : > > Hello all, > > > > I propose "vu" and "vz" as format strings for the Utf8View and > > BinaryView types in the Arrow C data interface [1]. > > > > The vote will be open for at

Re: Apache Arrow file format

2023-10-17 Thread Matt Topol
One benefit of the feather format (i.e. Arrow IPC file format) is the ability to mmap the file to easily handle reading sections of a larger than memory file of data. Since, as Felipe mentioned, the format is focused on in-memory representation, you can easily and simply mmap the file and use the

Re: [ANNOUNCE] New Arrow PMC member: Jonathan Keane

2023-10-14 Thread Matt Topol
Congrats Jon!!! On Sat, Oct 14, 2023, 1:42 PM David Li wrote: > Congrats Jon! > > On Sat, Oct 14, 2023, at 13:25, Ian Cook wrote: > > Congratulations Jonathan! > > > > On Sat, Oct 14, 2023 at 13:24 Andrew Lamb wrote: > > > >> The Project Management Committee (PMC) for Apache Arrow has invited

Re: [Vote][Format] (new proposal) C data interface format string for ListView and LargeListView arrays

2023-10-06 Thread Matt Topol
+1 On Fri, Oct 6, 2023, 6:55 PM Benjamin Kietzman wrote: > +1 > > On Fri, Oct 6, 2023, 17:27 Felipe Oliveira Carvalho > wrote: > > > Hello, > > > > I'm writing to propose "+vl" and "+vL" as format strings for list-view > and > > large list-view arrays passing through the Arrow C data interface

[RESULT][VOTE][Format] Add ListView and LargeListView Arrays to Arrow Format

2023-10-04 Thread Matt Topol
38C17-L238C17 > > > On Tue, 3 Oct 2023 at 00:22 Micah Kornfield wrote: > > > Sorry to chime in late. In practice I'm not sure how much LargeList is > > used? Are we doing this just for symmetry purposes? Is there a known > > use-case for it? > > > > On Mon,

[RESULT] [VOTE] [Format] Add app_metadata to FlightInfo and FlightEndpoint

2023-10-03 Thread Matt Topol
htEndpoint`s and `FlightData` chunks. > > > Le 12/09/2023 à 17:48, Matt Topol a écrit : > > Hey all, > > > > I would like to propose adding a new app_metadata field to both the > > FlightInfo and FlightEndpoint message types of the Arrow Flight protocol. > > There has

Re: [DISCUSS][C++] Raw pointer string views

2023-10-02 Thread Matt Topol
Given the benchmarks that Ben provided, I think I still have one concern if we only support the offset-based representation: @Raphael: > Conversion between the two view representations is relatively fast, especially for small strings I think this is a bit of an oversimplification given Ben's

Re: [VOTE][Format] Add ListView and LargeListView Arrays to Arrow Format

2023-10-02 Thread Matt Topol
Should have expanded my messages, i forgot that i already +1'd this d'oh! Sorry for the spam! --Matt On Mon, Oct 2, 2023 at 2:19 PM Matt Topol wrote: > +1 > > On Mon, Oct 2, 2023 at 8:54 AM Raphael Taylor-Davies > wrote: > >> +1 >> >> On 02/10

Re: [VOTE][Format] Add ListView and LargeListView Arrays to Arrow Format

2023-10-02 Thread Matt Topol
+1 On Mon, Oct 2, 2023 at 8:54 AM Raphael Taylor-Davies wrote: > +1 > > On 02/10/2023 13:53, Antoine Pitrou wrote: > > > > Hello, > > > > +1 and thanks for working on this! > > > > There'll probably be some minor comments to the format PR, but those > > don't deter from accepting these new

Re: [VOTE][Format] Variable shape tensor canonical extension type

2023-09-29 Thread Matt Topol
+1 Thanks for all the work here! On Fri, Sep 29, 2023 at 11:04 AM Dewey Dunnington wrote: > +1! Thank you for iterating on this with all of us! > > On Fri, Sep 29, 2023 at 11:28 AM Alenka Frim > wrote: > > > > +1 > > Thanks for pushing this through! > > > > On Wed, Sep 27, 2023 at 2:44 PM Rok

Re: [VOTE][Format] Add ListView and LargeListView Arrays to Arrow Format

2023-09-29 Thread Matt Topol
+1, thanks Felipe for your perseverance here! On Fri, Sep 29, 2023, 12:55 PM wish maple wrote: > +1 > > LGTM, thanks! > > Ian Cook 于2023年9月30日周六 00:49写道: > > > +1 (non-binding) > > > > Thanks very much Felipe for your persistence and your commitment to > > addressing the numerous questions and

Re: [DISCUSS][C++] Raw pointer string views

2023-09-26 Thread Matt Topol
I believe the motivation is to avoid the cost of the data copy that would have to happen to convert from a pointer based to offset based scenario. Allowing the pointer-based implementation will ensure that we can maintain zero-copy communication with both DuckDB and Velox in a common workflow

Re: [LAST CALL][DISCUSS] Unsigned integers in Utf8View

2023-09-20 Thread Matt Topol
Just to chime in (and add yet another voice into the mix here), I'd have a preference for it being signed integers for the same reasons as most everyone else: consistency with everything else in the spec. Since we use signed integers everywhere, I'd prefer to keep it consistent rather than

Re: [VOTE] [Format] Add app_metadata to FlightInfo and FlightEndpoint

2023-09-14 Thread Matt Topol
The PR has been updated for a bit with both C++ and Go implementations, hopefully I can get some more votes on this thread? On Tue, Sep 12, 2023 at 12:16 PM Matt Topol wrote: > The C++ code gets auto-generated during build right? Ah, fair point the > C++ still uses it's own objects. I'll

Re: [VOTE] [Format] Add app_metadata to FlightInfo and FlightEndpoint

2023-09-12 Thread Matt Topol
tation)? > > On Tue, Sep 12, 2023, at 11:48, Matt Topol wrote: > > Hey all, > > > > I would like to propose adding a new app_metadata field to both the > > FlightInfo and FlightEndpoint message types of the Arrow Flight protocol. > > There has been discussion of doing

Re: [VOTE] Release Apache Arrow Flight SQL adapter for PostgreSQL 0.1.0 - RC6

2023-09-12 Thread Matt Topol
+1 Though I ran into the same issue as David, but the verify script ran successfully On Tue, Sep 12, 2023 at 10:56 AM David Li wrote: > +1 > > Though, I couldn't figure out how to get run-postgresql.sh to work for my > setup (postgres installed via Conda), as initdb complained about the >

[VOTE] [Format] Add app_metadata to FlightInfo and FlightEndpoint

2023-09-12 Thread Matt Topol
Hey all, I would like to propose adding a new app_metadata field to both the FlightInfo and FlightEndpoint message types of the Arrow Flight protocol. There has been discussion of doing so for a while and has now been brought back up in regards to [1]. More specifically, this enables adding

Re: [Vote][Format] C Data Interface Format string for REE

2023-08-22 Thread Matt Topol
: > > > > +1 (binding) > > > > Cheers, > > > > -Jacob > > > > On Wed, Aug 16, 2023 at 8:16 AM Matt Topol > > > wrote: > > > > > Hey All, > > > > > > As proposed by Felipe [1] I'm starting a vote on the propos

Re: [Vote][Format] C Data Interface Format string for REE

2023-08-16 Thread Matt Topol
t would be nice to get approval from authors of other implementations > such as Rust, C#, Javascript... > > Thanks for doing this! > > > Le 16/08/2023 à 16:16, Matt Topol a écrit : > > Hey All, > > > > As proposed by Felipe [1] I'm starting a vote on the pro

[Vote][Format] C Data Interface Format string for REE

2023-08-16 Thread Matt Topol
Hey All, As proposed by Felipe [1] I'm starting a vote on the proposed update to the Format Spec of adding "+r" as the format string for passing Run-End Encoded arrays through the Arrow C Data Interface. A PR containing an update to the C++ Arrow implementation to add support for this format

Re: [Format] C data interface format string for run-end encoded arrays

2023-08-15 Thread Matt Topol
Sounds good, I'll send out an email starting the vote On Tue, Aug 15, 2023 at 2:30 PM Antoine Pitrou wrote: > > I think we should. > > Regards > > Antoine. > > > Le 15/08/2023 à 19:58, Matt Topol a écrit : > > I'm in favor of this as the C

Re: [Format] C data interface format string for run-end encoded arrays

2023-08-15 Thread Matt Topol
I'm in favor of this as the C Data format string. Though since this is technically a format/spec change do others think we should take a vote on this? --Matt On Tue, Aug 15, 2023, 12:19 PM Felipe Oliveira Carvalho wrote: > Hello, > > I'm writing to inform you that I'm proposing "+r" as format

Re: [VOTE] Apache Arrow ADBC (API) 1.1.0

2023-08-14 Thread Matt Topol
will be open for at least 72 hours. > > > > [ ] +1 Adopt the ADBC 1.1.0 specification > > [ ] 0 > > [ ] -1 Do not adopt the specification because... > > > > Thanks to Sutou Kouhei, Matt Topol, Dewey Dunnington, Antoine Pitrou, > Will > > Ayd, a

Re: [DISCUSS][Format] Draft implementation of string view array format

2023-07-31 Thread Matt Topol
defer selection than > >>>> baking > >>>>>> it into the array, but I also don't have any workloads where this is > >> the > >>>>>> major bottleneck so can't speak authoritatively here. > >>>>>> > >>>

Re: [QUESTION][BLOG] Contributing a Blog Post

2023-07-14 Thread Matt Topol
I think this would be a great idea! It's been great seeing various organizations posting on the Arrow blog and this would be a great contribution. Assuming that no one objects, you can contribute a PR to https://github.com/apache/arrow-site --Matt On Fri, Jul 14, 2023 at 10:17 AM Christopher

Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Matt Topol
I don't have much to add but I do want to second Jacob's comments. I agree that this is a good way to avoid the fragmentation while keeping Arrow relevant, and likely something we need to do so that we can ensure Arrow remains the way to do this data integration and interoperability. On Wed, Jul

Re: Do we need CODEOWNERS ?

2023-07-04 Thread Matt Topol
I've found it useful for me so far since it auto adds me on any Go related PRs so I don't need to sift through the notifications or active PRs, and instead can easily find them in my reviews on GitHub notifications. But if everyone else finds it more detrimental than helpful I can set up a custom

Re: [ANNOUNCE] New Arrow committer: Kevin Gurney

2023-07-04 Thread Matt Topol
Welcome! On Tue, Jul 4, 2023, 11:06 AM Joris Van den Bossche < jorisvandenboss...@gmail.com> wrote: > Congrats Kevin! > > On Tue, 4 Jul 2023 at 13:47, David Li wrote: > > > > Welcome Kevin! > > > > On Tue, Jul 4, 2023, at 05:55, Raúl Cumplido wrote: > > > Congratulations Kevin!!! > > > > > > El

Re: [VOTE][Format][Flight] Result set expiration support

2023-06-28 Thread Matt Topol
+1 Thanks kou! On Wed, Jun 28, 2023, 10:33 AM David Li wrote: > +1 > > Thanks Kou! > > On Tue, Jun 27, 2023, at 21:31, Sutou Kouhei wrote: > > +1 > > > > In <20230628.103017.2111667987485891680@clear-code.com> > > "[VOTE][Format][Flight] Result set expiration support" on Wed, 28 Jun > >

Re: [VOTE] Release Apache Arrow ADBC 0.5.1 - RC1

2023-06-23 Thread Matt Topol
+1 tested on Pop!_Os 22.04 with go 1.19 On Fri, Jun 23, 2023, 4:52 PM Sutou Kouhei wrote: > +1 > > I ran the following on Debian GNU/Linux sid: > > JAVA_HOME=/usr/lib/jvm/default-java \ > dev/release/verify-release-candidate.sh 0.5.1 1 > > with: > > * Python 3.11.4 > * g++ (Debian

Re: [ANNOUNCE] New Arrow PMC member: Dewey Dunnington

2023-06-23 Thread Matt Topol
Congrats Dewey!! On Fri, Jun 23, 2023, 9:35 AM Dane Pitkin wrote: > Congrats Dewey! > > On Fri, Jun 23, 2023 at 9:15 AM Nic Crane wrote: > > > Well-deserved Dewey, congratulations! > > > > On Fri, 23 Jun 2023 at 11:53, Vibhatha Abeykoon > > wrote: > > > > > Congratulations Dewey! > > > > > >

Re: [DISCUSS][Format][Flight] Result set expiration support

2023-06-22 Thread Matt Topol
> That said, I think it's reasonable to only have Cancel at the protocol level. I'd be in favor of only having Cancel too. In theory calling Cancel on something that has already completed should just be equivalent to calling Close anyways rather than requiring a client to guess and call Close if

[DISCUSS] ADBC 0.5.1 patch release?

2023-06-21 Thread Matt Topol
Given the upcoming Snowflake Summit talk on ADBC with the Snowflake driver, and potential deadlock condition addressed by [1], it might make sense for us to do a v0.5.1 patch release of ADBC. Unfortunately I only discovered the issue just as the voting for 0.5.0 closed and the release was

Re: [ANNOUNCE] New Arrow PMC member: Ben Baumgold,

2023-06-20 Thread Matt Topol
Congrats Ben! On Tue, Jun 20, 2023, 11:00 AM Weston Pace wrote: > Congratulations Ben! > > On Tue, Jun 20, 2023 at 7:38 AM Jacob Quinn > wrote: > > > Yay! Congrats Ben! Love to see more Julia folks here! > > > > -Jacob > > > > On Tue, Jun 20, 2023 at 4:15 AM Andrew Lamb > wrote: > > > > > The

Re: [VOTE] Release Apache Arrow ADBC 0.5.0 - RC0

2023-06-19 Thread Matt Topol
+1 Tested on Pop!_Os (Ubuntu 22.04) x86_64 On Mon, Jun 19, 2023, 10:55 AM Jacob Wujciak-Jens wrote: > +1 (nb) with conda on ubuntu > > On Mon, Jun 19, 2023 at 2:18 PM David Li wrote: > > > My vote: +1 (Ubuntu Linux 20.04/x86_64) > > > > On Fri, Jun 16, 2023, at 05:24, Raúl Cumplido wrote: > >

Re: [DISCUSS][Format] Draft implementation of string view array format

2023-06-15 Thread Matt Topol
Based on my understanding, in theory a buffer *could* be shared within a batch since the flatbuffers message just uses an offset and length to identify the buffers. That said, I don't believe any current implementation actually does this or takes advantage of this in any meaningful way. --Matt

Re: [ANNOUNCE] New Arrow PMC member: Jie Wen (jakevin / jackwener)

2023-06-12 Thread Matt Topol
Congrats Jie! On Sun, Jun 11, 2023 at 9:20 AM Andrew Lamb wrote: > The Project Management Committee (PMC) for Apache Arrow has invited > Jie Wen to become a PMC member and we are pleased to announce > that Jie Wen has accepted. > > Congratulations and welcome! >

Re: [ANNOUNCE] New Arrow committer: Mehmet Ozan Kabak

2023-06-08 Thread Matt Topol
Congrats! Welcome Ozan! On Thu, Jun 8, 2023 at 8:53 AM Raúl Cumplido wrote: > Congratulations and welcome! > > El jue, 8 jun 2023 a las 14:45, Metehan Yıldırım > () escribió: > > > > Congrats Ozan! > > > > On Thu, Jun 8, 2023 at 1:09 PM Andrew Lamb wrote: > > > > > On behalf of the Arrow PMC,

Re: [VOTE][Format] Add experimental ArrowDeviceArray to C-Data API

2023-06-05 Thread Matt Topol
from > > other stakeholder communities. > > > > On Mon, May 22, 2023 at 12:02 PM Matt Topol > wrote: > > > > > Hello, > > > > > > Now that there's a rough consensus and a toy example POC[1], I would > like > > > to propose a

Re: [VOTE][Format] Add experimental ArrowDeviceArray to C-Data API

2023-05-26 Thread Matt Topol
> > >> > > > >> > Antoine. > > >> > > > >> > > > >> > Le 23/05/2023 à 16:32, Antoine Pitrou a écrit : > > >> > > > > >> > > Depends on what we're voting on? > > >> >

Re: [VOTE][Format] Add experimental ArrowDeviceArray to C-Data API

2023-05-23 Thread Matt Topol
> > > The C declarations seem fine to me (I'm a bit lukewarm on the reserved > > > bits, but I understand the motivation), however I've posted comments as > > > to how to document the interface. The current PR entirely lacks a prose > > > description of the C Devi

[VOTE][Format] Add experimental ArrowDeviceArray to C-Data API

2023-05-22 Thread Matt Topol
Hello, Now that there's a rough consensus and a toy example POC[1], I would like to propose an official enhancement to the Arrow C-Data API specification as described in the PR[2]. The new ArrowDeviceArray/ArrowDeviceArrayStream structs would be considered "experimental" and the documentation

Re: [DISCUSS] Interest in a 12.0.1 patch?

2023-05-18 Thread Matt Topol
I think it's worthwhile enough to justify the work for the patch. If we do end up doing the patch, then we should also include this [1] change for the Go side which, while significant, I didn't believe to be significant enough to warrant a patch on its own. But it is definitely a good idea to

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-05-17 Thread Matt Topol
ntered this problem and have > > > proposed similar workarounds. > > > > > > * The changes to the stream interface are more than just "metadata" > > > > > > I did not look closely enough and realize that these changes are more > > > sub

Re: [Go] Scalar Question

2023-05-11 Thread Matt Topol
I don't know how many people are using the scalar package directly, but I'm definitely open to chatting about refactoring it. On Thu, May 11, 2023, 10:35 AM Yevgeny Pats wrote: > Hi Folks, > > I'm curious if anyone here is using the Go scalar >

Re: [ANNOUNCE] New Arrow committer: Marco Neumann

2023-05-11 Thread Matt Topol
Congrats Marco! On Thu, May 11, 2023 at 9:18 AM Joris Van den Bossche < jorisvandenboss...@gmail.com> wrote: > Congrats Marco! > > On Thu, 11 May 2023 at 15:05, Weston Pace wrote: > > > > Congratulations! > > > > On Thu, May 11, 2023 at 4:28 AM vin jake wrote: > > > > > Congratulations Marco!

Re: [VOTE] Release Apache Arrow ADBC 0.4.0 - RC0

2023-05-10 Thread Matt Topol
Using a manjaro linux image (in honor of the issues we found for Arrow v12 rc) I ran: USE_CONDA=1 ./dev/release/verify-release-candidate.sh 0.4.0 0 My first attempt failed because the default base image doesn't have make and such installed. should we install that via conda too since we install

[WEBSITE] [DISCUSS] Arrow-Site blog post

2023-04-28 Thread Matt Topol
Hey All, Yevgeny Pats has contributed a blog post to the Arrow Site via PR[1]. detailing his company's usage of Arrow for their type system. I've reviewed it and it looks good to me, but as I'm not a PMC member I didn't want to go merging it and having it get published without input from others

Re: [VOTE] Formalize how to change format

2023-04-26 Thread Matt Topol
+1 (Non-binding) On Wed, Apr 26, 2023 at 5:16 AM Joris Van den Bossche < jorisvandenboss...@gmail.com> wrote: > +1 > > On Wed, 26 Apr 2023 at 04:18, Sutou Kouhei wrote: > > > > Hi, > > > > I've added one more note about documentation: > > > > We must update the corresponding documentation

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-24 Thread Matt Topol
anks, > -- > kou > > In > "Re: [VOTE] Release Apache Arrow 12.0.0 - RC0" on Mon, 24 Apr 2023 > 20:08:59 -0400, > Matt Topol wrote: > > > I was able to replicate the same llvm issue that Jacob saw, does v12 make > > llvm-16 a requirement now? It looks

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-24 Thread Matt Topol
reason that would be the cause of this failure. Though looking through the cmake modules, I don't see why it would be requiring LLVM-16 and discounting 15.0.7, so I'm not sure what's going on yet. I'll try to dig a bit and see if i can come up with something. On Mon, Apr 24, 2023 at 5:27 PM Matt

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-24 Thread Matt Topol
@Jacob I'm currently seeing if i can replicate the Majaro failure you found via a docker image for manjaro. I'll report back If I am and what I figure out. On Mon, Apr 24, 2023 at 3:12 PM Raúl Cumplido wrote: > El lun, 24 abr 2023 a las 18:53, Will Jones > () escribió: > > > > I'm seeing

Re: [DISCUSS] Migrate s390x from Travis to ASF Jenkins

2023-04-20 Thread Matt Topol
ourse, a choice, > although I imagine it would be more work/require more input to do so > than to migrate a CI job. > > I use Arrow on s380x, although it's a bit of circular logic because > I'm using it to make sure nanoarrow works on big endian. > > On Thu, Apr 20, 2023 at 4:07 

Re: [DISCUSS] Migrate s390x from Travis to ASF Jenkins

2023-04-20 Thread Matt Topol
I just wanted to add on that there was a Go on s390x job too that needs to get migrated and wasn't on the list in Raul's original email. On Thu, Apr 20, 2023 at 2:42 PM Benson Muite wrote: > Might also consider testing farm for Centos Stream, Fedora and/or RHEL > builds[1][2]. > > 1)

Re: [ANNOUNCE] New Arrow committer: Ruihang Xia

2023-04-11 Thread Matt Topol
Congrats!! Welcome! On Tue, Apr 11, 2023, 11:29 PM Jacob Wujciak wrote: > Congratulations and welcome! > > On Mon, Apr 10, 2023 at 8:13 AM Wang Xudong > wrote: > > > Congratulations! > > > > Yang Jiang 于2023年4月10日周一 13:37写道: > > > > > > > > Congratulations !!! > > > > > > On 2023/04/09

Re: [CROWDSOURCING] Apache Arrow Board Report - April 12, 2023

2023-04-11 Thread Matt Topol
My apologies, I forgot to add updates for the Go section previously, I've added to the Google doc now for the Go updates. On Tue, Apr 11, 2023 at 9:29 AM Andrew Lamb wrote: > As a reminder, I will submit the ASF board report [1] tomorrow summarizing > the state of the project. Thank you to

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-10 Thread Matt Topol
rrow? > * instead of just repeating/vendoring the enum can we simply refer to it > and treat this as an opaque integer?) > 2. Providing an example of how you can tag arrays with metadata > > > > On Mon, Apr 10, 2023 at 9:30 AM Matt Topol wrote: > > > > The ArrowArray str

  1   2   >