[VOTE] Release Apache Arrow 8.0.0 - RC1

2022-04-27 Thread Krisztián Szűcs
Hi, I would like to propose the following release candidate (RC1) of Apache Arrow version 8.0.0. This is a release consisting of 581 resolved JIRA issues[1]. This release candidate is based on commit: 13625026ed4f1c6f4bbb136a0015580d2bf51506 [2] The source release rc1 is hosted at [3]. The binar

Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh

2022-04-27 Thread L. C. Hsieh
Thanks all! Looking forward to working with you on the project! On Wed, Apr 27, 2022 at 9:26 PM Gidon Gershinsky wrote: > > Congrats Liang-Chi! > > Cheers, Gidon > > > On Thu, Apr 28, 2022 at 4:17 AM Yang hao <1371656737...@gmail.com> wrote: > > > Congratulations Liang-Chi! > > > > From: Weston P

Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh

2022-04-27 Thread Gidon Gershinsky
Congrats Liang-Chi! Cheers, Gidon On Thu, Apr 28, 2022 at 4:17 AM Yang hao <1371656737...@gmail.com> wrote: > Congratulations Liang-Chi! > > From: Weston Pace > Date: Thursday, April 28, 2022 at 05:19 > To: dev@arrow.apache.org > Subject: Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh >

Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh

2022-04-27 Thread Yang hao
Congratulations Liang-Chi! From: Weston Pace Date: Thursday, April 28, 2022 at 05:19 To: dev@arrow.apache.org Subject: Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh Congratulations Liang-Chi! On Wed, Apr 27, 2022 at 9:54 AM Chao Sun wrote: > > Congrats Liang-Chi! well deserved! > > On We

Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh

2022-04-27 Thread Weston Pace
Congratulations Liang-Chi! On Wed, Apr 27, 2022 at 9:54 AM Chao Sun wrote: > > Congrats Liang-Chi! well deserved! > > On Wed, Apr 27, 2022 at 12:49 PM L. C. Hsieh wrote: > > > > Thank you, Andrew and Bryan, > > I'm pleased to become an Arrow committer. Looking forward to > > contributing more on

Re: [Compute][C++] Question on compute scheduler

2022-04-27 Thread Li Jin
I see. Yeah, spill to disk seems to be a reasonable approach. Hard back pressure does seem like it can lead to deadlocks. On Wed, Apr 27, 2022 at 4:55 PM Weston Pace wrote: > Our backpressure is best-effort. A push downstream will never > fail/block. Eventually, when sinks (or pipeline breakers)

Re: [Compute][C++] Question on compute scheduler

2022-04-27 Thread Weston Pace
Our backpressure is best-effort. A push downstream will never fail/block. Eventually, when sinks (or pipeline breakers) start to fill up, a pause message is sent to the source nodes. However, anything in progress will continue and should not be prevented from completing and pushing results upwards.

Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh

2022-04-27 Thread Chao Sun
Congrats Liang-Chi! well deserved! On Wed, Apr 27, 2022 at 12:49 PM L. C. Hsieh wrote: > > Thank you, Andrew and Bryan, > I'm pleased to become an Arrow committer. Looking forward to > contributing more on Apache Arrow! > > On Wed, Apr 27, 2022 at 12:34 PM Bryan Cutler wrote: > > > > Congratulat

Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh

2022-04-27 Thread L. C. Hsieh
Thank you, Andrew and Bryan, I'm pleased to become an Arrow committer. Looking forward to contributing more on Apache Arrow! On Wed, Apr 27, 2022 at 12:34 PM Bryan Cutler wrote: > > Congratulations!! That's great news and really glad to have you on the > project! > > On Wed, Apr 27, 2022, 11:44 A

Re: [ANNOUNCE] New Arrow committer: Liang-Chi Hsieh

2022-04-27 Thread Bryan Cutler
Congratulations!! That's great news and really glad to have you on the project! On Wed, Apr 27, 2022, 11:44 AM Andrew Lamb wrote: > On behalf of the Arrow PMC, I'm happy to announce that Liang-Chi Hsieh > has accepted an invitation to become a committer on Apache > Arrow. Welcome, and thank you

[ANNOUNCE] New Arrow committer: Liang-Chi Hsieh

2022-04-27 Thread Andrew Lamb
On behalf of the Arrow PMC, I'm happy to announce that Liang-Chi Hsieh has accepted an invitation to become a committer on Apache Arrow. Welcome, and thank you for your contributions! Andrew

Re: [Compute][C++] Question on compute scheduler

2022-04-27 Thread Li Jin
Thanks both! The ExecPlan Sequencing doc is interesting and close to the problem that we are trying to solve. (Ordered progressing) One thought is that I can see some cases for deadlock if we are not careful, for example (Filter Node -> Asof Join Node, assuming Asof Join node requires ordered inpu

Re: Arrow, Flight, Streaming, and Watermarking

2022-04-27 Thread David Li
Hey Matt, For Flight: for DoGet/DoPut/DoExchange, you can accomplish with the app_metadata fields built in to these methods. For instance, in DoGet/DoExchange, you could send some batches of data, then send a message with only an app_metadata field encoding the watermark. (The app_metadata fiel

Arrow, Flight, Streaming, and Watermarking

2022-04-27 Thread Matt Rudary
Hi, We're looking at using Arrow as part of our solution to ship tabular data between different streaming systems, potentially implemented using different technologies, like Spark, Beam, Flink, etc. Some of these systems contain "watermarks" as a key concept. Briefly, a watermark is a promise t

Re: Arrow sync call April 27 at 12:00 US/Eastern, 16:00 UTC

2022-04-27 Thread Benson Muite
Attendees: Ian Joiner Matthew Topol Benson Muite Discussion points: 1) New book on Arrow - covers C++, Python and Go, out in June 2) Building ORC bindings in R would be useful, extensions to parallel R? 3) Comparing ORC and Parquet for IO 4) IO optimization vs SIMD optimization - Parquet seems we

Re: Flight/FlightSQL Optimization for Small Results?

2022-04-27 Thread Micah Kornfield
Yes, next step is implementation which I've been delayed on. I hope to have a little time this week to work on it and will post an update. On Wed, Apr 27, 2022 at 7:48 AM David Li wrote: > Following up here - what are the next steps? The RFC PR looks fairly > complete, maybe we can help build o

Re: Flight/FlightSQL Optimization for Small Results?

2022-04-27 Thread David Li
Following up here - what are the next steps? The RFC PR looks fairly complete, maybe we can help build out implementations in C++/Java/other languages in preparation for a vote? On Wed, Mar 9, 2022, at 00:23, Micah Kornfield wrote: >> >> The operation flow would be like this, or what would it lo

Re: [VOTE] Release Apache Arrow 8.0.0 - RC0

2022-04-27 Thread Krisztián Szűcs
On Wed, Apr 27, 2022 at 5:03 AM Sutou Kouhei wrote: > > -1 > > There are some problems for RPM package and C GLib. I've > fixed them: > > * https://github.com/apache/arrow/pull/13002 > * https://github.com/apache/arrow/pull/13006 Thanks Kou! I'm going to cut another RC now. > > I'm still veri

Re: Arrow sync call April 27 at 12:00 US/Eastern, 16:00 UTC

2022-04-27 Thread Ian Cook
Thanks Benson! The Zoom meeting URL for this and other biweekly Arrow sync calls is: https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 Alternatively, enter this information into the Zoom website or app to join the call: Meeting ID: 876 4903 3008 Passcode: 958092 The Zoom meeting

Re: Arrow sync call April 27 at 12:00 US/Eastern, 16:00 UTC

2022-04-27 Thread David Li
Thanks Benson. If you are able to take notes this week that would be much appreciated. And thanks Joris for the clarification. On Wed, Apr 27, 2022, at 09:34, Joris Van den Bossche wrote: > As a small clarification: the zoom meeting link itself should still work > for anyone to join, it's only t

Re: Arrow sync call April 27 at 12:00 US/Eastern, 16:00 UTC

2022-04-27 Thread Joris Van den Bossche
As a small clarification: the zoom meeting link itself should still work for anyone to join, it's only there is no one from Voltron Data to lead the meeting / take notes (so I also won't be present today). Joris On Wed, 27 Apr 2022 at 13:05, Benson Muite wrote: > Hi, > > Can host if required, t

Re: Arrow sync call April 13 at 12:00 US/Eastern, 16:00 UTC

2022-04-27 Thread Benson Muite
On 4/25/22 2:49 PM, David Li wrote: Following up here: N.B. The Voltron Data folks have a scheduling conflict on 4/27 and will not be able to host the fortnightly sync call. Is anyone available to run the meeting that day? Is anyone available to run the sync call this Wednesday? On Wed, Ap

Re: Arrow sync call April 27 at 12:00 US/Eastern, 16:00 UTC

2022-04-27 Thread Benson Muite
Hi, Can host if required, though the timing is not ideal for me. It may be helpful to vary the timing in future. Benson On 4/25/22 2:49 PM, David Li wrote: Following up here: N.B. The Voltron Data folks have a scheduling conflict on 4/27 and will not be able to host the fortnightly sync c