[DataFusion] Projection pushdown and pushed down filters

2023-04-11 Thread Markus Appel
Hello, I hope this is the right place to ask this. While working on a project based on arrow-datafusion, I came across some weird behavior where a projection did not get eliminated as expected, thus breaking a custom optimizer rule's assumption (into which I won't go further, as it's not

Re: [ANNOUNCE] New Arrow committer: Ruihang Xia

2023-04-11 Thread Matt Topol
Congrats!! Welcome! On Tue, Apr 11, 2023, 11:29 PM Jacob Wujciak wrote: > Congratulations and welcome! > > On Mon, Apr 10, 2023 at 8:13 AM Wang Xudong > wrote: > > > Congratulations! > > > > Yang Jiang 于2023年4月10日周一 13:37写道: > > > > > > > > Congratulations !!! > > > > > > On 2023/04/09

Re: [ANNOUNCE] New Arrow committer: Ruihang Xia

2023-04-11 Thread Jacob Wujciak
Congratulations and welcome! On Mon, Apr 10, 2023 at 8:13 AM Wang Xudong wrote: > Congratulations! > > Yang Jiang 于2023年4月10日周一 13:37写道: > > > > > Congratulations !!! > > > > On 2023/04/09 11:25:19 Andrew Lamb wrote: > > > On behalf of the Arrow PMC, I'm happy to announce that Ruihang Xia > >

Re: [VOTE][Julia] Release Apache Arrow Julia 2.5.1 RC1

2023-04-11 Thread Jacob Quinn
Hmmm, I'm also on MacOS m1, but didn't have any issues running tests. David, is the error reproducible? We fixed an issue for this in [this commit]( https://github.com/apache/arrow-julia/commit/6d0ac4946f062414e2b60aa3d67c2875bb2e7958), but it's possible that our check for this condition wasn't

Re: [VOTE][Julia] Release Apache Arrow Julia 2.5.1 RC1

2023-04-11 Thread David Li
I had an issue during verification (macOS/AArch64) [1] The gist seems to be: ``` nested task error: ArgumentError: unsafe_wrap: pointer 0x293389438 is not properly aligned to 16 bytes Stacktrace: [1] #unsafe_wrap#100 @ ./pointer.jl:92 [inlined] [2]

[VOTE][Julia] Release Apache Arrow Julia 2.5.1 RC1

2023-04-11 Thread Sutou Kouhei
Hi, I would like to propose the following release candidate (RC1) of Apache Arrow Julia version 2.5.1. This release candidate is based on commit: 22088f1cb59bcd99fbffbf9d8248e491690dbfd9 [1] The source release rc1 is hosted at [2]. Please download, verify checksums and signatures, run the unit

Arrow community meeting April 12 at 16:00 UTC

2023-04-11 Thread Ian Cook
Hi all, Our biweekly Arrow community meeting is tomorrow at 16:00 UTC / 12:00 EDT. Zoom meeting URL: https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 Meeting ID: 876 4903 3008 Passcode: 958092 The notes for this and future instances of this meeting will be captured in this

Re: [CROWDSOURCING] Apache Arrow Board Report - April 12, 2023

2023-04-11 Thread Matt Topol
My apologies, I forgot to add updates for the Go section previously, I've added to the Google doc now for the Go updates. On Tue, Apr 11, 2023 at 9:29 AM Andrew Lamb wrote: > As a reminder, I will submit the ASF board report [1] tomorrow summarizing > the state of the project. Thank you to

Re: [VOTE][RUST][DataFusion] Release DataFusion Python Bindings 22.0.0 RC1

2023-04-11 Thread Jeremy Dyer
+1 (non-binding) Ran through verification script. Built conda packages manually and validated. Also included in 3rd party library and validated in working order. Thanks Andy! On Tue, Apr 11, 2023 at 9:39 AM Andrew Lamb wrote: > +1 > > Verified on x86 mac > > Thanks Andy > > Andrew > > On Mon,

Re: OpenTelemetry + Arrow

2023-04-11 Thread Laurent Quérel
Thank you very much Andrew. I should be able to work on the second article next week and I will follow the same process. Cheers, Laurent On Tue, Apr 11, 2023 at 4:31 AM Andrew Lamb wrote: > The blog post is now live on the arrow site [1] > > Thanks again Laurent > > [1]: > >

Re: [VOTE][RUST][DataFusion] Release DataFusion Python Bindings 22.0.0 RC1

2023-04-11 Thread Andrew Lamb
+1 Verified on x86 mac Thanks Andy Andrew On Mon, Apr 10, 2023 at 8:10 PM L. C. Hsieh wrote: > +1 (binding) > > Verified on Intel Mac. > > Thanks Andy. > > On Mon, Apr 10, 2023 at 4:47 PM Andy Grove wrote: > > > > Hi, > > > > I would like to propose a release of Apache Arrow DataFusion

Re: [CROWDSOURCING] Apache Arrow Board Report - April 12, 2023

2023-04-11 Thread Andrew Lamb
As a reminder, I will submit the ASF board report [1] tomorrow summarizing the state of the project. Thank you to everyone who has contributed content already. I encourage everyone who is interested in the goings on with Arrow to check it out -- there is lots going on in this project. Andrew

Re: Best practice on populating from VectorSchemaRoot to VectorSchemaRoot, ArrowStreamReader to ArrowStreamWriter

2023-04-11 Thread David Dali Susanibar Arce
Hi Wenbo Hu, Sorry to join late. Wenbo, what about the proposal mentioned in the Java Flight Cookbook (1). The method acceptPut will be an upstream with VectorUnloader needed, then getStream method will be a downstream with VectorLoader needed. Initially this cookbook use ArrowRecordBatch.

Re: OpenTelemetry + Arrow

2023-04-11 Thread Andrew Lamb
The blog post is now live on the arrow site [1] Thanks again Laurent [1]: https://arrow.apache.org/blog/2023/04/11/our-journey-at-f5-with-apache-arrow-part-1/ On Sun, Apr 2, 2023 at 9:07 PM Laurent Quérel wrote: > Hi Andrew, > > The feedback seems to be good so I created a PR. > >

Re: [DISCUSS] Acero roadmap / philosophy

2023-04-11 Thread Weston Pace
Yes, you could use Acero for this. However, I would hope that someday you could also use DuckDb and Datafusion to do the combining as well. In my mind an "engine" is something that takes a plan (Substrait) and zero or more input streams (Arrow C stream interface[1]) and has one output stream