Re: Question about `minibatch`

2023-06-20 Thread Ruoxi Sun
Thanks Weston, that makes a lot of sense. Please let me rephrase to make sure I get this right. So the main purpose of minibatch is actually about keeping the working set within L1 (in addition with the side benefit of more chances to shortcut). This requires splitting the input batch into

Re: Question about `minibatch`

2023-06-20 Thread Weston Pace
Those goals are somewhat compatible. Sasha can probably correct me if I get this wrong but my understanding is that the minibatch is just large enough to ensure reliable vectorized execution. It is used in some innermost critical sections to both keep the working set small (fit in L1) and

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Sutou Kouhei
Hi, I think that you needed to specify "-DCMAKE_INSTALL_RPATH=${CONDA_PREFIX}/lib" when you build Apache Arrow C++. (Or "LD_LIBRARY_PATH=${CONDA_PREFIX}/lib dev/release/verify-release-candidate.sh ..." may work.) Thanks, -- kou In <8bfb0384-46f0-07f7-a510-2f2eb3134...@python.org> "Re:

Question about `minibatch`

2023-06-20 Thread Ruoxi Sun
Hi, By looking at acero code, I'm curious about the concept `minibatch` being used in swiss join and grouper. I wonder if its purpose is to proactively limit the memory size of the working set? Or is it the consequence of that the temp vector should be fix-sized (to avoid costly memory

Re: [DISCUSS][C++] Can we require CMake 3.16+ since 13.0.0?

2023-06-20 Thread Sutou Kouhei
Hi, > FYI3: We'll support Amazon Linux 2023 in Apache Arrow C++ > 13.0.0: > > https://github.com/apache/arrow/pull/36081 Merged. It seems that there is no objection for this. I'll proceed this in the next week. Thanks, -- kou In <20230616.061904.689752341473121297@clear-code.com>

Re: [DISCUSS][Format][Flight] Result set expiration support

2023-06-20 Thread Sutou Kouhei
Hi, David provided the Java implementation. Thanks! If anyone has any comments about this proposal, please share them. Thanks, -- kou In <20230619.151511.1159782462289578136@clear-code.com> "[DISCUSS][Format][Flight] Result set expiration support" on Mon, 19 Jun 2023 15:15:11 +0900

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Dane Pitkin
+1 (non-binding) Verified on MacOS (M1) using conda. A couple of nuances: * Had to uninstall gnupg in conda and used brew's gnupg instead (same issue Will found). * I initially encountered some intermittent CMake build timeouts with gtest, but haven't been able to reproduce. On Tue, Jun 20,

Re: [DISCUSS][Format] Draft implementation of string view array format

2023-06-20 Thread Weston Pace
Before I say anything else I'll say that I am in favor of this new layout. There is some existing literature on the idea (e.g. umbra) and your benchmarks show some nice improvements. Compared to some of the other layouts we've discussed recently (REE, list veiw) I do think this layout is more

Re: [ANNOUNCE] New Arrow PMC member: Ben Baumgold,

2023-06-20 Thread Matt Topol
Congrats Ben! On Tue, Jun 20, 2023, 11:00 AM Weston Pace wrote: > Congratulations Ben! > > On Tue, Jun 20, 2023 at 7:38 AM Jacob Quinn > wrote: > > > Yay! Congrats Ben! Love to see more Julia folks here! > > > > -Jacob > > > > On Tue, Jun 20, 2023 at 4:15 AM Andrew Lamb > wrote: > > > > > The

Re: [ANNOUNCE] New Arrow PMC member: Ben Baumgold,

2023-06-20 Thread Weston Pace
Congratulations Ben! On Tue, Jun 20, 2023 at 7:38 AM Jacob Quinn wrote: > Yay! Congrats Ben! Love to see more Julia folks here! > > -Jacob > > On Tue, Jun 20, 2023 at 4:15 AM Andrew Lamb wrote: > > > The Project Management Committee (PMC) for Apache Arrow has invited > > Ben Baumgold, to

Re: [ANNOUNCE] New Arrow PMC member: Ben Baumgold,

2023-06-20 Thread Jacob Quinn
Yay! Congrats Ben! Love to see more Julia folks here! -Jacob On Tue, Jun 20, 2023 at 4:15 AM Andrew Lamb wrote: > The Project Management Committee (PMC) for Apache Arrow has invited > Ben Baumgold, to become a PMC member and we are pleased to announce > that Ben Baumgold has accepted. > >

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Antoine Pitrou
I don't have much time to investigate and I don't think it's a blocker either way. Perhaps there's room for improvement on the Arrow C++ side as well... Le 20/06/2023 à 15:40, Dewey Dunnington a écrit : Thanks for verifying! I don't *think* there is anything non-standard about the

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Dewey Dunnington
Thanks for verifying! I don't *think* there is anything non-standard about the `find_package(Arrow)` / `target_link_libraries(..., arrow_shared)` sequence used to link the tests (although clearly they aren't working as intended!). You can pass extra arguments to CMake to help it find the right

[RESULT][VOTE][RUST] Release Apache Arrow Rust 42.0.0 RC1

2023-06-20 Thread Andrew Lamb
With 5 +1 votes(4 binding) the release is approved! The release is available here: https://dist.apache.org/repos/dist/release/arrow/arrow-rs-42.0.0 As well as crates.io: https://crates.io/crates/arrow/42.0.0 (and similar) Thanks to everyone who contributed and voted on this release. Andrew

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Antoine Pitrou
Ok, now running from the right repo :-), I get linker errors against Arrow C++ dependencies: [ 44%] Linking CXX executable utils_test /home/antoine/mambaforge/envs/pyarrow/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: warning: libcrypto.so.3, needed

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Antoine Pitrou
Ouch, please disregard this message, I was running the script from the wrong repo :-( Le 20/06/2023 à 14:24, Antoine Pitrou a écrit : Hello, I tried to run the verification script and got the following error: https://gist.github.com/pitrou/b2c77f3d7836d92cb6d589c735f98d5d """ gpg: Total

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Antoine Pitrou
Hello, I tried to run the verification script and got the following error: https://gist.github.com/pitrou/b2c77f3d7836d92cb6d589c735f98d5d """ gpg: Total number processed: 18 gpg: unchanged: 18 curl: (22) The requested URL returned error: 404 Failed to verify release candidate.

[ANNOUNCE] Apache Arrow ADBC 0.5.0 released

2023-06-20 Thread David Li
The Apache Arrow community is pleased to announce the 0.5.0 release of the Apache Arrow ADBC libraries. It includes 37 resolved GitHub issues ([1]). The release is available now from [2] and [3]. Release notes are available at:

Re: [ANNOUNCE] New Arrow PMC member: Ben Baumgold,

2023-06-20 Thread Alenka Frim
Congratulations Ben! On Tue, Jun 20, 2023 at 1:54 PM David Li wrote: > Welcome Ben! > > On Tue, Jun 20, 2023, at 06:14, Andrew Lamb wrote: > > The Project Management Committee (PMC) for Apache Arrow has invited > > Ben Baumgold, to become a PMC member and we are pleased to announce > > that Ben

Re: [VOTE] Release Apache Arrow ADBC 0.5.0 - RC0

2023-06-20 Thread David Li
The vote passes with 7 +1 votes (3 binding, 4 non-binding). Thanks all! Post-release tasks: [x] Close the GitHub milestone/project [ ] Add the new release to the Apache Reporter System [ ] Upload source release artifacts to Subversion [ ] Create the final GitHub release [ ] Update website [ ]

Re: [ANNOUNCE] New Arrow PMC member: Ben Baumgold,

2023-06-20 Thread David Li
Welcome Ben! On Tue, Jun 20, 2023, at 06:14, Andrew Lamb wrote: > The Project Management Committee (PMC) for Apache Arrow has invited > Ben Baumgold, to become a PMC member and we are pleased to announce > that Ben Baumgold has accepted. > > Congratulations and welcome!

Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Raúl Cumplido
+1 (non-binding) I've run: ./verify-release-candidate.sh 0.2.0 1 on Ubuntu 22.04 with conda: * arrow-cpp 12.0.0 * gcc (conda-forge gcc 11.4.0-0) 11.4.0 * r-base 4.2.3 Thanks, Raúl El mar, 20 jun 2023 a las 1:55, Sutou Kouhei () escribió: > > +1 > > I ran the following command line on Debian

[ANNOUNCE] New Arrow PMC member: Ben Baumgold,

2023-06-20 Thread Andrew Lamb
The Project Management Committee (PMC) for Apache Arrow has invited Ben Baumgold, to become a PMC member and we are pleased to announce that Ben Baumgold has accepted. Congratulations and welcome!