some questions, please help

2019-10-30 Thread Yibo Cai
Hi, I'm new to Arrow. Would like to seek for help about some questions. Any comment is welcomed. - About source code tree, my understand is that "cpp" is the core arrow libraries, "c_glib, go, python, ..." are language bindings to ease integrating arrow into apps developed by that language.

Re: some questions, please help

2019-11-07 Thread Yibo Cai
Hi Wes, On 10/30/19 10:24 PM, Wes McKinney wrote: hi Yibo On Wed, Oct 30, 2019 at 2:16 AM Yibo Cai wrote: Hi, I'm new to Arrow. Would like to seek for help about some questions. Any comment is welcomed. - About source code tree, my understand is that "cpp" is the core arrow

Re: questions about Gandiva

2019-10-31 Thread Yibo Cai
Thanks Wes. Arrow is a very exciting project. I'm from Arm. We are interested in arrow and would like to study and help improving arrow. Yibo On 11/1/19 1:25 AM, Wes McKinney wrote: hi On Thu, Oct 31, 2019 at 12:11 AM Yibo Cai wrote: Hi, Arrow cpp integrates Gandiva to provide low level

Re: some questions, please help

2019-10-30 Thread Yibo Cai
it will likely be important to have explicit dynamic/runtime SIMD dispatching on certain hot paths as we build binaries that need to be able to run on both newer and older CPUs On Wed, Oct 30, 2019 at 7:25 AM Wes McKinney wrote: hi Yibo On Wed, Oct 30, 2019 at 2:16 AM Yibo Cai wrote: Hi, I'm new

questions about Gandiva

2019-10-30 Thread Yibo Cai
Hi, Arrow cpp integrates Gandiva to provide low level operations on arrow buffers. [1][2] I have some questions, any help is appreciated: - Arrow cpp already has a compute kernel[3], does it duplicate what Gandiva provides? I see a Jira talk about it.[4] - Is Gandiva only for arrow cpp? What

[Gandiva] How to optimize per CPU feature

2019-12-13 Thread Yibo Cai
Hi, Thanks to pravindra's patch [1], Gandiva loop vectorization is okay now. Will Gandiva detects CPU feature at runtime? My test CPU supports sse to avx2, but I only see "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" in IR, and final code doesn't leverage registers longer than 128. [1]

[Gandiva] question about IR optimization

2019-12-11 Thread Yibo Cai
Hi, I'm trying to figure out how Gandiva works by tracing unit test TestSimpleArichmetic[1]. I met with a problem about Gandiva IR generator and optimizer, would like to seek for help from community. I'm focusing on case "b+1", which adds 1 to each element of an int32 vector. I see there's a

Re: [Gandiva] How to optimize per CPU feature

2019-12-15 Thread Yibo Cai
On 12/13/19 7:45 PM, Ravindra Pindikura wrote: On Fri, Dec 13, 2019 at 3:41 PM Yibo Cai wrote: Hi, Thanks to pravindra's patch [1], Gandiva loop vectorization is okay now. Will Gandiva detects CPU feature at runtime? My test CPU supports sse to avx2, but I only see "target-features&qu

[C++][Compute] RFC: add SIMD support to C++ kernel

2019-12-19 Thread Yibo Cai
Hi, I'm investigating SIMD support to C++ compute kernel(not gandiva). A typical case is the sum kernel[1]. Below tight loop can be easily optimized with SIMD. for (int64_t i = 0; i < length; i++) { local.sum += values[i]; } Compiler already does loop vectorization. But it's done at

[C++]: cmake: about parallel build of third party modules

2020-01-01 Thread Yibo Cai
I noticed a fresh build always stuck at compiling protobuf for a long time. We've decided to use single job building for each third party module [1], partly because different thirty party modules are built concurrently (protobuf is built concurrently with jemalloc, but protobuf itself is built

Re: [C++] Runtime SIMD dispatching for Arrow

2020-05-12 Thread Yibo Cai
Thanks Wes, I'm glad to see this feature coming. From history talks, the main concern is runtime dispatcher may cause performance issue. Personally, I don't think it's a big problem. If we're using SIMD, it must be targeting some time consuming code. But we do need to take care some issues.

Re: [C++][Compute] RFC: add SIMD support to C++ kernel

2020-03-19 Thread Yibo Cai
>> > > Hi,> > >> > > I would recommend against reinventing the wheel. It would be possible> > > to reuse an existing C++ SIMD library. There are several of them (Vc,> > > xsimd, libsimdpp...). Of course, "just use Gandiva" is another possible

Re: [C++][Compute] RFC: add SIMD support to C++ kernel

2020-03-19 Thread Yibo Cai
On Thu, Mar 19, 2020 at 9:57 PM Yibo Cai wrote: I'm revisiting this old thread as I see some avx512 code merged recently[1]. Code maintenance will be non-trivial if we want to cover more hardware(sse/avx/avx512/neon/sve/...) and optimize more code in the future. #ifdef is obviously no-go. So I'm

[C++][Compute] question about aggregate kernels

2020-09-16 Thread Yibo Cai
Hi, I have a question about aggregate kernel implementation. Any help is appreciated. Aggregate kernel implements "consume" and "merge" interfaces. For a chunked array, "consume" is called for each array to get a temporary aggregated result, then "merge" it with previously consumed result.

Re: [C++][Compute] question about aggregate kernels

2020-09-17 Thread Yibo Cai
))/(count(x)-1))) (loosely translated from https://math.stackexchange.com/questions/102978/incremental-computation-of-standard-deviation ) On Wed, Sep 16, 2020 at 6:12 AM Yibo Cai wrote: Hi, I have a question about aggregate kernel implementation. Any help is appreciated. Aggregate kernel

Re: [C++][Compute] question about aggregate kernels

2020-09-17 Thread Yibo Cai
requires `update` and `merge` On Wed, Sep 16, 2020 at 12:12 PM Yibo Cai wrote: Hi, I have a question about aggregate kernel implementation. Any help is appreciated. Aggregate kernel implements "consume" and "merge" interfaces. For a chunked array, "consume" i

Re: [C++][Compute] question about aggregate kernels

2020-09-21 Thread Yibo Cai
for_calculating_variance#Computing_shifted_data [2] https://github.com/tdunning/t-digest On Thu, Sep 17, 2020 at 8:17 PM Yibo Cai wrote: Thanks Andrew. The link gives a cool method to calculate variance incrementally. I think the problem is that it's computationally too expensive (cannot leverage vector

Re: Flight benchmark question

2020-06-17 Thread Yibo Cai
Find a way to achieve reasonable benchmark result with multiple threads. Diff pasted below for a quick review or try. Tested on E5-2650, with this change: num_threads = 1, speed = 1996 num_threads = 2, speed = 3555 num_threads = 4, speed = 5828 When running `arrow_flight_benchmark`, I find

Flight benchmark question

2020-06-15 Thread Yibo Cai
I'm evaluating flight benchmark [1] on single host. Met with one problem. Would like to seek for help. Flight benchmark has a "num_threads" parameter [1] to set "number of current gets". Counter-intuitively, setting it to larger values drops performance, "arrow-flight-benchmark

Re: [DISCUSS][C++] Performance work and compiler standardization for linux

2020-06-22 Thread Yibo Cai
On 6/22/20 5:07 PM, Antoine Pitrou wrote: Le 22/06/2020 à 06:27, Micah Kornfield a écrit : There has been significant effort recently trying to optimize our C++ code. One thing that seems to come up frequently is different benchmark results between GCC and Clang. Even different versions of

Re: Flight benchmark question

2020-06-17 Thread Yibo Cai
Data Throughput over gRPC" in https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/ Kind Regards Chengxin Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Wednesday, June 17, 2020 8:35 AM, Yibo Cai wrote: Find a way to achieve reasonable benchmark result with

Re: [C++][Discuss] Approaches for SIMD optimizations

2020-06-10 Thread Yibo Cai
suboptimal situation (that is, unless the number of full-time developers and maintainers on Arrow C++ inflates significantly). Personally, I would like interested developers and contributors (such as Micah, Frank, Yibo Cai) to hash out the various possible approaches, and propose a way forward (which

Re: [C++][Discuss] Approaches for SIMD optimizations

2020-06-12 Thread Yibo Cai
On 6/12/20 2:30 PM, Micah Kornfield wrote: Hi Frank, Are the performance numbers you published for the baseline directly from master? I'd like to look at this over the next few days to see if I can figure out what is going on. To all: I'd like to make sure we flush out things to consider in

pass input args directly to kernel

2020-12-14 Thread Yibo Cai
Current kernel framework divides inputs (e.g. arrays, chunked arrays) into batches and feeds to kernel code. Does it make sense to pass input args directly to kernel? I'm writing quantile kernel, need to allocate buffer to record all inputs and find nth at last. For chunked array, input is

Re: Arrow Dataset API on Ceph

2021-06-21 Thread Yibo Cai
/adapters/arrow-rados-cls/docs/deploy.md On 2021/06/07 10:36:08, Yibo Cai wrote: Hi Jayjeet, It is exciting to see a real world computational storage solution built upon Arrow and Ceph. Amazing work! We are interesting in this project (I'm from Arm open source software team focusing on storage and big

Re: [ANNOUNCE] New Arrow PMC member: David M Li

2021-06-22 Thread Yibo Cai
Congrats David! On 6/22/21 8:56 PM, David Li wrote: Thanks everyone! I've learned a lot and had a great time contributing here, and I look forward to continuing to work with everybody. Best, David On 2021/06/22 10:54:08, Krisztián Szűcs wrote: Congrats David! On Tue, Jun 22, 2021 at 11:19

Re: [C++] Reducing branching in compute/kernels/vector_selection.cc

2021-06-24 Thread Yibo Cai
chmark results https://issues.apache.org/jira/browse/ARROW-13170 On Thu, Jun 24, 2021 at 12:27 AM Yibo Cai wrote: Did a quick test. For random bitmaps and my trivial test code, the branch-less code is 3.5x faster than branch one. https://quick-bench.com/q/UD22IIdMgKO9HU1PsPezj05Kkro On 6

Re: C++ RecordBatch Debugging Segmentation Fault

2021-05-20 Thread Yibo Cai
ch of the columns as soon as you create the RecordBatch (from one thread) which will force the boxed columns to materialize. -Weston On Thu, May 20, 2021 at 11:40 AM Wes McKinney wrote: Also, is it possible that the field is not an Int64Array? On Wed, May 19, 2021 at 10:19 PM Yibo Cai wrote:

Re: C++ RecordBatch Debugging Segmentation Fault

2021-05-19 Thread Yibo Cai
On 5/20/21 4:15 AM, Rares Vernica wrote: Hello, I'm using Arrow for accessing data outside the SciDB database engine. It generally works fine but we are running into Segmentation Faults in a corner multi-threaded case. I identified two threads that work on the same Record Batch. I wonder if

Re: Arrow Dataset API on Ceph

2021-06-07 Thread Yibo Cai
Hi Jayjeet, It is exciting to see a real world computational storage solution built upon Arrow and Ceph. Amazing work! We are interesting in this project (I'm from Arm open source software team focusing on storage and big data OSS), and would like to reproduce your works first, then evaluate

Re: [C++] Reducing branching in compute/kernels/vector_selection.cc

2021-06-23 Thread Yibo Cai
Did a quick test. For random bitmaps and my trivial test code, the branch-less code is 3.5x faster than branch one. https://quick-bench.com/q/UD22IIdMgKO9HU1PsPezj05Kkro On 6/23/21 11:21 PM, Wes McKinney wrote: One project I was interested in getting to but haven't had the time was introducing

Re: [ANNOUNCE] New Arrow committer: Weston Pace

2021-07-09 Thread Yibo Cai
Congrats Weston! From: Wes McKinney Sent: Friday, July 9, 2021 8:47 PM To: dev Subject: [ANNOUNCE] New Arrow committer: Weston Pace On behalf of the Arrow PMC, I'm happy to announce that Weston has accepted an invitation to become a committer on Apache Arrow.

RE: [C++] Indeterminate poor performance of random number generator

2021-04-22 Thread Yibo Cai
Yes, these soft-float math (in libm.so) makes Arm binary extremely slow. -Original Message- From: Antoine Pitrou Sent: Thursday, April 22, 2021 17:20 To: dev@arrow.apache.org Subject: Re: [C++] Indeterminate poor performance of random number generator Le 22/04/2021 à 03:38, Yibo Cai

Re: [C++] Indeterminate poor performance of random number generator

2021-04-22 Thread Yibo Cai
On 4/22/21 9:38 AM, Yibo Cai wrote: On 4/21/21 6:07 PM, Antoine Pitrou wrote: Le 21/04/2021 à 11:41, Yibo Cai a écrit : On 4/21/21 5:17 PM, Antoine Pitrou wrote: Le 21/04/2021 à 11:14, Yibo Cai a écrit : When running benchmarks on Arm64 servers, I find some benchmarks are extremely slow

[C++] adopting an SIMD library

2021-02-08 Thread Yibo Cai
This topic was talked in an earlier thread [1], but not landed yet. PR https://github.com/apache/arrow/pull/9424 optimizes ByteStreamSplit with Arm64 NEON, maybe it's a good chance to evaluate possibility of simplifying arch dependent SIMD code with an SIMD library. I did a quick comparison

Re: [DISCUSS][C++] Reduce usage of KernelContext in compute::

2021-03-11 Thread Yibo Cai
Beside reporting errors, maybe a kernel wants to allocate memory through KernelContext::memory_pool [1] in Kernel::init? I'm not quite sure if this is a valid case. Would like to hear other comments. [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernel.h#L95 Yibo On

Re: Requirements on JIRA usage in Apache Arrow

2021-03-02 Thread Yibo Cai
I prefer keeping Jira. Simply because I'm familiar with it and use it in daily work. I will log detailed progresses, findings and todos for non-trivial tasks in Jira comments. It does helps me. Yibo From: Sutou Kouhei Sent: Tuesday, March 2, 2021 9:47 AM To:

RE: [Rust] Contributing to Apache Arrow

2021-03-03 Thread Yibo Cai
Hi Ivan, I guess you didn't log in Jira? Otherwise you will see "Assign to me" link at the right pane. You can click "Log In" at the upper right corner, maybe "Sign up" an account if you don’t have. Yibo -Original Message- From: Ivan Vankov Sent: Wednesday, March 3, 2021 16:41 To:

Re: New committer: Yibo Cai

2021-03-07 Thread Yibo Cai
Sent from my iPhone On 06-Mar-2021, at 12:48 AM, Antoine Pitrou wrote:  Hello, The Project Management Committee (PMC) for Apache Arrow has invited Yibo Cai to become a committer and we are pleased to announce that he has accepted. Yibo is a frequent contributor to the C++ Arrow implementation

[C++] Indeterminate poor performance of random number generator

2021-04-21 Thread Yibo Cai
When running benchmarks on Arm64 servers, I find some benchmarks are extremely slow when built with clang. E.g., "ModeKernelNarrow/1048576/1" costs 90s to finish. I find almost all the time is spent in generating random bits (prepare test data)[1], not the test itself. Below sample code is

Re: [VOTE] Release Apache Arrow 4.0.0 - RC1

2021-04-20 Thread Yibo Cai
'gandiva-decimal-test' hangs on my machine, not sure if it's a blocker issue. Details at https://issues.apache.org/jira/browse/ARROW-12476 Test command "TEST_DEFAULT=0 TEST_SOURCE=1 TEST_CPP=1 dev/release/verify-release-candidate.sh source 4.0.0 1" On 4/19/21 10:50 PM, Krisztián Szűcs wrote:

Re: [C++] Indeterminate poor performance of random number generator

2021-04-21 Thread Yibo Cai
On 4/21/21 5:17 PM, Antoine Pitrou wrote: Le 21/04/2021 à 11:14, Yibo Cai a écrit : When running benchmarks on Arm64 servers, I find some benchmarks are extremely slow when built with clang. E.g., "ModeKernelNarrow/1048576/1" costs 90s to finish. I find almost all the tim

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-21 Thread Yibo Cai
+1 Verified C++ and Python on Arm64 Linux (Ubuntu-18.04). TEST_DEFAULT=0 TEST_SOURCE=1 TEST_CPP=1 TEST_PYTHON=1 dev/release/verify-release-candidate.sh source 4.0.0 3 On 4/22/21 5:30 AM, Krisztián Szűcs wrote: Hi, I would like to propose the following release candidate (RC3) of Apache Arrow

Re: [C++] Indeterminate poor performance of random number generator

2021-04-21 Thread Yibo Cai
On 4/21/21 6:07 PM, Antoine Pitrou wrote: Le 21/04/2021 à 11:41, Yibo Cai a écrit : On 4/21/21 5:17 PM, Antoine Pitrou wrote: Le 21/04/2021 à 11:14, Yibo Cai a écrit : When running benchmarks on Arm64 servers, I find some benchmarks are extremely slow when built with clang. E.g

RE: [NIGHTLY] Arrow Build Report for Job nightly-2021-08-24-0

2021-08-24 Thread Yibo Cai
Did a quick review. Listed error positions. Removed duplicated failures. *-osx-* https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=10372=logs=cf796865-97b7-5cd1-be8e-6e00ce4fd8cf=9f7de14c-8ff0-55c4-a998-d852f888262c=15 test-conda-python-3.6-pandas-0.23 (I remember there's a PR

Arrow in HPC

2021-09-09 Thread Yibo Cai
Hi, We have some rough ideas of applying Flight in HPC (High Performance Computation). Would like to hear comments. HPC infrastructure normally leverages RDMA for fast data transfer among storage nodes and compute nodes. Computation tasks are dispatched to compute nodes with best fit

Re: [Question] Allocations along 64 byte cache lines

2021-09-07 Thread Yibo Cai
), and the alignment does not impact the benches. Best, Jorge [1] https://stackoverflow.com/a/27184001/931303 On Tue, Sep 7, 2021 at 4:29 AM Yibo Cai wrote: Did a quick bench of accessing long buffer not 8 bytes aligned. Giving enough conditions, looks it does shows unaligned access has some penalty over

Re: [Question] Allocations along 64 byte cache lines

2021-09-06 Thread Yibo Cai
Did a quick bench of accessing long buffer not 8 bytes aligned. Giving enough conditions, looks it does shows unaligned access has some penalty over aligned access. But I don't think this is an issue in practice. Please be very skeptical to this benchmark. It's hard to get it right given the

Re: Arm64 github runner

2021-10-21 Thread Yibo Cai
unner" on Wed, 20 Oct 2021 12:43:16 +0800, Yibo Cai wrote: Hi, We have free Arm64 instances (maintained by Arm) as github action self-hosted runners for open source projects. Arrow Arm CI is currently running on Travis. Is an additional Arm64 runner useful? I think we can build and v

Arm64 github runner

2021-10-19 Thread Yibo Cai
Hi, We have free Arm64 instances (maintained by Arm) as github action self-hosted runners for open source projects. Arrow Arm CI is currently running on Travis. Is an additional Arm64 runner useful? I think we can build and verify Arm64 Linux releases on it. Yibo

Re: [VOTE] Release Apache Arrow 6.0.0 - RC3

2021-10-21 Thread Yibo Cai
+1 Verified c++/python source on ubuntu 20.04, aarch64 ARROW_CMAKE_OPTIONS="-DCMAKE_CXX_COMPILER=/usr/bin/clang++-10 -DCMAKE_C_COMPILER=/usr/bin/clang-10" TEST_DEFAULT=0 TEST_SOURCE=1 TEST_CPP=1 TEST_PYTHON=1 dev/release/verify-release-candidate.sh source 6.0.0 3 On 10/22/21 7:30 AM,

Re: Arrow in HPC

2021-12-29 Thread Yibo Cai
29, 2021, at 04:37, Yibo Cai wrote: Thanks David to initiate UCX integration, great work! I think 5Gbps network is too limited for performance evaluation. I will try the patch on 100Gb RDMA network, hopefully we can see some improvements. I once benchmarked flight over 100Gb network [1], grpc based

Re: Arrow in HPC

2021-12-29 Thread Yibo Cai
Thanks David to initiate UCX integration, great work! I think 5Gbps network is too limited for performance evaluation. I will try the patch on 100Gb RDMA network, hopefully we can see some improvements. I once benchmarked flight over 100Gb network [1], grpc based throughput is 2.4GB/s for one

Re: Arm64 github runner

2021-11-09 Thread Yibo Cai
runner. Please note for security reasons, the runner cannot be accessed directly from internet. Inbound connection requests are rejected. It can initiate connections to other hosts. Is it okay for crossbow? Any security concern from crossbow side? On 10/22/21 10:07 AM, Yibo Cai wrote: Thanks

Re: [VOTE] Release Apache Arrow 6.0.1 - RC1

2021-11-11 Thread Yibo Cai
+1. Verified c++ and python source, on ubuntu 20.04, aarch64. CC=clang-10 CXX=clang++-10 \ TEST_SOURCE=1 TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 \ dev/release/verify-release-candidate.sh source 6.0.1 1 On 11/11/21 10:39 AM, Sutou Kouhei wrote: Hi, I would like to propose the following

Re: Arrow in HPC

2021-10-26 Thread Yibo Cai
u, Sep 9, 2021, at 11:24, Jed Brown wrote: Yibo Cai writes: HPC infrastructure normally leverages RDMA for fast data transfer among storage nodes and compute nodes. Computation tasks are dispatched to compute nodes with best fit resources. Concretely, we are investigating porting UCX as Fl

Re: [DISCUSS][C++] Strategies for SIMD cross-compilation?

2021-07-18 Thread Yibo Cai
On 7/17/21 12:08 AM, Wes McKinney wrote: hi folks, I had a conversation with the developers of xsimd last week in Paris and was made aware that they are working on a substantial refactor of xsimd to improve its usability for cross-compilation and dynamic-dispatch based on runtime processor

Re: [VOTE] Release Apache Arrow 5.0.0 - RC1

2021-07-24 Thread Yibo Cai
+1 Verified C++ and Python on Arm64 Linux (Ubuntu 20.04, aarch64). ARROW_CMAKE_OPTIONS="-DCMAKE_CXX_COMPILER=/usr/bin/clang++-10 -DCMAKE_C_COMPILER=/usr/bin/clang-10" TEST_DEFAULT=0 TEST_SOURCE=1 TEST_CPP=1 TEST_PYTHON=1 dev/release/verify-release-candidate.sh source 5.0.0 1 On 7/23/21

Re: Arm64 github runner

2022-02-14 Thread Yibo Cai
Unfortunately the runner may not be available. Will update if things change. Yibo On 11/12/21 6:57 PM, Krisztián Szűcs wrote: On Wed, Nov 10, 2021 at 2:55 AM Yibo Cai wrote: Some updates, @kou, @kszucs There are two kinds of runners provided. One is dynamic vm created on demand like travis

RE: [C++] Replacing xsimd with compiler autovectorization

2022-03-29 Thread Yibo Cai
Hi Sasha, Thanks for the advice. I didn't quite catch the point. Would you explain a bit the purpose of this proposal? We do prefer compiler auto-vectorization to explicit simd code, even if the c++ code is slower than simd one (20% is acceptable IMO). And we do support runtime dispatch

RE: [VOTE] Release Apache Arrow 7.0.0 - RC10

2022-01-29 Thread Yibo Cai
+1. Verified C++ and Python source on Arm64 ubuntu20.04. CC=clang-12 CXX=clang++-12 TEST_SOURCE=1 TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 dev/release/verify-release-candidate.sh source 7.0.0 10 -Original Message- From: Krisztián Szűcs Sent: Saturday, January 29, 2022 7:29 PM To: dev

RE: [VOTE] Release Apache Arrow 7.0.0 - RC8

2022-01-27 Thread Yibo Cai
BitUtilTests.TestCopyAndReverseBitmapPreAllocated test failure is tracked at https://issues.apache.org/jira/browse/ARROW-15461 It's due to clang-12 compiler bug. PR https://github.com/apache/arrow/pull/12276 fixes the issue. -Original Message- From: Jonathan Keane Sent: Friday, January

RE: [VOTE] Release Apache Arrow 7.0.0 - RC8

2022-01-28 Thread Yibo Cai
Gandiva unit test failed on Arm (TEST_SOURCE=1 TEST_CPP=1), but the bug is not arch dependent, should be fixed in 7.0 release. Details at https://issues.apache.org/jira/browse/ARROW-15493, PR is ready. -Original Message- From: Krisztián Szűcs Sent: Wednesday, January 26, 2022 9:24 PM

Re: Arrow in HPC

2022-01-18 Thread Yibo Cai
. -David On Wed, Dec 29, 2021, at 22:16, Yibo Cai wrote: On 12/29/21 11:03 PM, David Li wrote: Awesome, thanks for sharing this too! The refactoring you have with DataClientStream what I would like to do as well - I think much of the existing code can be adapted to be more transport-agnostic

RE: [VOTE] Release Apache Arrow 7.0.0 - RC6

2022-01-26 Thread Yibo Cai
arrow-utility-test failed on both x86 and Arm if verify Arrow from source with clang-12. I believe it's a compiler bug and not a blocking issue. Details at https://issues.apache.org/jira/browse/ARROW-15461 -Original Message- From: Krisztián Szűcs Sent: Tuesday, January 25, 2022 2:03 AM

RE: [VOTE] Release Apache Arrow 14.0.0 - RC2

2023-10-25 Thread Yibo Cai
+1 Verified cpp/python/go on Arm64 ubuntu-20.04. No blocking issue found. TEST_DEFAULT=0 \ TEST_CPP=1 \ TEST_PYTHON=1 \ TEST_GO=1 \ dev/release/verify-release-candidate.sh 14.0.0 2 -Original Message- From: Sutou Kouhei Sent: Wednesday, October 25, 2023 14:03 To: dev@arrow.apache.org

RE: [ANNOUNCE] New Arrow committer: Xuwei Fu

2023-10-23 Thread Yibo Cai
Congrats Xuwei! -Original Message- From: Gang Wu Sent: Monday, October 23, 2023 13:29 To: dev@arrow.apache.org Subject: Re: [ANNOUNCE] New Arrow committer: Xuwei Fu Congrats Xuwei! Best, Gang On Mon, Oct 23, 2023 at 12:56 PM Sutou Kouhei wrote: > On behalf of the Arrow PMC, I'm

RE: Merge a pull request with GitHub API

2022-05-18 Thread Yibo Cai
+1 -Original Message- From: Sutou Kouhei Sent: Wednesday, May 18, 2022 11:43 AM To: dev@arrow.apache.org Subject: Merge a pull request with GitHub API Hi, How about using GitHub API instead of local "git merge" to merge a pull request? We use local "git merge" to merge a pull request

Re: [VOTE] Accept donation of Flight SQL JDBC driver

2022-06-30 Thread Yibo Cai
+1 On 7/1/22 04:20, David Li wrote: Hello, This vote is to determine if the Arrow PMC is in favor of accepting the donation of the Flight SQL JDBC driver. This process was deemed necessary since there was significant development prior to opening the pull request. This was discussed in a

RE: [VOTE] Release Apache Arrow 8.0.1 - RC0

2022-07-14 Thread Yibo Cai
+1, verified on arm64 -Original Message- From: Sutou Kouhei Sent: Friday, July 15, 2022 5:11 AM To: dev@arrow.apache.org Subject: [VOTE] Release Apache Arrow 8.0.1 - RC0 Hi, I would like to propose the following release candidate (RC0) of Apache Arrow version 8.0.1. This is a release

RE: [VOTE] Release Apache Arrow 7.0.1 - RC0

2022-07-14 Thread Yibo Cai
+1, verified on arm64 -Original Message- From: Sutou Kouhei Sent: Friday, July 15, 2022 5:11 AM To: dev@arrow.apache.org Subject: [VOTE] Release Apache Arrow 7.0.1 - RC0 Hi, I would like to propose the following release candidate (RC0) of Apache Arrow version 7.0.1. This is a release

RE: [VOTE] Release Apache Arrow 6.0.2 - RC0

2022-07-14 Thread Yibo Cai
+1, verified on arm64 -Original Message- From: Sutou Kouhei Sent: Friday, July 15, 2022 5:11 AM To: dev@arrow.apache.org Subject: [VOTE] Release Apache Arrow 6.0.2 - RC0 Hi, I would like to propose the following release candidate (RC0) of Apache Arrow version 6.0.2. This is a release

RE: [VOTE] Release Apache Arrow 8.0.0 - RC3

2022-05-04 Thread Yibo Cai
+1. Verified cpp/python/go source and apt binaries on ubuntu20.04, aarch64. TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 TEST_GO=1 dev/release/verify-release-candidate.sh 8.0.0 3 TEST_DEFAULT=0 TEST_APT=1 dev/release/verify-release-candidate.sh 8.0.0 3 -Original Message- From: Krisztián

Re: [VOTE] C++: switch to C++17

2022-08-24 Thread Yibo Cai
+1 (binding) On 8/25/22 04:03, Mauricio Vargas Sepúlveda wrote: +1 On 2022-08-24 12:50, Weston Pace wrote: +1 (non-binding) On Wed, Aug 24, 2022 at 9:24 AM Keith Kraus wrote: +1 (non-binding) On Wed, Aug 24, 2022 at 12:12 PM David Li wrote: +1 (binding) On Wed, Aug 24, 2022, at 12:06,

Re: [VOTE] Release Apache Arrow 10.0.0 - RC0

2022-10-23 Thread Yibo Cai
+1 Verified C++/Python/Go source on ubuntu20.04, aarch64. TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 TEST_GO=1 \ dev/release/verify-release-candidate.sh 10.0.0 0 On 10/21/22 14:06, Sutou Kouhei wrote: Hi, I would like to propose the following release candidate (RC0) of Apache Arrow version

nightly job failures

2022-09-26 Thread Yibo Cai
There are some nightly job failures [1]. Pasted some logs below, not sure if already reported. osx related --- https://github.com/ursacomputing/crossbow/actions/runs/3125303015/jobs/5069525838#step:7:747

Re: [ANNOUNCE] New Arrow PMC member: Weston Pace

2022-09-05 Thread Yibo Cai
Congratulations Weston! From: Andrew Lamb Sent: Monday, September 5, 2022 9:18 PM To: dev Subject: Re: [ANNOUNCE] New Arrow PMC member: Weston Pace Congratulations Weston! On Mon, Sep 5, 2022 at 8:09 AM Rok Mihevc wrote: > Congrats Weston! > > Rok > IMPORTANT

Re: [VOTE] Release Apache Arrow 9.0.0 - RC2

2022-08-02 Thread Yibo Cai
+1 (binding) Verified source (cpp/python/go) and wheels on ubuntu-20.04 aarch64. TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 TEST_GO=1 dev/release/verify-release-candidate.sh 9.0.0 2 TEST_DEFAULT=0 TEST_WHEELS=1 dev/release/verify-release-candidate.sh 9.0.0 2 On 7/30/22 07:10, Krisztián Szűcs

RE: [ANNOUNCE] New Arrow PMC chair: Andrew Lamb

2022-12-26 Thread Yibo Cai
Congratulations! -Original Message- From: Rok Mihevc Sent: Tuesday, December 27, 2022 7:57 AM To: dev@arrow.apache.org Subject: Re: [ANNOUNCE] New Arrow PMC chair: Andrew Lamb Congratulations Andrew! Rok On Mon, Dec 26, 2022 at 11:26 PM Neal Richardson < neal.p.richard...@gmail.com>

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-23 Thread Yibo Cai
+1 I ran the followings on Ubuntu-22.04, aarch64. TEST_DEFAULT=0 \ TEST_CPP=1 \ TEST_PYTHON=1 \ TEST_GO=1 \ dev/release/verify-release-candidate.sh 12.0.0 0 TEST_DEFAULT=0 \ TEST_WHEELS=1 \ dev/release/verify-release-candidate.sh 12.0.0 0 On 4/23/23 14:40, Sutou Kouhei wrote: +1

Re: [DISCUSS] The default commit message for merge button

2023-01-31 Thread Yibo Cai
+1 for title and description For purely personal reason. I care about commit messages and often try writing informative messages (even following pedantic rules like 72 chars length). It's a bit disappointed if the messages are not shown by `git log`. On 2/1/23 10:46, Jacob Wujciak wrote:

Re: [ANNOUNCE] New Arrow PMC member: Matt Topol

2023-05-03 Thread Yibo Cai
Congrats Matt! On 5/4/23 07:07, Krisztián Szűcs wrote: Congrats Matt! On Wed, May 3, 2023 at 11:44 PM Rok Mihevc wrote: Congrats Matt. Well deserved! Rok On Wed, May 3, 2023 at 11:03 PM David Li wrote: Congrats Matt! On Wed, May 3, 2023, at 16:06, Neal Richardson wrote:

Re: [VOTE] Release Apache Arrow 13.0.0 - RC0

2023-07-24 Thread Yibo Cai
+1. Verified c++/python/go source on Ubuntu-22.04 aarch64. TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 TEST_GO=1 \ dev/release/verify-release-candidate.sh 13.0.0 0 Met with a non-blocking issue: https://github.com/apache/arrow/issues/36860 On 7/21/23 17:49, Raúl Cumplido wrote: Hi, As discussed

RE: [VOTE] Release Apache Arrow 15.0.0 - RC1

2024-01-17 Thread Yibo Cai
+1 (binding) Verified cpp/python/go on ubuntu20.04, aarch64 TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 TEST_GO=1 dev/release/verify-release-candidate.sh 15.0.0 1 -Original Message- From: Raúl Cumplido Sent: Wednesday, January 17, 2024 18:58 To: dev@arrow.apache.org Subject: [VOTE]

Re: [ANNOUNCE] New Arrow PMC chair: Andy Grove

2023-11-27 Thread Yibo Cai
Congrats Andy! On 11/28/23 06:03, L. C. Hsieh wrote: Congrats Andy! Thanks Andrew for the efforts to lead the Arrow project in the past year! On Tue, Nov 28, 2023 at 3:51 AM Krisztián Szűcs wrote: Congrats Andy & Thanks Andrew! On Mon, Nov 27, 2023 at 6:55 PM Chao Sun wrote:

[jira] [Created] (ARROW-7404) [C++][Gandiva] Fix utf8 char length error on Arm64

2019-12-16 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-7404: --- Summary: [C++][Gandiva] Fix utf8 char length error on Arm64 Key: ARROW-7404 URL: https://issues.apache.org/jira/browse/ARROW-7404 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-7403) [C++][JSON] Enable Rapidjson on Arm64 Neon

2019-12-16 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-7403: --- Summary: [C++][JSON] Enable Rapidjson on Arm64 Neon Key: ARROW-7403 URL: https://issues.apache.org/jira/browse/ARROW-7403 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-7397) [C++] Json white space length detection error

2019-12-16 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-7397: --- Summary: [C++] Json white space length detection error Key: ARROW-7397 URL: https://issues.apache.org/jira/browse/ARROW-7397 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-7464) [C++][util]: Refine CpuInfo singleton with std::call_once

2019-12-22 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-7464: --- Summary: [C++][util]: Refine CpuInfo singleton with std::call_once Key: ARROW-7464 URL: https://issues.apache.org/jira/browse/ARROW-7464 Project: Apache Arrow Issue

[jira] [Created] (ARROW-7526) [C++][Compute]: Optimize small integer sorting

2020-01-09 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-7526: --- Summary: [C++][Compute]: Optimize small integer sorting Key: ARROW-7526 URL: https://issues.apache.org/jira/browse/ARROW-7526 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-7557) [C++][Compute] Validate sorting stability in random test

2020-01-12 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-7557: --- Summary: [C++][Compute] Validate sorting stability in random test Key: ARROW-7557 URL: https://issues.apache.org/jira/browse/ARROW-7557 Project: Apache Arrow Issue

[jira] [Created] (ARROW-7587) [C++][Compute] Add Top-k kernel

2020-01-15 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-7587: --- Summary: [C++][Compute] Add Top-k kernel Key: ARROW-7587 URL: https://issues.apache.org/jira/browse/ARROW-7587 Project: Apache Arrow Issue Type: New Feature

[jira] [Created] (ARROW-8129) [C++][Compute] Refine compare sorting kernel

2020-03-16 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-8129: --- Summary: [C++][Compute] Refine compare sorting kernel Key: ARROW-8129 URL: https://issues.apache.org/jira/browse/ARROW-8129 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-8440) Refine simd header files

2020-04-14 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-8440: --- Summary: Refine simd header files Key: ARROW-8440 URL: https://issues.apache.org/jira/browse/ARROW-8440 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-8438) [C++] arrow-io-memory-benchmark crashes

2020-04-14 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-8438: --- Summary: [C++] arrow-io-memory-benchmark crashes Key: ARROW-8438 URL: https://issues.apache.org/jira/browse/ARROW-8438 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-8537) [C++] Performance regression from ARROW-8523

2020-04-20 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-8537: --- Summary: [C++] Performance regression from ARROW-8523 Key: ARROW-8537 URL: https://issues.apache.org/jira/browse/ARROW-8537 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-8523) [C++] Optimize BitmapReader

2020-04-19 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-8523: --- Summary: [C++] Optimize BitmapReader Key: ARROW-8523 URL: https://issues.apache.org/jira/browse/ARROW-8523 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-8496) [C++] Refine ByteStreamSplitDecodeScalar

2020-04-17 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-8496: --- Summary: [C++] Refine ByteStreamSplitDecodeScalar Key: ARROW-8496 URL: https://issues.apache.org/jira/browse/ARROW-8496 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-8126) [C++][Compute] Add Top-K kernel benchmark

2020-03-15 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-8126: --- Summary: [C++][Compute] Add Top-K kernel benchmark Key: ARROW-8126 URL: https://issues.apache.org/jira/browse/ARROW-8126 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-8227) [C++] Propose refining SIMD code framework

2020-03-25 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-8227: --- Summary: [C++] Propose refining SIMD code framework Key: ARROW-8227 URL: https://issues.apache.org/jira/browse/ARROW-8227 Project: Apache Arrow Issue Type

  1   2   >