Re: [Release] Request for patch release of arrow 3.x

2021-03-12 Thread Andy Grove
Hi Prem, I'd like to learn more about your needs here. I mostly work on the Rust implementation but I work with the cuDF team in my day job at NVIDIA. I'll take a look at your PR on Monday. Thanks, Andy. On Fri, Mar 12, 2021, 3:47 PM Neal Richardson wrote: > I'll let others comment on a

Re: [Proposal] Improving PR tracking in Github with PR labels

2021-03-12 Thread Weston Pace
So I took some time to look this over today and consider the possibilities. I also explored the basic capabilities available from Github more and learned a few tricks and capabilities that I didn't realize existed. @Wes I'm not sure how the Spark dashboard is helpful. This may be a case of simpl

Re: [Release] Request for patch release of arrow 3.x

2021-03-12 Thread Neal Richardson
I'll let others comment on a patch release, but let me clarify: we do major releases quarterly, and 3.0.0 was in January, so 4.0.0 will be in April. Neal On Fri, Mar 12, 2021 at 2:45 PM Prem Sagar Gali wrote: > Hi Arrow Devs, > > I'm a maintainer of a project called cuDF ( > https://github.com/

[Release] Request for patch release of arrow 3.x

2021-03-12 Thread Prem Sagar Gali
Hi Arrow Devs, I'm a maintainer of a project called cuDF (https://github.com/rapidsai/cudf.git) that is based on the Arrow columnar format and depends on the Arrow C++ and Python libraries. Currently, we are pinned to version `1.0.1`, but we've gotten feedback from the community that they'd re

Re: [Rust][DataFusion] Query Engine Design / DataFusion Implementation talk

2021-03-12 Thread Aldrin
This is great, thanks! Aldrin Montana Computer Science PhD Student UC Santa Cruz On Fri, Mar 12, 2021 at 11:39 AM Andrew Lamb wrote: > Here are links to the content, should anyone be interested: > > Query Engine Design and the Rust-Based DataFusion in Apache Arrow > recording: https://www.yout

Re: [Rust][DataFusion] Query Engine Design / DataFusion Implementation talk

2021-03-12 Thread Andrew Lamb
Here are links to the content, should anyone be interested: Query Engine Design and the Rust-Based DataFusion in Apache Arrow recording: https://www.youtube.com/watch?v=K6eCAVEk4kU slides: (datafusion content starts on slide 6): https://www.slideshare.net/influxdata/influxdb-iox-tech-talks-query-e

Re: [JAVA] issues encountered during build

2021-03-12 Thread Micah Kornfield
> > I would think that this would show up in nightly builds. I guess I could > try older versions, or I'll keep tracking it down to the cause. I think by default most machines have a default timezone set to UTC, so it might not. On Fri, Mar 12, 2021 at 7:53 AM bobtins wrote: > > > On 2021/03/1

Re: [Rust] Patch release process

2021-03-12 Thread Neal Richardson
If you haven't already, you may need to re-create the 3.0.1 version in JIRA. I deleted it this week (along with 2.0.1) while doing some prep for 4.0.0 since it seemed like we weren't going to be doing a patch release given the proximity of the next major release. Apologies if that was a mistake. N

Re: [Python] Best practices when exposing options

2021-03-12 Thread Neal Richardson
Hi Ying, I'd suggest looking at how the other file readers and writers (CSV, Parquet, etc.) expose their options. I don't know pyarrow well enough myself to tell you what the answer is, but the answer is probably following whatever model is already there for those options. Neal On Fri, Mar 12, 20

Re: [JAVA] issues encountered during build

2021-03-12 Thread bobtins
On 2021/03/12 04:09:21, Micah Kornfield wrote: > > * Build does require Java 8, not "8 or later" as stated in java/README.md > > There's a reference to sun.misc.Unsafe > > in > > memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java > > which of course went away in

Re: [Rust] Patch release process

2021-03-12 Thread Andy Grove
Thanks for the reviews and edits so far on this. It looks like we have a process defined that should work well. I will create a PR to add this documentation to the repo this weekend and will create the release-rust-3.0 branch and start cherry-picking 3.0.1 bug fixes into that branch. If anyone has

Re: [DISCUSS][C++] Reduce usage of KernelContext in compute::

2021-03-12 Thread Antoine Pitrou
I wouldn't mind changing those APIs to return a Status. I'll also note that KernelContext::SetStatus() isn't thread-safe. Regards Antoine. Le 12/03/2021 à 11:40, Benjamin Kietzman a écrit : My primary point is that using KernelContext to hold error statuses is confusing since there are more

Re: [DISCUSS][C++] Reduce usage of KernelContext in compute::

2021-03-12 Thread Benjamin Kietzman
My primary point is that using KernelContext to hold error statuses is confusing since there are more places to check for an error condition. In the rest of the c++ library we use RETURN_NOT_OK or ARROW_ASSIGN_OR_RAISE to handle stack unwinding from an error, but in the presence of KernelContext it

[NIGHTLY] Arrow Build Report for Job nightly-2021-03-12-0

2021-03-12 Thread Crossbow
Arrow Build Report for Job nightly-2021-03-12-0 All tasks: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-03-12-0 Failed Tasks: - conda-linux-gcc-py36-aarch64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-03-12-0-drone-conda-linux

[Python] Best practices when exposing options

2021-03-12 Thread Ying Zhou
Hi, Currently I’m working on ARROW-11297 https://github.com/mathyingzhou/arrow/tree/ARROW-11297 ) which will be filed as soon as the current PR is merged. I managed to reimplement orc::WriterOptions in Arrow (with naming conventions Arr