Re: Subject: [VOTE] Release Apache Arrow 0.15.0 - RC1

2019-09-30 Thread Sutou Kouhei
If we don't care the rustfmt check for release, how about removing the check from dev/release/verify-release-candidate.sh? In "Re: Subject: [VOTE] Release Apache Arrow 0.15.0 - RC1" on Sun, 29 Sep 2019 17:40:20 -0700, Andy Grove wrote: > Actually, I think the RC was cut just before 1.40.0

Clarifying interpretation of Buffer "length" field in Arrow protocol

2019-09-30 Thread Wes McKinney
I just updated my pull request from May adding language to clarify what protocol writers are expected to set when producing the Arrow binary protocol https://github.com/apache/arrow/pull/4370 Implementations may allocate small buffers, or use memory which does not meet the 8-byte minimal padding

Re: [DISCUSS] C-level in-process array protocol

2019-09-30 Thread Wes McKinney
A couple things: * I think a C protocol / FFI for Arrow array/vectors would be better to have the same "shape" as an assembled array. Note that the C structs here have very nearly the same "shape" as the data structure representing a C++ Array object [1]. The disassembly and reassembly here is

Re: [DISCUSS] C-level in-process array protocol

2019-09-30 Thread Antoine Pitrou
FlatCC is still a dependency, with generated files etc. Perhaps you want to evaluate FlatCC on a schema-like example and see what the generated code and compile instructions look like? I'll point out again that the format string in my proposal uses an extremely simple mini-format, that should

Re: [DISCUSS] C-level in-process array protocol

2019-09-30 Thread Ben Kietzman
FlatCC seems germane: https://github.com/dvidelabs/flatcc It compiles flatbuffer schemas down to (idiomatic?) C Perhaps the schema and batch serialization problems should be solved by storing everything in the flatbuffer format. Then the results of running flatcc plus a few simple helpers can be

[jira] [Created] (ARROW-6747) [R] Bindings for Plasma object store

2019-09-30 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6747: --- Summary: [R] Bindings for Plasma object store Key: ARROW-6747 URL: https://issues.apache.org/jira/browse/ARROW-6747 Project: Apache Arrow Issue Type: New

[jira] [Created] (ARROW-6746) [CI] Run hadolint Dockerfile lint checks somewhere else

2019-09-30 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6746: --- Summary: [CI] Run hadolint Dockerfile lint checks somewhere else Key: ARROW-6746 URL: https://issues.apache.org/jira/browse/ARROW-6746 Project: Apache Arrow

[jira] [Created] (ARROW-6744) Export JsonEqual trait in the array module

2019-09-30 Thread Kyle McCarthy (Jira)
Kyle McCarthy created ARROW-6744: Summary: Export JsonEqual trait in the array module Key: ARROW-6744 URL: https://issues.apache.org/jira/browse/ARROW-6744 Project: Apache Arrow Issue Type:

Re: Parquet file reading performance

2019-09-30 Thread Wes McKinney
On Sat, Sep 28, 2019 at 3:16 PM Maarten Ballintijn wrote: > > Hi Joris, > > Thanks for your detailed analysis! > > We can leave the impact of the large DateTimeIndex on the performance for > another time. > (Notes: my laptop has sufficient memory to support it, no error is thrown, the >

Re: Unnesting ListArrays

2019-09-30 Thread Wes McKinney
hi Suhail -- well, unnesting produces an array of a different length. I would think that unnesting would mainly occur in the context of analytics, e.g. list_values.flatten().unique() We definitely would like to have APIs that help with doing analytics on nested data. I had hoped to get to work

Re: Build issues on macOS [newbie]

2019-09-30 Thread Wes McKinney
Thanks for letting us know. If there are any improvements we can make to the developer documentation, please feel free to open a JIRA or a pull request to fix On Mon, Sep 30, 2019 at 8:13 AM Tarek Allam Jr. wrote: > > > Hi Wes, > > Thank you very much, that indeed fixed things and allowed me to

[jira] [Created] (ARROW-6742) [C++] Remove usage of boost::filesystem::path from arrow/io/hdfs_internal.cc

2019-09-30 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6742: --- Summary: [C++] Remove usage of boost::filesystem::path from arrow/io/hdfs_internal.cc Key: ARROW-6742 URL: https://issues.apache.org/jira/browse/ARROW-6742 Project:

[jira] [Created] (ARROW-6740) Unable to delete closed MemoryMappedFile on Windows

2019-09-30 Thread Sergey Mozharov (Jira)
Sergey Mozharov created ARROW-6740: -- Summary: Unable to delete closed MemoryMappedFile on Windows Key: ARROW-6740 URL: https://issues.apache.org/jira/browse/ARROW-6740 Project: Apache Arrow

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-09-30-0

2019-09-30 Thread Krisztián Szűcs
wheel-osx-cp27m has filed with a Travis deployment error. Created a JIRA to resolve it https://issues.apache.org/jira/browse/ARROW-6739 On Mon, Sep 30, 2019 at 3:32 PM Crossbow wrote: > >

[jira] [Created] (ARROW-6739) [Packaging][Crossbow] Use crossbow.py upload-artifacts across all CI providers

2019-09-30 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-6739: -- Summary: [Packaging][Crossbow] Use crossbow.py upload-artifacts across all CI providers Key: ARROW-6739 URL: https://issues.apache.org/jira/browse/ARROW-6739

[NIGHTLY] Arrow Build Report for Job nightly-2019-09-30-0

2019-09-30 Thread Crossbow
Arrow Build Report for Job nightly-2019-09-30-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-30-0 Failed Tasks: - docker-r: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-30-0-circle-docker-r - docker-spark-integration:

Re: Build issues on macOS [newbie]

2019-09-30 Thread Tarek Allam Jr .
Hi Wes, Thank you very much, that indeed fixed things and allowed me to complete a build. After running conda install --file ci/conda_env_cpp.yml I was able to get passed the above error, but then was faced with the error message akin to that found at

[jira] [Created] (ARROW-6738) [Java] Fix problems with current union comparison logic

2019-09-30 Thread Liya Fan (Jira)
Liya Fan created ARROW-6738: --- Summary: [Java] Fix problems with current union comparison logic Key: ARROW-6738 URL: https://issues.apache.org/jira/browse/ARROW-6738 Project: Apache Arrow Issue

Re: [DISCUSS][Java] Reduce the range of synchronized block when releasing an ArrowBuf

2019-09-30 Thread Antoine Pitrou
I will just point out that using an atomic counter or boolean /outside/ of a locked section is a common pattern in C++. The benefit comes up if the locked section is conditional and the condition is rarely true. Regards Antoine. Le 30/09/2019 à 06:24, Jacques Nadeau a écrit : > For others

Re: Subject: [VOTE] Release Apache Arrow 0.15.0 - RC1

2019-09-30 Thread Krisztián Szűcs
On Mon, Sep 30, 2019 at 12:27 AM Wes McKinney wrote: > OK. I think an RC2 can be based off of the current master branch to > make things simple. Do any more patches need to be cherry-picked? > There are some other C# protocol-related bug fixes but they seem > incomplete > I'll use the master

Re: Subject: [VOTE] Release Apache Arrow 0.15.0 - RC1

2019-09-30 Thread Krisztián Szűcs
Hey Micah! On Sun, Sep 29, 2019 at 10:23 PM Micah Kornfield wrote: > Krisztián do you have availability to cut a new RC (I won't be able to > sort out my key issues for at least a week)? > Yes, I can cut RC2 later today. > > To answer Wes's question earlier in the thread it would be nice to