Re: [CI] Docker-compose refactor and GitHub Actions

2019-11-08 Thread Wes McKinney
Just to be sure, if this PR is merged, how many GHA tasks will be run on each commit to master? On Fri, Nov 8, 2019 at 12:07 PM Krisztián Szűcs wrote: > > I've trimmed down the number of triggered builds on pull requests by > converting them to run on master only or cron builds. Alsoe added > the

[jira] [Created] (ARROW-7103) [R] Various minor cleanups

2019-11-08 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-7103: -- Summary: [R] Various minor cleanups Key: ARROW-7103 URL: https://issues.apache.org/jira/browse/ARROW-7103 Project: Apache Arrow Issue Type: New Feature

[jira] [Created] (ARROW-7102) Make filesystem wrappers compatible with fsspec

2019-11-08 Thread Tom Augspurger (Jira)
Tom Augspurger created ARROW-7102: - Summary: Make filesystem wrappers compatible with fsspec Key: ARROW-7102 URL: https://issues.apache.org/jira/browse/ARROW-7102 Project: Apache Arrow Issue

Re: [Java] Call for reviewers

2019-11-08 Thread David Li
I took a look at #5630 (ARROW-6662) and #5751 (ARROW-7019). Best, David On 11/7/19, Micah Kornfield wrote: > There are a few open PRs that I think could either use a first or second > set of eyes: > > https://github.com/apache/arrow/pull/5630 > https://github.com/apache/arrow/pull/5645 > https:/

Re: [Discuss][FlightRPC] Extensions to Flight: "DoBidirectional"

2019-11-08 Thread David Li
I've updated the proposal. On the subject of Protobuf Any vs bytes, and how to handle errors/metadata, I still think using bytes is preferable: - It doesn't require (conditionally) exposing or wrapping Protobuf types, - We wouldn't be able to practically expose the Protobuf field to C++ users with

Re: [Java] Append multiple record batches together?

2019-11-08 Thread Bryan Cutler
I think having a chunked array with multiple vector buffers would be ideal, similar to C++. It might take a fair amount of work to add this but would open up a lot more functionality. As for the API, VectorSchemaRoot.concat(Collection) seems good to me. On Thu, Nov 7, 2019 at 12:09 AM Fan Liya wr

ConcatenateTables APIs

2019-11-08 Thread Zhuo Peng
Hi, https://github.com/apache/arrow/pull/5534 introduced ConcatenateTablesWithPromotion(). And there is already a ConcatenateTables() function which behaves differently (it requires the tables to have the schema). Wes raised a concern in that PR [1] that we might end up having many Concatenate

Re: [CI] Docker-compose refactor and GitHub Actions

2019-11-08 Thread Krisztián Szűcs
I've trimmed down the number of triggered builds on pull requests by converting them to run on master only or cron builds. Alsoe added the action filters including the changed path patterns. I've also collected ~30 follow up JIRAs aggregating the problems I came across during the refactor and possi

[jira] [Created] (ARROW-7101) [CI] Refactor docker-compose setup and use it with GitHub Actions

2019-11-08 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-7101: -- Summary: [CI] Refactor docker-compose setup and use it with GitHub Actions Key: ARROW-7101 URL: https://issues.apache.org/jira/browse/ARROW-7101 Project: Apache A

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-11-08-0

2019-11-08 Thread Neal Richardson
The homebrew-cpp failure is a little hard (for me) to determine which is the real error, but https://travis-ci.org/ursa-labs/crossbow/builds/609138711#L6672 seems to implicate aws-sdk-cpp, which was upgraded on homebrew-core a couple of days ago: https://github.com/Homebrew/homebrew-core/commits/m

[jira] [Created] (ARROW-7100) libjvm not found on ubuntu 19.04 with java >8

2019-11-08 Thread Alexis Mignon (Jira)
Alexis Mignon created ARROW-7100: Summary: libjvm not found on ubuntu 19.04 with java >8 Key: ARROW-7100 URL: https://issues.apache.org/jira/browse/ARROW-7100 Project: Apache Arrow Issue Type

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-11-08-0

2019-11-08 Thread Francois Saint-Jacques
Lint and Rust failures fixed (https://github.com/apache/arrow/commit/aa9f5c95253ef1fe713c5010f0a8f740ef284109) Gandiva failures fixed (https://github.com/apache/arrow/commit/1d23ec42fd786141b7de58a057d91c74ca19c32e) Centos7 failure fixed (https://github.com/apache/arrow/commit/5a47c5e8c2d5dba5eac52

Re: Merged C++ Parquet Encryption implementation PARQUET-1300

2019-11-08 Thread Gidon Gershinsky
Wes, Thank you for reviewing and merging this project. Regarding the note - we'll have interop testers in parquet-mr, so that cpp-written files, encrypted in various modes, would be tested by java readers - and vice versa. These manual tests could be run during development and ahead of releases. F

[NIGHTLY] Arrow Build Report for Job nightly-2019-11-08-0

2019-11-08 Thread Crossbow
Arrow Build Report for Job nightly-2019-11-08-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-11-08-0 Failed Tasks: - centos-7: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-11-08-0-azure-centos-7 - docker-lint: URL: https:/

Re: [Java] Question About Vector Allocation

2019-11-08 Thread Fan Liya
Hi Azim, I think we should be aware of two distinct concepts: 1. vector capacity: the max number of values that can be stored in the vector, without reallocation 2. vector length: the number of values actually filled in the vector For any valid vector, we always have vector length <= vector capa

[Java] Question About Vector Allocation

2019-11-08 Thread azim afroozeh
Hi everyone, I have a question about the Java implementation of Apache Arrow. Should we always call setValueCount after creating a vector with allocateNew()? I can see that in some tests where setValueCount is called immediately after allocateNew. For example here: https://github.com/apache/arrow