[Java] Append multiple record batches together?

2019-11-06 Thread Micah Kornfield
Hi, A colleague opened up https://issues.apache.org/jira/browse/ARROW-7048 for having similar functionality to the python APIs that allow for creating one larger data structure from a series of record batches. I just wanted to surface it here in case: 1. An efficient solution already exists? It

[jira] [Created] (ARROW-7083) [C++] Determine the feasibility and build a prototype to replace compute/kernels with gandiva kernels

2019-11-06 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-7083: -- Summary: [C++] Determine the feasibility and build a prototype to replace compute/kernels with gandiva kernels Key: ARROW-7083 URL:

Re: [DISCUSS] Result vs Status

2019-11-06 Thread Micah Kornfield
This seems reasonable to me. Give the impact of the API changes I think it might be worth keeping around for ~3 releases, but I think we are generally slow to delete deprecated APIs anyways. Any other thoughts on this? i can try to open up some tracking JIRAs for the work involved. On Wed, Oct

Re: Saving Binary Arrow memory objects as blobs in Cassandra

2019-11-06 Thread Wes McKinney
I suggest you use the IPC protocol http://arrow.apache.org/docs/python/ipc.html This protocol will be considered stable starting with the 1.0.0 release but I would guess (without making any guarantees) that blobs written with 0.15.1 will be readable in 1.0.0 and beyond. On Wed, Nov 6, 2019 at

[jira] [Created] (ARROW-7082) [Packaging][deb] Add apache-arrow-archive-keyring

2019-11-06 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-7082: --- Summary: [Packaging][deb] Add apache-arrow-archive-keyring Key: ARROW-7082 URL: https://issues.apache.org/jira/browse/ARROW-7082 Project: Apache Arrow Issue

[jira] [Created] (ARROW-7081) [R] Add methods for introspecting parquet files

2019-11-06 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-7081: --- Summary: [R] Add methods for introspecting parquet files Key: ARROW-7081 URL: https://issues.apache.org/jira/browse/ARROW-7081 Project: Apache Arrow Issue

Re: Achieving parity with Java extension types in Python

2019-11-06 Thread Justin Polchlopek
Hi. I'm looking into this issue and I have some questions as someone new to the project. The comment from Joris earlier in the thread suggests that the solution here is to create an Array subclass for each extension type that wants to use one. This will give a nice symmetry w.r.t. the Java

[jira] [Created] (ARROW-7080) [Python][Parquet] Expose parquet field_id in Schema objects

2019-11-06 Thread Ted Gooch (Jira)
Ted Gooch created ARROW-7080: Summary: [Python][Parquet] Expose parquet field_id in Schema objects Key: ARROW-7080 URL: https://issues.apache.org/jira/browse/ARROW-7080 Project: Apache Arrow

[jira] [Created] (ARROW-7079) [C++][Dataset] Implement ScalarAsStatisctics for non-primitive types

2019-11-06 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7079: - Summary: [C++][Dataset] Implement ScalarAsStatisctics for non-primitive types Key: ARROW-7079 URL: https://issues.apache.org/jira/browse/ARROW-7079

Saving Binary Arrow memory objects as blobs in Cassandra

2019-11-06 Thread Lee, David
Is there anyway to save Arrow memory as a blob? I tried using Feather and Parquet, but neither one supports writing complex nested structures yet. I tried with the following test file. test.jsonl: {"a": 1, "b": "abc", "c": [1, 2], "d": {"e": true, "f": "1991-02-03"}, "g": [{"h": 1, "i": "a"},

Re: [DISCUSS] Dictionary Encoding Clarifications/Future Proofing

2019-11-06 Thread Wes McKinney
Just bumping this thread for more comments On Wed, Oct 30, 2019 at 3:11 PM Wes McKinney wrote: > > Returning to this discussion as there seems to lack consensus in the vote > thread > > Copying Micah's proposals in the VOTE thread here, I wanted to state > my opinions so we can discuss further

[jira] [Created] (ARROW-7078) [Developer] Add Windows utility script to use Dependencies.exe to dump DLL dependencies for diagnostic purposes

2019-11-06 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-7078: --- Summary: [Developer] Add Windows utility script to use Dependencies.exe to dump DLL dependencies for diagnostic purposes Key: ARROW-7078 URL:

[jira] [Created] (ARROW-7077) [C++] Unsupported Dict->T cast crashes instead of returning error

2019-11-06 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-7077: - Summary: [C++] Unsupported Dict->T cast crashes instead of returning error Key: ARROW-7077 URL: https://issues.apache.org/jira/browse/ARROW-7077 Project: Apache

[NIGHTLY] Arrow Build Report for Job nightly-2019-11-06-0

2019-11-06 Thread Crossbow
Arrow Build Report for Job nightly-2019-11-06-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-11-06-0 Failed Tasks: - gandiva-jar-osx: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-11-06-0-travis-gandiva-jar-osx -

[jira] [Created] (ARROW-7076) `pip install pyarrow` with python 3.8 fail with message : Could not build wheels for pyarrow which use PEP 517 and cannot be installed directly

2019-11-06 Thread Fabien (Jira)
Fabien created ARROW-7076: - Summary: `pip install pyarrow` with python 3.8 fail with message : Could not build wheels for pyarrow which use PEP 517 and cannot be installed directly Key: ARROW-7076 URL:

[jira] [Created] (ARROW-7074) [C++] ASSERT_OK_AND_ASSIGN crashes when failing

2019-11-06 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-7074: - Summary: [C++] ASSERT_OK_AND_ASSIGN crashes when failing Key: ARROW-7074 URL: https://issues.apache.org/jira/browse/ARROW-7074 Project: Apache Arrow Issue

[jira] [Created] (ARROW-7073) [Java] Support concating vectors values in batch

2019-11-06 Thread Liya Fan (Jira)
Liya Fan created ARROW-7073: --- Summary: [Java] Support concating vectors values in batch Key: ARROW-7073 URL: https://issues.apache.org/jira/browse/ARROW-7073 Project: Apache Arrow Issue Type: New

[jira] [Created] (ARROW-7072) [Java] Support concating validity bits efficiently

2019-11-06 Thread Liya Fan (Jira)
Liya Fan created ARROW-7072: --- Summary: [Java] Support concating validity bits efficiently Key: ARROW-7072 URL: https://issues.apache.org/jira/browse/ARROW-7072 Project: Apache Arrow Issue Type: