Re: [Discuss][Java] 64-bit lengths for ValueVectors

2019-08-22 Thread Micah Kornfield
> > I don't think we should couple this discussion with the implementation of > large list, etc since I think those two concepts are independent. I'm still trying to balance in my mind which is a worse experience for consumers of the libraries for these types. Claiming that Java supports these ty

Binary compatibility of pyarrow.serialize

2019-08-22 Thread Yevgeni Litvin
In our system we are using arrow serialization as it showed excellent deserialization speed. However, seems that we made a mistake by persisting the streams into a long-term storage as the serialized data appears to be incompatible between versions. According to the release notes of 0.14.0 it appea

[jira] [Created] (ARROW-6331) [Java] Incorporate ErrorProne into the java build

2019-08-22 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-6331: -- Summary: [Java] Incorporate ErrorProne into the java build Key: ARROW-6331 URL: https://issues.apache.org/jira/browse/ARROW-6331 Project: Apache Arrow Is

[jira] [Created] (ARROW-6330) [C++] Include missing headers in api.h

2019-08-22 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-6330: -- Summary: [C++] Include missing headers in api.h Key: ARROW-6330 URL: https://issues.apache.org/jira/browse/ARROW-6330 Project: Apache Arrow Issue Type: B

Re: [Discuss][Java] 64-bit lengths for ValueVectors

2019-08-22 Thread Jacques Nadeau
I don't think we should couple this discussion with the implementation of large list, etc since I think those two concepts are independent. I've asked some others on my team their opinions on the risk here. I think we should probably review some our more complex vector interactions and see how the

Re: [Discuss][Java] 64-bit lengths for ValueVectors

2019-08-22 Thread Jacques Nadeau
> > Hi Jacques, I hope you had a good rest. I did, thanks! On Fri, Aug 23, 2019 at 9:25 AM Jacques Nadeau wrote: > I don't think we should couple this discussion with the implementation of > large list, etc since I think those two concepts are independent. > > I've asked some others on my tea

Re: [DISCUSS][Java] Design of RLE vector

2019-08-22 Thread Fan Liya
Hi Micah, Sounds good. Thanks. I have prepared some initial code, in the hope that it will make discussions easier. Anyway, we can ignore it for now, until we have consensus. Best, Liya Fan On Fri, Aug 23, 2019 at 11:05 AM Micah Kornfield wrote: > I'm in favor of this, but still think we are

Re: [DISCUSS][Java] Design of RLE vector

2019-08-22 Thread Micah Kornfield
I'm in favor of this, but still think we are gather feedback on the proposal, so we should hold off on coding these up, until we have consensus on the approach. Thanks, Micah On Wed, Aug 21, 2019 at 9:22 PM Fan Liya wrote: > Hi Micah, > > Thanks for the comments. > By storing the run-length end

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-22 Thread Micah Kornfield
I created https://issues.apache.org/jira/browse/ARROW-6313 as a tracking issue with sub-issues on the development work. So far no-one has claimed Java and Javascript tasks. Would it make sense to have a separate dev branch for this work? Thanks, Micah On Thu, Aug 22, 2019 at 3:24 PM Wes McKinne

[jira] [Created] (ARROW-6329) [Format] Add 4-byte "stream continuation" to IPC message format to align Flatbuffers

2019-08-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6329: --- Summary: [Format] Add 4-byte "stream continuation" to IPC message format to align Flatbuffers Key: ARROW-6329 URL: https://issues.apache.org/jira/browse/ARROW-6329 Proj

[RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-22 Thread Wes McKinney
The vote carries with 4 binding +1 votes and 1 non-binding +1 I'll merge the specification patch later today and we can begin working on implementations so we can get this done for 0.15.0 On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler wrote: > > +1 (non-binding) > > On Tue, Aug 20, 2019, 7:43 AM

[jira] [Created] (ARROW-6328) Click.option-s should have help text

2019-08-22 Thread Ulzii O (Jira)
Ulzii O created ARROW-6328: -- Summary: Click.option-s should have help text Key: ARROW-6328 URL: https://issues.apache.org/jira/browse/ARROW-6328 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-6327) [Python] Conversion of pandas.SparseArray columns in pandas.DataFrames to pyarrow.Table and back

2019-08-22 Thread Rok Mihevc (Jira)
Rok Mihevc created ARROW-6327: - Summary: [Python] Conversion of pandas.SparseArray columns in pandas.DataFrames to pyarrow.Table and back Key: ARROW-6327 URL: https://issues.apache.org/jira/browse/ARROW-6327

[jira] [Created] (ARROW-6326) [C++] Nullable fields when converting std::tuple to Table

2019-08-22 Thread Omer Ozarslan (Jira)
Omer Ozarslan created ARROW-6326: Summary: [C++] Nullable fields when converting std::tuple to Table Key: ARROW-6326 URL: https://issues.apache.org/jira/browse/ARROW-6326 Project: Apache Arrow

[jira] [Created] (ARROW-6325) [Python] wrong conversion of DataFrame with boolean values

2019-08-22 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6325: Summary: [Python] wrong conversion of DataFrame with boolean values Key: ARROW-6325 URL: https://issues.apache.org/jira/browse/ARROW-6325 Project: Apac

[jira] [Created] (ARROW-6324) [C++] File system API should expand paths

2019-08-22 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6324: -- Summary: [C++] File system API should expand paths Key: ARROW-6324 URL: https://issues.apache.org/jira/browse/ARROW-6324 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-6323) [R] Expand file paths when passing to readers

2019-08-22 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6323: -- Summary: [R] Expand file paths when passing to readers Key: ARROW-6323 URL: https://issues.apache.org/jira/browse/ARROW-6323 Project: Apache Arrow Issue

[jira] [Created] (ARROW-6322) [C#] Implement a plasma client

2019-08-22 Thread Eric Erhardt (Jira)
Eric Erhardt created ARROW-6322: --- Summary: [C#] Implement a plasma client Key: ARROW-6322 URL: https://issues.apache.org/jira/browse/ARROW-6322 Project: Apache Arrow Issue Type: New Feature

Re: In-memory sorting of plasma objects

2019-08-22 Thread Wes McKinney
hi Tanveer, IIUC there is logic for moving data that's managed by Plasma servers between nodes in the Ray project (https://github.com/ray-project/ray) --if you need to move the bytes from one node to another you need to use some kind of messaging / RPC tool. The Ray developers might have some advi

In-memory sorting of plasma objects

2019-08-22 Thread Tanveer Ahmad - EWI
Hi, I need some help regarding data exchange between Arrow based plasma shared memory objects on cluster nodes. I have two Plasma shared memory objects each contains a RecordBatch on different nodes of a cluster. I want to use pandas dataframes or something like that (dask) on a single node

[jira] [Created] (ARROW-6321) [Python] Ability to create ExtensionBlock on conversion to pandas

2019-08-22 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-6321: Summary: [Python] Ability to create ExtensionBlock on conversion to pandas Key: ARROW-6321 URL: https://issues.apache.org/jira/browse/ARROW-6321 Proje

[jira] [Created] (ARROW-6320) [C++] Arrow utilities are linked statically

2019-08-22 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6320: - Summary: [C++] Arrow utilities are linked statically Key: ARROW-6320 URL: https://issues.apache.org/jira/browse/ARROW-6320 Project: Apache Arrow Issue Type