Re: Using Plasma with xnd

2018-03-06 Thread Wes McKinney
hi Saul -- I think the easiest solution here is the buffer/memoryview protocol. You won't have to touch the Cython or C++ API from pyarrow if you do this. You can interact with a Buffer object like any other Python object implementing the buffer protocol. See numpy.frombuffer as an example of a

RE: Parquet to arrow java converter

2018-03-06 Thread Wenbo Zhao
Thanks Julien and Wes. There is an ongoing PR https://github.com/apache/parquet-mr/pull/443 (update Arrow version to 0.8.0) which I may be depending on. Should I wait for this? Wenbo -Original Message- From: Julien Le Dem [mailto:julien.le...@gmail.com] Sent: Tuesday, March 6, 2018

[jira] [Created] (ARROW-2283) [C++] Support Arrow C++ installed in /usr detection by pkg-config

2018-03-06 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-2283: --- Summary: [C++] Support Arrow C++ installed in /usr detection by pkg-config Key: ARROW-2283 URL: https://issues.apache.org/jira/browse/ARROW-2283 Project: Apache Arrow

Re: [VOTE] Accept donation of Arrow Go implementation

2018-03-06 Thread Kouhei Sutou
+1 In "Re: [VOTE] Accept donation of Arrow Go implementation" on Tue, 6 Mar 2018 15:46:31 -0500, Li Jin wrote: > +1 > > On Tue, Mar 6, 2018 at 3:31 PM, Uwe L. Korn wrote: > >>

Re: Parquet to arrow java converter

2018-03-06 Thread Julien Le Dem
I would put in the parquet-mr codebase. I have contributed the schéma conversion code there. I’m happy to provide feedback on PRs in this area. Julien > On Mar 6, 2018, at 12:18, Wes McKinney wrote: > > When it had been discussed in the past, the thinking had been to >

Re: Using Plasma with xnd

2018-03-06 Thread Saul Shanabrook
Hey Wes, I don't have much experience doing C + Python + Cython development, so I am probably missing something obvious, but reading the Cython docs, it seems like I can only access types marked as

[jira] [Created] (ARROW-2282) [Python] Create StringArray from buffers

2018-03-06 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2282: -- Summary: [Python] Create StringArray from buffers Key: ARROW-2282 URL: https://issues.apache.org/jira/browse/ARROW-2282 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-2281) [Python] Expose MakeArray to construct arrays from buffers

2018-03-06 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2281: -- Summary: [Python] Expose MakeArray to construct arrays from buffers Key: ARROW-2281 URL: https://issues.apache.org/jira/browse/ARROW-2281 Project: Apache Arrow

Re: [Documentation] Incomplete Documentation

2018-03-06 Thread Wes McKinney
hi Alberto, We are volunteer developers developing new codebases with a quite large feature surface area. Please feel free to create JIRA issues pointing out missing API documentation so that members of the community can submit patches improving it. There is already a JIRA about concat_tables,

[Documentation] Incomplete Documentation

2018-03-06 Thread ALBERTO Bocchinfuso
Hi everyone, I am noting more and more that the API documentation is missing some functions or some fields. I can testify about the python APIs, which are the ones that I am using. For example, Batch.num_rows Batch.num_columns Batch.schema

Re: [VOTE] Accept donation of Arrow Go implementation

2018-03-06 Thread Li Jin
+1 On Tue, Mar 6, 2018 at 3:31 PM, Uwe L. Korn wrote: > +1 > > On Tue, Mar 6, 2018, at 9:28 PM, Jacques Nadeau wrote: > > +1 > > > > On Tue, Mar 6, 2018 at 10:57 AM, Wes McKinney > wrote: > > > > > Dear all, > > > > > > The Arrow PMC has been in contact

Re: [VOTE] Accept donation of Arrow Go implementation

2018-03-06 Thread Uwe L. Korn
+1 On Tue, Mar 6, 2018, at 9:28 PM, Jacques Nadeau wrote: > +1 > > On Tue, Mar 6, 2018 at 10:57 AM, Wes McKinney wrote: > > > Dear all, > > > > The Arrow PMC has been in contact with the developers of > > > > https://github.com/influxdata/arrow > > > > which is a native Go

Re: [VOTE] Accept donation of Arrow Go implementation

2018-03-06 Thread Jacques Nadeau
+1 On Tue, Mar 6, 2018 at 10:57 AM, Wes McKinney wrote: > Dear all, > > The Arrow PMC has been in contact with the developers of > > https://github.com/influxdata/arrow > > which is a native Go implementation of Apache Arrow. We are proposing > to accept this codebase into

Re: Parquet to arrow java converter

2018-03-06 Thread Wes McKinney
When it had been discussed in the past, the thinking had been to implement it in the Parquet Java codebase. I'd be interested in others' opinions about this (since I'm not an expert on Java matters) - Wes On Tue, Mar 6, 2018 at 2:27 PM, Wenbo Zhao wrote: > Hi, > > Sorry

Re: Parquet to arrow java converter

2018-03-06 Thread Li Jin
This definitely sounds like a useful tool. It seems like Julien started some of work in Parquet-arrow a while back. Julien, I am wondering what's your thoughts on whether such code should live in parquet-mr or arrow codebase? On Tue, Mar 6, 2018 at 2:27 PM, Wenbo Zhao

Re: Using Plasma with xnd

2018-03-06 Thread Wes McKinney
hi Saul, Are you able to use the buffer/memoryview protocol? Instances of pyarrow.Buffer, like PlasmaBuffer, support this https://github.com/apache/arrow/blob/master/python/pyarrow/plasma.pyx#L182 - Wes On Tue, Mar 6, 2018 at 3:09 PM, Saul Shanabrook wrote: > I am

Using Plasma with xnd

2018-03-06 Thread Saul Shanabrook
I am trying to use the Plasma store to back xnd objects. Xnd ( https://xnd.readthedocs.io/en/latest/xnd/index.html) is a container library in C that has Python bindings. I would like to get a pointer to the allocated memory after creating or get an object in Plasma. I see that this is supported in

[jira] [Created] (ARROW-2280) [Python] pyarrow.Array.buffers should also include the offsets

2018-03-06 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2280: -- Summary: [Python] pyarrow.Array.buffers should also include the offsets Key: ARROW-2280 URL: https://issues.apache.org/jira/browse/ARROW-2280 Project: Apache Arrow

[jira] [Created] (ARROW-2279) [Python] Better error message if lib cannot be found

2018-03-06 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2279: -- Summary: [Python] Better error message if lib cannot be found Key: ARROW-2279 URL: https://issues.apache.org/jira/browse/ARROW-2279 Project: Apache Arrow Issue

[VOTE] Accept donation of Arrow Go implementation

2018-03-06 Thread Wes McKinney
Dear all, The Arrow PMC has been in contact with the developers of https://github.com/influxdata/arrow which is a native Go implementation of Apache Arrow. We are proposing to accept this codebase into the Apache project. If the vote passes, the PMC and the authors of the code will work

[jira] [Created] (ARROW-2278) [Python] deserializing Numpy struct arrays raises

2018-03-06 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2278: - Summary: [Python] deserializing Numpy struct arrays raises Key: ARROW-2278 URL: https://issues.apache.org/jira/browse/ARROW-2278 Project: Apache Arrow

[jira] [Created] (ARROW-2277) [Python] Tensor.from_numpy doesn't support struct arrays

2018-03-06 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2277: - Summary: [Python] Tensor.from_numpy doesn't support struct arrays Key: ARROW-2277 URL: https://issues.apache.org/jira/browse/ARROW-2277 Project: Apache Arrow

[jira] [Created] (ARROW-2276) [Python] Tensor could implement the buffer protocol

2018-03-06 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2276: - Summary: [Python] Tensor could implement the buffer protocol Key: ARROW-2276 URL: https://issues.apache.org/jira/browse/ARROW-2276 Project: Apache Arrow

[jira] [Created] (ARROW-2275) [C++] Buffer::mutable_data_ member uninitialized

2018-03-06 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2275: - Summary: [C++] Buffer::mutable_data_ member uninitialized Key: ARROW-2275 URL: https://issues.apache.org/jira/browse/ARROW-2275 Project: Apache Arrow

[jira] [Created] (ARROW-2273) Cannot deserialize pandas SparseDataFrame

2018-03-06 Thread Mitar (JIRA)
Mitar created ARROW-2273: Summary: Cannot deserialize pandas SparseDataFrame Key: ARROW-2273 URL: https://issues.apache.org/jira/browse/ARROW-2273 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-2272) [Python] test_plasma spams /tmp

2018-03-06 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2272: - Summary: [Python] test_plasma spams /tmp Key: ARROW-2272 URL: https://issues.apache.org/jira/browse/ARROW-2272 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-2271) [Python] test_plasma could make errors more diagnosable

2018-03-06 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2271: - Summary: [Python] test_plasma could make errors more diagnosable Key: ARROW-2271 URL: https://issues.apache.org/jira/browse/ARROW-2271 Project: Apache Arrow

[jira] [Created] (ARROW-2270) [Python] ForeignBuffer doesn't tie Python object lifetime to C++ buffer lifetime

2018-03-06 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2270: - Summary: [Python] ForeignBuffer doesn't tie Python object lifetime to C++ buffer lifetime Key: ARROW-2270 URL: https://issues.apache.org/jira/browse/ARROW-2270

[jira] [Created] (ARROW-2269) Cannot build bdist_wheel for Python

2018-03-06 Thread Mitar (JIRA)
Mitar created ARROW-2269: Summary: Cannot build bdist_wheel for Python Key: ARROW-2269 URL: https://issues.apache.org/jira/browse/ARROW-2269 Project: Apache Arrow Issue Type: Bug