"Empty" slots

2016-03-09 Thread Daniel Robinson
Thanks, Wes. For hashing, my thought was that you could just hash the null bitmask concatenated with the values array (this should be doable in constant space for ordinary one-pass hash functions). If nulls are zeroed out, this will guarantee that equal primitive arrays hash to the same value. But

Understanding "shared" memory implications

2016-03-09 Thread Corey J Nolet
If I understand correctly, Arrow is using Netty underneath which is using Sun's Unsafe API in order to allocate direct byte buffers off heap. It is using Netty to communicate between "client" and "server", information about memory addresses for data that is being requested. I've never attempted

[jira] [Created] (ARROW-60) C++: Struct type builder API

2016-03-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-60: - Summary: C++: Struct type builder API Key: ARROW-60 URL: https://issues.apache.org/jira/browse/ARROW-60 Project: Apache Arrow Issue Type: New Feature Co

[jira] [Created] (ARROW-59) Python: Boolean data support for builtin data structures

2016-03-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-59: - Summary: Python: Boolean data support for builtin data structures Key: ARROW-59 URL: https://issues.apache.org/jira/browse/ARROW-59 Project: Apache Arrow Issue Typ

[jira] [Created] (ARROW-58) Format: Draft type metadata ("schemas") IDL

2016-03-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-58: - Summary: Format: Draft type metadata ("schemas") IDL Key: ARROW-58 URL: https://issues.apache.org/jira/browse/ARROW-58 Project: Apache Arrow Issue Type: New Featu

[jira] [Created] (ARROW-57) Format: Draft data headers IDL for data interchange

2016-03-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-57: - Summary: Format: Draft data headers IDL for data interchange Key: ARROW-57 URL: https://issues.apache.org/jira/browse/ARROW-57 Project: Apache Arrow Issue Type: Ne

[jira] [Created] (ARROW-56) Format: Specify LSB bit ordering in bit arrays

2016-03-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-56: - Summary: Format: Specify LSB bit ordering in bit arrays Key: ARROW-56 URL: https://issues.apache.org/jira/browse/ARROW-56 Project: Apache Arrow Issue Type: New Fea

Re: "Empty" slots

2016-03-09 Thread Wes McKinney
hey Dan, You bring up a good point. It's unclear whether this would make hashing much less complex (if you ignore the null bits when computing the hash function, then it probably would, especially on repeated fields. We'd need to do some experiments to see if there are cases where skipping the nul

[jira] [Resolved] (ARROW-50) C++: Enable library builds for 3rd-party users without having to build thirdparty googletest

2016-03-09 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-50. --- Resolution: Fixed See https://github.com/apache/arrow/blob/master/python/doc/INSTALL.md from https:/

[jira] [Resolved] (ARROW-53) Python: Fix RPATH and add source installation instructions

2016-03-09 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-53?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-53. --- Resolution: Fixed I was able to resolve this essentially through permutation testing (seriously) in htt

[jira] [Assigned] (ARROW-54) Python: rename package to "pyarrow"

2016-03-09 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-54: - Assignee: Wes McKinney > Python: rename package to "pyarrow" > ---

[jira] [Resolved] (ARROW-54) Python: rename package to "pyarrow"

2016-03-09 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-54. --- Resolution: Fixed Issue resolved by pull request 23 [https://github.com/apache/arrow/pull/23] > Python:

[jira] [Created] (ARROW-55) Python: fix legacy Python (2.7) tests and add to Travis CI

2016-03-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-55: - Summary: Python: fix legacy Python (2.7) tests and add to Travis CI Key: ARROW-55 URL: https://issues.apache.org/jira/browse/ARROW-55 Project: Apache Arrow Issue T

[jira] [Created] (ARROW-54) Python: rename package to "pyarrow"

2016-03-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-54: - Summary: Python: rename package to "pyarrow" Key: ARROW-54 URL: https://issues.apache.org/jira/browse/ARROW-54 Project: Apache Arrow Issue Type: Improvement

[jira] [Commented] (ARROW-53) Python: Fix RPATH and add source installation instructions

2016-03-09 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-53?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187771#comment-15187771 ] Wes McKinney commented on ARROW-53: --- Well, this has ended up being a massive time sink. I'

Re: Distributed arrow?

2016-03-09 Thread Wes McKinney
For a system to use Arrow, it only needs to be able to understand the columnar memory layout (and typically the metadata -- still to be defined). Separately, many of us are interested in developing tools to share Arrow data through various shared memory mechanisms (for example, memory-mapped files

[jira] [Created] (ARROW-53) Python: Fix RPATH and add source installation instructions

2016-03-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-53: - Summary: Python: Fix RPATH and add source installation instructions Key: ARROW-53 URL: https://issues.apache.org/jira/browse/ARROW-53 Project: Apache Arrow Issue T

Re: Distributed arrow?

2016-03-09 Thread Yiannis Gkoufas
Hi Venkat, as far as I understand, arrow works on infrastructures which already support RDMA and it's not included in the core arrow library. Can the arrow developers confirm that assumption? Thanks On 5 March 2016 at 01:54, Venkat Krishnamurthy wrote: > All > > I've been following along with