[DISCUSS] Improving Arrow columnar implementation guidelines for third parties

2019-09-16 Thread Wes McKinney
hi folks, As Apache Arrow grows more popular, we may acquire some different kinds of third party developers: A. Developers who use and, in many cases, contribute to one of the project's reference implementations B. Developers who choose to implement the columnar format themselves, without

[jira] [Created] (ARROW-6571) [Developer] Provide means to "plug in" a third party Arrow implementation into the integration test suite for validation purposes

2019-09-16 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6571: --- Summary: [Developer] Provide means to "plug in" a third party Arrow implementation into the integration test suite for validation purposes Key: ARROW-6571 URL:

[jira] [Created] (ARROW-6572) [C++] Reading some Parquet data can return uninitialized memory

2019-09-16 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6572: - Summary: [C++] Reading some Parquet data can return uninitialized memory Key: ARROW-6572 URL: https://issues.apache.org/jira/browse/ARROW-6572 Project: Apache

[jira] [Created] (ARROW-6573) Segfault when writing to parquet

2019-09-16 Thread Josh Weinstock (Jira)
Josh Weinstock created ARROW-6573: - Summary: Segfault when writing to parquet Key: ARROW-6573 URL: https://issues.apache.org/jira/browse/ARROW-6573 Project: Apache Arrow Issue Type: Bug

Re: [DISCUSS][C++] Rethinking our current C++ shared library (.so / .dll) approach

2019-09-16 Thread Sutou Kouhei
Hi, If this is circular, it's a problem. But this isn't circular for now. I think that we can use libarrow as the fundamental shared library to provide common implementation like [1] if we need to provide common implementation for template. (I think that we don't provide common implementation

Re: [Discuss] [Java] DateMilliVector.getObject() return type (LocalDateTime vs LocalDate)

2019-09-16 Thread Micah Kornfield
Anyone have an opinion on this? Personally, I'm leaning on keeping the existing API compatibility, but I don't feel too strongly about it. On Mon, Sep 9, 2019 at 7:39 PM Micah Kornfield wrote: > Yongbo Zhang, > Opened up a pull request to have DateMilliVector return a LocalDate > instead of a

[jira] [Created] (ARROW-6574) [JS] TypeError with utf8 and JSONVectorLoader.readData

2019-09-16 Thread Adam M Krebs (Jira)
Adam M Krebs created ARROW-6574: --- Summary: [JS] TypeError with utf8 and JSONVectorLoader.readData Key: ARROW-6574 URL: https://issues.apache.org/jira/browse/ARROW-6574 Project: Apache Arrow

Re: [DISCUSS][C++] Rethinking our current C++ shared library (.so / .dll) approach

2019-09-16 Thread Sutou Kouhei
Hi, I understand what problems we want to solve. Especially template and DLL in ARROW-6244. I feel that one shared library is overkill because we have many namespaces. If we have only arrow:: namespace, it's reasonable. But we have arrow::, gandiva::, parquet:: and plasma:: namespaces. It's a

Re: [DISCUSS][C++] Rethinking our current C++ shared library (.so / .dll) approach

2019-09-16 Thread Micah Kornfield
I don't have a strong opinion here, but had a question and comment: Are there are implications from a project governance perspective of packaging Parquet and Arrow into a single shared library? As a comment, but I'm a big +1 on trying to tease apart the circular dependencies between

Re: [DISCUSS][Java] Design of the algorithm module

2019-09-16 Thread Micah Kornfield
Hi Liya Fan, Thank you for this writeup, it doesn't look like comments are enabled on the document. Could you allow for them? Thanks, Micah On Sat, Sep 14, 2019 at 6:57 AM Fan Liya wrote: > Dear all, > > We have prepared a document for discussing the requirements, design and > implementation

[jira] [Created] (ARROW-6575) [JS] decimal toString does not support negative values

2019-09-16 Thread Andong Zhan (Jira)
Andong Zhan created ARROW-6575: -- Summary: [JS] decimal toString does not support negative values Key: ARROW-6575 URL: https://issues.apache.org/jira/browse/ARROW-6575 Project: Apache Arrow

[Rust] DataFusion parallel query execution update

2019-09-16 Thread Andy Grove
I wanted to give a quick update to add some context to the work I am doing to add parallel query execution to DataFusion since I have been working on this largely in isolation. The current query execution code in DataFusion 0.14 is single-threaded and can only run against a single CSV or Parquet

Re: [DISCUSS] Improving Arrow columnar implementation guidelines for third parties

2019-09-16 Thread Micah Kornfield
1. Are there particular issues that have cropped up that we should be aware of? This might help inform how we go about this. 2. We should be publishing a matrix of current compliance with the standard for our existing implementations (this could be the basis of letting bespoke implementations

[jira] [Created] (ARROW-6576) [R] Fix sparklyr integration tests

2019-09-16 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6576: -- Summary: [R] Fix sparklyr integration tests Key: ARROW-6576 URL: https://issues.apache.org/jira/browse/ARROW-6576 Project: Apache Arrow Issue Type: Bug

[NIGHTLY] Arrow Build Report for Job nightly-2019-09-16-0

2019-09-16 Thread Crossbow
Arrow Build Report for Job nightly-2019-09-16-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-16-0 Failed Tasks: - ubuntu-cosmic: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-16-0-azure-ubuntu-cosmic -