[jira] [Created] (ARROW-2419) [Site] Website generation depends on local timezone

2018-04-09 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2419: - Summary: [Site] Website generation depends on local timezone Key: ARROW-2419 URL: https://issues.apache.org/jira/browse/ARROW-2419 Project: Apache Arrow Is

[jira] [Created] (ARROW-2420) [Rust] Memory is never released

2018-04-09 Thread Andy Grove (JIRA)
Andy Grove created ARROW-2420: - Summary: [Rust] Memory is never released Key: ARROW-2420 URL: https://issues.apache.org/jira/browse/ARROW-2420 Project: Apache Arrow Issue Type: Bug Comp

[jira] [Created] (ARROW-2421) [C++] Update LLVM version in cpp README

2018-04-09 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-2421: -- Summary: [C++] Update LLVM version in cpp README Key: ARROW-2421 URL: https://issues.apache.org/jira/browse/ARROW-2421 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-2422) Support more filter operators on Hive partitioned Parquet files

2018-04-09 Thread Julius Neuffer (JIRA)
Julius Neuffer created ARROW-2422: - Summary: Support more filter operators on Hive partitioned Parquet files Key: ARROW-2422 URL: https://issues.apache.org/jira/browse/ARROW-2422 Project: Apache Arrow

Rust Arrow status and plans for this week

2018-04-09 Thread Andy Grove
Over the weekend I added preliminary Parquet support to DataFusion (it only supports int/float primitives and UTF8 so far). This was possible due to the great work happening with the parquet-rs crate. Integrating this with the current Rust version of Arrow was simple and I have now started running

[jira] [Created] (ARROW-2423) [Python] PyArrow datatypes raise ValueError on equality checks against non-PyArrow objects

2018-04-09 Thread Dave Challis (JIRA)
Dave Challis created ARROW-2423: --- Summary: [Python] PyArrow datatypes raise ValueError on equality checks against non-PyArrow objects Key: ARROW-2423 URL: https://issues.apache.org/jira/browse/ARROW-2423

[jira] [Created] (ARROW-2424) Missing import causing broken build

2018-04-09 Thread Andy Grove (JIRA)
Andy Grove created ARROW-2424: - Summary: Missing import causing broken build Key: ARROW-2424 URL: https://issues.apache.org/jira/browse/ARROW-2424 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-2425) [Rust] Array::from missing mapping for u8 type

2018-04-09 Thread Andy Grove (JIRA)
Andy Grove created ARROW-2425: - Summary: [Rust] Array::from missing mapping for u8 type Key: ARROW-2425 URL: https://issues.apache.org/jira/browse/ARROW-2425 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-2426) [CI] glib build failure

2018-04-09 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2426: - Summary: [CI] glib build failure Key: ARROW-2426 URL: https://issues.apache.org/jira/browse/ARROW-2426 Project: Apache Arrow Issue Type: Bug Comp

[jira] [Created] (ARROW-2427) [C++] ReadAt implementations suboptimal

2018-04-09 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2427: - Summary: [C++] ReadAt implementations suboptimal Key: ARROW-2427 URL: https://issues.apache.org/jira/browse/ARROW-2427 Project: Apache Arrow Issue Type: Im

Tasks for upcoming Hackathons & Sprints

2018-04-09 Thread Uwe L. Korn
Hello all, in the next weeks and months some of us are taking part in Hackathons and Sprints and hope to attract new people to Arrow development. This includes: * MAN AHL Hackathon in 2 weeks: https://www.ahl.com/hackathon * PyCon US in May: https://us.pycon.org/2018/community/sprints/ * PyCon D

[jira] [Created] (ARROW-2428) [Python] Support ExtensionArrays in to_pandas conversion

2018-04-09 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2428: -- Summary: [Python] Support ExtensionArrays in to_pandas conversion Key: ARROW-2428 URL: https://issues.apache.org/jira/browse/ARROW-2428 Project: Apache Arrow Iss

[jira] [Created] (ARROW-2429) [Python] Timestamp unit in schema changes when writing to Parquet file then reading back

2018-04-09 Thread Dave Challis (JIRA)
Dave Challis created ARROW-2429: --- Summary: [Python] Timestamp unit in schema changes when writing to Parquet file then reading back Key: ARROW-2429 URL: https://issues.apache.org/jira/browse/ARROW-2429

Re: What do people think about a one day get together?

2018-04-09 Thread Jacques Nadeau
Hey all, given that several people are busy in June, let's way until the fall. I'll take at look at the schedule of things and throw out a new idea in the next few months. Thanks! Jacques On Wed, Apr 4, 2018 at 10:17 AM, Wes McKinney wrote: > I'm +1 on the idea of a Arrow conference. June is a

Re: What do people think about a one day get together?

2018-04-09 Thread Julian Hyde
+1 The Arrow community would benefit greatly from a conference/unconference. Remember not to schedule it too close to ApacheCon. Julian > On Apr 9, 2018, at 10:18 AM, Jacques Nadeau wrote: > > Hey all, given that several people are busy in June, let's way until the > fall. I'll take at look

[jira] [Created] (ARROW-2430) MVP for branch based packaging automation

2018-04-09 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-2430: -- Summary: MVP for branch based packaging automation Key: ARROW-2430 URL: https://issues.apache.org/jira/browse/ARROW-2430 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-2431) [Rust] Schema fidelity

2018-04-09 Thread Maximilian Roos (JIRA)
Maximilian Roos created ARROW-2431: -- Summary: [Rust] Schema fidelity Key: ARROW-2431 URL: https://issues.apache.org/jira/browse/ARROW-2431 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-2432) [Python] from_pandas fails when converting decimals if contain None

2018-04-09 Thread Bryan Cutler (JIRA)
Bryan Cutler created ARROW-2432: --- Summary: [Python] from_pandas fails when converting decimals if contain None Key: ARROW-2432 URL: https://issues.apache.org/jira/browse/ARROW-2432 Project: Apache Arrow

Tensor column types in arrow

2018-04-09 Thread Leif Walsh
Hi all, I’ve been doing some work lately with Spark’s ML interfaces, which include sparse and dense Vector and Matrix types, backed on the Scala side by Breeze. Using these interfaces, you can construct DataFrames whose column types are vectors and matrices, and though the API isn’t terribly rich,

Re: Tensor column types in arrow

2018-04-09 Thread Li Jin
As far as I know, there is an implementation of tensor type in C++/Python already. Should we just finalize the spec and add implementation to Java? On the Spark side, it's probably more complicated as Vector and Matrix are not "first class" types in Spark SQL. Spark ML implements them as UDT (user

Re: Tensor column types in arrow

2018-04-09 Thread Leif Walsh
The tensor type in the c++ api is a stand-alone object afaict, Phillip and I were unable to construct an arrow column of them. I agree that it’s a good starting point, one interpretation of what I’m suggesting is that we take it as the reference implementation, add it to the spec, and write the jav

Re: Tensor column types in arrow

2018-04-09 Thread Wes McKinney
> As far as I know, there is an implementation of tensor type in C++/Python > already. Should we just finalize the spec and add implementation to Java? There is nothing specified yet as far as a *column* of ndarrays/tensors. We defined Tensor metadata for the purposes of IPC/serialization but mad

dev@arrow.apache.org

2018-04-09 Thread Andy Grove (JIRA)
Andy Grove created ARROW-2433: - Summary: [Rust] Add Builder.push_slice(&[T]) Key: ARROW-2433 URL: https://issues.apache.org/jira/browse/ARROW-2433 Project: Apache Arrow Issue Type: Improvement

Re: Tensor column types in arrow

2018-04-09 Thread Leif Walsh
My gut feeling is that such a column type should specify both the shape and primitive type of all values in the column. I can’t think of a common use case that requires differently shaped tensors in a single column. Can anyone here come up with such a use case? If not, I can try to draft a propos

[jira] [Created] (ARROW-2434) [Rust] Add windows support

2018-04-09 Thread Paddy Horan (JIRA)
Paddy Horan created ARROW-2434: -- Summary: [Rust] Add windows support Key: ARROW-2434 URL: https://issues.apache.org/jira/browse/ARROW-2434 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-2435) [Rust] Add memory pool abstraction.

2018-04-09 Thread Renjie Liu (JIRA)
Renjie Liu created ARROW-2435: - Summary: [Rust] Add memory pool abstraction. Key: ARROW-2435 URL: https://issues.apache.org/jira/browse/ARROW-2435 Project: Apache Arrow Issue Type: Improvement

Re: Rust Arrow status and plans for this week

2018-04-09 Thread Jacques Nadeau
Super cool, congrats on the progress! The IPC/interop is top priority for me as well. On Mon, Apr 9, 2018 at 6:26 AM, Andy Grove wrote: > Over the weekend I added preliminary Parquet support to DataFusion (it only > supports int/float primitives and UTF8 so far). This was possible due to > the

Re: Rust Arrow status and plans for this week

2018-04-09 Thread Renjie Liu
Cool! I'm also trying to use arrow-rs in my project and would like to contribute to arrow-rs, can anybody give me contributor permission? On Tue, Apr 10, 2018 at 10:31 AM Jacques Nadeau wrote: > Super cool, congrats on the progress! > > The IPC/interop is top priority for me as well. > > On Mon,

[jira] [Created] (ARROW-2436) [Rust] Add windows CI

2018-04-09 Thread Paddy Horan (JIRA)
Paddy Horan created ARROW-2436: -- Summary: [Rust] Add windows CI Key: ARROW-2436 URL: https://issues.apache.org/jira/browse/ARROW-2436 Project: Apache Arrow Issue Type: Improvement Comp

Re: Rust Arrow status and plans for this week

2018-04-09 Thread Uwe L. Korn
Hello Renjie, I can give you contributor permissions on JIRA so you can assign issues to yourself. I would need to know your JIRA id for that. Code contributions happen per pull request on github. Just fork the project, open a new branch and once it's ready: make a pull request to the main arro