[jira] [Created] (ARROW-5050) [C++] cares_ep should build before grpc_ep

2019-03-27 Thread Kenta Murata (JIRA)
Kenta Murata created ARROW-5050: --- Summary: [C++] cares_ep should build before grpc_ep Key: ARROW-5050 URL: https://issues.apache.org/jira/browse/ARROW-5050 Project: Apache Arrow Issue Type:

Re: [DISCUSS][Format] Time Interval Changes

2019-03-27 Thread Micah Kornfield
Hi Wes, Thanks for the feedback. I'm happy to update the PR to include c++ and python once there is consensus on the format change. I'd also welcome feedback and an extra set of eyes on the issues I raised below, since it is hard to change once we make a release. Based on previous discussions,

[jira] [Created] (ARROW-5049) [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark

2019-03-27 Thread Tiger068 (JIRA)
Tiger068 created ARROW-5049: --- Summary: [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark Key: ARROW-5049 URL: https://issues.apache.org/jira/browse/ARROW-5049

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Kouhei Sutou
Hi, > TEST_DEFAULT=0 TEST_RUST=1 dev/release/verify-release-candidate.sh source > 0.13.0 3 > reports Rust format error: > https://issues.apache.org/jira/browse/ARROW-5044 I've fixed this. And other problems found in RC3 are also fixed. I'll create RC4. Thanks, -- kou In

[jira] [Created] (ARROW-5048) [Release][Rust] arrow-testing is missing in verification script

2019-03-27 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5048: --- Summary: [Release][Rust] arrow-testing is missing in verification script Key: ARROW-5048 URL: https://issues.apache.org/jira/browse/ARROW-5048 Project: Apache Arrow

[jira] [Created] (ARROW-5047) [Release] Always set up parquet-testing in verification script

2019-03-27 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5047: --- Summary: [Release] Always set up parquet-testing in verification script Key: ARROW-5047 URL: https://issues.apache.org/jira/browse/ARROW-5047 Project: Apache Arrow

[jira] [Created] (ARROW-5046) [Release][C++] Plasma test is fragile in verification script

2019-03-27 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5046: --- Summary: [Release][C++] Plasma test is fragile in verification script Key: ARROW-5046 URL: https://issues.apache.org/jira/browse/ARROW-5046 Project: Apache Arrow

[jira] [Created] (ARROW-5045) [Rust] Code coverage silently failing in CI

2019-03-27 Thread Andy Grove (JIRA)
Andy Grove created ARROW-5045: - Summary: [Rust] Code coverage silently failing in CI Key: ARROW-5045 URL: https://issues.apache.org/jira/browse/ARROW-5045 Project: Apache Arrow Issue Type:

Re: [Rust] Code coverage

2019-03-27 Thread Andy Grove
I see now that code coverage stopped working a long time ago in CI ... it is silently failing with the following error. I'll file a JIRA and start researching this. error: could not execute process `target/kcov-master/build/src/kcov --verify --include-path=/home/travis/build/apache/arrow/rust

[Rust] Code coverage

2019-03-27 Thread Andy Grove
I'd like to add Rust code coverage to CI but so far I haven't had much luck even getting it working locally with "cargo kcov" (it produces an empty report). I was wondering if anyone else has this working and could share the commands / configs needed? Thanks, Andy.

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Wes McKinney
I ran the release verification with Ben's changes and it works fine, so I'll merge that patch. I also built and checked the Python wheel in an isolated environment on Python 3.7 and it seems to be OK (including Parquet and Gandiva extensions). So whatever is wrong in CI must be something else On

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Kouhei Sutou
Hi, TEST_DEFAULT=0 TEST_RUST=1 dev/release/verify-release-candidate.sh source 0.13.0 3 reports Rust format error: https://issues.apache.org/jira/browse/ARROW-5044 Could someone fix this? It seems that we need to use Rust stable to check format. Thanks, -- kou In

[jira] [Created] (ARROW-5044) [Release][Rust] Format error in verification script

2019-03-27 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5044: --- Summary: [Release][Rust] Format error in verification script Key: ARROW-5044 URL: https://issues.apache.org/jira/browse/ARROW-5044 Project: Apache Arrow Issue

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Krisztián Szűcs
Yes, check the windows wheels please. We test them in a conda env [1], so theoretically they should be fine. [1]: https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/win-build.bat#L76-L88 On Wed, Mar 27, 2019 at 10:09 PM Wes McKinney wrote: > That's definitely a problem, we

[jira] [Created] (ARROW-5043) [Release][Ruby] red-arrow dependency can't be resolve in verification script

2019-03-27 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5043: --- Summary: [Release][Ruby] red-arrow dependency can't be resolve in verification script Key: ARROW-5043 URL: https://issues.apache.org/jira/browse/ARROW-5043 Project:

[jira] [Created] (ARROW-5042) [Release] Wrong ARROW_DEPENDENCY_SOURCE in verification script

2019-03-27 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5042: --- Summary: [Release] Wrong ARROW_DEPENDENCY_SOURCE in verification script Key: ARROW-5042 URL: https://issues.apache.org/jira/browse/ARROW-5042 Project: Apache Arrow

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Wes McKinney
That's definitely a problem, we can't ship broken wheels. I can take a look at the wheels locally in the next couple of hours if that helps, but getting them tested again in CI would be the best thing On Wed, Mar 27, 2019 at 4:03 PM Antoine Pitrou wrote: > > > Forgot the link, sorry: >

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Antoine Pitrou
Forgot the link, sorry: https://github.com/apache/arrow/pull/4015 Regards Antoine. Le 27/03/2019 à 22:02, Antoine Pitrou a écrit : > > Unsure this is related, but there is also a problem that Windows wheels > are not tested anymore on AppVeyor (and actually fail if you re-enable > the

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Antoine Pitrou
Unsure this is related, but there is also a problem that Windows wheels are not tested anymore on AppVeyor (and actually fail if you re-enable the test). Regards Antoine. Le 27/03/2019 à 22:00, Wes McKinney a écrit : > Thanks Kou. This is a good learning experience. We will continue to do >

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Wes McKinney
Thanks Kou. This is a good learning experience. We will continue to do better with each release On Wed, Mar 27, 2019 at 3:57 PM Kouhei Sutou wrote: > > Hi, > > > What controls do you think we can put in place to prevent the issues > > you had with RC0-2? For example, I merged a patch that

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Kouhei Sutou
Hi, > Technically such a bugfix in 02-source.sh does not require > invalidating an RC because the produced artifacts are unaffected. But > it's definitely a rough edge in the RM process and best avoided Yes. I found some problems for release scripts: *

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Kouhei Sutou
Hi, > What controls do you think we can put in place to prevent the issues > you had with RC0-2? For example, I merged a patch that affected the > dev/release/02-source.sh script, in the future I could test that out > (though perhaps we should develop an automated way to test these > scripts)

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Kouhei Sutou
Hi, > I'm leaning towards -1 because there are unforeseen CUDA memory leaks > in release mode. Related to > https://issues.apache.org/jira/browse/ARROW-5029. OK. I'll create RC4 that fixes this problem. We can also include other changes carefully until I create RC4. Thanks, -- kou In

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Kouhei Sutou
Hi, > However, source verification failed with the following messages during > the C++ build: > > CMake Error at > /usr/share/cmake-3.10/Modules/FindPackageHandleStandardArgs.cmake:137 > (message): > Could NOT find DoubleConversion (missing: DoubleConversion_LIB >

Re: [DISCUSS][Format] Time Interval Changes

2019-03-27 Thread Wes McKinney
hi Micah, Sorry for the delay. I'm in favor of introducing the Duration/DurationInterval type to unblock the difference-of-timestamps / timedelta use case that many Arrow users have. I'd like Jacques or someone from the Java side to comment about this before starting a vote. We can merge these

[jira] [Created] (ARROW-5041) [C++] use vendored gtest and gmock in verify-release-candidate.bat

2019-03-27 Thread Benjamin Kietzman (JIRA)
Benjamin Kietzman created ARROW-5041: Summary: [C++] use vendored gtest and gmock in verify-release-candidate.bat Key: ARROW-5041 URL: https://issues.apache.org/jira/browse/ARROW-5041 Project:

Re: Java allocate buffer code

2019-03-27 Thread Hitesh
Hi Siddarth: Thanks. yes, I am referring compound buffer as an extra buffer. This we release and further can be reused? Let's take an example of 1000 ints. Then, it will need the following bytes. getValidityBufferSize: 125value bufferSize: 4000combinedSize: 4128combinedSizeWith2ThePower: 8192

[jira] [Created] (ARROW-5040) [C++] ArrayFromJSON can't parse Timestamp from strings

2019-03-27 Thread Benjamin Kietzman (JIRA)
Benjamin Kietzman created ARROW-5040: Summary: [C++] ArrayFromJSON can't parse Timestamp from strings Key: ARROW-5040 URL: https://issues.apache.org/jira/browse/ARROW-5040 Project: Apache Arrow

Re: Java allocate buffer code

2019-03-27 Thread Siddharth Teotia
Hi Hitesh, The code you referenced allocates data and validity buffers for a fixed width vector. It first determines the appropriate buffer size for a given value count and then allocates a compound buffer. The compound buffer is then sliced to get data and validity buffers and finally compound

Java allocate buffer code

2019-03-27 Thread Hitesh Khamesra
Hi All: I was looking following code to release extra allocated buffer. It seems it should be considering actualCount as "valueCount*typeWidth". Then it should calculate extra buffer and release it. Right now, it calculates based on actually allocated size and not justifying the intend. Any

[Discuss][Format] Arrow Flight URI scheme proposal

2019-03-27 Thread David Li
Hi all, We'd like to propose a URI scheme for Flight, in anticipation of supporting multiple transports, and different configurations of the gRPC transport. This will change Flight.proto[1] in format/ in a backwards-incompatible way. This aims to fix ARROW-4651[2]. The proposal can be found here

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Wes McKinney
Okay, reviewing the actual problem I agree with doing an RC4 and fixing the memory leak. I hope we can prevent this from happening in the future through some of the physical build cluster work that is ongoing. I suspect Kou is asleep right now so we should see if there are any other problems that

Re: tensorflow-io Arrow Datasets and thoughts on support for tensor columns

2019-03-27 Thread Bryan Cutler
Thanks Wes! I am most interested in the last option, adding Tensor as a logical type, but if it makes sense to embed as a BinaryArray for a first step then that would still be useful too. I'll work on a design doc with a use case and report back. I know there are a lot of different efforts going

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Francois Saint-Jacques
Wrong link, https://issues.apache.org/jira/browse/ARROW-5036 François On Wed, Mar 27, 2019 at 12:25 PM Francois Saint-Jacques < fsaintjacq...@gmail.com> wrote: > 0 (non-binding) > > The verification script doesn't compile out of the box on ubuntu 18.04 > (fixed by forcing

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Antoine Pitrou
There's also a leak in compressed streams though it's probably less large... See https://github.com/apache/arrow/pull/4049#pullrequestreview-219550038 Regards Antoine. Le 27/03/2019 à 18:54, Wes McKinney a écrit : > Since we aren't testing CUDA extensions in CI yet I am not sure we should >

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Antoine Pitrou
I think this is a regression, though. Regards Antoine. Le 27/03/2019 à 18:54, Wes McKinney a écrit : > Since we aren't testing CUDA extensions in CI yet I am not sure we should > block over issues with them. It seems like this (plus CI on physical > hardware) can be resolved for 0.14. > >

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Wes McKinney
Since we aren't testing CUDA extensions in CI yet I am not sure we should block over issues with them. It seems like this (plus CI on physical hardware) can be resolved for 0.14. On Wed, Mar 27, 2019, 3:42 PM Antoine Pitrou wrote: > On Wed, 27 Mar 2019 20:36:14 +0900 (JST) > Kouhei Sutou

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Wes McKinney
The Windows verification script has to be updated to use vendored gtest and gmock. I accounted for this in the new C++ developer documentation but have not updated the script yet. On Wed, Mar 27, 2019, 5:23 PM Ben Kietzman wrote: > Running verification script on Windows10 I get a link error

[jira] [Created] (ARROW-5039) [Rust] [DataFusion] Fix bugs in CAST support

2019-03-27 Thread Andy Grove (JIRA)
Andy Grove created ARROW-5039: - Summary: [Rust] [DataFusion] Fix bugs in CAST support Key: ARROW-5039 URL: https://issues.apache.org/jira/browse/ARROW-5039 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-5038) [Rust] [DataFusion] Implement AVG aggregate function

2019-03-27 Thread Andy Grove (JIRA)
Andy Grove created ARROW-5038: - Summary: [Rust] [DataFusion] Implement AVG aggregate function Key: ARROW-5038 URL: https://issues.apache.org/jira/browse/ARROW-5038 Project: Apache Arrow Issue

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Francois Saint-Jacques
0 (non-binding) The verification script doesn't compile out of the box on ubuntu 18.04 (fixed by forcing ARROW_DEPENDENCY_SOURCE=BUNDLE). Afterward, I hit a failure I can't reproduce out of the verification script, e.g. in a local build with/without conda. 1:

[jira] [Created] (ARROW-5037) [Rust] [DataFusion] Refactor aggregate module

2019-03-27 Thread Andy Grove (JIRA)
Andy Grove created ARROW-5037: - Summary: [Rust] [DataFusion] Refactor aggregate module Key: ARROW-5037 URL: https://issues.apache.org/jira/browse/ARROW-5037 Project: Apache Arrow Issue Type:

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Ben Kietzman
Running verification script on Windows10 I get a link error while compiling compute-test: unresolved external symbol "class testing::internal::Mutex testing::internal::g_gmock_mutex" On Wed, Mar 27, 2019 at 10:42 AM Antoine Pitrou wrote: > > On Wed, 27 Mar 2019 20:36:14 +0900 (JST) >

[jira] [Created] (ARROW-5036) [C++] Serialization tests resort to memcpy to check equality

2019-03-27 Thread Francois Saint-Jacques (JIRA)
Francois Saint-Jacques created ARROW-5036: - Summary: [C++] Serialization tests resort to memcpy to check equality Key: ARROW-5036 URL: https://issues.apache.org/jira/browse/ARROW-5036

[jira] [Created] (ARROW-5035) [C#] ArrowBuffer.Builder is broken

2019-03-27 Thread Eric Erhardt (JIRA)
Eric Erhardt created ARROW-5035: --- Summary: [C#] ArrowBuffer.Builder is broken Key: ARROW-5035 URL: https://issues.apache.org/jira/browse/ARROW-5035 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-5034) [C#] ArrowStreamWriter should expose synchronous Write methods

2019-03-27 Thread Eric Erhardt (JIRA)
Eric Erhardt created ARROW-5034: --- Summary: [C#] ArrowStreamWriter should expose synchronous Write methods Key: ARROW-5034 URL: https://issues.apache.org/jira/browse/ARROW-5034 Project: Apache Arrow

[jira] [Created] (ARROW-5033) [C++] JSON table writer

2019-03-27 Thread Benjamin Kietzman (JIRA)
Benjamin Kietzman created ARROW-5033: Summary: [C++] JSON table writer Key: ARROW-5033 URL: https://issues.apache.org/jira/browse/ARROW-5033 Project: Apache Arrow Issue Type: New Feature

[jira] [Created] (ARROW-5032) [C++] Headers in vendored/datetime directory aren't installed

2019-03-27 Thread Kenta Murata (JIRA)
Kenta Murata created ARROW-5032: --- Summary: [C++] Headers in vendored/datetime directory aren't installed Key: ARROW-5032 URL: https://issues.apache.org/jira/browse/ARROW-5032 Project: Apache Arrow

[jira] [Created] (ARROW-5031) [Dev] Release verification script does not run CUDA tests in Python

2019-03-27 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-5031: - Summary: [Dev] Release verification script does not run CUDA tests in Python Key: ARROW-5031 URL: https://issues.apache.org/jira/browse/ARROW-5031 Project: Apache

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Antoine Pitrou
On Wed, 27 Mar 2019 20:36:14 +0900 (JST) Kouhei Sutou wrote: > Hi, > > I would like to propose the following release candidate (RC3) of Apache > Arrow version 0.13.0. This is a release consiting of 584 > resolved JIRA issues[1]. > > This release candidate is based on commit: >

[jira] [Created] (ARROW-5030) read_row_group fails with Nested data conversions not implemented for chunked array outputs

2019-03-27 Thread JIRA
Jakub Okoński created ARROW-5030: Summary: read_row_group fails with Nested data conversions not implemented for chunked array outputs Key: ARROW-5030 URL: https://issues.apache.org/jira/browse/ARROW-5030

[jira] [Created] (ARROW-5029) [C++] Compilation warnings in release mode

2019-03-27 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-5029: - Summary: [C++] Compilation warnings in release mode Key: ARROW-5029 URL: https://issues.apache.org/jira/browse/ARROW-5029 Project: Apache Arrow Issue

[jira] [Created] (ARROW-5028) Arrow->Parquet store drops and corrupts values

2019-03-27 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5028: Summary: Arrow->Parquet store drops and corrupts values Key: ARROW-5028 URL: https://issues.apache.org/jira/browse/ARROW-5028 Project: Apache Arrow Issue

Re: [VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Antoine Pitrou
Hi, Le 27/03/2019 à 12:36, Kouhei Sutou a écrit : > Hi, > > I would like to propose the following release candidate (RC3) of Apache > Arrow version 0.13.0. This is a release consiting of 584 > resolved JIRA issues[1]. > > This release candidate is based on commit: >

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Wes McKinney
Technically such a bugfix in 02-source.sh does not require invalidating an RC because the produced artifacts are unaffected. But it's definitely a rough edge in the RM process and best avoided On Wed, Mar 27, 2019 at 7:38 AM Francois Saint-Jacques wrote: > > An automated nightly release is for

Re: [Format] For all null arrays are data-buffers required?

2019-03-27 Thread Wes McKinney
Let me fix a typo On Wed, Mar 27, 2019 at 7:40 AM Wes McKinney wrote: > > hi Micah, > > I think it's most productive to view things through the lens of the > binary protocol (i.e. the IPC/RPC wire format) > > On Wed, Mar 27, 2019 at 1:59 AM Micah Kornfield wrote: > > > > Similar to how there is

Re: [Format] For all null arrays are data-buffers required?

2019-03-27 Thread Wes McKinney
hi Micah, I think it's most productive to view things through the lens of the binary protocol (i.e. the IPC/RPC wire format) On Wed, Mar 27, 2019 at 1:59 AM Micah Kornfield wrote: > > Similar to how there is no validity-buffer required in the format if > null_count == 0, is a similar

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Francois Saint-Jacques
An automated nightly release is for sure going to catch a lot of the tiny issues. Regarding the error in the script, do you need a full RC increase if one modifies the script, can you just modify it locally? François On Wed, Mar 27, 2019 at 8:30 AM Wes McKinney wrote: > Thanks Kou for being

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Wes McKinney
Thanks Kou for being the RM! What controls do you think we can put in place to prevent the issues you had with RC0-2? For example, I merged a patch that affected the dev/release/02-source.sh script, in the future I could test that out (though perhaps we should develop an automated way to test

[VOTE] Release Apache Arrow 0.13.0 - RC3

2019-03-27 Thread Kouhei Sutou
Hi, I would like to propose the following release candidate (RC3) of Apache Arrow version 0.13.0. This is a release consiting of 584 resolved JIRA issues[1]. This release candidate is based on commit: 6de758a9e5f74d50ec3458b2af17b7d2d892c573 [2] The source release rc3 is hosted at [3]. The

[jira] [Created] (ARROW-5027) [Python] Add JSON Reader

2019-03-27 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-5027: - Summary: [Python] Add JSON Reader Key: ARROW-5027 URL: https://issues.apache.org/jira/browse/ARROW-5027 Project: Apache Arrow Issue Type: Improvement

Re: Timeline for 0.13 Arrow release

2019-03-27 Thread Kouhei Sutou
Hi, I'm creating RC3(!). There are some problems in RC0-2. I hope that we can vote against RC3. We need to wait building and uploading packages: https://github.com/kou/crossbow/branches/all?query=build-71 I'll be able to send vote e-mail in 12 hours... Thanks, -- kou In "Re: Timeline

[Format] For all null arrays are data-buffers required?

2019-03-27 Thread Micah Kornfield
Similar to how there is no validity-buffer required in the format if null_count == 0, is a similar optimization for the "data buffer" allowed when (null_count == array length)? It seems that if all values are null, no data element should ever be accessed, but I couldn't find if this was ever