[jira] [Created] (ARROW-6942) [Developer] Improve github actions

2019-10-18 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-6942: -- Summary: [Developer] Improve github actions Key: ARROW-6942 URL: https://issues.apache.org/jira/browse/ARROW-6942 Project: Apache Arrow Issue Type: Impro

Re: [Discuss][FlightRPC] Extensions to Flight: "DoBidirectional"

2019-10-18 Thread Wes McKinney
I'm supportive of having a bidirectional API for the reasons stated in the document. There seem to be some details to work out but probably nothing insurmountable. It seems that how errors and metadata are handled are one of the open questions On Wed, Oct 16, 2019 at 7:13 PM David Li wrote: > >

[jira] [Created] (ARROW-6941) [C++] Unpin gtest in build environment

2019-10-18 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6941: --- Summary: [C++] Unpin gtest in build environment Key: ARROW-6941 URL: https://issues.apache.org/jira/browse/ARROW-6941 Project: Apache Arrow Issue Type: Improve

[jira] [Created] (ARROW-6940) [C++] Expose Message-level IPC metadata in both read and write interfaces

2019-10-18 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6940: --- Summary: [C++] Expose Message-level IPC metadata in both read and write interfaces Key: ARROW-6940 URL: https://issues.apache.org/jira/browse/ARROW-6940 Project: Apache

Re: [DISCUSS] Result vs Status

2019-10-18 Thread Antoine Pitrou
Le 18/10/2019 à 20:58, Wes McKinney a écrit : I'm definitely uncomfortable with the idea of deprecating Status. We have a few kinds of functions that can fail: 1. Functions with no "out" arguments 2. Functions with one out argument 3. Functions with multiple out arguments IMHO functions in c

Re: [DISCUSS] Result vs Status

2019-10-18 Thread Micah Kornfield
Hi Wes, Sorry for the confusion I agree completely with what you wrote. I was only thinking about scenario 2 and 3 (where it makes sense) in my previous email. TL;DR; in the long term I don't think we should be supporting semantically equivilent APIs for both Status and Result. I'll see if I can

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-18-0

2019-10-18 Thread Wes McKinney
I appear to have fixed the wheels, will merge the fix once the Crossbow jobs complete. There are some other issues that need to get done for 0.15.1 so I'll look at those as soon as I can On Fri, Oct 18, 2019 at 3:35 PM Krisztián Szűcs wrote: > > On Fri, Oct 18, 2019 at 3:47 PM Wes McKinney wrote

Re: [DISCUSS] Result vs Status

2019-10-18 Thread Wes McKinney
On Fri, Oct 18, 2019 at 7:58 PM Wes McKinney wrote: > > I'm definitely uncomfortable with the idea of deprecating Status. > > We have a few kinds of functions that can fail: > > 1. Functions with no "out" arguments > 2. Functions with one out argument > 3. Functions with multiple out arguments > >

Re: [DISCUSS] Result vs Status

2019-10-18 Thread Wes McKinney
I'm definitely uncomfortable with the idea of deprecating Status. We have a few kinds of functions that can fail: 1. Functions with no "out" arguments 2. Functions with one out argument 3. Functions with multiple out arguments IMHO functions in category 2 are the best candidates for utilizing St

[jira] [Created] (ARROW-6939) [Packaging][Crossbow] Always upload binary artifacts regardless of the test result

2019-10-18 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-6939: -- Summary: [Packaging][Crossbow] Always upload binary artifacts regardless of the test result Key: ARROW-6939 URL: https://issues.apache.org/jira/browse/ARROW-6939

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-18-0

2019-10-18 Thread Krisztián Szűcs
On Fri, Oct 18, 2019 at 3:47 PM Wes McKinney wrote: > I'm going to look at the wheel now. > > Are the Crossbow builds configured to upload the artifact(s) to > Appveyor regardless of whether they are good or not? It would be nice > to always have the artifact available to help with diagnosing bui

[jira] [Created] (ARROW-6938) [Python] Windows wheel depends on zstd.dll and libbz2.dll, which are not bundled

2019-10-18 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6938: --- Summary: [Python] Windows wheel depends on zstd.dll and libbz2.dll, which are not bundled Key: ARROW-6938 URL: https://issues.apache.org/jira/browse/ARROW-6938 Project:

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-18-0

2019-10-18 Thread Wes McKinney
I'm going to look at the wheel now. Are the Crossbow builds configured to upload the artifact(s) to Appveyor regardless of whether they are good or not? It would be nice to always have the artifact available to help with diagnosing build failures Thanks On Fri, Oct 18, 2019 at 12:47 PM Krisztián

Re: Possible Arrow 0.15.1 release

2019-10-18 Thread Wes McKinney
hi Krisz, Thanks for the update. I'm going to look at the Windows wheels in my VM to see if I can determine what's wrong and fix it. I think because of the wheel issue we may have to include the compression library option changes in 0.15.1 because otherwise there may be some dependency between t

Re: Possible Arrow 0.15.1 release

2019-10-18 Thread Krisztián Szűcs
So seems like there are still 5 unresolved issues in the release: https://issues.apache.org/jira/projects/ARROW/versions/12346358 I won't be available during the weekend, but I can cut the release on Monday. The packaging builds should be fixed except the windows wheels, although this problem mig

Re: Horizontal scaling design suggestion: Apache arrow flight

2019-10-18 Thread Ryan Murray
Hey Vinay, This Spark source might be of interest [1]. We had discussed the possibility of it being moved into Arrow proper as a contrib module when more stable. This is doing something similar to what you are suggesting: talking to a cluster of Flight servers from Spark. This deals more with the

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-18-0

2019-10-18 Thread Krisztián Szűcs
wheel-osx-* and conda-linux-* tasks should be resolved by https://github.com/apache/arrow/pull/5647 The windows wheels are having a linking problem which is hard to investigate (error doesn't provide any context about which DLL is missing). I have produced a faulty wheel at [1]. [1]: https://ci.a

Re: [Discuss] Streaming: Differentiate between length of RecordBatch and utilized portion-- common use-case?

2019-10-18 Thread John Muehlhausen
Perhaps what I should do is study the batch creation process in the reference implementation and see whether an alternative approach can be a lot more efficient in time while being less efficient in space. And if so, whether this new approach also requires a differentiation between batch length an

Re: [DISCUSS] [Rust] Adding support for Flight protocol

2019-10-18 Thread Lucio Franco
Hi all! I am the author of Tonic, I'd love to see the rust flight implementation done with Tonic. David, it looks like what you implemented with tower-grpc should work just fine with tonic as well. I am also interested in helping out. Since, I guess most of my experience up to now is with Tonic

Re: [ANNOUNCE] New Arrow committer: Eric Erhardt

2019-10-18 Thread Krisztián Szűcs
Congrats! On Fri, Oct 18, 2019 at 12:35 PM Bryan Cutler wrote: > Congrats! > > On Thu, Oct 17, 2019, 6:26 PM Fan Liya wrote: > > > Congrats Eric! > > > > Best, > > Liya Fan > > > > On Fri, Oct 18, 2019 at 3:06 AM paddy horan > > wrote: > > > > > Congrats Eric! > > > > > > _

Re: [ANNOUNCE] New Arrow committer: Eric Erhardt

2019-10-18 Thread Bryan Cutler
Congrats! On Thu, Oct 17, 2019, 6:26 PM Fan Liya wrote: > Congrats Eric! > > Best, > Liya Fan > > On Fri, Oct 18, 2019 at 3:06 AM paddy horan > wrote: > > > Congrats Eric! > > > > > > From: Micah Kornfield > > Sent: Thursday, October 17, 2019 12:45:15 PM > > To

Horizontal scaling design suggestion: Apache arrow flight

2019-10-18 Thread Vinay Kesarwani
Hi, I am trying to establish following architecture My approach for flight horizontal scaling is to launch 1-Apache flight server in each node 2-one node declared as coordinator 3-Publish coordinator info to a shared service [zookeeper] 4-Launch worker node --> get coordinator node info from [zoo

[jira] [Created] (ARROW-6937) [Packaging][Python] Fix conda linux and OSX wheel nightly builds

2019-10-18 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-6937: -- Summary: [Packaging][Python] Fix conda linux and OSX wheel nightly builds Key: ARROW-6937 URL: https://issues.apache.org/jira/browse/ARROW-6937 Project: Apache Ar

[jira] [Created] (ARROW-6936) [Python] Improve error message when object of wrong type is given

2019-10-18 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6936: - Summary: [Python] Improve error message when object of wrong type is given Key: ARROW-6936 URL: https://issues.apache.org/jira/browse/ARROW-6936 Project: Apache Arr

Re: pyarrow and pyzmq no copy

2019-10-18 Thread Antoine Pitrou
Hi, Your example works here with the latest development version of Arrow, under Python 3.7. Regards Antoine. Le 18/10/2019 à 00:07, seshu yamajala a écrit : I would like to use pyarrow with pyzmq no copy to send dicts of arrays across the network without having to make copies of the arra

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-10-18-0

2019-10-18 Thread Neal Richardson
docker-r-sanitizer failure is a failure to install a dependency: https://circleci.com/gh/ursa-labs/crossbow/3945 I'll watch it and make a ticket if the failure persists. Neal On Fri, Oct 18, 2019 at 5:01 AM Crossbow wrote: > > > Arrow Build Report for Job nightly-2019-10-18-0 > > All tasks: >

[jira] [Created] (ARROW-6935) [Java] Improve the performance of comparing two blocks of heap data

2019-10-18 Thread Liya Fan (Jira)
Liya Fan created ARROW-6935: --- Summary: [Java] Improve the performance of comparing two blocks of heap data Key: ARROW-6935 URL: https://issues.apache.org/jira/browse/ARROW-6935 Project: Apache Arrow

[jira] [Created] (ARROW-6934) [Python] Choose string column encoding in csv reader

2019-10-18 Thread Sascha Hofmann (Jira)
Sascha Hofmann created ARROW-6934: - Summary: [Python] Choose string column encoding in csv reader Key: ARROW-6934 URL: https://issues.apache.org/jira/browse/ARROW-6934 Project: Apache Arrow I

[NIGHTLY] Arrow Build Report for Job nightly-2019-10-18-0

2019-10-18 Thread Crossbow
Arrow Build Report for Job nightly-2019-10-18-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-18-0 Failed Tasks: - conda-linux-gcc-py27: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-18-0-azure-conda-linux-gcc-py27 - cond

[jira] [Created] (ARROW-6933) [Java] Suppor linear dictionary encoder

2019-10-18 Thread Liya Fan (Jira)
Liya Fan created ARROW-6933: --- Summary: [Java] Suppor linear dictionary encoder Key: ARROW-6933 URL: https://issues.apache.org/jira/browse/ARROW-6933 Project: Apache Arrow Issue Type: New Feature

[jira] [Created] (ARROW-6932) incorrect log on known extension type

2019-10-18 Thread stephane campinas (Jira)
stephane campinas created ARROW-6932: Summary: incorrect log on known extension type Key: ARROW-6932 URL: https://issues.apache.org/jira/browse/ARROW-6932 Project: Apache Arrow Issue Type

Re: [DISCUSS] Result vs Status

2019-10-18 Thread Micah Kornfield
Based on the call this week, I think there are a few related questions here. 1. Should we use Result at all? - IMO Result expresses APIs more naturally then Status + Single output parameter. I think most would agree if we had it from the beginning we would be probably use it. - The downside to