Re: [DISCUSS] Release cadence and release vote conventions

2019-07-31 Thread Sutou Kouhei
Hi, Sorry for not replying this thread. I think that the biggest problem is related to our Java package. We'll be able to resolve the GPG key problem by creating a GPG key only for nightly release test. We can share the test GPG key publicly because it's a just for testing. It'll work for our

Re: [DISCUSS] Release cadence and release vote conventions

2019-07-31 Thread Wes McKinney
The PMC member and their GPG keys need to be in the loop at some point. The release artifacts can be produced by some kind of CI/CD system so long as the PMC member has confidence in the security of those artifacts before signing them. For example, we build the official binary packages on public CI

Re: [VOTE] Adopt FORMAT and LIBRARY SemVer-based version schemes for Arrow 1.0.0 and beyond

2019-07-31 Thread Bryan Cutler
+1 (non-binding) On Wed, Jul 31, 2019 at 8:59 AM Uwe L. Korn wrote: > +1 from me. > > I really like the separate versions > > Uwe > > On Tue, Jul 30, 2019, at 2:21 PM, Antoine Pitrou wrote: > > > > +1 from me. > > > > Regards > > > > Antoine. > > > > > > > > On Fri, 26 Jul 2019 14:33:30 -0500 >

Re: Ursabot configuration within Arrow

2019-07-31 Thread Krisztián Szűcs
We can now reproduce the builds locally (without the need of the web UI) with a single command: To demonstrate, building the master barnch and building a pull request requires the following commands: $ ursabot project build 'AMD64 Ubuntu 18.04 C++' $ ursabot project build -pr 'AMD64 Ubuntu 18.0

[jira] [Created] (ARROW-6091) Implement parallel execution for limit

2019-07-31 Thread Andy Grove (JIRA)
Andy Grove created ARROW-6091: - Summary: Implement parallel execution for limit Key: ARROW-6091 URL: https://issues.apache.org/jira/browse/ARROW-6091 Project: Apache Arrow Issue Type: Sub-task

[jira] [Created] (ARROW-6087) Implement parallel execution for CSV scan

2019-07-31 Thread Andy Grove (JIRA)
Andy Grove created ARROW-6087: - Summary: Implement parallel execution for CSV scan Key: ARROW-6087 URL: https://issues.apache.org/jira/browse/ARROW-6087 Project: Apache Arrow Issue Type: Sub-task

[jira] [Created] (ARROW-6090) Implement parallel execution for hash aggregate

2019-07-31 Thread Andy Grove (JIRA)
Andy Grove created ARROW-6090: - Summary: Implement parallel execution for hash aggregate Key: ARROW-6090 URL: https://issues.apache.org/jira/browse/ARROW-6090 Project: Apache Arrow Issue Type: Su

[jira] [Created] (ARROW-6089) Implement parallel execution for selection

2019-07-31 Thread Andy Grove (JIRA)
Andy Grove created ARROW-6089: - Summary: Implement parallel execution for selection Key: ARROW-6089 URL: https://issues.apache.org/jira/browse/ARROW-6089 Project: Apache Arrow Issue Type: Sub-tas

[jira] [Created] (ARROW-6088) Implement parallel execution for projection

2019-07-31 Thread Andy Grove (JIRA)
Andy Grove created ARROW-6088: - Summary: Implement parallel execution for projection Key: ARROW-6088 URL: https://issues.apache.org/jira/browse/ARROW-6088 Project: Apache Arrow Issue Type: Sub-ta

[jira] [Created] (ARROW-6086) Implement parallel execution for parquet scan

2019-07-31 Thread Andy Grove (JIRA)
Andy Grove created ARROW-6086: - Summary: Implement parallel execution for parquet scan Key: ARROW-6086 URL: https://issues.apache.org/jira/browse/ARROW-6086 Project: Apache Arrow Issue Type: Sub-

Re: New version(s) on JIRA

2019-07-31 Thread Antoine Pitrou
Ok, I've created it as well. Regards Antoine. Le 31/07/2019 à 19:00, Wes McKinney a écrit : > Yes, I think we need 0.15.0 for this > > On Wed, Jul 31, 2019 at 10:42 AM Antoine Pitrou wrote: >> >> >> Thanks. I created "2.0.0". >> Will we also need a "0.15.0" for the flatbuffers alignment fi

Re: Building on Arrow CUDA

2019-07-31 Thread Uwe L. Korn
Hello Paul, you might want to look into https://github.com/conda-forge/conda-forge.github.io/issues/687 where CUDA support on conda-forge is dicussed. I'm not uptodate anymore on this but reading the whole issue should give you the current level of support. Once this is solved, adding cuda sup

Re: New version(s) on JIRA

2019-07-31 Thread Wes McKinney
Yes, I think we need 0.15.0 for this On Wed, Jul 31, 2019 at 10:42 AM Antoine Pitrou wrote: > > > Thanks. I created "2.0.0". > Will we also need a "0.15.0" for the flatbuffers alignment fix? > > Regards > > Antoine. > > > Le 31/07/2019 à 03:00, Sutou Kouhei a écrit : > > Hi, > > > > I think that

Re: [DISCUSS][Format] FixedSizeList w/ row-length not specified as part of the type

2019-07-31 Thread Brian Hulette
I'm a little confused about the proposal now. If the unknown dimension doesn't have to be the same within a record batch, how would you be able to deduce it with the approach you described (dividing the logical length of the values array by the length of the record batch)? On Wed, Jul 31, 2019 at

[jira] [Created] (ARROW-6085) Create traits for phsyical query plan

2019-07-31 Thread Andy Grove (JIRA)
Andy Grove created ARROW-6085: - Summary: Create traits for phsyical query plan Key: ARROW-6085 URL: https://issues.apache.org/jira/browse/ARROW-6085 Project: Apache Arrow Issue Type: Sub-task

[jira] [Created] (ARROW-6084) [Python] Support LargeList

2019-07-31 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-6084: - Summary: [Python] Support LargeList Key: ARROW-6084 URL: https://issues.apache.org/jira/browse/ARROW-6084 Project: Apache Arrow Issue Type: Improvement

Re: [DISCUSS] Release cadence and release vote conventions

2019-07-31 Thread Andy Grove
To what extent would it be possible to automate the release process via CICD? On Wed, Jul 31, 2019 at 9:19 AM Wes McKinney wrote: > I think one thing that would help would be improving the > reproducibility of the source release process. The RM has to have > their machine configured in a particu

Re: [VOTE] Adopt FORMAT and LIBRARY SemVer-based version schemes for Arrow 1.0.0 and beyond

2019-07-31 Thread Uwe L. Korn
+1 from me. I really like the separate versions Uwe On Tue, Jul 30, 2019, at 2:21 PM, Antoine Pitrou wrote: > > +1 from me. > > Regards > > Antoine. > > > > On Fri, 26 Jul 2019 14:33:30 -0500 > Wes McKinney wrote: > > hello, > > > > As discussed on the mailing list thread [1], Micah Korn

Re: New version(s) on JIRA

2019-07-31 Thread Antoine Pitrou
Thanks. I created "2.0.0". Will we also need a "0.15.0" for the flatbuffers alignment fix? Regards Antoine. Le 31/07/2019 à 03:00, Sutou Kouhei a écrit : > Hi, > > I think that "2.0.0" is better. Because we'll not release > "1.1.0". > > See also: > https://lists.apache.org/thread.html/d0a

Re: [DISCUSS][Format] FixedSizeList w/ row-length not specified as part of the type

2019-07-31 Thread Wes McKinney
I agree this sounds like a good application for ExtensionType. At minimum, ExtensionType can be used to develop a working version of what you need to help guide further discussions. On Mon, Jul 29, 2019 at 2:29 PM Francois Saint-Jacques wrote: > > Hello, > > if each record has a different size, t

Re: [DISCUSS] Release cadence and release vote conventions

2019-07-31 Thread Wes McKinney
I think one thing that would help would be improving the reproducibility of the source release process. The RM has to have their machine configured in a particular way for it to work. Before anyone says "Docker" it isn't an easy solution because the release scripts need to be able to create git co

[jira] [Created] (ARROW-6083) [Java] Refactor Jdbc adapter consume logic

2019-07-31 Thread Ji Liu (JIRA)
Ji Liu created ARROW-6083: - Summary: [Java] Refactor Jdbc adapter consume logic Key: ARROW-6083 URL: https://issues.apache.org/jira/browse/ARROW-6083 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-6082) [Python] create pa.dictionary() type with non-integer indices type crashes

2019-07-31 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-6082: Summary: [Python] create pa.dictionary() type with non-integer indices type crashes Key: ARROW-6082 URL: https://issues.apache.org/jira/browse/ARROW-6082

[jira] [Created] (ARROW-6081) FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmptb2ao6te_job_6e0a8ca1.parquet'

2019-07-31 Thread David Draper (JIRA)
David Draper created ARROW-6081: --- Summary: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmptb2ao6te_job_6e0a8ca1.parquet' Key: ARROW-6081 URL: https://issues.apache.org/jira/browse/ARROW-6081

[jira] [Created] (ARROW-6080) [Java] Support search operation for BaseRepeatedValueVector

2019-07-31 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6080: --- Summary: [Java] Support search operation for BaseRepeatedValueVector Key: ARROW-6080 URL: https://issues.apache.org/jira/browse/ARROW-6080 Project: Apache Arrow Issue

[jira] [Created] (ARROW-6079) [Java] Implement/test UnionFixedSizeListWriter for FixedSizeListVector

2019-07-31 Thread Ji Liu (JIRA)
Ji Liu created ARROW-6079: - Summary: [Java] Implement/test UnionFixedSizeListWriter for FixedSizeListVector Key: ARROW-6079 URL: https://issues.apache.org/jira/browse/ARROW-6079 Project: Apache Arrow