Re: Timeline for 0.15.0 release

2019-09-22 Thread Micah Kornfield
>
> It's ideal if your GPG key is in the web of trust (i.e. you can get it
> signed by another PMC member), but is not 100% essential.

That won't be an option for me this week (it seems like I would need to
meet one face-to-face).  I'll try to get the GPG checked in and the rest of
the pre-requisites done tomorrow (Monday) to hopefully start the release on
Tuesday (hopefully we can solve the last blocker/integration tests by then).

On Sat, Sep 21, 2019 at 7:12 PM Wes McKinney  wrote:

> It's ideal if your GPG key is in the web of trust (i.e. you can get it
> signed by another PMC member), but is not 100% essential.
>
> Speaking of the release, there are at least 2 code changes I still
> want to get in
>
> ARROW-5717
> ARROW-6353
>
> I just pushed updates to ARROW-5717, will merge once the build is green.
>
> There are a couple of Rust patches still marked for 0.15. The rest
> seems to be documentation and a couple of integration test failures we
> should see about fixing in time.
>
> On Fri, Sep 20, 2019 at 11:26 PM Micah Kornfield 
> wrote:
> >
> > Thanks Krisztián and Wes,
> > I've gone ahead and started registering myself on all the packaging
> sites.
> >
> > Is there any review process when adding my GPG key to the SVN file? [1]
> > doesn't seem to mention explicitly.
> >
> > Thanks,
> > Micah
> >
> > [1] https://www.apache.org/dev/version-control.html#https-svn
> >
> > On Fri, Sep 20, 2019 at 5:01 PM Krisztián Szűcs <
> szucs.kriszt...@gmail.com>
> > wrote:
> >
> > > On Thu, Sep 19, 2019 at 5:52 PM Wes McKinney 
> wrote:
> > >
> > >> On Thu, Sep 19, 2019 at 12:13 AM Micah Kornfield <
> emkornfi...@gmail.com>
> > >> wrote:
> > >> >>
> > >> >> The process should be well documented at this point but there are a
> > >> >> number of steps.
> > >> >
> > >> > Is [1] the up-to-date documentation for the release?   Are there
> > >> instructions for the adding the code signing Key to SVN?
> > >> >
> > >> > I will make a go of it.  i will try to mitigate any internet issues
> by
> > >> doing the process for a cloud instance (I assume that isn't a
> problem?).
> > >> >
> > >>
> > >> Setting up a new cloud environment suitable for producing an RC may be
> > >> time consuming, but you are welcome to try. Krisztian -- are you
> > >> available next week to help Micah and potentially take over producing
> > >> the RC if there are issues?
> > >>
> > > Sure, I'll be available next week. We can also grant access to
> > > https://github.com/ursa-labs/crossbow because configuring all
> > > the CI backends can be time consuming.
> > >
> > >>
> > >> > Thanks,
> > >> > Micah
> > >> >
> > >> > [1]
> > >>
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide
> > >> >
> > >> > On Wed, Sep 18, 2019 at 8:29 AM Wes McKinney 
> > >> wrote:
> > >> >>
> > >> >> The process should be well documented at this point but there are a
> > >> >> number of steps. Note that you need to add your code signing key to
> > >> >> the KEYS file in SVN (that's not very hard to do). I think it's
> fine
> > >> >> to hand off the process to others after the VOTE but it would be
> > >> >> tricky to have multiple RMs involved with producing the source and
> > >> >> binary artifacts for the vote
> > >> >>
> > >> >> On Tue, Sep 17, 2019 at 10:55 PM Micah Kornfield <
> > >> emkornfi...@gmail.com> wrote:
> > >> >> >
> > >> >> > SGTM, as well.
> > >> >> >
> > >> >> > I should have a little bit of time next week if I can help as RM
> but
> > >> I have
> > >> >> > a couple of concerns:
> > >> >> > 1.  In the past I've had trouble downloading and validating
> > >> releases. I'm a
> > >> >> > bit worried, that I might have similar problems doing the
> necessary
> > >> uploads.
> > >> >> > 2.  My internet connection will likely be not great, I don't
> know if
> > >> this
> > >> >> > would make it even less likely to be successful.
> > >> >> >
> > >> >> > Does it become problematic if somehow I would have to abandon the
> > >> process
> > >> >> > mid-release?  Is there anyone who could serve as a backup?  Are
> the
> > >> steps
> > >> >> > well documented?
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Micah
> > >> >> >
> > >> >> > On Tue, Sep 17, 2019 at 4:25 PM Neal Richardson <
> > >> neal.p.richard...@gmail.com>
> > >> >> > wrote:
> > >> >> >
> > >> >> > > Sounds good to me.
> > >> >> > >
> > >> >> > > Do we have a release manager yet? Any volunteers?
> > >> >> > >
> > >> >> > > Neal
> > >> >> > >
> > >> >> > > On Tue, Sep 17, 2019 at 4:06 PM Wes McKinney <
> wesmck...@gmail.com>
> > >> wrote:
> > >> >> > >
> > >> >> > > > hi all,
> > >> >> > > >
> > >> >> > > > It looks like we're drawing close to be able to make the
> 0.15.0
> > >> >> > > > release. I would suggest "pencils down" at the end of this
> week
> > >> and
> > >> >> > > > see if a release candidate can be produced next Monday
> September
> > >> 23.
> > >> >> > > > Any thoughts or objections?
> > >> >> > > >
> > >> >> > > > Thanks,
> > >> >> > > > Wes
> > >> >> > > >
> > >> >> > > 

[jira] [Created] (ARROW-6664) [C++] Add option to build without SSE4.2

2019-09-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6664:
---

 Summary: [C++] Add option to build without SSE4.2
 Key: ARROW-6664
 URL: https://issues.apache.org/jira/browse/ARROW-6664
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.15.0


Child task of ARROW-5381



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6663) [C++] Use software __builtin_popcountll when building without SSE4.2

2019-09-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6663:
---

 Summary: [C++] Use software __builtin_popcountll when building 
without SSE4.2
 Key: ARROW-6663
 URL: https://issues.apache.org/jira/browse/ARROW-6663
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
 Fix For: 1.0.0


This is to be extra safe in the context of ARROW-5381



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6662) [Java] Implement equals/approxEquals API for VectorSchemaRoot

2019-09-22 Thread Ji Liu (Jira)
Ji Liu created ARROW-6662:
-

 Summary: [Java] Implement equals/approxEquals API for 
VectorSchemaRoot
 Key: ARROW-6662
 URL: https://issues.apache.org/jira/browse/ARROW-6662
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java
Reporter: Ji Liu
Assignee: Ji Liu


Currently with the new added visitor APIs(ARROW-6211), we could implement 
equals/approxEquals for VectorSchemaRoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6661) [Java] Implement APIs like slice to enhance VectorSchemaRoot

2019-09-22 Thread Ji Liu (Jira)
Ji Liu created ARROW-6661:
-

 Summary: [Java] Implement APIs like slice to enhance 
VectorSchemaRoot
 Key: ARROW-6661
 URL: https://issues.apache.org/jira/browse/ARROW-6661
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java
Reporter: Ji Liu
Assignee: Ji Liu


Currently in Java Implementation there is no APIs like slice for record batch 
like C++/Python.

This issue is about to implement slice/getVector/addVector/removeVector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Timeline for 0.15.0 release

2019-09-22 Thread Andy Grove
There's been quite a bit of activity in DataFusion over the past few weeks,
and there are currently two issues that I would like to see merged in time
for the release:

ARROW-6660: Minor docs update
ARROW-6089: Physical query plan for selection operator

ARROW-6089 is part of the new query execution implementation that I talked
about on the mailing list just recently and is new functionality that
doesn't impact any existing users, so maybe this one could be rubber stamp
approved if there are no objections. The equivalent PRs for the other
operators (projection and aggregate) were already merged.





On Sat, Sep 21, 2019 at 7:12 PM Wes McKinney  wrote:

> It's ideal if your GPG key is in the web of trust (i.e. you can get it
> signed by another PMC member), but is not 100% essential.
>
> Speaking of the release, there are at least 2 code changes I still
> want to get in
>
> ARROW-5717
> ARROW-6353
>
> I just pushed updates to ARROW-5717, will merge once the build is green.
>
> There are a couple of Rust patches still marked for 0.15. The rest
> seems to be documentation and a couple of integration test failures we
> should see about fixing in time.
>
> On Fri, Sep 20, 2019 at 11:26 PM Micah Kornfield 
> wrote:
> >
> > Thanks Krisztián and Wes,
> > I've gone ahead and started registering myself on all the packaging
> sites.
> >
> > Is there any review process when adding my GPG key to the SVN file? [1]
> > doesn't seem to mention explicitly.
> >
> > Thanks,
> > Micah
> >
> > [1] https://www.apache.org/dev/version-control.html#https-svn
> >
> > On Fri, Sep 20, 2019 at 5:01 PM Krisztián Szűcs <
> szucs.kriszt...@gmail.com>
> > wrote:
> >
> > > On Thu, Sep 19, 2019 at 5:52 PM Wes McKinney 
> wrote:
> > >
> > >> On Thu, Sep 19, 2019 at 12:13 AM Micah Kornfield <
> emkornfi...@gmail.com>
> > >> wrote:
> > >> >>
> > >> >> The process should be well documented at this point but there are a
> > >> >> number of steps.
> > >> >
> > >> > Is [1] the up-to-date documentation for the release?   Are there
> > >> instructions for the adding the code signing Key to SVN?
> > >> >
> > >> > I will make a go of it.  i will try to mitigate any internet issues
> by
> > >> doing the process for a cloud instance (I assume that isn't a
> problem?).
> > >> >
> > >>
> > >> Setting up a new cloud environment suitable for producing an RC may be
> > >> time consuming, but you are welcome to try. Krisztian -- are you
> > >> available next week to help Micah and potentially take over producing
> > >> the RC if there are issues?
> > >>
> > > Sure, I'll be available next week. We can also grant access to
> > > https://github.com/ursa-labs/crossbow because configuring all
> > > the CI backends can be time consuming.
> > >
> > >>
> > >> > Thanks,
> > >> > Micah
> > >> >
> > >> > [1]
> > >>
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide
> > >> >
> > >> > On Wed, Sep 18, 2019 at 8:29 AM Wes McKinney 
> > >> wrote:
> > >> >>
> > >> >> The process should be well documented at this point but there are a
> > >> >> number of steps. Note that you need to add your code signing key to
> > >> >> the KEYS file in SVN (that's not very hard to do). I think it's
> fine
> > >> >> to hand off the process to others after the VOTE but it would be
> > >> >> tricky to have multiple RMs involved with producing the source and
> > >> >> binary artifacts for the vote
> > >> >>
> > >> >> On Tue, Sep 17, 2019 at 10:55 PM Micah Kornfield <
> > >> emkornfi...@gmail.com> wrote:
> > >> >> >
> > >> >> > SGTM, as well.
> > >> >> >
> > >> >> > I should have a little bit of time next week if I can help as RM
> but
> > >> I have
> > >> >> > a couple of concerns:
> > >> >> > 1.  In the past I've had trouble downloading and validating
> > >> releases. I'm a
> > >> >> > bit worried, that I might have similar problems doing the
> necessary
> > >> uploads.
> > >> >> > 2.  My internet connection will likely be not great, I don't
> know if
> > >> this
> > >> >> > would make it even less likely to be successful.
> > >> >> >
> > >> >> > Does it become problematic if somehow I would have to abandon the
> > >> process
> > >> >> > mid-release?  Is there anyone who could serve as a backup?  Are
> the
> > >> steps
> > >> >> > well documented?
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Micah
> > >> >> >
> > >> >> > On Tue, Sep 17, 2019 at 4:25 PM Neal Richardson <
> > >> neal.p.richard...@gmail.com>
> > >> >> > wrote:
> > >> >> >
> > >> >> > > Sounds good to me.
> > >> >> > >
> > >> >> > > Do we have a release manager yet? Any volunteers?
> > >> >> > >
> > >> >> > > Neal
> > >> >> > >
> > >> >> > > On Tue, Sep 17, 2019 at 4:06 PM Wes McKinney <
> wesmck...@gmail.com>
> > >> wrote:
> > >> >> > >
> > >> >> > > > hi all,
> > >> >> > > >
> > >> >> > > > It looks like we're drawing close to be able to make the
> 0.15.0
> > >> >> > > > release. I would suggest "pencils down" at the end of this
> week
> > >> and
> > >> >> > > > see if a release candidate can be 

[NIGHTLY] Arrow Build Report for Job nightly-2019-09-22-0

2019-09-22 Thread Crossbow


Arrow Build Report for Job nightly-2019-09-22-0

All tasks: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0

Failed Tasks:
- docker-r-sanitizer:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-r-sanitizer
- docker-dask-integration:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-dask-integration
- docker-cpp-fuzzit:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-cpp-fuzzit
- docker-spark-integration:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-spark-integration

Succeeded Tasks:
- docker-python-2.7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-python-2.7
- docker-cpp-cmake32:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-cpp-cmake32
- docker-python-3.6-nopandas:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-python-3.6-nopandas
- homebrew-cpp-autobrew:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-homebrew-cpp-autobrew
- wheel-manylinux2010-cp36m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-manylinux2010-cp36m
- conda-linux-gcc-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-azure-conda-linux-gcc-py37
- docker-r-conda:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-r-conda
- debian-buster-arm64:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-azure-debian-buster-arm64
- docker-cpp-release:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-cpp-release
- wheel-manylinux1-cp27m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-manylinux1-cp27m
- conda-linux-gcc-py27:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-azure-conda-linux-gcc-py27
- docker-docs:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-docs
- wheel-manylinux1-cp36m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-manylinux1-cp36m
- docker-r:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-r
- docker-cpp:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-cpp
- conda-osx-clang-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-azure-conda-osx-clang-py36
- docker-python-3.7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-python-3.7
- docker-iwyu:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-iwyu
- wheel-osx-cp27m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-osx-cp27m
- wheel-manylinux2010-cp35m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-manylinux2010-cp35m
- docker-rust:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-rust
- conda-osx-clang-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-azure-conda-osx-clang-py37
- wheel-osx-cp36m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-osx-cp36m
- docker-python-3.6:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-python-3.6
- wheel-manylinux1-cp37m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-manylinux1-cp37m
- debian-stretch-arm64:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-azure-debian-stretch-arm64
- wheel-manylinux2010-cp27mu:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-manylinux2010-cp27mu
- debian-stretch:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-azure-debian-stretch
- wheel-manylinux2010-cp37m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-travis-wheel-manylinux2010-cp37m
- docker-clang-format:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-circle-docker-clang-format
- conda-osx-clang-py27:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-22-0-azure-conda-osx-clang-py27
- conda-win-vs2015-py36:
  URL: 

[jira] [Created] (ARROW-6660) [Rust] [DataFusion] Minor docs update for 0.15.0 release

2019-09-22 Thread Andy Grove (Jira)
Andy Grove created ARROW-6660:
-

 Summary: [Rust] [DataFusion] Minor docs update for 0.15.0 release
 Key: ARROW-6660
 URL: https://issues.apache.org/jira/browse/ARROW-6660
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove
Assignee: Andy Grove
 Fix For: 0.15.0


Minor docs update for 0.15.0 release



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6659) [Rust] [DataFusion] Refactor of HashAggregateExec to support custom merge

2019-09-22 Thread Andy Grove (Jira)
Andy Grove created ARROW-6659:
-

 Summary: [Rust] [DataFusion] Refactor of HashAggregateExec to 
support custom merge
 Key: ARROW-6659
 URL: https://issues.apache.org/jira/browse/ARROW-6659
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove


HashAggregateExec current creates one HashPartition per input partition for the 
initial aggregate per partition, and then explicitly calls MergeExec and then 
creates another HashPartition for the final reduce operation.

This is fine for in-memory queries in DataFusion but is not extensible. For 
example, it is not possible to provide a different MergeExec implementation 
that would distribute queries to a cluster.

A better design would be to move the logic into the query planner so that the 
physical plan contains explicit steps such as:

 
{code:java}
- HashAggregate // final aggregate
  - MergeExec
- HashAggregate // aggregate per partition
 {code}
This would then make it easier to customize the plan in other projects, to 
support distributed execution:
{code:java}
 - HashAggregate // final aggregate
   - MergeExec
  - DistributedExec
 - HashAggregate // aggregate per partition{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6658) [Rust] [DataFusion] Implement AVG aggregate expression

2019-09-22 Thread Andy Grove (Jira)
Andy Grove created ARROW-6658:
-

 Summary: [Rust] [DataFusion] Implement AVG aggregate expression
 Key: ARROW-6658
 URL: https://issues.apache.org/jira/browse/ARROW-6658
 Project: Apache Arrow
  Issue Type: Sub-task
Reporter: Andy Grove
 Fix For: 1.0.0


Implement AVG aggregate expression. See COUNT and SUM for inspiration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6657) [Rust] [DataFusion] Implement COUNT aggregate expression

2019-09-22 Thread Andy Grove (Jira)
Andy Grove created ARROW-6657:
-

 Summary: [Rust] [DataFusion] Implement COUNT aggregate expression
 Key: ARROW-6657
 URL: https://issues.apache.org/jira/browse/ARROW-6657
 Project: Apache Arrow
  Issue Type: Sub-task
Reporter: Andy Grove
 Fix For: 1.0.0


Implement COUNT aggregate expressions. See the SUM implementation for 
inspiration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6656) [Rust] [DataFusion] Implement MIN and MAX

2019-09-22 Thread Andy Grove (Jira)
Andy Grove created ARROW-6656:
-

 Summary: [Rust] [DataFusion] Implement MIN and MAX
 Key: ARROW-6656
 URL: https://issues.apache.org/jira/browse/ARROW-6656
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove
 Fix For: 1.0.0


Implement MIN and MAX aggregate expressions. See the SUM implementation for 
inspiration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)