Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-18 Thread Uwe L. Korn
+1

Uwe

On Fri, Aug 17, 2018, at 5:02 AM, Jeff Zhang wrote:
> +1
> 
> Phillip Cloud 于2018年8月17日周五 上午10:59写道:
> 
> > +1
> >
> > On Thu, Aug 16, 2018 at 9:26 PM Andy Grove  wrote:
> >
> > > +1
> > >
> > > On Thu, Aug 16, 2018 at 9:56 AM Wes McKinney 
> > wrote:
> > >
> > > > Dear all,
> > > >
> > > > The developers of Gandiva, an LLVM-based vectorized expression
> > > > evaluation engine for Arrow columnar memory, are proposing to donate
> > > > the project to Apache Arrow at some point in the near future, as has
> > > > been discussed on the dev@ mailing list [1].
> > > >
> > > > The Gandiva codebase is located at:
> > > >
> > > > https://github.com/dremio/gandiva
> > > >
> > > > This work is not yet in a patch-ready state, but I wish to determine
> > > > if the Arrow PMC is in favor of accepting this donation, subject to
> > > > the fulfillment of the ASF IP Clearance process.
> > > >
> > > > [ ] +1 : Accept contribution of Gandiva
> > > > [ ]  0 : No opinion
> > > > [ ] -1 : Reject contribution because...
> > > >
> > > > Here is my vote: +1
> > > >
> > > > The vote will be open for at least 72 hours.
> > > >
> > > > Thanks,
> > > > Wes
> > > >
> > > > [1]:
> > > >
> > >
> > https://lists.apache.org/thread.html/cded0b511c68da21246cd25e99b4ad77092d17219629f73e0dc85cad@%3Cdev.arrow.apache.org%3E
> > > >
> > >
> >


Re: [DISCUSS] Re-think CI strategy?

2018-08-18 Thread Wes McKinney
Our CI is looking much healthier now after recent work (thank you!),
example build:

https://travis-ci.org/apache/arrow/builds/417700344

I think we've bought ourselves a few months at least. We'll have to
see what the impact on CI health of adding a couple more things:

* parquet-cpp unit tests (per [1])
* Gandiva build + tests

I suspect at some point in the future we may need to have a
combination of "fast Travis CI builds" and more exhaustive / longer
running builds in Jenkins. Projects like Apache Kudu have much more
intense testing procedures and these are run on dedicated
infrastructure rather than CI

I also think that more parts of our CI could be handled by creating an
"Arrow test bot" that can respond to directions. There are a number of
frameworks and examples now for writing GitHub bots; we could create a
bot that can execute on-demand tests of optional components using the
crossbow tool.

Other things that we run in every commit, like the Python manylinux1
build, could be run on-demand and nightly. That being said, I just
worked on a PR that broke the manylinux1 build
(https://github.com/apache/arrow/pull/2428) and so we risk having to
hunt down the root cause of a broken build if we don't run such tests
on every commit. I'm not sure we can simultaneously have fast CI
builds while also catching all possible problems

- Wes

[1]: 
https://lists.apache.org/thread.html/53f77f9f1f04b97709a0286db1b73a49b7f1541d8f8b2cb32db5c922@%3Cdev.parquet.apache.org%3E

On Tue, Aug 7, 2018 at 2:55 AM, Antoine Pitrou  wrote:
>
> It would be good to test all Python versions in a cron build, but I
> agree we may not need to test all Python 3 versions in per-commit builds.
>
> Regards
>
> Antoine.
>
>
> Le 07/08/2018 à 03:14, Robert Nishihara a écrit :
>> Thanks Wes.
>>
>> As for Python 3.5, 3.6, and 3.7, I think testing any one of them should be
>> sufficient (I can't recall any errors that happened with one version and
>> not the other).
>>
>> On Mon, Aug 6, 2018 at 12:01 PM Wes McKinney  wrote:
>>
>>> @Robert, it looks like NumPy is making LTS releases until Jan 1, 2020
>>>
>>>
>>> https://docs.scipy.org/doc/numpy-1.14.0/neps/dropping-python2.7-proposal.html
>>>
>>> Based on this, I think it's fine for us to continue to support Python
>>> 2.7 until then. It's only 16 months away; are you all ready for the
>>> next decade?
>>>
>>> We should also discuss if we want to continue to build and test Python
>>> 3.5. From download statistics it appears that there are 5-10x as many
>>> Python 3.6 users as 3.5. I would prefer to drop 3.5 and begin
>>> supporting 3.7 soon.
>>>
>>> @Antoine, I think we can avoid building the C++ codebase 3 times, but
>>> it will require a bit of retooling of the scripts. The reason that
>>> ccache isn't working properly is probably because the Python include
>>> directory is being included even for compilation units that do not use
>>> the Python C API.
>>> https://github.com/apache/arrow/blob/master/cpp/CMakeLists.txt#L721.
>>> I'm opening a JIRA about fixing this
>>> https://issues.apache.org/jira/browse/ARROW-2994
>>>
>>> Created https://issues.apache.org/jira/browse/ARROW-2995 about
>>> removing the redundant build cycle
>>>
>>> On Mon, Aug 6, 2018 at 2:19 PM, Robert Nishihara
>>>  wrote:
>
> Also, at this point we're sometimes hitting the 50 minutes time limit on
> our slowest Travis-CI matrix job, which means we have to restart it...
> making the build even slower.
>
 Only a short-term fix, but Travis can lengthen the max build time if you
 email them and ask them to.
>>>
>>


[jira] [Created] (ARROW-3084) [Python] Do we need to build both unicode variants of pyarrow wheels?

2018-08-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3084:
---

 Summary: [Python] Do we need to build both unicode variants of 
pyarrow wheels?
 Key: ARROW-3084
 URL: https://issues.apache.org/jira/browse/ARROW-3084
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Wes McKinney
 Fix For: 0.11.0


I noticed that pandas does not provide a UCS2 wheel for Python 2.7. We're 
building both UCS2 and UCS4. I am curious if the UCS2 wheels are widely used 
enough to make this worthwhile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3083) [Python] Version in manylinux1 wheel builds is wrong

2018-08-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3083:
---

 Summary: [Python] Version in manylinux1 wheel builds is wrong
 Key: ARROW-3083
 URL: https://issues.apache.org/jira/browse/ARROW-3083
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Wes McKinney
 Fix For: 0.11.0


Not sure if this is a regression but I noticed:

https://travis-ci.org/apache/arrow/jobs/417498434#L2665



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)