[jira] [Created] (ARROW-1339) [C++] Use boost::filesystem for handling of platform-specific file path encodings

2017-08-07 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1339:
---

 Summary: [C++] Use boost::filesystem for handling of 
platform-specific file path encodings
 Key: ARROW-1339
 URL: https://issues.apache.org/jira/browse/ARROW-1339
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.7.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ARROW-1338) [Python] Investigate non-deterministic core dump on Python 2.7, Travis CI builds

2017-08-07 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1338:
---

 Summary: [Python] Investigate non-deterministic core dump on 
Python 2.7, Travis CI builds
 Key: ARROW-1338
 URL: https://issues.apache.org/jira/browse/ARROW-1338
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Wes McKinney
 Fix For: 0.6.0


{code}
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_python_file_write
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_python_file_read
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_bytes_reader
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_bytes_reader_non_bytes
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_bytes_reader_retains_parent_reference
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_buffer_bytes
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_buffer_memoryview
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_buffer_bytearray
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_buffer_numpy
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_buffer_memoryview_is_immutable
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_memory_output_stream
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_inmemory_write_after_closed
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_buffer_protocol_ref_counting
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_nativefile_write_memoryview
 PASSED
pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_io.py::test_mock_output_stream
 /Users/travis/build/apache/arrow/ci/travis_script_python.sh: line 81:  8186 
Segmentation fault: 11  (core dumped) python -m pytest -vv -r sxX -s 
$PYARROW_PATH --parquet
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ARROW-1337) [Python] User reports pkg-config does not work properly in FindArrow.cmake for pyarrow

2017-08-07 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1337:
---

 Summary: [Python] User reports pkg-config does not work properly 
in FindArrow.cmake for pyarrow
 Key: ARROW-1337
 URL: https://issues.apache.org/jira/browse/ARROW-1337
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Wes McKinney
 Fix For: 0.6.0


{code}
-- Checking for module 'arrow'
--   Found arrow, version 0.5.0
-- Arrow ABI version: 0.0.0
-- Arrow SO version: 0
-- Found the Arrow core library: 
/gnu/store/h3cb0ynq76cmzs2vp2syqd42kkdh9paa-apache-arrow-0.5.0/lib/libarrow.so
-- Found the Arrow Python library: 
/gnu/store/h3cb0ynq76cmzs2vp2syqd42kkdh9paa-apache-arrow-0.5.0/lib/libarrow_python.so
CMake Error at cmake_modules/BuildUtils.cmake:88 (message):
  No static or shared library provided for arrow
Call Stack (most recent call first):
  CMakeLists.txt:263 (ADD_THIRDPARTY_LIB)


-- Configuring incomplete, errors occurred!
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: pyarrow versioning

2017-08-07 Thread Colin Nichols
Ah ok, makes sense -- thanks Wes!

- Colin


*Colin Nichols | Senior Software Engineer335 Madison Avenue, 16F | New
York, NY, 10017+1 (646) 912 2018 | BAM.ai*

On Mon, Aug 7, 2017 at 3:55 PM, Wes McKinney  wrote:

> hi Colin,
>
> Sorry about that. Yes, I pulled the 0.5.0 packages from PyPI (which is
> an unofficial package, because the Arrow PMC has not voted on it)
> because of the problems with jemalloc in ARROW-1282 -- it is not
> possible to replace a broken package without also changing the version
> number. This is a pretty exceptional case because the jemalloc
> allocator was causing hung processes in some cases; this was something
> we could disable at build time without making a new release.
>
> I recommend that you pin to a minor version, but not a patch version, so
>
> pyarrow==0.5.*
>
> Patch versions as a rule will not contain API changes. This conflicts
> with the installation advice in http://arrow.apache.org/install/, so I
> will update this at next opportunity.
>
> - Wes
>
> On Mon, Aug 7, 2017 at 3:40 PM, Colin Nichols  wrote:
> > Hi all,
> >
> > I noticed today that pyarrow==0.5.0 has disappeared from Pypi, replaced
> by
> > 0.5.0.post2.  Just wanted to make sure that was intended.  If so, is the
> > expectation that users put e.g., pyarrow~=0.5.0 in their requirements
> file
> > as opposed to pyarrow==0.5.0?
> >
> > Thank you,
> > Colin
> >
> >
> >
> > *Colin Nichols | Senior Software Engineer335 Madison Avenue, 16F | New
> > York, NY, 10017+1 (646) 912 2018 | Narrativ*
>


Re: pyarrow versioning

2017-08-07 Thread Wes McKinney
hi Colin,

Sorry about that. Yes, I pulled the 0.5.0 packages from PyPI (which is
an unofficial package, because the Arrow PMC has not voted on it)
because of the problems with jemalloc in ARROW-1282 -- it is not
possible to replace a broken package without also changing the version
number. This is a pretty exceptional case because the jemalloc
allocator was causing hung processes in some cases; this was something
we could disable at build time without making a new release.

I recommend that you pin to a minor version, but not a patch version, so

pyarrow==0.5.*

Patch versions as a rule will not contain API changes. This conflicts
with the installation advice in http://arrow.apache.org/install/, so I
will update this at next opportunity.

- Wes

On Mon, Aug 7, 2017 at 3:40 PM, Colin Nichols  wrote:
> Hi all,
>
> I noticed today that pyarrow==0.5.0 has disappeared from Pypi, replaced by
> 0.5.0.post2.  Just wanted to make sure that was intended.  If so, is the
> expectation that users put e.g., pyarrow~=0.5.0 in their requirements file
> as opposed to pyarrow==0.5.0?
>
> Thank you,
> Colin
>
>
>
> *Colin Nichols | Senior Software Engineer335 Madison Avenue, 16F | New
> York, NY, 10017+1 (646) 912 2018 | Narrativ*


pyarrow versioning

2017-08-07 Thread Colin Nichols
Hi all,

I noticed today that pyarrow==0.5.0 has disappeared from Pypi, replaced by
0.5.0.post2.  Just wanted to make sure that was intended.  If so, is the
expectation that users put e.g., pyarrow~=0.5.0 in their requirements file
as opposed to pyarrow==0.5.0?

Thank you,
Colin



*Colin Nichols | Senior Software Engineer335 Madison Avenue, 16F | New
York, NY, 10017+1 (646) 912 2018 | Narrativ*


Re: Arrow Plasma Object Store - IP clearance

2017-08-07 Thread Robert Nishihara
Thanks! This is great!

On Mon, Aug 7, 2017 at 11:30 AM Wes McKinney  wrote:

> Thanks to the Plasma developers for their code contribution and
> efforts integrating it with the Arrow codebase! It's a powerful and
> useful tool that will help the project grow.
>
> - Wes
>
> On Mon, Aug 7, 2017 at 2:24 PM, Philipp Moritz  wrote:
> > Great to hear! Thanks a lot to everybody involved with this for their
> help.
> >
> > On Mon, Aug 7, 2017 at 11:19 AM, Julian Hyde  wrote:
> >
> >> The vote for IP clearance of the Plasma Object Store on the Incubator
> list
> >> has passed[1].
> >>
> >> We can now proceed with a release.
> >>
> >> Julian
> >>
> >> [1] https://s.apache.org/arrow-plasma-object-store-clearance-result
> >>
> >>
> >>
>


Re: Arrow Plasma Object Store - IP clearance

2017-08-07 Thread Wes McKinney
Thanks to the Plasma developers for their code contribution and
efforts integrating it with the Arrow codebase! It's a powerful and
useful tool that will help the project grow.

- Wes

On Mon, Aug 7, 2017 at 2:24 PM, Philipp Moritz  wrote:
> Great to hear! Thanks a lot to everybody involved with this for their help.
>
> On Mon, Aug 7, 2017 at 11:19 AM, Julian Hyde  wrote:
>
>> The vote for IP clearance of the Plasma Object Store on the Incubator list
>> has passed[1].
>>
>> We can now proceed with a release.
>>
>> Julian
>>
>> [1] https://s.apache.org/arrow-plasma-object-store-clearance-result
>>
>>
>>


Re: Arrow Plasma Object Store - IP clearance

2017-08-07 Thread Philipp Moritz
Great to hear! Thanks a lot to everybody involved with this for their help.

On Mon, Aug 7, 2017 at 11:19 AM, Julian Hyde  wrote:

> The vote for IP clearance of the Plasma Object Store on the Incubator list
> has passed[1].
>
> We can now proceed with a release.
>
> Julian
>
> [1] https://s.apache.org/arrow-plasma-object-store-clearance-result
>
>
>


Arrow Plasma Object Store - IP clearance

2017-08-07 Thread Julian Hyde
The vote for IP clearance of the Plasma Object Store on the Incubator list has 
passed[1].

We can now proceed with a release.

Julian

[1] https://s.apache.org/arrow-plasma-object-store-clearance-result




[jira] [Created] (ARROW-1336) [C++] Add arrow::schema factory function

2017-08-07 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1336:
---

 Summary: [C++] Add arrow::schema factory function
 Key: ARROW-1336
 URL: https://issues.apache.org/jira/browse/ARROW-1336
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.6.0


Because using {{std::make_shared}} with initializer lists is incompatible, it 
would be useful to have a factory function for making schemas from an 
initializer list of fields to make user syntax nicer



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ARROW-1335) [C++] PrimitiveArray::raw_values has inconsistent semantics re: offsets compared with subclasses

2017-08-07 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1335:
---

 Summary: [C++] PrimitiveArray::raw_values has inconsistent 
semantics re: offsets compared with subclasses
 Key: ARROW-1335
 URL: https://issues.apache.org/jira/browse/ARROW-1335
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.6.0


{{NumericArray::raw_values}} accounts for offset, while 
{{PrimitiveArray::raw_values}} does not. This seems likely to lead to shooting 
one's self in the foot. It may be better to remove 
{{PrimitiveArray::raw_values}} altogether



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Arrow 0.6.0 release planning and timeline

2017-08-07 Thread Wes McKinney
hi all,

It looks like we should be able to close out 0.6.0 issues today and
cut a release candidate tomorrow if there are no objections. Please
take a moment to complete any code reviews for patches that should go
in. We have a number of Java library dependency upgrades pending, do
these need to go in?

https://github.com/apache/arrow/pull/929
https://github.com/apache/arrow/pull/873

There is another Java patch that needs to be merged if someone could review:

https://github.com/apache/arrow/pull/898

There are a handful of C++ or Python patches pending, and I'll be
opening a few more PRs today; any code reviews would be appreciated.

Thanks
Wes

On Fri, Aug 4, 2017 at 3:57 PM, Siddharth Teotia  wrote:
> Reviewed https://github.com/apache/arrow/pull/915 for ARROW-1296
>
>
> On Fri, Aug 4, 2017 at 11:45 AM, Siddharth Teotia 
> wrote:
>
>> I will review it by EOD.
>>
>> On Fri, Aug 4, 2017 at 11:15 AM, Li Jin  wrote:
>>
>>> On the Java side I have https://issues.apache.org/jira/browse/ARROW-1296,
>>> which is small bug fix.
>>>
>>> If someone help review it would be great. Else if it doesn't get reviewed
>>> by 0.6 rc cut, we can take it off 0.6 release.
>>>
>>> Li
>>>
>>> On Fri, Aug 4, 2017 at 2:02 PM, Wes McKinney  wrote:
>>>
>>> > hi all,
>>> >
>>> > If there are no problems with the Plasma IP Clearance, I would like to
>>> > cut a release candidate for 0.6.0 at the beginning of next week. There
>>> > are a handful of issues pending on the Java and C++ side that I'll be
>>> > working to complete over the next several days. Please keep an eye on
>>> > the release page on JIRA:
>>> >
>>> > https://issues.apache.org/jira/projects/ARROW/versions/12341088
>>> >
>>> > There are a number of outstanding Java patches; if you would like to
>>> > include any of these in the 0.6.0 release, could someone review?
>>> >
>>> > Thanks,
>>> > Wes
>>> >
>>> > On Tue, Aug 1, 2017 at 10:59 PM, Wes McKinney 
>>> wrote:
>>> > > It seems that ARROW-1282 is causing some users problems. We have the
>>> > > option of making a 0.5.1 release, but given how much work has reached
>>> > > master (or is about to reach master) I would be in favor of
>>> > > accelerating 0.6.0, cutting a release candidate within the next couple
>>> > > of days. We could aim for another release within 2-3 weeks after
>>> > > completing the Plasma IP clearance.
>>> > >
>>> > > Thoughts?
>>> > >
>>> > > On Tue, Aug 1, 2017 at 9:44 AM, Uwe L. Korn  wrote:
>>> > >> Hello,
>>> > >>
>>> > >> from my side we're mostly fine for a 0.6.0 release. Currently I'm
>>> facing
>>> > >> a problem with https://issues.apache.org/jira/browse/ARROW-1302 in
>>> the
>>> > >> 0.5.0 OSX wheels. We need to fix this before 0.6.0. Also I would
>>> like to
>>> > >> look a bit more into the jemalloc issues that came up with 0.5.0 to
>>> get
>>> > >> some of them solved in the next release.
>>> > >>
>>> > >> Uwe
>>> > >>
>>> > >> On Mon, Jul 31, 2017, at 04:55 PM, Wes McKinney wrote:
>>> > >>> hi all,
>>> > >>>
>>> > >>> We're already 40 patches into the next Arrow version. I just created
>>> > >>> https://issues.apache.org/jira/browse/ARROW-1297 as a tracking
>>> issue
>>> > >>> so that any blocking issues can be tracked as we push forward to
>>> 0.6.0
>>> > >>>
>>> > >>> You can track the status of the release here (accessible from the
>>> > >>> "Projects" tab --> Releases in JIRA):
>>> > >>>
>>> > >>> https://issues.apache.org/jira/projects/ARROW/versions/12341088
>>> > >>>
>>> > >>> We don't have any more data types slated for integration testing for
>>> > >>> this release, but it might be nice to try to finish one or more of
>>> > >>> them in the next week or two:
>>> > >>>
>>> > >>> - Fixed size binary
>>> > >>> - Fixed size lists
>>> > >>> - Decimal
>>> > >>> - Union
>>> > >>>
>>> > >>> As far as timeline for 0.6.0, I would like to push for an RC the
>>> week
>>> > >>> of 8/14 at latest (assuming we are ready to ship the Plasma C++
>>> code),
>>> > >>> reducing scope if needed. Any contributions of code, documentation,
>>> or
>>> > >>> JIRA prioritization would be much appreciated.
>>> > >>>
>>> > >>> Thanks,
>>> > >>> Wes
>>> >
>>>
>>
>>