[jira] [Created] (ARROW-7291) [Dev]Fix FORMAT_DIR in update-flatbuffers.sh

2019-12-01 Thread Kenta Murata (Jira)
Kenta Murata created ARROW-7291:
---

 Summary: [Dev]Fix FORMAT_DIR in update-flatbuffers.sh
 Key: ARROW-7291
 URL: https://issues.apache.org/jira/browse/ARROW-7291
 Project: Apache Arrow
  Issue Type: Bug
  Components: Developer Tools
Reporter: Kenta Murata
Assignee: Kenta Murata






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7290) Implement ListArray Builder in C#

2019-12-01 Thread Takashi Hashida (Jira)
Takashi Hashida created ARROW-7290:
--

 Summary: Implement ListArray Builder in C#
 Key: ARROW-7290
 URL: https://issues.apache.org/jira/browse/ARROW-7290
 Project: Apache Arrow
  Issue Type: Improvement
Affects Versions: 0.15.1
Reporter: Takashi Hashida


[https://github.com/apache/arrow/blob/master/csharp/src/Apache.Arrow/Arrays/ListArray.cs]

 

Implement "ListArray.Builder" in arrow/csharp.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7289) ListType constructor argument is redundant

2019-12-01 Thread Takashi Hashida (Jira)
Takashi Hashida created ARROW-7289:
--

 Summary: ListType constructor argument is redundant
 Key: ARROW-7289
 URL: https://issues.apache.org/jira/browse/ARROW-7289
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C#
Affects Versions: 0.15.1
Reporter: Takashi Hashida


[https://github.com/apache/arrow/blob/master/csharp/src/Apache.Arrow/Types/ListType.cs#L28]

 

The ListType constructor has two arguments but 'ValueDataType' can be 
determined by 'Filed.DataType' and 'ValueFiled' can be created by valueDataType.

It seems to me that the constructor should be separated to "ListType(Field 
valueField)" and " ListType(Field valueField)"

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


RE: Issues with installation on docker / R

2019-12-01 Thread Christian Klar
Perfect this seems to have worked (did a brief test on one .arrow file).

Thanks a lot for all the help! Really appreciate it.

Christian

-Original Message-
From: Wes McKinney [mailto:wesmck...@gmail.com] 
Sent: Sunday, December 1, 2019 9:16 PM
To: dev
Cc: Anthony Abate; Jason De Biasio
Subject: Re: Issues with installation on docker / R

OK, if you are installing using what is on CRAN you need to use the
corresponding version of Apache Arrow, so please use the released
0.15.1 version instead of master which contains API changes

On Sun, Dec 1, 2019 at 7:51 PM Christian Klar  wrote:
>
> Apologies! I missed that line.
>
> Okay this worked (make install finishes with some warnings), and the R part 
> gets further, however I'm getting a filesystem error now after around 15 
> steps.
>
> https://gist.github.com/klarchristian/2947eaf93c0e7ef9dd0d6ed9d0be990b
>
> Christian
>
> -Original Message-
> From: Wes McKinney [mailto:wesmck...@gmail.com]
> Sent: Sunday, December 1, 2019 8:14 PM
> To: dev
> Cc: Anthony Abate; Jason De Biasio
> Subject: Re: Issues with installation on docker / R
>
> Building Thrift from source requires the "flex" and "bison" packages,
> which you need to install with apt-get, this is mentioned here
>
> http://arrow.apache.org/docs/developers/cpp.html#apache-parquet-development
>
> On Sun, Dec 1, 2019 at 7:09 PM Christian Klar  wrote:
> >
> > Okay I ran it. With the additional tag the make install errored out.
> >
> > Here's the link to the log. Please let me know if you need further data.
> >
> > https://gist.github.com/klarchristian/272d0a9bffa03d4ff2265e4e181a26ef
> >
> > Also one more point: The arrow-master folder is in a shared volume. I don't 
> > think that should be an issue but just wanted to point it out (first line 
> > in the log).
> >
> >
> >
> >
> >
> > -Original Message-
> > From: Wes McKinney [mailto:wesmck...@gmail.com]
> > Sent: Sunday, December 1, 2019 7:48 PM
> > To: dev
> > Cc: Anthony Abate; Jason De Biasio
> > Subject: Re: Issues with installation on docker / R
> >
> > Can you also pass -DARROW_VERBOSE_THIRDPARTY_BUILD=ON ? That will show
> > us what's wrong with with the Thrift build in this environment (I
> > tried locally on my Ubuntu 18.04 machine -- without the influence of
> > conda or any other library toolchains -- and it works fine). You can
> > upload the whole logs on https://gist.github.com/ if you want
> >
> > On Sun, Dec 1, 2019 at 6:02 PM Christian Klar  wrote:
> > >
> > > No luck with cmake -DARROW_PARQUET=ON either. Getting another error. 
> > > Please see below the log from the docker session.
> > >
> > > root@873cfc2bc7b1:/hello/arrow-master/cpp/release# cmake 
> > > -DARROW_PARQUET=ON ..
> > > -- Building using CMake version: 3.7.2
> > > -- The C compiler identification is GNU 6.3.0
> > > -- The CXX compiler identification is GNU 6.3.0
> > > -- Check for working C compiler: /usr/bin/cc
> > > -- Check for working C compiler: /usr/bin/cc -- works
> > > -- Detecting C compiler ABI info
> > > -- Detecting C compiler ABI info - done
> > > -- Detecting C compile features
> > > -- Detecting C compile features - done
> > > -- Check for working CXX compiler: /usr/bin/c++
> > > -- Check for working CXX compiler: /usr/bin/c++ -- works
> > > -- Detecting CXX compiler ABI info
> > > -- Detecting CXX compiler ABI info - done
> > > -- Detecting CXX compile features
> > > -- Detecting CXX compile features - done
> > > -- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
> > > -- Arrow SO version: 100 (full: 100.0.0)
> > > -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
> > > -- clang-tidy not found
> > > -- clang-format not found
> > > -- infer not found
> > > -- Found PythonInterp: /usr/bin/python (found version "2.7.13")
> > > -- Found cpplint executable at 
> > > /hello/arrow-master/cpp/build-support/cpplint.py
> > > -- Performing Test CXX_SUPPORTS_SSE4_2
> > > -- Performing Test CXX_SUPPORTS_SSE4_2 - Success
> > > -- Performing Test CXX_SUPPORTS_ALTIVEC
> > > -- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
> > > -- Performing Test CXX_SUPPORTS_ARMCRC
> > > -- Performing Test CXX_SUPPORTS_ARMCRC - Failed
> > > -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
> > > -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
> > > -- Arrow build warning level: PRODUCTION
> > > Using ld linker
> > > Configured for RELEASE build (set with cmake 
> > > -DCMAKE_BUILD_TYPE={release,debug,...})
> > > -- Build Type: RELEASE
> > > -- Using AUTO approach to find dependencies
> > > -- AWSSDK_VERSION: 1.7.160
> > > -- BOOST_VERSION: 1.67.0
> > > -- BROTLI_VERSION: v1.0.7
> > > -- BZIP2_VERSION: 1.0.8
> > > -- CARES_VERSION: 1.15.0
> > > -- GBENCHMARK_VERSION: v1.5.0
> > > -- GFLAGS_VERSION: v2.2.0
> > > -- GLOG_VERSION: v0.3.5
> > > -- GRPC_VERSION: v1.24.3
> > > -- GTEST_VERSION: 1.8.1
> > > -- JEMALLOC_VERSION: 5.2.1
> > > -- LZ4_VERSION: v1.9.2
> > > -- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
> > > -- ORC_VERSION: 

Re: [Result] [VOTE] Clarifications and forward compatibility changes for Dictionary Encoding (second iteration)

2019-12-01 Thread Ji Liu
Thanks Micah, I'll take the Java side implementation.

Thanks,
Ji Liu


--
From:Micah Kornfield 
Send Time:2019年12月2日(星期一) 09:25
To:dev 
Subject:Re: [Result] [VOTE] Clarifications and forward compatibility changes 
for Dictionary Encoding (second iteration)

I've merged the PR and created ARROW-7283
 [1] to track
implementation for languages currently in the integration test.


[1] https://issues.apache.org/jira/browse/ARROW-7283

On Wed, Nov 27, 2019 at 1:03 AM Micah Kornfield 
wrote:

> The vote carries with 3 bindings votes +1 votes, 1 non-binding +1 vote and
> 1 non-binding +.5 vote.
>
> To follow-up I will:
> 1.  Open up JIRAs for work items in reference implementations (c++/java)
> 2.  Merge the pull request containing the specification changes.
>
> Thanks,
> Micah
>
> On Tue, Nov 26, 2019 at 12:50 AM Sutou Kouhei  wrote:
>
>> +1 (binding)
>>
>> In 
>>   "[VOTE] Clarifications and forward compatibility changes for Dictionary
>> Encoding (second iteration)" on Wed, 20 Nov 2019 20:41:57 -0800,
>>   Micah Kornfield  wrote:
>>
>> > Hello,
>> > As discussed on [1], I've proposed clarifications in a PR [2] that
>> > clarifies:
>> >
>> > 1.  It is not required that all dictionary batches occur at the
>> beginning
>> > of the IPC stream format (if a the first record batch has an all null
>> > dictionary encoded column, the null column's dictionary might not be
>> sent
>> > until later in the stream).
>> >
>> > 2.  A second dictionary batch for the same ID that is not a "delta
>> batch"
>> > in an IPC stream indicates the dictionary should be replaced.
>> >
>> > 3.  Clarifies that the file format, can only contain 1 "NON-delta"
>> > dictionary batch and multiple "delta" dictionary batches. Dictionary
>> > replacement is not supported in the file format.
>> >
>> > 4.  Add an enum to dictionary metadata for possible future changes in
>> what
>> > format dictionary batches can be sent. (the most likely would be an
>> array
>> > Map).  An enum is needed as a place holder to allow for
>> forward
>> > compatibility past the release 1.0.0.
>> >
>> > If accepted there will be work in all implementations to make sure that
>> > they cover the edge cases highlighted and additional integration testing
>> > will be needed.
>> >
>> > Please vote whether to accept these additions. The vote will be open
>> for at
>> > least 72 hours.
>> >
>> > [ ] +1 Accept these change to the specification
>> > [ ] +0
>> > [ ] -1 Do not accept the changes because...
>> >
>> > Thanks,
>> > Micah
>> >
>> >
>> > [1]
>> >
>> https://lists.apache.org/thread.html/d0f137e9db0abfcfde2ef879ca517a710f620e5be4dd749923d22c37@%3Cdev.arrow.apache.org%3E
>> > [2] https://github.com/apache/arrow/pull/5585
>>
>


Re: Issues with installation on docker / R

2019-12-01 Thread Wes McKinney
OK, if you are installing using what is on CRAN you need to use the
corresponding version of Apache Arrow, so please use the released
0.15.1 version instead of master which contains API changes

On Sun, Dec 1, 2019 at 7:51 PM Christian Klar  wrote:
>
> Apologies! I missed that line.
>
> Okay this worked (make install finishes with some warnings), and the R part 
> gets further, however I'm getting a filesystem error now after around 15 
> steps.
>
> https://gist.github.com/klarchristian/2947eaf93c0e7ef9dd0d6ed9d0be990b
>
> Christian
>
> -Original Message-
> From: Wes McKinney [mailto:wesmck...@gmail.com]
> Sent: Sunday, December 1, 2019 8:14 PM
> To: dev
> Cc: Anthony Abate; Jason De Biasio
> Subject: Re: Issues with installation on docker / R
>
> Building Thrift from source requires the "flex" and "bison" packages,
> which you need to install with apt-get, this is mentioned here
>
> http://arrow.apache.org/docs/developers/cpp.html#apache-parquet-development
>
> On Sun, Dec 1, 2019 at 7:09 PM Christian Klar  wrote:
> >
> > Okay I ran it. With the additional tag the make install errored out.
> >
> > Here's the link to the log. Please let me know if you need further data.
> >
> > https://gist.github.com/klarchristian/272d0a9bffa03d4ff2265e4e181a26ef
> >
> > Also one more point: The arrow-master folder is in a shared volume. I don't 
> > think that should be an issue but just wanted to point it out (first line 
> > in the log).
> >
> >
> >
> >
> >
> > -Original Message-
> > From: Wes McKinney [mailto:wesmck...@gmail.com]
> > Sent: Sunday, December 1, 2019 7:48 PM
> > To: dev
> > Cc: Anthony Abate; Jason De Biasio
> > Subject: Re: Issues with installation on docker / R
> >
> > Can you also pass -DARROW_VERBOSE_THIRDPARTY_BUILD=ON ? That will show
> > us what's wrong with with the Thrift build in this environment (I
> > tried locally on my Ubuntu 18.04 machine -- without the influence of
> > conda or any other library toolchains -- and it works fine). You can
> > upload the whole logs on https://gist.github.com/ if you want
> >
> > On Sun, Dec 1, 2019 at 6:02 PM Christian Klar  wrote:
> > >
> > > No luck with cmake -DARROW_PARQUET=ON either. Getting another error. 
> > > Please see below the log from the docker session.
> > >
> > > root@873cfc2bc7b1:/hello/arrow-master/cpp/release# cmake 
> > > -DARROW_PARQUET=ON ..
> > > -- Building using CMake version: 3.7.2
> > > -- The C compiler identification is GNU 6.3.0
> > > -- The CXX compiler identification is GNU 6.3.0
> > > -- Check for working C compiler: /usr/bin/cc
> > > -- Check for working C compiler: /usr/bin/cc -- works
> > > -- Detecting C compiler ABI info
> > > -- Detecting C compiler ABI info - done
> > > -- Detecting C compile features
> > > -- Detecting C compile features - done
> > > -- Check for working CXX compiler: /usr/bin/c++
> > > -- Check for working CXX compiler: /usr/bin/c++ -- works
> > > -- Detecting CXX compiler ABI info
> > > -- Detecting CXX compiler ABI info - done
> > > -- Detecting CXX compile features
> > > -- Detecting CXX compile features - done
> > > -- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
> > > -- Arrow SO version: 100 (full: 100.0.0)
> > > -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
> > > -- clang-tidy not found
> > > -- clang-format not found
> > > -- infer not found
> > > -- Found PythonInterp: /usr/bin/python (found version "2.7.13")
> > > -- Found cpplint executable at 
> > > /hello/arrow-master/cpp/build-support/cpplint.py
> > > -- Performing Test CXX_SUPPORTS_SSE4_2
> > > -- Performing Test CXX_SUPPORTS_SSE4_2 - Success
> > > -- Performing Test CXX_SUPPORTS_ALTIVEC
> > > -- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
> > > -- Performing Test CXX_SUPPORTS_ARMCRC
> > > -- Performing Test CXX_SUPPORTS_ARMCRC - Failed
> > > -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
> > > -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
> > > -- Arrow build warning level: PRODUCTION
> > > Using ld linker
> > > Configured for RELEASE build (set with cmake 
> > > -DCMAKE_BUILD_TYPE={release,debug,...})
> > > -- Build Type: RELEASE
> > > -- Using AUTO approach to find dependencies
> > > -- AWSSDK_VERSION: 1.7.160
> > > -- BOOST_VERSION: 1.67.0
> > > -- BROTLI_VERSION: v1.0.7
> > > -- BZIP2_VERSION: 1.0.8
> > > -- CARES_VERSION: 1.15.0
> > > -- GBENCHMARK_VERSION: v1.5.0
> > > -- GFLAGS_VERSION: v2.2.0
> > > -- GLOG_VERSION: v0.3.5
> > > -- GRPC_VERSION: v1.24.3
> > > -- GTEST_VERSION: 1.8.1
> > > -- JEMALLOC_VERSION: 5.2.1
> > > -- LZ4_VERSION: v1.9.2
> > > -- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
> > > -- ORC_VERSION: 1.5.7
> > > -- PROTOBUF_VERSION: v3.7.1
> > > -- RAPIDJSON_VERSION: 2bbd33b33217ff4a73434ebf10cdac41e2ef5e34
> > > -- RE2_VERSION: 2019-08-01
> > > -- SNAPPY_VERSION: 1.1.7
> > > -- THRIFT_VERSION: 0.12.0
> > > -- THRIFT_MD5_CHECKSUM: 3deebbb4d1ca77dd9c9e009a1ea02183
> > > -- ZLIB_VERSION: 1.2.11
> > > -- ZSTD_VERSION: v1.4.3
> > > -- Looking for 

[jira] [Created] (ARROW-7288) [R] read_parquet() freezes on Windows

2019-12-01 Thread Hiroaki Yutani (Jira)
Hiroaki Yutani created ARROW-7288:
-

 Summary: [R] read_parquet() freezes on Windows
 Key: ARROW-7288
 URL: https://issues.apache.org/jira/browse/ARROW-7288
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Affects Versions: 0.15.1
 Environment: R 3.6.1 on Windows 10
Reporter: Hiroaki Yutani


The following example on read_parquet()'s doc freezes (seems to wait for
the result foever) on my Windows.

df <- read_parquet(system.file("v0.7.1.parquet", package="arrow"))

The CRAN checks are all fine, which means the example is successfully executed 
on the CRAN Windows. So, I have no idea why it doesn't work on my local.

https://cran.r-project.org/web/checks/check_results_arrow.html

Here's my session info in case it helps:


{code}
> sessioninfo::session_info()

- Session info 
-
 setting  value
 version  R version 3.6.1 (2019-07-05)
 os   Windows 10 x64
 system   x86_64, mingw32
 ui   RStudio
 language en
 collate  Japanese_Japan.932
 ctypeJapanese_Japan.932
 tz   Asia/Tokyo
 date 2019-12-01

- Packages 
-
 package * version  date   lib source
 arrow   * 0.15.1.1 2019-11-05 [1] CRAN (R 3.6.1)
 assertthat0.2.12019-03-21 [1] CRAN (R 3.6.0)
 bit   1.1-14   2018-05-29 [1] CRAN (R 3.6.0)
 bit64 0.9-72017-05-08 [1] CRAN (R 3.6.0)
 cli   1.1.02019-03-19 [1] CRAN (R 3.6.0)
 crayon1.3.42017-09-16 [1] CRAN (R 3.6.0)
 fs1.3.12019-05-06 [1] CRAN (R 3.6.0)
 glue  1.3.12019-03-12 [1] CRAN (R 3.6.0)
 magrittr  1.5  2014-11-22 [1] CRAN (R 3.6.0)
 purrr 0.3.32019-10-18 [1] CRAN (R 3.6.1)
 R62.4.12019-11-12 [1] CRAN (R 3.6.1)
 Rcpp  1.0.32019-11-08 [1] CRAN (R 3.6.1)
 reprex0.3.02019-05-16 [1] CRAN (R 3.6.0)
 rlang 0.4.22019-11-23 [1] CRAN (R 3.6.1)
 rstudioapi0.10 2019-03-19 [1] CRAN (R 3.6.0)
 sessioninfo   1.1.12018-11-05 [1] CRAN (R 3.6.0)
 tidyselect0.2.52018-10-11 [1] CRAN (R 3.6.0)
 withr 2.1.22018-03-15 [1] CRAN (R 3.6.0)

[1] C:/Users/hiroaki-yutani/Documents/R/win-library/3.6
[2] C:/Program Files/R/R-3.6.1/library
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


RE: Issues with installation on docker / R

2019-12-01 Thread Christian Klar
Apologies! I missed that line.

Okay this worked (make install finishes with some warnings), and the R part 
gets further, however I'm getting a filesystem error now after around 15 steps.

https://gist.github.com/klarchristian/2947eaf93c0e7ef9dd0d6ed9d0be990b

Christian

-Original Message-
From: Wes McKinney [mailto:wesmck...@gmail.com] 
Sent: Sunday, December 1, 2019 8:14 PM
To: dev
Cc: Anthony Abate; Jason De Biasio
Subject: Re: Issues with installation on docker / R

Building Thrift from source requires the "flex" and "bison" packages,
which you need to install with apt-get, this is mentioned here

http://arrow.apache.org/docs/developers/cpp.html#apache-parquet-development

On Sun, Dec 1, 2019 at 7:09 PM Christian Klar  wrote:
>
> Okay I ran it. With the additional tag the make install errored out.
>
> Here's the link to the log. Please let me know if you need further data.
>
> https://gist.github.com/klarchristian/272d0a9bffa03d4ff2265e4e181a26ef
>
> Also one more point: The arrow-master folder is in a shared volume. I don't 
> think that should be an issue but just wanted to point it out (first line in 
> the log).
>
>
>
>
>
> -Original Message-
> From: Wes McKinney [mailto:wesmck...@gmail.com]
> Sent: Sunday, December 1, 2019 7:48 PM
> To: dev
> Cc: Anthony Abate; Jason De Biasio
> Subject: Re: Issues with installation on docker / R
>
> Can you also pass -DARROW_VERBOSE_THIRDPARTY_BUILD=ON ? That will show
> us what's wrong with with the Thrift build in this environment (I
> tried locally on my Ubuntu 18.04 machine -- without the influence of
> conda or any other library toolchains -- and it works fine). You can
> upload the whole logs on https://gist.github.com/ if you want
>
> On Sun, Dec 1, 2019 at 6:02 PM Christian Klar  wrote:
> >
> > No luck with cmake -DARROW_PARQUET=ON either. Getting another error. Please 
> > see below the log from the docker session.
> >
> > root@873cfc2bc7b1:/hello/arrow-master/cpp/release# cmake -DARROW_PARQUET=ON 
> > ..
> > -- Building using CMake version: 3.7.2
> > -- The C compiler identification is GNU 6.3.0
> > -- The CXX compiler identification is GNU 6.3.0
> > -- Check for working C compiler: /usr/bin/cc
> > -- Check for working C compiler: /usr/bin/cc -- works
> > -- Detecting C compiler ABI info
> > -- Detecting C compiler ABI info - done
> > -- Detecting C compile features
> > -- Detecting C compile features - done
> > -- Check for working CXX compiler: /usr/bin/c++
> > -- Check for working CXX compiler: /usr/bin/c++ -- works
> > -- Detecting CXX compiler ABI info
> > -- Detecting CXX compiler ABI info - done
> > -- Detecting CXX compile features
> > -- Detecting CXX compile features - done
> > -- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
> > -- Arrow SO version: 100 (full: 100.0.0)
> > -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
> > -- clang-tidy not found
> > -- clang-format not found
> > -- infer not found
> > -- Found PythonInterp: /usr/bin/python (found version "2.7.13")
> > -- Found cpplint executable at 
> > /hello/arrow-master/cpp/build-support/cpplint.py
> > -- Performing Test CXX_SUPPORTS_SSE4_2
> > -- Performing Test CXX_SUPPORTS_SSE4_2 - Success
> > -- Performing Test CXX_SUPPORTS_ALTIVEC
> > -- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
> > -- Performing Test CXX_SUPPORTS_ARMCRC
> > -- Performing Test CXX_SUPPORTS_ARMCRC - Failed
> > -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
> > -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
> > -- Arrow build warning level: PRODUCTION
> > Using ld linker
> > Configured for RELEASE build (set with cmake 
> > -DCMAKE_BUILD_TYPE={release,debug,...})
> > -- Build Type: RELEASE
> > -- Using AUTO approach to find dependencies
> > -- AWSSDK_VERSION: 1.7.160
> > -- BOOST_VERSION: 1.67.0
> > -- BROTLI_VERSION: v1.0.7
> > -- BZIP2_VERSION: 1.0.8
> > -- CARES_VERSION: 1.15.0
> > -- GBENCHMARK_VERSION: v1.5.0
> > -- GFLAGS_VERSION: v2.2.0
> > -- GLOG_VERSION: v0.3.5
> > -- GRPC_VERSION: v1.24.3
> > -- GTEST_VERSION: 1.8.1
> > -- JEMALLOC_VERSION: 5.2.1
> > -- LZ4_VERSION: v1.9.2
> > -- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
> > -- ORC_VERSION: 1.5.7
> > -- PROTOBUF_VERSION: v3.7.1
> > -- RAPIDJSON_VERSION: 2bbd33b33217ff4a73434ebf10cdac41e2ef5e34
> > -- RE2_VERSION: 2019-08-01
> > -- SNAPPY_VERSION: 1.1.7
> > -- THRIFT_VERSION: 0.12.0
> > -- THRIFT_MD5_CHECKSUM: 3deebbb4d1ca77dd9c9e009a1ea02183
> > -- ZLIB_VERSION: 1.2.11
> > -- ZSTD_VERSION: v1.4.3
> > -- Looking for pthread.h
> > -- Looking for pthread.h - found
> > -- Looking for pthread_create
> > -- Looking for pthread_create - not found
> > -- Check if compiler accepts -pthread
> > -- Check if compiler accepts -pthread - yes
> > -- Found Threads: TRUE
> > -- Boost version: 1.62.0
> > -- Found the following Boost libraries:
> > --   regex
> > --   system
> > --   filesystem
> > -- Boost include dir: /usr/include
> > -- Boost libraries: Boost::system;Boost::filesystem
> > -- Building 

Re: MIME type

2019-12-01 Thread Micah Kornfield
>
> Should we register our MIME types to IANA?

Are there any downsides to doing this? should we wait for 1.0.0?

On Thu, Nov 21, 2019 at 5:04 AM Sutou Kouhei  wrote:

> I found Apache Thrift registers the following MIME types:
>
>   * application/vnd.apache.thrift.binary
>   * application/vnd.apache.thrift.compact
>   * application/vnd.apache.thrift.json
>
> https://www.iana.org/assignments/media-types/media-types.xhtml
>
> Thrift uses "vnd.apache." prefix[1].
>
> [1] https://tools.ietf.org/html/rfc6838
> > Vendor-tree registrations will be distinguished by the leading facet
> > "vnd.".  That may be followed, at the discretion of the registrant,
> > by either a media subtype name from a well-known producer (e.g.,
> > "vnd.mudpie") or by an IANA-approved designation of the producer's
> > name that is followed by a media type or product designation (e.g.,
> > vnd.bigcompany.funnypictures).
>
> vnd.apache.thrift.binary was registered at 2014-09-09:
>
>
> https://www.iana.org/assignments/media-types/application/vnd.apache.thrift.binary
>
> Should we register our MIME types to IANA?
>
> It seems that Apache Thrift uses application/x-thift (typo?)
> before Apache Thrift registers these MIME types.
>
> > The application/x-thift media type is currently used to describe multiple
> > formats/protocols. Communications endpoints need to the format/protocol
> > used, so this media type should be used preferentially when it is
> > appropriate to do so.
>
> In 
>   "Re: MIME type" on Wed, 20 Nov 2019 12:01:54 +0100,
>   Antoine Pitrou  wrote:
>
> >
> > If it's not standardized, shouldn't it be prefixed with x-?
> >
> > e.g. application/x-apache-arrow-stream
> >
> >
> > Le 20/11/2019 à 08:29, Micah Kornfield a écrit :
> >> I would propose:
> >> application/apache-arrow-stream
> >> application/apache-arrow-file
> >>
> >> I'm not attached to those names but I think there should be two
> different
> >> mime-types, since the formats are not interchangeable.
> >>
> >> On Tue, Nov 19, 2019 at 10:31 PM Sutou Kouhei 
> wrote:
> >>
> >>> Hi,
> >>>
> >>> What MIME type should be used for Apache Arrow data?
> >>> application/arrow?
> >>>
> >>> Should we use the same MIME type for IPC Streaming Format[1]
> >>> and IPC File Format[2]? Or should we use different MIME
> >>> types for them?
> >>>
> >>> [1]
> >>>
> https://arrow.apache.org/docs/format/Columnar.html#ipc-streaming-format
> >>> [2] https://arrow.apache.org/docs/format/Columnar.html#ipc-file-format
> >>>
> >>>
> >>> Thanks,
> >>> --
> >>> kou
> >>>
> >>
>


[jira] [Created] (ARROW-7287) [Javascript] Ensure Javascript implementation implements clarified dictionary spec

2019-12-01 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-7287:
--

 Summary: [Javascript] Ensure Javascript implementation implements 
clarified dictionary spec
 Key: ARROW-7287
 URL: https://issues.apache.org/jira/browse/ARROW-7287
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: JavaScript
Reporter: Micah Kornfield


See parent issue for description.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7286) [Go] Ensure go implementation implements clarified dictionary spec

2019-12-01 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-7286:
--

 Summary: [Go] Ensure go implementation implements clarified 
dictionary spec
 Key: ARROW-7286
 URL: https://issues.apache.org/jira/browse/ARROW-7286
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Go
Reporter: Micah Kornfield


See parent issue for description.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7284) [Java] ensure java implementation meets clarified dictionary spec

2019-12-01 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-7284:
--

 Summary: [Java] ensure java implementation meets clarified 
dictionary spec
 Key: ARROW-7284
 URL: https://issues.apache.org/jira/browse/ARROW-7284
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Java
Reporter: Micah Kornfield
 Fix For: 1.0.0


see parent issue.

 

CC [~tianchen92]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7285) [C++] ensure C++ implementation meets clarified dictionary spec

2019-12-01 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-7285:
--

 Summary: [C++] ensure C++ implementation meets clarified 
dictionary spec
 Key: ARROW-7285
 URL: https://issues.apache.org/jira/browse/ARROW-7285
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Java
Reporter: Micah Kornfield
 Fix For: 1.0.0


see parent issue.

 

CC [~tianchen92]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7283) Ensure dictionary IPC implementations match spec clarifications

2019-12-01 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-7283:
--

 Summary: Ensure dictionary IPC implementations match spec 
clarifications
 Key: ARROW-7283
 URL: https://issues.apache.org/jira/browse/ARROW-7283
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Micah Kornfield


Parent tracking issue to ensure clarification in PR: 
[https://github.com/apache/arrow/pull/5585#pullrequestreview-324979419] are 
correctly implemented.  

 

Specifically:

1.  dictionary replacement in streams.

2.  Not requiring dictionaries be present at the beginning of the stream for 
all null columns.

3.  Dictionary replacement isn't supported in the file format.

 

Some implementations might already have some or all of these.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Issues with installation on docker / R

2019-12-01 Thread Wes McKinney
Building Thrift from source requires the "flex" and "bison" packages,
which you need to install with apt-get, this is mentioned here

http://arrow.apache.org/docs/developers/cpp.html#apache-parquet-development

On Sun, Dec 1, 2019 at 7:09 PM Christian Klar  wrote:
>
> Okay I ran it. With the additional tag the make install errored out.
>
> Here's the link to the log. Please let me know if you need further data.
>
> https://gist.github.com/klarchristian/272d0a9bffa03d4ff2265e4e181a26ef
>
> Also one more point: The arrow-master folder is in a shared volume. I don't 
> think that should be an issue but just wanted to point it out (first line in 
> the log).
>
>
>
>
>
> -Original Message-
> From: Wes McKinney [mailto:wesmck...@gmail.com]
> Sent: Sunday, December 1, 2019 7:48 PM
> To: dev
> Cc: Anthony Abate; Jason De Biasio
> Subject: Re: Issues with installation on docker / R
>
> Can you also pass -DARROW_VERBOSE_THIRDPARTY_BUILD=ON ? That will show
> us what's wrong with with the Thrift build in this environment (I
> tried locally on my Ubuntu 18.04 machine -- without the influence of
> conda or any other library toolchains -- and it works fine). You can
> upload the whole logs on https://gist.github.com/ if you want
>
> On Sun, Dec 1, 2019 at 6:02 PM Christian Klar  wrote:
> >
> > No luck with cmake -DARROW_PARQUET=ON either. Getting another error. Please 
> > see below the log from the docker session.
> >
> > root@873cfc2bc7b1:/hello/arrow-master/cpp/release# cmake -DARROW_PARQUET=ON 
> > ..
> > -- Building using CMake version: 3.7.2
> > -- The C compiler identification is GNU 6.3.0
> > -- The CXX compiler identification is GNU 6.3.0
> > -- Check for working C compiler: /usr/bin/cc
> > -- Check for working C compiler: /usr/bin/cc -- works
> > -- Detecting C compiler ABI info
> > -- Detecting C compiler ABI info - done
> > -- Detecting C compile features
> > -- Detecting C compile features - done
> > -- Check for working CXX compiler: /usr/bin/c++
> > -- Check for working CXX compiler: /usr/bin/c++ -- works
> > -- Detecting CXX compiler ABI info
> > -- Detecting CXX compiler ABI info - done
> > -- Detecting CXX compile features
> > -- Detecting CXX compile features - done
> > -- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
> > -- Arrow SO version: 100 (full: 100.0.0)
> > -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
> > -- clang-tidy not found
> > -- clang-format not found
> > -- infer not found
> > -- Found PythonInterp: /usr/bin/python (found version "2.7.13")
> > -- Found cpplint executable at 
> > /hello/arrow-master/cpp/build-support/cpplint.py
> > -- Performing Test CXX_SUPPORTS_SSE4_2
> > -- Performing Test CXX_SUPPORTS_SSE4_2 - Success
> > -- Performing Test CXX_SUPPORTS_ALTIVEC
> > -- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
> > -- Performing Test CXX_SUPPORTS_ARMCRC
> > -- Performing Test CXX_SUPPORTS_ARMCRC - Failed
> > -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
> > -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
> > -- Arrow build warning level: PRODUCTION
> > Using ld linker
> > Configured for RELEASE build (set with cmake 
> > -DCMAKE_BUILD_TYPE={release,debug,...})
> > -- Build Type: RELEASE
> > -- Using AUTO approach to find dependencies
> > -- AWSSDK_VERSION: 1.7.160
> > -- BOOST_VERSION: 1.67.0
> > -- BROTLI_VERSION: v1.0.7
> > -- BZIP2_VERSION: 1.0.8
> > -- CARES_VERSION: 1.15.0
> > -- GBENCHMARK_VERSION: v1.5.0
> > -- GFLAGS_VERSION: v2.2.0
> > -- GLOG_VERSION: v0.3.5
> > -- GRPC_VERSION: v1.24.3
> > -- GTEST_VERSION: 1.8.1
> > -- JEMALLOC_VERSION: 5.2.1
> > -- LZ4_VERSION: v1.9.2
> > -- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
> > -- ORC_VERSION: 1.5.7
> > -- PROTOBUF_VERSION: v3.7.1
> > -- RAPIDJSON_VERSION: 2bbd33b33217ff4a73434ebf10cdac41e2ef5e34
> > -- RE2_VERSION: 2019-08-01
> > -- SNAPPY_VERSION: 1.1.7
> > -- THRIFT_VERSION: 0.12.0
> > -- THRIFT_MD5_CHECKSUM: 3deebbb4d1ca77dd9c9e009a1ea02183
> > -- ZLIB_VERSION: 1.2.11
> > -- ZSTD_VERSION: v1.4.3
> > -- Looking for pthread.h
> > -- Looking for pthread.h - found
> > -- Looking for pthread_create
> > -- Looking for pthread_create - not found
> > -- Check if compiler accepts -pthread
> > -- Check if compiler accepts -pthread - yes
> > -- Found Threads: TRUE
> > -- Boost version: 1.62.0
> > -- Found the following Boost libraries:
> > --   regex
> > --   system
> > --   filesystem
> > -- Boost include dir: /usr/include
> > -- Boost libraries: Boost::system;Boost::filesystem
> > -- Building without OpenSSL support. Minimum OpenSSL version 1.0.2 required.
> > -- Checking for module 'thrift'
> > --   No package 'thrift' found
> > -- Could NOT find Thrift (missing:  THRIFT_STATIC_LIB THRIFT_INCLUDE_DIR 
> > THRIFT_COMPILER)
> > Building Apache Thrift from source
> > /hello/arrow-master/cpp/build-support/get_apache_mirror.py:46: 
> > RuntimeWarning: Failed loading 
> > 'https://www.apache.org/dyn/closer.cgi?as_json=1':  > CERTIFICATE_VERIFY_FAILED] certificate verify failed 

RE: Issues with installation on docker / R

2019-12-01 Thread Christian Klar
Okay I ran it. With the additional tag the make install errored out.

Here's the link to the log. Please let me know if you need further data.

https://gist.github.com/klarchristian/272d0a9bffa03d4ff2265e4e181a26ef

Also one more point: The arrow-master folder is in a shared volume. I don't 
think that should be an issue but just wanted to point it out (first line in 
the log).





-Original Message-
From: Wes McKinney [mailto:wesmck...@gmail.com] 
Sent: Sunday, December 1, 2019 7:48 PM
To: dev
Cc: Anthony Abate; Jason De Biasio
Subject: Re: Issues with installation on docker / R

Can you also pass -DARROW_VERBOSE_THIRDPARTY_BUILD=ON ? That will show
us what's wrong with with the Thrift build in this environment (I
tried locally on my Ubuntu 18.04 machine -- without the influence of
conda or any other library toolchains -- and it works fine). You can
upload the whole logs on https://gist.github.com/ if you want

On Sun, Dec 1, 2019 at 6:02 PM Christian Klar  wrote:
>
> No luck with cmake -DARROW_PARQUET=ON either. Getting another error. Please 
> see below the log from the docker session.
>
> root@873cfc2bc7b1:/hello/arrow-master/cpp/release# cmake -DARROW_PARQUET=ON ..
> -- Building using CMake version: 3.7.2
> -- The C compiler identification is GNU 6.3.0
> -- The CXX compiler identification is GNU 6.3.0
> -- Check for working C compiler: /usr/bin/cc
> -- Check for working C compiler: /usr/bin/cc -- works
> -- Detecting C compiler ABI info
> -- Detecting C compiler ABI info - done
> -- Detecting C compile features
> -- Detecting C compile features - done
> -- Check for working CXX compiler: /usr/bin/c++
> -- Check for working CXX compiler: /usr/bin/c++ -- works
> -- Detecting CXX compiler ABI info
> -- Detecting CXX compiler ABI info - done
> -- Detecting CXX compile features
> -- Detecting CXX compile features - done
> -- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
> -- Arrow SO version: 100 (full: 100.0.0)
> -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
> -- clang-tidy not found
> -- clang-format not found
> -- infer not found
> -- Found PythonInterp: /usr/bin/python (found version "2.7.13")
> -- Found cpplint executable at 
> /hello/arrow-master/cpp/build-support/cpplint.py
> -- Performing Test CXX_SUPPORTS_SSE4_2
> -- Performing Test CXX_SUPPORTS_SSE4_2 - Success
> -- Performing Test CXX_SUPPORTS_ALTIVEC
> -- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
> -- Performing Test CXX_SUPPORTS_ARMCRC
> -- Performing Test CXX_SUPPORTS_ARMCRC - Failed
> -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
> -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
> -- Arrow build warning level: PRODUCTION
> Using ld linker
> Configured for RELEASE build (set with cmake 
> -DCMAKE_BUILD_TYPE={release,debug,...})
> -- Build Type: RELEASE
> -- Using AUTO approach to find dependencies
> -- AWSSDK_VERSION: 1.7.160
> -- BOOST_VERSION: 1.67.0
> -- BROTLI_VERSION: v1.0.7
> -- BZIP2_VERSION: 1.0.8
> -- CARES_VERSION: 1.15.0
> -- GBENCHMARK_VERSION: v1.5.0
> -- GFLAGS_VERSION: v2.2.0
> -- GLOG_VERSION: v0.3.5
> -- GRPC_VERSION: v1.24.3
> -- GTEST_VERSION: 1.8.1
> -- JEMALLOC_VERSION: 5.2.1
> -- LZ4_VERSION: v1.9.2
> -- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
> -- ORC_VERSION: 1.5.7
> -- PROTOBUF_VERSION: v3.7.1
> -- RAPIDJSON_VERSION: 2bbd33b33217ff4a73434ebf10cdac41e2ef5e34
> -- RE2_VERSION: 2019-08-01
> -- SNAPPY_VERSION: 1.1.7
> -- THRIFT_VERSION: 0.12.0
> -- THRIFT_MD5_CHECKSUM: 3deebbb4d1ca77dd9c9e009a1ea02183
> -- ZLIB_VERSION: 1.2.11
> -- ZSTD_VERSION: v1.4.3
> -- Looking for pthread.h
> -- Looking for pthread.h - found
> -- Looking for pthread_create
> -- Looking for pthread_create - not found
> -- Check if compiler accepts -pthread
> -- Check if compiler accepts -pthread - yes
> -- Found Threads: TRUE
> -- Boost version: 1.62.0
> -- Found the following Boost libraries:
> --   regex
> --   system
> --   filesystem
> -- Boost include dir: /usr/include
> -- Boost libraries: Boost::system;Boost::filesystem
> -- Building without OpenSSL support. Minimum OpenSSL version 1.0.2 required.
> -- Checking for module 'thrift'
> --   No package 'thrift' found
> -- Could NOT find Thrift (missing:  THRIFT_STATIC_LIB THRIFT_INCLUDE_DIR 
> THRIFT_COMPILER)
> Building Apache Thrift from source
> /hello/arrow-master/cpp/build-support/get_apache_mirror.py:46: 
> RuntimeWarning: Failed loading 
> 'https://www.apache.org/dyn/closer.cgi?as_json=1':  CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)>
>   RuntimeWarning)
> Downloading Apache Thrift from 
> http://apache.osuosl.org//thrift/0.12.0/thrift-0.12.0.tar.gz
> -- Building (vendored) jemalloc from source
> -- Could NOT find RapidJSONAlt (missing:  RAPIDJSON_INCLUDE_DIR) (found 
> suitable version "2bbd33b33217ff4a73434ebf10cdac41e2ef5e34", minimum required 
> is "1.1.0")
> -- Building rapidjson from source
> -- Found hdfs.h at: /hello/arrow-master/cpp/thirdparty/hadoop/include/hdfs.h
> -- CMAKE_C_FLAGS:  -O3 -DNDEBUG  

Re: Issues with installation on docker / R

2019-12-01 Thread Wes McKinney
Can you also pass -DARROW_VERBOSE_THIRDPARTY_BUILD=ON ? That will show
us what's wrong with with the Thrift build in this environment (I
tried locally on my Ubuntu 18.04 machine -- without the influence of
conda or any other library toolchains -- and it works fine). You can
upload the whole logs on https://gist.github.com/ if you want

On Sun, Dec 1, 2019 at 6:02 PM Christian Klar  wrote:
>
> No luck with cmake -DARROW_PARQUET=ON either. Getting another error. Please 
> see below the log from the docker session.
>
> root@873cfc2bc7b1:/hello/arrow-master/cpp/release# cmake -DARROW_PARQUET=ON ..
> -- Building using CMake version: 3.7.2
> -- The C compiler identification is GNU 6.3.0
> -- The CXX compiler identification is GNU 6.3.0
> -- Check for working C compiler: /usr/bin/cc
> -- Check for working C compiler: /usr/bin/cc -- works
> -- Detecting C compiler ABI info
> -- Detecting C compiler ABI info - done
> -- Detecting C compile features
> -- Detecting C compile features - done
> -- Check for working CXX compiler: /usr/bin/c++
> -- Check for working CXX compiler: /usr/bin/c++ -- works
> -- Detecting CXX compiler ABI info
> -- Detecting CXX compiler ABI info - done
> -- Detecting CXX compile features
> -- Detecting CXX compile features - done
> -- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
> -- Arrow SO version: 100 (full: 100.0.0)
> -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
> -- clang-tidy not found
> -- clang-format not found
> -- infer not found
> -- Found PythonInterp: /usr/bin/python (found version "2.7.13")
> -- Found cpplint executable at 
> /hello/arrow-master/cpp/build-support/cpplint.py
> -- Performing Test CXX_SUPPORTS_SSE4_2
> -- Performing Test CXX_SUPPORTS_SSE4_2 - Success
> -- Performing Test CXX_SUPPORTS_ALTIVEC
> -- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
> -- Performing Test CXX_SUPPORTS_ARMCRC
> -- Performing Test CXX_SUPPORTS_ARMCRC - Failed
> -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
> -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
> -- Arrow build warning level: PRODUCTION
> Using ld linker
> Configured for RELEASE build (set with cmake 
> -DCMAKE_BUILD_TYPE={release,debug,...})
> -- Build Type: RELEASE
> -- Using AUTO approach to find dependencies
> -- AWSSDK_VERSION: 1.7.160
> -- BOOST_VERSION: 1.67.0
> -- BROTLI_VERSION: v1.0.7
> -- BZIP2_VERSION: 1.0.8
> -- CARES_VERSION: 1.15.0
> -- GBENCHMARK_VERSION: v1.5.0
> -- GFLAGS_VERSION: v2.2.0
> -- GLOG_VERSION: v0.3.5
> -- GRPC_VERSION: v1.24.3
> -- GTEST_VERSION: 1.8.1
> -- JEMALLOC_VERSION: 5.2.1
> -- LZ4_VERSION: v1.9.2
> -- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
> -- ORC_VERSION: 1.5.7
> -- PROTOBUF_VERSION: v3.7.1
> -- RAPIDJSON_VERSION: 2bbd33b33217ff4a73434ebf10cdac41e2ef5e34
> -- RE2_VERSION: 2019-08-01
> -- SNAPPY_VERSION: 1.1.7
> -- THRIFT_VERSION: 0.12.0
> -- THRIFT_MD5_CHECKSUM: 3deebbb4d1ca77dd9c9e009a1ea02183
> -- ZLIB_VERSION: 1.2.11
> -- ZSTD_VERSION: v1.4.3
> -- Looking for pthread.h
> -- Looking for pthread.h - found
> -- Looking for pthread_create
> -- Looking for pthread_create - not found
> -- Check if compiler accepts -pthread
> -- Check if compiler accepts -pthread - yes
> -- Found Threads: TRUE
> -- Boost version: 1.62.0
> -- Found the following Boost libraries:
> --   regex
> --   system
> --   filesystem
> -- Boost include dir: /usr/include
> -- Boost libraries: Boost::system;Boost::filesystem
> -- Building without OpenSSL support. Minimum OpenSSL version 1.0.2 required.
> -- Checking for module 'thrift'
> --   No package 'thrift' found
> -- Could NOT find Thrift (missing:  THRIFT_STATIC_LIB THRIFT_INCLUDE_DIR 
> THRIFT_COMPILER)
> Building Apache Thrift from source
> /hello/arrow-master/cpp/build-support/get_apache_mirror.py:46: 
> RuntimeWarning: Failed loading 
> 'https://www.apache.org/dyn/closer.cgi?as_json=1':  CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)>
>   RuntimeWarning)
> Downloading Apache Thrift from 
> http://apache.osuosl.org//thrift/0.12.0/thrift-0.12.0.tar.gz
> -- Building (vendored) jemalloc from source
> -- Could NOT find RapidJSONAlt (missing:  RAPIDJSON_INCLUDE_DIR) (found 
> suitable version "2bbd33b33217ff4a73434ebf10cdac41e2ef5e34", minimum required 
> is "1.1.0")
> -- Building rapidjson from source
> -- Found hdfs.h at: /hello/arrow-master/cpp/thirdparty/hadoop/include/hdfs.h
> -- CMAKE_C_FLAGS:  -O3 -DNDEBUG   -Wall -msse4.2
> -- CMAKE_CXX_FLAGS:  -Wno-noexcept-type  -fdiagnostics-color=always -O3 
> -DNDEBUG  -Wall -msse4.2
> -- Looking for backtrace
> -- Looking for backtrace - found
> -- backtrace facility detected in default set of libraries
> -- Found Backtrace: /usr/include
> -- -
> -- Arrow version: 1.0.0-SNAPSHOT
> --
> -- Build configuration summary:
> --   Generator: Unix Makefiles
> --   Build type: RELEASE
> --   Source directory: /hello/arrow-master/cpp
> --   Install prefix: 

RE: Issues with installation on docker / R

2019-12-01 Thread Christian Klar
No luck with cmake -DARROW_PARQUET=ON either. Getting another error. Please see 
below the log from the docker session.

root@873cfc2bc7b1:/hello/arrow-master/cpp/release# cmake -DARROW_PARQUET=ON ..
-- Building using CMake version: 3.7.2
-- The C compiler identification is GNU 6.3.0
-- The CXX compiler identification is GNU 6.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
-- Arrow SO version: 100 (full: 100.0.0)
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29") 
-- clang-tidy not found
-- clang-format not found
-- infer not found
-- Found PythonInterp: /usr/bin/python (found version "2.7.13") 
-- Found cpplint executable at /hello/arrow-master/cpp/build-support/cpplint.py
-- Performing Test CXX_SUPPORTS_SSE4_2
-- Performing Test CXX_SUPPORTS_SSE4_2 - Success
-- Performing Test CXX_SUPPORTS_ALTIVEC
-- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
-- Performing Test CXX_SUPPORTS_ARMCRC
-- Performing Test CXX_SUPPORTS_ARMCRC - Failed
-- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
-- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
-- Arrow build warning level: PRODUCTION
Using ld linker
Configured for RELEASE build (set with cmake 
-DCMAKE_BUILD_TYPE={release,debug,...})
-- Build Type: RELEASE
-- Using AUTO approach to find dependencies
-- AWSSDK_VERSION: 1.7.160
-- BOOST_VERSION: 1.67.0
-- BROTLI_VERSION: v1.0.7
-- BZIP2_VERSION: 1.0.8
-- CARES_VERSION: 1.15.0
-- GBENCHMARK_VERSION: v1.5.0
-- GFLAGS_VERSION: v2.2.0
-- GLOG_VERSION: v0.3.5
-- GRPC_VERSION: v1.24.3
-- GTEST_VERSION: 1.8.1
-- JEMALLOC_VERSION: 5.2.1
-- LZ4_VERSION: v1.9.2
-- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
-- ORC_VERSION: 1.5.7
-- PROTOBUF_VERSION: v3.7.1
-- RAPIDJSON_VERSION: 2bbd33b33217ff4a73434ebf10cdac41e2ef5e34
-- RE2_VERSION: 2019-08-01
-- SNAPPY_VERSION: 1.1.7
-- THRIFT_VERSION: 0.12.0
-- THRIFT_MD5_CHECKSUM: 3deebbb4d1ca77dd9c9e009a1ea02183
-- ZLIB_VERSION: 1.2.11
-- ZSTD_VERSION: v1.4.3
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- Boost version: 1.62.0
-- Found the following Boost libraries:
--   regex
--   system
--   filesystem
-- Boost include dir: /usr/include
-- Boost libraries: Boost::system;Boost::filesystem
-- Building without OpenSSL support. Minimum OpenSSL version 1.0.2 required.
-- Checking for module 'thrift'
--   No package 'thrift' found
-- Could NOT find Thrift (missing:  THRIFT_STATIC_LIB THRIFT_INCLUDE_DIR 
THRIFT_COMPILER) 
Building Apache Thrift from source
/hello/arrow-master/cpp/build-support/get_apache_mirror.py:46: RuntimeWarning: 
Failed loading 'https://www.apache.org/dyn/closer.cgi?as_json=1': 
  RuntimeWarning)
Downloading Apache Thrift from 
http://apache.osuosl.org//thrift/0.12.0/thrift-0.12.0.tar.gz
-- Building (vendored) jemalloc from source
-- Could NOT find RapidJSONAlt (missing:  RAPIDJSON_INCLUDE_DIR) (found 
suitable version "2bbd33b33217ff4a73434ebf10cdac41e2ef5e34", minimum required 
is "1.1.0")
-- Building rapidjson from source
-- Found hdfs.h at: /hello/arrow-master/cpp/thirdparty/hadoop/include/hdfs.h
-- CMAKE_C_FLAGS:  -O3 -DNDEBUG   -Wall -msse4.2
-- CMAKE_CXX_FLAGS:  -Wno-noexcept-type  -fdiagnostics-color=always -O3 
-DNDEBUG  -Wall -msse4.2 
-- Looking for backtrace
-- Looking for backtrace - found
-- backtrace facility detected in default set of libraries
-- Found Backtrace: /usr/include  
-- -
-- Arrow version: 1.0.0-SNAPSHOT
-- 
-- Build configuration summary:
--   Generator: Unix Makefiles
--   Build type: RELEASE
--   Source directory: /hello/arrow-master/cpp
--   Install prefix: /usr/local
-- 
-- Compile and link options:
--   Compiler flags to append when compiling Arrow  ""  
  [default] [ARROW_CXXFLAGS]
--   Build static libraries ON  
  [default] [ARROW_BUILD_STATIC]
--   Build shared libraries ON  
  [default] [ARROW_BUILD_SHARED]
--   Exclude deprecated APIs from build OFF 
  [default] [ARROW_NO_DEPRECATED_API]
--   Use ccache when compiling (if available)   ON  
  [default] [ARROW_USE_CCACHE]
--   Use ld.gold for linking on Linux (if available)  

RE: Issues with installation on docker / R

2019-12-01 Thread Christian Klar
Thanks!

When running "make install" it finishes fine too, but then the R installation 
errors out. See below.

Are there any other libraries I need to install?

(Would potentially "cmake -DARROW_PARQUET=ON" solve this?)



Here the log.

> install.packages('arrow')
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 
'https://mran.microsoft.com/snapshot/2019-11-17/src/contrib/arrow_0.15.1.1.tar.gz'
Content type 'application/octet-stream' length 147277 bytes (143 KB)
==
downloaded 143 KB

* installing *source* package ‘arrow’ ...
** package ‘arrow’ successfully unpacked and MD5 sums checked
** using staged installation
PKG_CFLAGS= -DARROW_R_WITH_ARROW
PKG_LIBS=-larrow -lparquet
** libs
g++ -std=gnu++11 -I"/usr/local/lib/R/include" -DNDEBUG -DARROW_R_WITH_ARROW 
-I"/usr/local/lib/R/site-library/Rcpp/include" -I/usr/local/include  -fpic  -g 
-O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time 
-D_FORTIFY_SOURCE=2 -g  -c array.cpp -o array.o
In file included from array.cpp:18:0:
./arrow_types.h:198:34: fatal error: parquet/arrow/reader.h: No such file or 
directory
 #include 
  ^
compilation terminated.
/usr/local/lib/R/etc/Makeconf:176: recipe for target 'array.o' failed
make: *** [array.o] Error 1
ERROR: compilation failed for package ‘arrow’
* removing ‘/usr/local/lib/R/site-library/arrow’
* restoring previous ‘/usr/local/lib/R/site-library/arrow’

The downloaded source packages are in
‘/tmp/Rtmp2POLjt/downloaded_packages’
Warning message:
In install.packages("arrow") :
  installation of package ‘arrow’ had non-zero exit status



Here the docker image again.

FROM rocker/tidyverse
MAINTAINER Christian Klar 
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get install -y -V \
 build-essential \
 cmake \
 libboost-filesystem-dev \
 libboost-regex-dev \
 libboost-system-dev
RUN apt update && \
  apt install -y -V \
autoconf-archive \
gtk-doc-tools \
libgirepository1.0-dev \
libglib2.0-doc \
libtool \
pkg-config && \
  apt clean && \
  rm -rf /var/lib/apt/lists/*
RUN R -e "install.packages('remotes', repos = c(CRAN = 
'http://cran.us.r-project.org'))"


















-Original Message-
From: Wes McKinney [mailto:wesmck...@gmail.com] 
Sent: Sunday, December 1, 2019 6:34 PM
To: dev@arrow.apache.org
Cc: Anthony Abate; Jason De Biasio
Subject: Re: Issues with installation on docker / R

That is a “minimal release build” but it does not install the results. Try
“make install” instead of just “make”. We can improve the documentation to
clarify this issue

On Sun, Dec 1, 2019 at 5:22 PM Christian Klar  wrote:

>
> Thanks for the quick response!
>
> I attached the log to my first email but maybe it didn't got through
> because of the mailing list. Copying it below below the
> "#"-line.
>
> I'm using the commands from the "Minimal release build" on
> https://arrow.apache.org/docs/developers/cpp.html.
>
> cd arrow/cpp
> mkdir release
> cd release
> cmake ..
> make
>
> The corresponding line below for the last make is here copied separately.
>
> "root@66311c2b2dc5:/hello/arrow-master/cpp/release# make"
>
> Please let me know if that's not the one you meant.
>
> #
>
> cklar@nyc-poly-tci-03:~/Desktop/Docker/arrow$ docker run -v
> /home/cklar/Desktop/tmpshare:/hello -it arrowtest /bin/bash
> root@66311c2b2dc5:/# cd hello
> root@66311c2b2dc5:/hello# cd hell^C
> root@66311c2b2dc5:/hello# ls
> arrow-master  arrow-master.zip  CLOPROCESS_TrancheHistory_Expense.arrow
> root@66311c2b2dc5:/hello# cd arrow-master
> root@66311c2b2dc5:/hello/arrow-master# ls
> appveyor.yml  cmake-format.py csharp  format   java
>  matlab  README.mdtesting
> c_glibCODE_OF_CONDUCT.md  dev go   js
>  NOTICE.txt  ruby
> CHANGELOG.md  CONTRIBUTING.md docker-compose.yml  header
>  LICENSE.txt  python  run-cmake-format.py
> cicpp docsintegration
> Makefile.docker  r   rust
> root@66311c2b2dc5:/hello/arrow-master# cd cpp
> root@66311c2b2dc5:/hello/arrow-master/cpp# mkdir release
> root@66311c2b2dc5:/hello/arrow-master/cpp# cd release
> root@66311c2b2dc5:/hello/arrow-master/cpp/release# cmake
> Usage
>
>   cmake [options] 
>   cmake [options] 
>
> Specify a source directory to (re-)generate a build system for it in the
> current working directory.  Specify an existing build directory to
> re-generate its build system.
>
> Run 'cmake --help' for more information.
>
> root@66311c2b2dc5:/hello/arrow-master/cpp/release# cmake ..
> -- Building using CMake version: 3.7.2
> -- The C compiler identification is GNU 6.3.0
> -- The CXX compiler identification is GNU 6.3.0
> -- Check for working C compiler: /usr/bin/cc
> -- Check for working C compiler: /usr/bin/cc -- works
> -- 

Re: Issues with installation on docker / R

2019-12-01 Thread Wes McKinney
That is a “minimal release build” but it does not install the results. Try
“make install” instead of just “make”. We can improve the documentation to
clarify this issue

On Sun, Dec 1, 2019 at 5:22 PM Christian Klar  wrote:

>
> Thanks for the quick response!
>
> I attached the log to my first email but maybe it didn't got through
> because of the mailing list. Copying it below below the
> "#"-line.
>
> I'm using the commands from the "Minimal release build" on
> https://arrow.apache.org/docs/developers/cpp.html.
>
> cd arrow/cpp
> mkdir release
> cd release
> cmake ..
> make
>
> The corresponding line below for the last make is here copied separately.
>
> "root@66311c2b2dc5:/hello/arrow-master/cpp/release# make"
>
> Please let me know if that's not the one you meant.
>
> #
>
> cklar@nyc-poly-tci-03:~/Desktop/Docker/arrow$ docker run -v
> /home/cklar/Desktop/tmpshare:/hello -it arrowtest /bin/bash
> root@66311c2b2dc5:/# cd hello
> root@66311c2b2dc5:/hello# cd hell^C
> root@66311c2b2dc5:/hello# ls
> arrow-master  arrow-master.zip  CLOPROCESS_TrancheHistory_Expense.arrow
> root@66311c2b2dc5:/hello# cd arrow-master
> root@66311c2b2dc5:/hello/arrow-master# ls
> appveyor.yml  cmake-format.py csharp  format   java
>  matlab  README.mdtesting
> c_glibCODE_OF_CONDUCT.md  dev go   js
>  NOTICE.txt  ruby
> CHANGELOG.md  CONTRIBUTING.md docker-compose.yml  header
>  LICENSE.txt  python  run-cmake-format.py
> cicpp docsintegration
> Makefile.docker  r   rust
> root@66311c2b2dc5:/hello/arrow-master# cd cpp
> root@66311c2b2dc5:/hello/arrow-master/cpp# mkdir release
> root@66311c2b2dc5:/hello/arrow-master/cpp# cd release
> root@66311c2b2dc5:/hello/arrow-master/cpp/release# cmake
> Usage
>
>   cmake [options] 
>   cmake [options] 
>
> Specify a source directory to (re-)generate a build system for it in the
> current working directory.  Specify an existing build directory to
> re-generate its build system.
>
> Run 'cmake --help' for more information.
>
> root@66311c2b2dc5:/hello/arrow-master/cpp/release# cmake ..
> -- Building using CMake version: 3.7.2
> -- The C compiler identification is GNU 6.3.0
> -- The CXX compiler identification is GNU 6.3.0
> -- Check for working C compiler: /usr/bin/cc
> -- Check for working C compiler: /usr/bin/cc -- works
> -- Detecting C compiler ABI info
> -- Detecting C compiler ABI info - done
> -- Detecting C compile features
> -- Detecting C compile features - done
> -- Check for working CXX compiler: /usr/bin/c++
> -- Check for working CXX compiler: /usr/bin/c++ -- works
> -- Detecting CXX compiler ABI info
> -- Detecting CXX compiler ABI info - done
> -- Detecting CXX compile features
> -- Detecting CXX compile features - done
> -- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
> -- Arrow SO version: 100 (full: 100.0.0)
> -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
> -- clang-tidy not found
> -- clang-format not found
> -- infer not found
> -- Found PythonInterp: /usr/bin/python (found version "2.7.13")
> -- Found cpplint executable at
> /hello/arrow-master/cpp/build-support/cpplint.py
> -- Performing Test CXX_SUPPORTS_SSE4_2
> -- Performing Test CXX_SUPPORTS_SSE4_2 - Success
> -- Performing Test CXX_SUPPORTS_ALTIVEC
> -- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
> -- Performing Test CXX_SUPPORTS_ARMCRC
> -- Performing Test CXX_SUPPORTS_ARMCRC - Failed
> -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
> -- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
> -- Arrow build warning level: PRODUCTION
> Using ld linker
> Configured for RELEASE build (set with cmake
> -DCMAKE_BUILD_TYPE={release,debug,...})
> -- Build Type: RELEASE
> -- Using AUTO approach to find dependencies
> -- AWSSDK_VERSION: 1.7.160
> -- BOOST_VERSION: 1.67.0
> -- BROTLI_VERSION: v1.0.7
> -- BZIP2_VERSION: 1.0.8
> -- CARES_VERSION: 1.15.0
> -- GBENCHMARK_VERSION: v1.5.0
> -- GFLAGS_VERSION: v2.2.0
> -- GLOG_VERSION: v0.3.5
> -- GRPC_VERSION: v1.24.3
> -- GTEST_VERSION: 1.8.1
> -- JEMALLOC_VERSION: 5.2.1
> -- LZ4_VERSION: v1.9.2
> -- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
> -- ORC_VERSION: 1.5.7
> -- PROTOBUF_VERSION: v3.7.1
> -- RAPIDJSON_VERSION: 2bbd33b33217ff4a73434ebf10cdac41e2ef5e34
> -- RE2_VERSION: 2019-08-01
> -- SNAPPY_VERSION: 1.1.7
> -- THRIFT_VERSION: 0.12.0
> -- THRIFT_MD5_CHECKSUM: 3deebbb4d1ca77dd9c9e009a1ea02183
> -- ZLIB_VERSION: 1.2.11
> -- ZSTD_VERSION: v1.4.3
> -- Looking for pthread.h
> -- Looking for pthread.h - found
> -- Looking for pthread_create
> -- Looking for pthread_create - not found
> -- Check if compiler accepts -pthread
> -- Check if compiler accepts -pthread - yes
> -- Found Threads: TRUE
> -- Boost version: 1.62.0
> -- Found the following Boost libraries:
> --   regex
> --   system
> --   filesystem
> -- Boost include dir: /usr/include
> -- Boost libraries: 

[C++] CSV string column category to dictionary/indices?

2019-12-01 Thread ntfs hard
Hello

I'm a newcomer and not quite sure about the library usage. I tried to find
some documentation about it but failed.

I have a dataset in CSV file where one column(let's call it colour) is a
string category. I'd like to get indices instead of text_lines to pass it
inside algorithm.
I tried to set column_types in ConvertOptions in
{{"colour", arrow::dictionary(std::make_shared(),
arrow::utf8()) }} but it seems to be not right api usage, a wild run-time
error appears: NotImplemented: CSV conversion to dictionary is not supported
Also I find a merged PR #5785  but
not quite sure that's applicable for my case.

So, my question is: can I get indices inside a category column only w/
library API. And if yes, what I doing wrong. :)

*In other word,* I'd like to something like such python pandas code:
df[column] = df[column].cat.codes # if str(column_data_type) == "category"

Thank you!


RE: Issues with installation on docker / R

2019-12-01 Thread Christian Klar

Thanks for the quick response!

I attached the log to my first email but maybe it didn't got through because of 
the mailing list. Copying it below below the "#"-line.

I'm using the commands from the "Minimal release build" on 
https://arrow.apache.org/docs/developers/cpp.html.

cd arrow/cpp
mkdir release
cd release
cmake ..
make

The corresponding line below for the last make is here copied separately.

"root@66311c2b2dc5:/hello/arrow-master/cpp/release# make"

Please let me know if that's not the one you meant.

#

cklar@nyc-poly-tci-03:~/Desktop/Docker/arrow$ docker run -v 
/home/cklar/Desktop/tmpshare:/hello -it arrowtest /bin/bash
root@66311c2b2dc5:/# cd hello
root@66311c2b2dc5:/hello# cd hell^C
root@66311c2b2dc5:/hello# ls
arrow-master  arrow-master.zip  CLOPROCESS_TrancheHistory_Expense.arrow
root@66311c2b2dc5:/hello# cd arrow-master
root@66311c2b2dc5:/hello/arrow-master# ls
appveyor.yml  cmake-format.py csharp  format   java 
matlab  README.mdtesting
c_glibCODE_OF_CONDUCT.md  dev go   js   
NOTICE.txt  ruby
CHANGELOG.md  CONTRIBUTING.md docker-compose.yml  header   LICENSE.txt  
python  run-cmake-format.py
cicpp docsintegration  
Makefile.docker  r   rust
root@66311c2b2dc5:/hello/arrow-master# cd cpp
root@66311c2b2dc5:/hello/arrow-master/cpp# mkdir release
root@66311c2b2dc5:/hello/arrow-master/cpp# cd release
root@66311c2b2dc5:/hello/arrow-master/cpp/release# cmake
Usage

  cmake [options] 
  cmake [options] 

Specify a source directory to (re-)generate a build system for it in the
current working directory.  Specify an existing build directory to
re-generate its build system.

Run 'cmake --help' for more information.

root@66311c2b2dc5:/hello/arrow-master/cpp/release# cmake ..
-- Building using CMake version: 3.7.2
-- The C compiler identification is GNU 6.3.0
-- The CXX compiler identification is GNU 6.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
-- Arrow SO version: 100 (full: 100.0.0)
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29") 
-- clang-tidy not found
-- clang-format not found
-- infer not found
-- Found PythonInterp: /usr/bin/python (found version "2.7.13") 
-- Found cpplint executable at /hello/arrow-master/cpp/build-support/cpplint.py
-- Performing Test CXX_SUPPORTS_SSE4_2
-- Performing Test CXX_SUPPORTS_SSE4_2 - Success
-- Performing Test CXX_SUPPORTS_ALTIVEC
-- Performing Test CXX_SUPPORTS_ALTIVEC - Failed
-- Performing Test CXX_SUPPORTS_ARMCRC
-- Performing Test CXX_SUPPORTS_ARMCRC - Failed
-- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO
-- Performing Test CXX_SUPPORTS_ARMV8_CRC_CRYPTO - Failed
-- Arrow build warning level: PRODUCTION
Using ld linker
Configured for RELEASE build (set with cmake 
-DCMAKE_BUILD_TYPE={release,debug,...})
-- Build Type: RELEASE
-- Using AUTO approach to find dependencies
-- AWSSDK_VERSION: 1.7.160
-- BOOST_VERSION: 1.67.0
-- BROTLI_VERSION: v1.0.7
-- BZIP2_VERSION: 1.0.8
-- CARES_VERSION: 1.15.0
-- GBENCHMARK_VERSION: v1.5.0
-- GFLAGS_VERSION: v2.2.0
-- GLOG_VERSION: v0.3.5
-- GRPC_VERSION: v1.24.3
-- GTEST_VERSION: 1.8.1
-- JEMALLOC_VERSION: 5.2.1
-- LZ4_VERSION: v1.9.2
-- MIMALLOC_VERSION: 270e765454f98e8bab9d42609b153425f749fff6
-- ORC_VERSION: 1.5.7
-- PROTOBUF_VERSION: v3.7.1
-- RAPIDJSON_VERSION: 2bbd33b33217ff4a73434ebf10cdac41e2ef5e34
-- RE2_VERSION: 2019-08-01
-- SNAPPY_VERSION: 1.1.7
-- THRIFT_VERSION: 0.12.0
-- THRIFT_MD5_CHECKSUM: 3deebbb4d1ca77dd9c9e009a1ea02183
-- ZLIB_VERSION: 1.2.11
-- ZSTD_VERSION: v1.4.3
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- Boost version: 1.62.0
-- Found the following Boost libraries:
--   regex
--   system
--   filesystem
-- Boost include dir: /usr/include
-- Boost libraries: Boost::system;Boost::filesystem
-- Building without OpenSSL support. Minimum OpenSSL version 1.0.2 required.
-- Building (vendored) jemalloc from source
-- Could NOT find RapidJSONAlt (missing:  RAPIDJSON_INCLUDE_DIR) (found 
suitable version "2bbd33b33217ff4a73434ebf10cdac41e2ef5e34", minimum required 
is "1.1.0")
-- Building rapidjson from source
-- Found hdfs.h at: /hello/arrow-master/cpp/thirdparty/hadoop/include/hdfs.h
-- CMAKE_C_FLAGS:  -O3 

Re: Issues with installation on docker / R

2019-12-01 Thread Wes McKinney
Your instructions do not include “make install” when building the C++
library which will cause this problem. Can you provide a full build log if
you are actually installing the C++ library and the R build cannot find it
for some reason?

Thanks

On Sun, Dec 1, 2019 at 4:55 PM Christian Klar  wrote:

> Hi –
>
>
>
> (Resending because I was not subscribed to the mailing list prior.)
>
>
>
> I’m trying to install arrow on a docker image with R on it but R gives me
> always the
>
>
>
> “error in io___ReadableFile__Open(clean_path_abs(path)) :
>
>   Cannot call io___ReadableFile__Open(). Please use arrow::install_arrow()
> to install required runtime libraries.”
>
>
>
> error.
>
>
>
>
>
> Below the details:
>
>
>
>
>
> 1.   The docker image.
>
>
>
> FROM rocker/tidyverse
>
> ENV DEBIAN_FRONTEND noninteractive
>
> RUN apt-get install -y -V \
>
>  build-essential \
>
>  cmake \
>
>  libboost-filesystem-dev \
>
>  libboost-regex-dev \
>
>  libboost-system-dev
>
> RUN apt update && \
>
>   apt install -y -V \
>
> autoconf-archive \
>
> gtk-doc-tools \
>
> libgirepository1.0-dev \
>
> libglib2.0-doc \
>
> libtool \
>
> pkg-config && \
>
>   apt clean && \
>
>   rm -rf /var/lib/apt/lists/*
>
>
>
>
>
> 2.   Then within an interactive session (interactive so that I can
> debug easier) I run the actual cmake (I downloaded the arrow package
> manually). This is straight from
> https://arrow.apache.org/docs/developers/cpp.html.
>
>
>
> cd arrow/cpp
>
> mkdir release
>
> cd release
>
> cmake ..
>
> make
>
>
>
>
>
> 3.   After number 2 finishing successfully I start R and try
> “install.packages(‘arrow’)” which succeeds (builds it from source), but
> then I get back to the error above.
>
>
>
> 4.   install_arrow() just directs me to the link in number 2.
>
>
>
>
>
>
>
> I have the feeling the R package installation can’t find the path with the
> arrow installation, however I can’t find any environment variable that
> would handle this.
>
>
>
> Please let us know. At this point we spent many hours trying to get it to
> work – but this has to be an issue people have come across.
>
>
>
> I’m not using conda so this (the only thing comparable that I could find)
> is not applicable to me:
>
> https://github.com/apache/arrow/issues/4399
>
>
>
> Happy to provide further details. I also attached the full console log.
>
>
>
> Christian
>
>
>
>
>
> Christian Klar
> TFG Asset Management
> Tetragon Financial Management
> 399 Park Avenue, 22nd Floor
> 
>  |
> 
>  New York
> 
> , NY
> 
>  10022
> 
> |
> 
>  United States
> 
> Direct: +1 212 359 7369 | Main: +1 212 359 7300 | Mobile: +1 607 216 5045
> ck...@tetragoninv.com
> www.tetragoninv.com
>
> This communication and all or some of the information contained therein
> may be confidential. If you have received this communication in error,
> 
> please destroy all electronic and paper copies and notify the sender
> immediately. Unless specifically indicated, this communication is not a
> confirmation, an offer to sell or solicitation of any offer to buy any
> financial product, or an official statement of Tetragon Financial Group or
> its affiliates. TFG Asset Management L.P. and Tetragon Financial Management
> LP are registered as investment advisers under the U.S. Investment Advisers
> Act of 1940.
>
>
>
>


Issues with installation on docker / R

2019-12-01 Thread Christian Klar
Hi –

(Resending because I was not subscribed to the mailing list prior.)

I’m trying to install arrow on a docker image with R on it but R gives me 
always the

“error in io___ReadableFile__Open(clean_path_abs(path)) :
  Cannot call io___ReadableFile__Open(). Please use arrow::install_arrow() to 
install required runtime libraries.”

error.


Below the details:



1.   The docker image.

FROM rocker/tidyverse
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get install -y -V \
 build-essential \
 cmake \
 libboost-filesystem-dev \
 libboost-regex-dev \
 libboost-system-dev
RUN apt update && \
  apt install -y -V \
autoconf-archive \
gtk-doc-tools \
libgirepository1.0-dev \
libglib2.0-doc \
libtool \
pkg-config && \
  apt clean && \
  rm -rf /var/lib/apt/lists/*



2.   Then within an interactive session (interactive so that I can debug 
easier) I run the actual cmake (I downloaded the arrow package manually). This 
is straight from https://arrow.apache.org/docs/developers/cpp.html.

cd arrow/cpp
mkdir release
cd release
cmake ..
make



3.   After number 2 finishing successfully I start R and try 
“install.packages(‘arrow’)” which succeeds (builds it from source), but then I 
get back to the error above.



4.   install_arrow() just directs me to the link in number 2.



I have the feeling the R package installation can’t find the path with the 
arrow installation, however I can’t find any environment variable that would 
handle this.

Please let us know. At this point we spent many hours trying to get it to work 
– but this has to be an issue people have come across.

I’m not using conda so this (the only thing comparable that I could find) is 
not applicable to me:
https://github.com/apache/arrow/issues/4399

Happy to provide further details. I also attached the full console log.

Christian



Christian Klar
TFG Asset Management
Tetragon Financial Management
399 Park Avenue, 22nd Floor | New York, NY 10022 | United States
Direct: +1 212 359 7369 | Main: +1 212 359 7300 | Mobile: +1 607 216 5045
ck...@tetragoninv.com
www.tetragoninv.com


This communication and all or some of the information contained therein may be 
confidential. If you have received this communication in error, please destroy 
all electronic and paper copies and notify the sender immediately. Unless 
specifically indicated, this communication is not a confirmation, an offer to 
sell or solicitation of any offer to buy any financial product, or an official 
statement of Tetragon Financial Group or its affiliates. TFG Asset Management 
L.P. and Tetragon Financial Management LP are registered as investment advisers 
under the U.S. Investment Advisers Act of 1940.


cklar@nyc-poly-tci-03:~/Desktop/Docker/arrow$ docker run -v 
/home/cklar/Desktop/tmpshare:/hello -it arrowtest /bin/bash
root@66311c2b2dc5:/# cd hello
root@66311c2b2dc5:/hello# cd hell^C
root@66311c2b2dc5:/hello# ls
arrow-master  arrow-master.zip  CLOPROCESS_TrancheHistory_Expense.arrow
root@66311c2b2dc5:/hello# cd arrow-master
root@66311c2b2dc5:/hello/arrow-master# ls
appveyor.yml  cmake-format.py csharp  format   java 
matlab  README.mdtesting
c_glibCODE_OF_CONDUCT.md  dev go   js   
NOTICE.txt  ruby
CHANGELOG.md  CONTRIBUTING.md docker-compose.yml  header   LICENSE.txt  
python  run-cmake-format.py
cicpp docsintegration  
Makefile.docker  r   rust
root@66311c2b2dc5:/hello/arrow-master# cd cpp
root@66311c2b2dc5:/hello/arrow-master/cpp# mkdir release
root@66311c2b2dc5:/hello/arrow-master/cpp# cd release
root@66311c2b2dc5:/hello/arrow-master/cpp/release# cmake
Usage

  cmake [options] 
  cmake [options] 

Specify a source directory to (re-)generate a build system for it in the
current working directory.  Specify an existing build directory to
re-generate its build system.

Run 'cmake --help' for more information.

root@66311c2b2dc5:/hello/arrow-master/cpp/release# cmake ..
-- Building using CMake version: 3.7.2
-- The C compiler identification is GNU 6.3.0
-- The CXX compiler identification is GNU 6.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Arrow version: 1.0.0 (full: '1.0.0-SNAPSHOT')
-- Arrow SO version: 100 (full: 100.0.0)
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29") 
-- clang-tidy not found
-- clang-format not found
-- infer not 

[jira] [Created] (ARROW-7282) [Python] IOError subclassing

2019-12-01 Thread Scott Gigante (Jira)
Scott Gigante created ARROW-7282:


 Summary: [Python] IOError subclassing
 Key: ARROW-7282
 URL: https://issues.apache.org/jira/browse/ARROW-7282
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.15.1
 Environment: Arch Linux, Python 3.7
Reporter: Scott Gigante


I get the following error when trying to open a file that does not exist.

 

```
pyarrow.lib.ArrowIOError: Failed to open local file 'filename', error: No such 
file or directory

```

 

Two issues here:
 # pyarrow.lib.ArrowIOError doesn't inherit from Python IOError
 # This particular error should also subclass from Python FileNotFoundError



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[NIGHTLY] Arrow Build Report for Job nightly-2019-12-01-0

2019-12-01 Thread Crossbow


Arrow Build Report for Job nightly-2019-12-01-0

All tasks: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0

Failed Tasks:
- macos-r-autobrew:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-macos-r-autobrew
- test-debian-10-rust-nightly-2019-09-25:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-circle-test-debian-10-rust-nightly-2019-09-25
- wheel-manylinux1-cp27m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux1-cp27m
- wheel-manylinux1-cp27mu:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux1-cp27mu
- wheel-manylinux1-cp35m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux1-cp35m
- wheel-manylinux1-cp36m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux1-cp36m
- wheel-manylinux1-cp37m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux1-cp37m
- wheel-manylinux2010-cp27m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux2010-cp27m
- wheel-manylinux2010-cp27mu:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux2010-cp27mu
- wheel-manylinux2010-cp35m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux2010-cp35m
- wheel-manylinux2010-cp36m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux2010-cp36m
- wheel-manylinux2010-cp37m:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-wheel-manylinux2010-cp37m

Succeeded Tasks:
- centos-6:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-centos-6
- centos-7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-centos-7
- centos-8:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-centos-8
- conda-linux-gcc-py27:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-conda-linux-gcc-py27
- conda-linux-gcc-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-conda-linux-gcc-py36
- conda-linux-gcc-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-conda-linux-gcc-py37
- conda-osx-clang-py27:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-conda-osx-clang-py27
- conda-osx-clang-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-conda-osx-clang-py36
- conda-osx-clang-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-conda-osx-clang-py37
- conda-win-vs2015-py36:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-conda-win-vs2015-py36
- conda-win-vs2015-py37:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-conda-win-vs2015-py37
- debian-buster:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-debian-buster
- debian-stretch:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-azure-debian-stretch
- gandiva-jar-osx:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-gandiva-jar-osx
- gandiva-jar-trusty:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-gandiva-jar-trusty
- homebrew-cpp:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-travis-homebrew-cpp
- test-conda-cpp:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-circle-test-conda-cpp
- test-conda-python-2.7-pandas-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-circle-test-conda-python-2.7-pandas-latest
- test-conda-python-2.7:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-circle-test-conda-python-2.7
- test-conda-python-3.6:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-circle-test-conda-python-3.6
- test-conda-python-3.7-dask-latest:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-circle-test-conda-python-3.7-dask-latest
- test-conda-python-3.7-hdfs-2.9.2:
  URL: 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-01-0-circle-test-conda-python-3.7-hdfs-2.9.2
- test-conda-python-3.7-pandas-latest:
  URL: