[jira] [Created] (PARQUET-1221) [C++] Extend release README

2018-02-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1221: Summary: [C++] Extend release README Key: PARQUET-1221 URL: https://issues.apache.org/jira/browse/PARQUET-1221 Project: Parquet Issue Type: Task

[jira] [Assigned] (PARQUET-1220) [C++] Don't build Thrift examples and tutorials in the ExternalProject

2018-02-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1220: Assignee: Uwe L. Korn > [C++] Don't build Thrift examples and tutori

[jira] [Created] (PARQUET-1220) [C++] Don't build Thrift examples and tutorials in the ExternalProject

2018-02-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1220: Summary: [C++] Don't build Thrift examples and tutorials in the ExternalProject Key: PARQUET-1220 URL: https://issues.apache.org/jira/browse/PARQUET-1220 Project

[VOTE] Release Apache Parquet C++ 1.4.0 RC0

2018-02-18 Thread Uwe L. Korn
All, I propose that we accept the following release candidate as the official Apache Parquet C++ 1.4.0 release. Parquet C++ 1.4.0-rc0 includes the following: --- The CHANGELOG for the release is available at:

[jira] [Created] (PARQUET-1219) [C++] Update release-candidate script links to gitbox

2018-02-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1219: Summary: [C++] Update release-candidate script links to gitbox Key: PARQUET-1219 URL: https://issues.apache.org/jira/browse/PARQUET-1219 Project: Parquet

[jira] [Created] (PARQUET-1218) [C++] More informative error message on too short pages

2018-02-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1218: Summary: [C++] More informative error message on too short pages Key: PARQUET-1218 URL: https://issues.apache.org/jira/browse/PARQUET-1218 Project: Parquet

[jira] [Updated] (PARQUET-1036) parquet file created via pyarrow 0.4.0 ; version 1.0 - incompatible with Spark

2018-02-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1036: - Fix Version/s: cpp-1.5.0 > parquet file created via pyarrow 0.4.0 ; version 1.0 - incompati

[jira] [Resolved] (PARQUET-1196) [C++] Provide a parquet_arrow example project incl. CMake setup

2018-02-15 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1196. -- Resolution: Fixed Issue resolved by pull request 436 [https://github.com/apache/parquet-cpp

[jira] [Resolved] (PARQUET-1200) [C++] Support reading a single Arrow column from a Parquet file

2018-02-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1200. -- Resolution: Fixed Issue resolved by pull request 434 [https://github.com/apache/parquet-cpp

[jira] [Resolved] (PARQUET-1210) [C++] Boost 1.66 compilation fails on Windows on linkage stage

2018-02-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1210. -- Resolution: Fixed Fix Version/s: cpp-1.4.0 Issue resolved by pull request 437 [https

Releasing parquet-cpp 1.4.0

2018-02-12 Thread Uwe L. Korn
Hello all, I would like to start a vote for the 1.4.0 release this week. We still have not yet implemented everything we wanted in 1.4.0 but as the arrow release it depends on is already quite some time ago, I would like to release 1.4.0 now and push all other things to 1.5.0 which should also

[jira] [Resolved] (PARQUET-1205) Fix msvc static build

2018-02-10 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1205. -- Resolution: Fixed Fix Version/s: cpp-1.4.0 Issue resolved by pull request 435 [https

[jira] [Commented] (PARQUET-1205) Fix msvc static build

2018-02-08 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356685#comment-16356685 ] Uwe L. Korn commented on PARQUET-1205: -- [~rip@gmail.com] Can you post the failure? > Fix m

[jira] [Assigned] (PARQUET-1099) [C++] Add Travis CI entry that uses parquet-cpp as a library

2018-02-07 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1099: Assignee: Uwe L. Korn > [C++] Add Travis CI entry that uses parquet-cpp as a libr

Re: Date and time for next parquet sync

2018-01-29 Thread Uwe L. Korn
+1, Tuesday to Thursday are ok for me but I would prefer Tuesday this week. Uwe On Mon, Jan 29, 2018, at 12:54 PM, Zoltan Ivanfi wrote: > +1 for Tuesday, this week I can't attend on Wednesday. > > Zoltan > > On Mon, Jan 29, 2018 at 7:29 AM Lars Volker wrote: > > > I'm good

[jira] [Resolved] (PARQUET-1179) [C++] Support Apache Thrift 0.11

2018-01-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1179. -- Resolution: Fixed Issue resolved by pull request 433 [https://github.com/apache/parquet-cpp

Re: Moving parquet-mr and parquet-format to Apache's GitBox service

2018-01-25 Thread Uwe L. Korn
> > On Jan 24, 2018 08:48, "Ryan Blue" <rb...@netflix.com.invalid> wrote: > > > > > +1 > > > > > > On Wed, Jan 24, 2018 at 5:13 AM, Uwe L. Korn <uw...@xhochy.com> wrote: > > > > > > > Hello all, > > > > > >

[jira] [Updated] (PARQUET-1200) [C++] Support reading a single Arrow column from a Parquet file

2018-01-24 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1200: - Fix Version/s: (was: cpp-1.5.0) cpp-1.4.0 > [C++] Support read

[jira] [Resolved] (PARQUET-1151) [C++] Add build options / configuration to use static runtime libraries with MSVC

2018-01-24 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1151. -- Resolution: Fixed Fix Version/s: cpp-1.4.0 Issue resolved by pull request 429 [https

[jira] [Resolved] (PARQUET-1193) [CPP] Implement ColumnOrder to support min_value and max_value

2018-01-24 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1193. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 430 [https

Moving parquet-mr and parquet-format to Apache's GitBox service

2018-01-24 Thread Uwe L. Korn
Hello all, it seems that we recently had some issues with the retriggering of Travis builds and also would like to have worked a bit more with the GitHub UI. For the Apache Arrow and parquet-cpp repositories, we have already migrated to https://gitbox.apache.org/ which gives us near full

[jira] [Created] (PARQUET-1200) [C++] Support reading a single Arrow column from a Parquet file

2018-01-21 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1200: Summary: [C++] Support reading a single Arrow column from a Parquet file Key: PARQUET-1200 URL: https://issues.apache.org/jira/browse/PARQUET-1200 Project: Parquet

[jira] [Created] (PARQUET-1196) [C++] Provide a parquet_arrow example project incl. CMake setup

2018-01-17 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1196: Summary: [C++] Provide a parquet_arrow example project incl. CMake setup Key: PARQUET-1196 URL: https://issues.apache.org/jira/browse/PARQUET-1196 Project: Parquet

[jira] [Commented] (PARQUET-1127) [C++] Fix AssertArraysEqual call

2018-01-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325135#comment-16325135 ] Uwe L. Korn commented on PARQUET-1127: -- [~cpcloud] Can give more insight what this issue is about

[jira] [Commented] (PARQUET-1169) Segment fault when using NextBatch of parquet::arrow::ColumnReader in parquet-cpp

2018-01-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325134#comment-16325134 ] Uwe L. Korn commented on PARQUET-1169: -- I'm unable to reproduce this issue locally using {{parquet

[jira] [Resolved] (PARQUET-1113) [C++] Incorporate fix from ARROW-1601 on bitmap read path

2018-01-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1113. -- Resolution: Fixed Assignee: Rene Sugar Issue resolved by pull request https

[jira] [Resolved] (PARQUET-1147) [C++] Account for API deprecation / change in ARROW-1671

2018-01-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1147. -- Resolution: Fixed Assignee: Wes McKinney These were already all incorporated. >

[jira] [Resolved] (PARQUET-1097) [C++] Account for Arrow API deprecation in ARROW-1511

2018-01-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1097. -- Resolution: Fixed Assignee: Wes McKinney Issue resolved by pull request https

[jira] [Assigned] (PARQUET-1086) [C++] Remove usage of arrow/util/compiler-util.h after 1.3.0 release

2018-01-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1086: Assignee: Uwe L. Korn > [C++] Remove usage of arrow/util/compiler-util.h after 1.

[jira] [Created] (PARQUET-1187) [C++] Add abi-compliance-checker to the CI build

2018-01-07 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1187: Summary: [C++] Add abi-compliance-checker to the CI build Key: PARQUET-1187 URL: https://issues.apache.org/jira/browse/PARQUET-1187 Project: Parquet Issue

Re: Apache Parquet for .NET

2018-01-07 Thread Uwe L. Korn
Hello all, I would like to bump this thread again. It would be nice to have the .NET implementation as part of the official Apache project. From the Apache project perspective I would volunteer to help to integrate the project into the infrastructure, sadly I have also no experience with the

[jira] [Resolved] (PARQUET-1182) Parquet-cpp version 1.3.1 not tagged in git repo

2018-01-03 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1182. -- Resolution: Invalid > Parquet-cpp version 1.3.1 not tagged in git r

[jira] [Commented] (PARQUET-1182) Parquet-cpp version 1.3.1 not tagged in git repo

2018-01-03 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309615#comment-16309615 ] Uwe L. Korn commented on PARQUET-1182: -- It is, have a look at https://github.com/apache/parquet

[jira] [Resolved] (PARQUET-1015) Object categoricals are not serialized when only None is present

2018-01-02 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1015. -- Resolution: Fixed > Object categoricals are not serialized when only None is pres

[jira] [Resolved] (PARQUET-1101) [C++] Build against arrow master in CI

2018-01-02 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1101. -- Resolution: Fixed Fix Version/s: (was: cpp-1.3.0) cpp-1.3.1

[jira] [Commented] (PARQUET-1101) [C++] Build against arrow master in CI

2018-01-02 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308195#comment-16308195 ] Uwe L. Korn commented on PARQUET-1101: -- This should be fixed in the meantime by bumping to a newer

[jira] [Updated] (PARQUET-1127) [C++] Fix AssertArraysEqual call

2018-01-02 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1127: - Fix Version/s: (was: cpp-1.3.0) cpp-1.4.0 > [C++] Fix AssertArraysEq

[jira] [Updated] (PARQUET-1097) [C++] Account for Arrow API deprecation in ARROW-1511

2018-01-02 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1097: - Fix Version/s: (was: cpp-1.3.0) cpp-1.4.0 > [C++] Account for Arrow

[jira] [Updated] (PARQUET-979) [C++] Limit size of min, max or disable stats for long binary types

2018-01-02 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-979: Fix Version/s: (was: cpp-1.3.0) cpp-1.4.0 > [C++] Limit size of min,

[jira] [Updated] (PARQUET-1113) [C++] Incorporate fix from ARROW-1601 on bitmap read path

2018-01-02 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1113: - Fix Version/s: (was: cpp-1.3.1) cpp-1.4.0 > [C++] Incorporate fix f

[jira] [Created] (PARQUET-1180) C++: Fix behaviour of num_children element of primitive nodes

2017-12-29 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1180: Summary: C++: Fix behaviour of num_children element of primitive nodes Key: PARQUET-1180 URL: https://issues.apache.org/jira/browse/PARQUET-1180 Project: Parquet

[jira] [Resolved] (PARQUET-1092) [C++] Write Arrow tables with chunked columns

2017-12-17 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1092. -- Resolution: Fixed Fix Version/s: (was: cpp-1.3.0) cpp-1.4.0

[jira] [Commented] (PARQUET-1084) Parquet-C++ doesn't selectively read columns with mmap'ed files

2017-12-12 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288202#comment-16288202 ] Uwe L. Korn commented on PARQUET-1084: -- Recently we have verified that selective column reads work

[jira] [Updated] (PARQUET-1084) Parquet-C++ doesn't selectively read columns with mmap'ed files

2017-12-12 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1084: - Summary: Parquet-C++ doesn't selectively read columns with mmap'ed files (was: Parquet-C

[jira] [Commented] (PARQUET-1169) Segment fault when using NextBatch of parquet::arrow::ColumnReader in parquet-cpp

2017-12-06 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280238#comment-16280238 ] Uwe L. Korn commented on PARQUET-1169: -- [~frankfang] can you run the above code in {{gdb

[jira] [Created] (PARQUET-1171) [C++] Support RLE and BITPACKED as encodings for data

2017-12-06 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1171: Summary: [C++] Support RLE and BITPACKED as encodings for data Key: PARQUET-1171 URL: https://issues.apache.org/jira/browse/PARQUET-1171 Project: Parquet

[jira] [Resolved] (PARQUET-970) Add Add Lz4 and Zstd compression codecs

2017-11-23 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-970. - Resolution: Fixed Fix Version/s: cpp-1.4.0 Issue resolved by pull request 419 [https

[jira] [Assigned] (PARQUET-1165) [C++] Pin clang-format version to 4.0

2017-11-21 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1165: Assignee: Uwe L. Korn > [C++] Pin clang-format version to

Re: Some question about parquet schema

2017-11-21 Thread Uwe L. Korn
Hello, it seems that you have attached images in your mail. These were not send over the mailing list (it strips these kind of attachments). Can you post the schemas either as text or link to the images instead of attaching them to the mail? Uwe On Mon, Nov 20, 2017, at 08:00 AM,

Re: Codec value missing from Turbodbc files? Format issue?

2017-11-20 Thread Uwe L. Korn
The files are produced by Parquet C++ through pyarrow. Turbodbc cannot itself write Parquet, it only talks ODBC with a database and then returns Arrow tables/Pandas Dataframes. The conversion Arrow -> Parquet is done in pyarrow. Additionally I would add zstandard +... that were recently added

[jira] [Commented] (PARQUET-1162) C++: Update dev/README after migration to Gitbox

2017-11-19 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258516#comment-16258516 ] Uwe L. Korn commented on PARQUET-1162: -- PR: https://github.com/apache/parquet-cpp/pull/417 >

[jira] [Created] (PARQUET-1162) C++: Update dev/README after migration to Gitbox

2017-11-19 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1162: Summary: C++: Update dev/README after migration to Gitbox Key: PARQUET-1162 URL: https://issues.apache.org/jira/browse/PARQUET-1162 Project: Parquet Issue

[jira] [Resolved] (PARQUET-1146) C++: Add macOS-compatible sha512sum call to release verify script

2017-11-19 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1146. -- Resolution: Fixed Issue resolved by pull request 414 [https://github.com/apache/parquet-cpp

[jira] [Created] (PARQUET-1158) C++: Basic RowGroup filtering

2017-11-10 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1158: Summary: C++: Basic RowGroup filtering Key: PARQUET-1158 URL: https://issues.apache.org/jira/browse/PARQUET-1158 Project: Parquet Issue Type: Improvement

Re: parquet-cpp build issues on a mac

2017-11-09 Thread Uwe L. Korn
Hello Rahul, it could be that Thrift in the first version was built against an old OpenSSL version that does not support the newer TLS methods which are required by Thrift 0.10. It is hard to guess what went wrong without seeing the build log. If you are able to provide us with it, we can help

Re: Issues using TypedColumnReader::ReadBatchSpaced

2017-11-07 Thread Uwe L. Korn
Hello William, Seems like you got the problem Felipe earlier mentioned. My response to that was: the parquet::ByteArray instances don't own the data, so their internal pointer might get invalid on the next call to ReadBatchSpaced. This should actually make no difference if you that

Re: Problem reading ByteArray data when reusing buffers

2017-11-03 Thread Uwe L. Korn
Hello Felipe, the parquet::ByteArray instances don't own the data, so their internal pointer might get invalid on the next call to ReadBatchSpaced. This should actually make no difference if you that intermediateBuffer or not. Thus the second code snippet might also fail. In general, I can

Re: Record Conversion API in parquet-cpp

2017-11-03 Thread Uwe L. Korn
w to some > json > record ? > My questions are specific to the cpp version of Arrow and Parquet. > > -Sandeep > > On Thu, Nov 2, 2017 at 11:07 PM, Uwe L. Korn <uw...@xhochy.com> wrote: > > > Hello Sandeep, > > > > we don't require the same class str

Re: Record Conversion API in parquet-cpp

2017-11-02 Thread Uwe L. Korn
Hello Sandeep, we don't require the same class structure as in parquet-mr. Preferably they are very similar but they may differ. Some of parquet-mr's interfaces are specifically tailored to fit Hadoop whereas we don't have this requirement in the C++ implementation. Still, the interfaces should

Re: Best way to read parquet files from HDFS parquet-cpp

2017-10-30 Thread Uwe L. Korn
Hello Felipe, from a brief look, this code seems to be fine. Note that in the recent version of Arrow, HdfsClient was merged into HadoopFileSystem and thus the code will slightly differ there (should be only naming changes for you). You should update to avoid running into any difficulties because

[jira] [Resolved] (PARQUET-1150) C++: Hide statically linked boost symbols

2017-10-30 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1150. -- Resolution: Fixed Issue resolved by pull request 416 [https://github.com/apache/parquet-cpp

[jira] [Commented] (PARQUET-1150) C++: Hide statically linked boost symbols

2017-10-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223536#comment-16223536 ] Uwe L. Korn commented on PARQUET-1150: -- PR: https://github.com/apache/parquet-cpp/pull/416 >

[jira] [Created] (PARQUET-1150) C++: Hide statically linked boost symbols

2017-10-28 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1150: Summary: C++: Hide statically linked boost symbols Key: PARQUET-1150 URL: https://issues.apache.org/jira/browse/PARQUET-1150 Project: Parquet Issue Type

[jira] [Assigned] (PARQUET-1150) C++: Hide statically linked boost symbols

2017-10-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1150: Assignee: Uwe L. Korn > C++: Hide statically linked boost symb

[RESULT][VOTE] Release Apache Parquet C++ 1.3.1 RC1

2017-10-28 Thread Uwe L. Korn
Also, should we start packaging convenience binaries in the Apache release? > >> That way, we could test those as well. > >> > >> On Sun, Oct 22, 2017 at 4:17 AM, Uwe L. Korn <uw...@xhochy.com> wrote: > >> > >> > +1 (binding) > >> >

[jira] [Commented] (PARQUET-1146) C++: Add macOS-compatible sha512sum call to release verify script

2017-10-22 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214288#comment-16214288 ] Uwe L. Korn commented on PARQUET-1146: -- PR: https://github.com/apache/parquet-cpp/pull/414 >

[jira] [Created] (PARQUET-1146) C++: Add macOS-compatible sha512sum call to release verify script

2017-10-22 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1146: Summary: C++: Add macOS-compatible sha512sum call to release verify script Key: PARQUET-1146 URL: https://issues.apache.org/jira/browse/PARQUET-1146 Project: Parquet

Re: [VOTE] Release Apache Parquet C++ 1.3.1 RC1

2017-10-22 Thread Uwe L. Korn
+1 (binding) * Tested & built on macOS Sierra * Tested & built on Ubuntu 16.04 * Verified signature On Sun, Oct 22, 2017, at 01:15 PM, Uwe L. Korn wrote: > All, > > I propose that we accept the following release candidate as the official > Apache Parquet C++ 1.3.1 rel

[VOTE] Release Apache Parquet C++ 1.3.1 RC1

2017-10-22 Thread Uwe L. Korn
All, I propose that we accept the following release candidate as the official Apache Parquet C++ 1.3.1 release. Parquet C++ 1.3.1-rc1 includes the following: --- The CHANGELOG for the release is available at:

[jira] [Resolved] (PARQUET-1139) Add license to cmake_modules/parquet-cppConfig.cmake.in

2017-10-17 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1139. -- Resolution: Fixed Fix Version/s: cpp-1.3.1 Issue resolved by pull request 411 [https

[jira] [Resolved] (PARQUET-1138) [C++] Fix compilation with Arrow 0.7.1

2017-10-16 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1138. -- Resolution: Fixed Issue resolved by pull request 410 [https://github.com/apache/parquet-cpp

[jira] [Updated] (PARQUET-1140) [C++] Fail on RAT errors in CI

2017-10-16 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1140: - Fix Version/s: (was: cpp-1.4.0) cpp-1.3.1 > [C++] Fail on RAT err

[jira] [Assigned] (PARQUET-1140) [C++] Fail on RAT errors in CI

2017-10-16 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1140: Assignee: Uwe L. Korn > [C++] Fail on RAT errors in

[jira] [Updated] (PARQUET-1140) [C++] Fail on RAT errors in CI

2017-10-16 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1140: - Summary: [C++] Fail on RAT errors in CI (was: [C++] Run RAT checks in CI) > [C++] Fail on

Re: [VOTE] Release Apache Parquet C++ 1.3.1 RC0

2017-10-16 Thread Uwe L. Korn
to me > > > > except for apache-parquet-cpp-1.3.1/cmake_modules/parquet- > > > cppConfig.cmake.in, > > > > which I think needs a license header. > > > > > > > > Would +1 after a license has been added to that file. I created > > > &g

Re: [VOTE] Release Apache Parquet C++ 1.3.1 RC0

2017-10-16 Thread Uwe L. Korn
+1 * Ran verify-release-candidate on Ubuntu 16.04 * Ran verify-release-candidate on macOS Sierra -- Uwe L. Korn uw...@xhochy.com On Mon, Oct 16, 2017, at 02:16 AM, Wes McKinney wrote: > +1 > > * Ran verify-release-candidate on Ubuntu 14.04 > > In trying to verify the re

[jira] [Assigned] (PARQUET-1121) C++: DictionaryArrays of NullType cannot be written

2017-10-05 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1121: Assignee: Uwe L. Korn > C++: DictionaryArrays of NullType cannot be writ

[jira] [Commented] (PARQUET-1121) C++: DictionaryArrays of NullType cannot be written

2017-10-05 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193288#comment-16193288 ] Uwe L. Korn commented on PARQUET-1121: -- PR: https://github.com/apache/parquet-cpp/pull/407 >

[jira] [Created] (PARQUET-1121) C++: DictionaryArrays of NullType cannot be written

2017-10-05 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1121: Summary: C++: DictionaryArrays of NullType cannot be written Key: PARQUET-1121 URL: https://issues.apache.org/jira/browse/PARQUET-1121 Project: Parquet

Re: Minor release for Parquet C++ (1.3.1) based the Arrow bugfix (0.7.1) this week

2017-10-03 Thread Uwe L. Korn
ney <wesmck...@gmail.com>: > > I would like to see PARQUET-1095 resolved > (https://github.com/apache/parquet-cpp/pull/403), and then we can cut > cpp 1.3.1. > >> On Tue, Sep 26, 2017 at 9:29 AM, Uwe L. Korn <uw...@xhochy.com> wrote: >> Hello, >> >

[jira] [Commented] (PARQUET-1084) Parquet-C++ doesn't selectively read columns

2017-09-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16184260#comment-16184260 ] Uwe L. Korn commented on PARQUET-1084: -- I have debugged this using a more verbose FileReader

[jira] [Resolved] (PARQUET-1105) [CPP] Remove libboost_system dependency

2017-09-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1105. -- Resolution: Fixed Issue resolved by pull request 406 [https://github.com/apache/parquet-cpp

Re: pyarrow hang

2017-09-26 Thread Uwe L. Korn
Hello Mike, this is a known issue with jemalloc. Using the Arrow 0.6.0/0.7.0 release should avoid this. Uwe On Tue, Sep 26, 2017, at 06:10 PM, Katelman, Michael wrote: > Hi, > > I sometimes see pyarrow.parquet.write_table hang and was wondering if > this is as known issue or specific to me. I

Re: Minor release for Parquet C++ (1.3.1) based the Arrow bugfix (0.7.1) this week

2017-09-26 Thread Uwe L. Korn
Hello, a small heads-up that we will also make a bugfix release for parquet-cpp this week once the Arrow release is out. Uwe On Mon, Sep 25, 2017, at 04:14 PM, Wes McKinney wrote: > hi folks, > > We just fixed a bug (https://issues.apache.org/jira/browse/ARROW-1601) > causing some crashes for

[jira] [Resolved] (PARQUET-1111) dev/release/verify-release-candidate has stale help

2017-09-25 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-. -- Resolution: Fixed Fix Version/s: cpp-1.4.0 Issue resolved by pull request 402 [https

[jira] [Resolved] (PARQUET-1109) C++: Update release verification script to SHA512

2017-09-25 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1109. -- Resolution: Fixed Fix Version/s: cpp-1.4.0 Issue resolved by pull request 400 [https

[jira] [Resolved] (PARQUET-1110) [C++] Release verification script for Windows

2017-09-25 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1110. -- Resolution: Fixed Resolved by pull request https://github.com/apache/parquet-cpp/pull/401

Re: [VOTE] Release Apache Parquet C++ 1.3.0 RC0

2017-09-25 Thread Uwe L. Korn
Ran tests on Ubuntu 16.04.2 with gcc 5.4.0 > > > > On Thu, Sep 21, 2017 at 6:24 PM, Wes McKinney <wesmck...@gmail.com> wrote: > > > > > +1 (binding) > > > > > > * Checked signature, checksum > > > * Ran tests on Ubuntu 14.04 / gcc 4.8

Re: [VOTE] Release Apache Parquet C++ 1.3.0 RC0

2017-09-21 Thread Uwe L. Korn
repos/dist/dev/parquet/KEYS > > The release is based on the commit hash > 96f868f7817275f06ded618731c01d6f861bd8b6. > > Please download, verify, and test. > > The vote will close on So 24. Sep 13:33:26 CEST 2017 > > [ ] +1 Release this as Apache Parquet C++ 1.3.0 > [ ] +0 > [ ] -1 Do not release this as Apache Parquet C++ 1.3.0 because... > > Uwe > > -- > Uwe L. Korn > uw...@xhochy.com

[jira] [Commented] (PARQUET-1109) C++: Update release verification script to SHA512

2017-09-21 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174620#comment-16174620 ] Uwe L. Korn commented on PARQUET-1109: -- PR: https://github.com/apache/parquet-cpp/pull/400 >

[jira] [Created] (PARQUET-1109) C++: Update release verification script to SHA512

2017-09-21 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1109: Summary: C++: Update release verification script to SHA512 Key: PARQUET-1109 URL: https://issues.apache.org/jira/browse/PARQUET-1109 Project: Parquet Issue

[VOTE] Release Apache Parquet C++ 1.3.0 RC0

2017-09-21 Thread Uwe L. Korn
will close on So 24. Sep 13:33:26 CEST 2017 [ ] +1 Release this as Apache Parquet C++ 1.3.0 [ ] +0 [ ] -1 Do not release this as Apache Parquet C++ 1.3.0 because... Uwe -- Uwe L. Korn uw...@xhochy.com

[jira] [Assigned] (PARQUET-1037) Allow final RowGroup to be unfilled

2017-09-21 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1037: Assignee: Toby Shaw > Allow final RowGroup to be unfil

[jira] [Resolved] (PARQUET-1108) [C++] Fix Int96 comparators

2017-09-21 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1108. -- Resolution: Fixed Issue resolved by pull request 399 [https://github.com/apache/parquet-cpp

[jira] [Resolved] (PARQUET-1037) Allow final RowGroup to be unfilled

2017-09-21 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1037. -- Resolution: Fixed Fix Version/s: cpp-1.3.0 Issue resolved by pull request 378 [https

[jira] [Commented] (PARQUET-1015) Object categoricals are not serialized when only None is present

2017-09-12 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163709#comment-16163709 ] Uwe L. Korn commented on PARQUET-1015: -- PR: https://github.com/apache/parquet-cpp/pull/393

[jira] [Assigned] (PARQUET-929) [C++] Handle arrow::DictionaryArray when writing Arrow data

2017-09-11 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-929: --- Assignee: Uwe L. Korn > [C++] Handle arrow::DictionaryArray when writing Arrow d

[jira] [Updated] (PARQUET-1095) [C++] Read and write Arrow decimal values

2017-09-10 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1095: - Fix Version/s: cpp-1.4.0 > [C++] Read and write Arrow decimal val

[jira] [Commented] (PARQUET-1096) C++: Update sha{1, 256, 512} checksums per latest ASF release policy

2017-09-10 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160238#comment-16160238 ] Uwe L. Korn commented on PARQUET-1096: -- PR: https://github.com/apache/parquet-cpp/pull/392 >

[jira] [Created] (PARQUET-1096) C++: Update sha{1, 256, 512} checksums per latest ASF release policy

2017-09-10 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1096: Summary: C++: Update sha{1, 256, 512} checksums per latest ASF release policy Key: PARQUET-1096 URL: https://issues.apache.org/jira/browse/PARQUET-1096 Project

<    1   2   3   4   5   6   7   >