[jira] [Resolved] (PARQUET-1346) [C++] Protect against null values data in empty Arrow array

2018-07-12 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1346. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 474 [https

[jira] [Commented] (PARQUET-1343) Unable to read a parquet file

2018-06-29 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527609#comment-16527609 ] Uwe L. Korn commented on PARQUET-1343: -- This sounds like your file really got corrupted. When

[jira] [Updated] (PARQUET-1343) Unable to read a parquet file

2018-06-29 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1343: - Priority: Minor (was: Blocker) > Unable to read a parquet f

[jira] [Updated] (PARQUET-1333) [C++] Reading of files with dictionary size 0 fails on Windows with bad_alloc

2018-06-25 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1333: - Fix Version/s: cpp-1.5.0 > [C++] Reading of files with dictionary size 0 fails on Wind

[jira] [Assigned] (PARQUET-1158) C++: Basic RowGroup filtering

2018-06-14 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1158: Assignee: (was: Uwe L. Korn) > C++: Basic RowGroup filter

[jira] [Commented] (PARQUET-1158) C++: Basic RowGroup filtering

2018-06-14 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512624#comment-16512624 ] Uwe L. Korn commented on PARQUET-1158: -- [~keithgchapman] No, I'm not actively working

[jira] [Assigned] (PARQUET-1319) [C++] Pass BISON_EXECUTABLE to Thrift EP for MacOS

2018-06-14 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1319: Assignee: Tham > [C++] Pass BISON_EXECUTABLE to Thrift EP for Ma

[jira] [Resolved] (PARQUET-1319) [C++] Pass BISON_EXECUTABLE to Thrift EP for MacOS

2018-06-14 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1319. -- Resolution: Fixed Issue resolved by pull request 470 [https://github.com/apache/parquet-cpp

[jira] [Updated] (PARQUET-1159) Compatibility with C++ iterators

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1159: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > Compatibility wit

[jira] [Updated] (PARQUET-1276) [C++] Reduce the amount of memory used for writing null decimal values

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1276: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > [C++] Reduce the amo

[jira] [Updated] (PARQUET-1148) [C++] Code coverage has been broken since June 23

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1148: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > [C++] Code coverage

[jira] [Updated] (PARQUET-1106) [C++] Add include-what-you-use setup, fix IWYU warnings

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1106: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > [C++] Add include-what-

[jira] [Updated] (PARQUET-1158) C++: Basic RowGroup filtering

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1158: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > C++: Basic RowGr

[jira] [Updated] (PARQUET-1169) Segment fault when using NextBatch of parquet::arrow::ColumnReader in parquet-cpp

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1169: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > Segment fault when us

[jira] [Updated] (PARQUET-1127) [C++] Fix AssertArraysEqual call

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1127: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > [C++] Fix AssertArraysEq

[jira] [Updated] (PARQUET-1122) [C++] Support 2-level list encoding in Arrow decoding

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1122: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > [C++] Support 2-level l

[jira] [Updated] (PARQUET-1186) [C++] Handling Arrow reads that overflow a BinaryArray capacity

2018-06-09 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1186: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > [C++] Handling Arrow re

[jira] [Updated] (PARQUET-1319) [C++] Pass BISON_EXECUTABLE to Thrift EP for MacOS

2018-06-08 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1319: - Fix Version/s: cpp-1.5.0 > [C++] Pass BISON_EXECUTABLE to Thrift EP for Ma

[jira] [Commented] (PARQUET-1319) [C++] Pass BISON_EXECUTABLE to Thrift EP for MacOS

2018-06-08 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506303#comment-16506303 ] Uwe L. Korn commented on PARQUET-1319: -- [~thamha] That should like a reasonable approach

[jira] [Assigned] (PARQUET-1313) [C++] Compilation failure with VS2017

2018-05-31 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1313: Assignee: Antoine Pitrou > [C++] Compilation failure with VS2

[jira] [Resolved] (PARQUET-1313) [C++] Compilation failure with VS2017

2018-05-31 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1313. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 468 [https

Re: Move Dremel paper to parquet-format

2018-05-29 Thread Uwe L. Korn
est, > > Zoltan > > On Tue, May 29, 2018 at 1:21 PM Uwe L. Korn wrote: > > > Hello Nandor, > > > > as it seems that wiki contents were written by Julian and as they are on > > github wiki, they are markdown in the backend. > > > > The easiest t

Re: Move Dremel paper to parquet-format

2018-05-29 Thread Uwe L. Korn
Hello Nandor, as it seems that wiki contents were written by Julian and as they are on github wiki, they are markdown in the backend. The easiest thing from an IP side would be if Julien could contribute as plain markdown files to the parquet-format repo. I don't think we want/can to enable

[jira] [Resolved] (PARQUET-1307) [C++] memory-test fails with latest Arrow

2018-05-26 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1307. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 466 [https

Re: Permissions for committers

2018-05-22 Thread Uwe L. Korn
With gitbox you can also push to github, it's a two way sync nowadays  > Am 22.05.2018 um 18:39 schrieb Julien Le Dem : > > You don’t push commits to GitHub. You push them to the Apache git and they > get replicated to GitHub > >> On Tue, May 22, 2018 at 09:37 Julien

Re: Permissions for committers

2018-05-22 Thread Uwe L. Korn
Hello Gabor, Have you linked your github account on https://gitbox.apache.org/ and setup two factor authentication on github? Uwe > Am 22.05.2018 um 15:18 schrieb Gabor Szadovszky : > > Hi, > > Could someone help me to have the required permissions on github so I can >

Re: Feature branch for column indexes

2018-05-17 Thread Uwe L. Korn
Hello Zoltan, > Can anyone advise what we need to do in order to have the regular > infrastructure (like Travis builds) work on the feature branch? You probably should get this for free if you make that branch on the main parquet-mr repository. This will then limit the people that can merge to

[jira] [Resolved] (PARQUET-1297) [Java] SchemaConverter should not convert from Timestamp(TimeUnit.SECOND) and Timestamp(TimeUnit.NANOSECOND) of Arrow

2018-05-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1297. -- Resolution: Fixed Issue resolved by PR https://github.com/apache/parquet-mr/pull/477 > [J

[jira] [Updated] (PARQUET-1297) [Java] SchemaConverter should not convert from Timestamp(TimeUnit.SECOND) and Timestamp(TimeUnit.NANOSECOND) of Arrow

2018-05-13 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1297: - Fix Version/s: (was: 1.10.0) 1.11 > [Java] SchemaConverter sho

Re: [VOTE] Release Apache Parquet MR 1.8.3 RC0

2018-05-07 Thread Uwe L. Korn
> > On 7 May 2018, at 09:46, Uwe L. Korn <uw...@xhochy.com> wrote: > > > > Hello, > > > > the build is failing for me with "[ERROR] Failed to execute goal > > org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process >

[jira] [Resolved] (PARQUET-1285) [Java] SchemaConverter should not convert from TimeUnit.SECOND AND TimeUnit.NANOSECOND of Arrow

2018-05-07 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1285. -- Resolution: Fixed Issue resolved by pull request https://github.com/apache/parquet-mr/pull

[jira] [Assigned] (PARQUET-1285) [Java] SchemaConverter should not convert from TimeUnit.SECOND AND TimeUnit.NANOSECOND of Arrow

2018-05-07 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1285: Assignee: Masayuki Takahashi > [Java] SchemaConverter should not convert f

[jira] [Updated] (PARQUET-1285) [Java] SchemaConverter should not convert from TimeUnit.SECOND AND TimeUnit.NANOSECOND of Arrow

2018-05-07 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1285: - Fix Version/s: 1.10.0 > [Java] SchemaConverter should not convert from TimeUnit.SEC

Re: [VOTE] Release Apache Parquet MR 1.8.3 RC0

2018-05-07 Thread Uwe L. Korn
Hello, the build is failing for me with "[ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) on project parquet-generator: Error rendering velocity resource.: NullPointerException", exteneded stacktrace:

[jira] [Commented] (PARQUET-1290) Clarify maximum run lengths for RLE encoding

2018-05-03 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462016#comment-16462016 ] Uwe L. Korn commented on PARQUET-1290: -- [~tarmstrong] Assigned this you and gave you the rights so

[jira] [Assigned] (PARQUET-1290) Clarify maximum run lengths for RLE encoding

2018-05-03 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1290: Assignee: Tim Armstrong > Clarify maximum run lengths for RLE encod

[jira] [Assigned] (PARQUET-1283) [C++] FormatStatValue appends trailing space to string and int96

2018-05-01 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1283: Assignee: Julius Neuffer > [C++] FormatStatValue appends trailing space to str

[jira] [Resolved] (PARQUET-1283) [C++] FormatStatValue appends trailing space to string and int96

2018-05-01 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1283. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 461 [https

Re: What is the maximum run length in the RLE encoding?

2018-05-01 Thread Uwe L. Korn
Hello Tim, taking a brief look at what we have in parquet-cpp (which is probably very similar to Impala), we would also have problems with runs that are longer than 2^31. While supporting arbitrary long runs might be a really cool feature, I think it will come at a cost that we would have to

Re: Too verbose GitHub bot comments

2018-04-25 Thread Uwe L. Korn
We recently changed this in the Arrow project to be saved as Work Log instead of comments in JIRA. Just open a ticket with INFRA, they can switch this (+1 for this from me). On Wed, Apr 25, 2018, at 5:02 PM, Ryan Blue wrote: > +1. I'd rather not have them. > > On Wed, Apr 25, 2018 at 7:39 AM,

[jira] [Resolved] (PARQUET-1279) Use ASSERT_NO_FATAIL_FAILURE in C++ unit tests

2018-04-23 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1279. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 458 [https

[jira] [Updated] (PARQUET-1279) Use ASSERT_NO_FATAIL_FAILURE in C++ unit tests

2018-04-23 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1279: - Component/s: parquet-cpp > Use ASSERT_NO_FATAIL_FAILURE in C++ unit te

[jira] [Moved] (PARQUET-1279) Use ASSERT_NO_FATAIL_FAILURE in C++ unit tests

2018-04-23 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn moved ARROW-2497 to PARQUET-1279: - Workflow: patch-available, re-open possible (was: jira) Key

[jira] [Assigned] (PARQUET-1262) [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thrift

2018-04-22 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1262: Assignee: Uwe L. Korn > [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thr

[jira] [Assigned] (PARQUET-1128) [Java] Upgrade the Apache Arrow version to 0.8.0 for SchemaConverter

2018-04-21 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1128: Assignee: Masayuki Takahashi > [Java] Upgrade the Apache Arrow version to 0.

[jira] [Resolved] (PARQUET-1128) [Java] Upgrade the Apache Arrow version to 0.8.0 for SchemaConverter

2018-04-21 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1128. -- Resolution: Fixed Fix Version/s: 1.11 Issue resolved by pull request 443 [https

[jira] [Resolved] (PARQUET-1272) [C++] ScanFileContents reports wrong row count for nested columns

2018-04-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1272. -- Resolution: Fixed Issue resolved by pull request 457 [https://github.com/apache/parquet-cpp

[jira] [Resolved] (PARQUET-1274) [Python] SegFault in pyarrow.parquet.write_table with specific options

2018-04-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1274. -- Resolution: Fixed Issue resolved by pull request 456 [https://github.com/apache/parquet-cpp

[jira] [Assigned] (PARQUET-1274) [Python] SegFault in pyarrow.parquet.write_table with specific options

2018-04-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1274: Assignee: Joshua Storck > [Python] SegFault in pyarrow.parquet.write_table with speci

[jira] [Moved] (PARQUET-1274) [Python] SegFault in pyarrow.parquet.write_table with specific options

2018-04-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn moved ARROW-2082 to PARQUET-1274: - Fix Version/s: (was: 0.10.0) cpp-1.5.0

[jira] [Resolved] (PARQUET-1273) [Python] Error writing to partitioned Parquet dataset

2018-04-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1273. -- Resolution: Fixed Issue resolved by pull request 453 [https://github.com/apache/parquet-cpp

[jira] [Assigned] (PARQUET-1273) [Python] Error writing to partitioned Parquet dataset

2018-04-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1273: Assignee: Joshua Storck > [Python] Error writing to partitioned Parquet data

[jira] [Moved] (PARQUET-1273) [Python] Error writing to partitioned Parquet dataset

2018-04-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn moved ARROW-1938 to PARQUET-1273: - Fix Version/s: (was: 0.10.0) cpp-1.5.0

[jira] [Commented] (PARQUET-1272) [C++] ScanFileContents reports wrong row count for nested columns

2018-04-17 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441647#comment-16441647 ] Uwe L. Korn commented on PARQUET-1272: -- Yes, it is. I did miss that when looking for similar

[jira] [Assigned] (PARQUET-1270) [C++] Executable tools do not get installed

2018-04-17 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1270: Assignee: Antoine Pitrou > [C++] Executable tools do not get instal

[jira] [Resolved] (PARQUET-1270) [C++] Executable tools do not get installed

2018-04-17 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1270. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 455 [https

[jira] [Resolved] (PARQUET-1268) [C++] Conversion of Arrow null list columns fails

2018-04-17 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1268. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 454 [https

[jira] [Assigned] (PARQUET-1267) replace "unsafe" std::equal by std::memcmp

2018-04-17 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1267: Assignee: rip.nsk > replace "unsafe" std::equal b

[jira] [Resolved] (PARQUET-1267) replace "unsafe" std::equal by std::memcmp

2018-04-17 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1267. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 451 [https

[jira] [Created] (PARQUET-1272) [C++] ScanFileContents reports wrong row count for nested columns

2018-04-17 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1272: Summary: [C++] ScanFileContents reports wrong row count for nested columns Key: PARQUET-1272 URL: https://issues.apache.org/jira/browse/PARQUET-1272 Project: Parquet

Re: [VOTE] Release Apache Parquet Java 1.10.0 RC0

2018-04-06 Thread Uwe L. Korn
@netflix.com> wrote: > > > You can either put 0.9 earlier in your PATH, or set thrift.executable: > > > > mvn clean install -Pthrift.executable=/path/to/bin/thrift > > > > ​ > > > > On Fri, Apr 6, 2018 at 9:11 AM, Uwe L. Korn <uw...@xhochy.com>

Re: [VOTE] Release Apache Parquet Java 1.10.0 RC0

2018-04-06 Thread Uwe L. Korn
The build is failing for me because it is picking up my installation of Thrift 0.11. Is there a variable I could set to point it to my Thrift 0.9 installation? Uwe On Fri, Apr 6, 2018, at 2:34 PM, Zoltan Ivanfi wrote: > I would have preferred waiting for the parquet-format release (which >

Re: [RESULT][VOTE] Release Apache Parquet Format 2.5.0 RC0

2018-04-06 Thread Uwe L. Korn
Having a small verification script that downloads the archive, verifies the signatures and runs the tests locally is also something that greatly helps in the process. We have that with Arrow and it really works smoothly (Arrow is much harder to build, so it is a necessity there). I had this

[jira] [Commented] (PARQUET-1265) Segfault on static ApplicationVersion initialization

2018-04-05 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426871#comment-16426871 ] Uwe L. Korn commented on PARQUET-1265: -- [~llchan] It is a known problem that statically linking

[jira] [Updated] (PARQUET-1256) [C++] Add --print-key-value-metadata option to parquet_reader tool

2018-04-04 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1256: - Fix Version/s: cpp-1.5.0 > [C++] Add --print-key-value-metadata option to parquet_reader t

Re: Solution to read/write multiple parquet files

2018-04-04 Thread Uwe L. Korn
; parquet files ? > Another question is that should we keep same RowGroup size for one > parquet file ?> > > > Thanks, > Lizhou > ------ Original -- > *From: * "Uwe L. Korn"<uw...@xhochy.com>; > *Date: * Tue, Apr 3, 20

Re: Solution to read/write multiple parquet files

2018-04-03 Thread Uwe L. Korn
Hello Lizhou, on the Python side there is http://dask.pydata.org/en/latest/ that can read large, distributed Parquet datasets. When using `engine=pyarrow`, it also uses parquet-cpp under the hood. On the pure C++ side, I know that https://github.com/thrill/thrill has experimental parquet

[jira] [Updated] (PARQUET-1262) [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thrift

2018-03-29 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1262: - Fix Version/s: cpp-1.5.0 > [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thr

[jira] [Updated] (PARQUET-1262) [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thrift

2018-03-29 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1262: - Component/s: parquet-cpp > [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thr

[jira] [Updated] (PARQUET-1262) [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thrift

2018-03-29 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1262: - Description: When building Thrift using the ExternalProject facility, we do not pass

[jira] [Created] (PARQUET-1262) [C++] Use the same BOOST_ROOT and Boost_NAMESPACE as

2018-03-29 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1262: Summary: [C++] Use the same BOOST_ROOT and Boost_NAMESPACE as Key: PARQUET-1262 URL: https://issues.apache.org/jira/browse/PARQUET-1262 Project: Parquet

[jira] [Updated] (PARQUET-1262) [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thrift

2018-03-29 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1262: - Summary: [C++] Use the same BOOST_ROOT and Boost_NAMESPACE for Thrift (was: [C++] Use

[jira] [Assigned] (PARQUET-1255) [C++] Exceptions thrown in some tests

2018-03-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1255: Assignee: Antoine Pitrou > [C++] Exceptions thrown in some te

[jira] [Assigned] (PARQUET-1071) [C++] parquet::arrow::FileWriter::Close is not idempotent

2018-03-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1071: Assignee: Antoine Pitrou > [C++] parquet::arrow::FileWriter::Close is not idempot

[jira] [Resolved] (PARQUET-1255) [C++] Exceptions thrown in some tests

2018-03-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1255. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 448 [https

[jira] [Resolved] (PARQUET-1071) [C++] parquet::arrow::FileWriter::Close is not idempotent

2018-03-28 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1071. -- Resolution: Fixed Fix Version/s: cpp-1.5.0 Issue resolved by pull request 449 [https

[jira] [Commented] (PARQUET-1255) [C++] Exceptions thrown in some tests

2018-03-27 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415585#comment-16415585 ] Uwe L. Korn commented on PARQUET-1255: -- This is the magic way of telling you to set

Re: Merging changes in the GitBox era

2018-03-25 Thread Uwe L. Korn
+1 On Fri, Mar 23, 2018, at 11:57 PM, Ryan Blue wrote: > +1 > > On Fri, Mar 23, 2018 at 11:01 AM, Wes McKinney wrote: > > > +1 > > > > On Fri, Mar 23, 2018 at 1:56 PM, Lars Volker wrote: > > > I checked with the Infra team and they can disable some of

[jira] [Created] (PARQUET-1252) [C++] Pass BOOST_ROOT and Boost_NAMESPACE on to Thrift EP

2018-03-22 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1252: Summary: [C++] Pass BOOST_ROOT and Boost_NAMESPACE on to Thrift EP Key: PARQUET-1252 URL: https://issues.apache.org/jira/browse/PARQUET-1252 Project: Parquet

[jira] [Updated] (PARQUET-1204) [C++] Less verbose logging from thirdparty toolchain

2018-03-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1204: - Fix Version/s: (was: cpp-1.4.0) cpp-1.5.0 > [C++] Less verbose logg

[jira] [Assigned] (PARQUET-1209) locally defined symbol ... imported in function ..

2018-03-12 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1209: Assignee: rip.nsk > locally defined symbol ... imported in funct

[ANNOUNCE] Apache Parquet C++ release 1.4.0

2018-03-07 Thread Uwe L. Korn
We are please to announce the release of Apache Parquet C++ 1.4.0! Parquet is a general-purpose columnar file format supporting nested data. It uses space-efficient encodings and a compressed and splittable structure for processing frameworks like Hadoop. The CHANGELOG for the release is

Re: [VOTE] Accept donation of Parquet Rust implementation

2018-03-06 Thread Uwe L. Korn
+1 On Tue, Mar 6, 2018, at 9:29 PM, Ryan Blue wrote: > +1 > > Thanks for starting a vote, Wes! > > On Tue, Mar 6, 2018 at 12:24 PM, Wes McKinney wrote: > > > Dear all, > > > > The Parquet PMC has been in contact with the developers of > > > >

[RESULT][VOTE] Release Apache Parquet C++ 1.4.0 RC1

2018-03-06 Thread Uwe L. Korn
Marroquín Mogrovejo > > <renatoj.marroq...@gmail.com> wrote: > >> +1 (non-binding) > >> > >> Run release verification successfully on Ubuntu 16.04 > >> > >> 2018-02-27 14:22 GMT+01:00 Uwe L. Korn <uw...@xhochy.com>: > >> > >

[jira] [Commented] (PARQUET-1099) [C++] Add Travis CI entry that uses parquet-cpp as a library

2018-02-27 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16378905#comment-16378905 ] Uwe L. Korn commented on PARQUET-1099: -- Issue resolved by PR https://github.com/apache/parquet-cpp

[jira] [Resolved] (PARQUET-1099) [C++] Add Travis CI entry that uses parquet-cpp as a library

2018-02-27 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1099. -- Resolution: Fixed > [C++] Add Travis CI entry that uses parquet-cpp as a libr

Re: [VOTE] Release Apache Parquet C++ 1.4.0 RC1

2018-02-27 Thread Uwe L. Korn
+1 (binding) Tested successfully on macOS and Ubuntu 16.04 using ./dev/release/verify-release-candidate 1.4.0 1 On Tue, Feb 27, 2018, at 2:21 PM, Uwe L. Korn wrote: > All, > > I propose that we accept the following release candidate as the official > Apache Parquet C++ 1

[VOTE] Release Apache Parquet C++ 1.4.0 RC1

2018-02-27 Thread Uwe L. Korn
All, I propose that we accept the following release candidate as the official Apache Parquet C++ 1.4.0 release. Parquet C++ 1.4.0-rc1 includes the following: --- The CHANGELOG for the release is available at:

[jira] [Resolved] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-24 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1225. -- Resolution: Fixed Fix Version/s: cpp-1.4.0 Issue resolved by pull request 444 [https

[jira] [Updated] (PARQUET-1224) Implement specification-compliant floating point comparison

2018-02-20 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1224: - Fix Version/s: (was: cpp-1.4.0) > Implement specification-compliant floating po

[jira] [Updated] (PARQUET-1224) Implement specification-compliant floating point comparison

2018-02-20 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1224: - Fix Version/s: cpp-1.4.0 > Implement specification-compliant floating point compari

Re: [VOTE] Release Apache Parquet C++ 1.4.0 RC0

2018-02-20 Thread Uwe L. Korn
che.org/jira/browse/PARQUET-1225> should be included in > >> the next release, even if it causes a delay. > >> > >> Br, > >> > >> Zoltan > >> > >> On Sun, Feb 18, 2018 at 10:10 PM Uwe L. Korn <uw...@xhochy.com> wrote: > >

[jira] [Created] (PARQUET-1221) [C++] Extend release README

2018-02-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1221: Summary: [C++] Extend release README Key: PARQUET-1221 URL: https://issues.apache.org/jira/browse/PARQUET-1221 Project: Parquet Issue Type: Task

[jira] [Assigned] (PARQUET-1220) [C++] Don't build Thrift examples and tutorials in the ExternalProject

2018-02-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned PARQUET-1220: Assignee: Uwe L. Korn > [C++] Don't build Thrift examples and tutori

[jira] [Created] (PARQUET-1220) [C++] Don't build Thrift examples and tutorials in the ExternalProject

2018-02-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1220: Summary: [C++] Don't build Thrift examples and tutorials in the ExternalProject Key: PARQUET-1220 URL: https://issues.apache.org/jira/browse/PARQUET-1220 Project

[VOTE] Release Apache Parquet C++ 1.4.0 RC0

2018-02-18 Thread Uwe L. Korn
All, I propose that we accept the following release candidate as the official Apache Parquet C++ 1.4.0 release. Parquet C++ 1.4.0-rc0 includes the following: --- The CHANGELOG for the release is available at:

[jira] [Created] (PARQUET-1219) [C++] Update release-candidate script links to gitbox

2018-02-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1219: Summary: [C++] Update release-candidate script links to gitbox Key: PARQUET-1219 URL: https://issues.apache.org/jira/browse/PARQUET-1219 Project: Parquet

[jira] [Created] (PARQUET-1218) [C++] More informative error message on too short pages

2018-02-18 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-1218: Summary: [C++] More informative error message on too short pages Key: PARQUET-1218 URL: https://issues.apache.org/jira/browse/PARQUET-1218 Project: Parquet

[jira] [Updated] (PARQUET-1036) parquet file created via pyarrow 0.4.0 ; version 1.0 - incompatible with Spark

2018-02-18 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated PARQUET-1036: - Fix Version/s: cpp-1.5.0 > parquet file created via pyarrow 0.4.0 ; version 1.0 - incompati

[jira] [Resolved] (PARQUET-1196) [C++] Provide a parquet_arrow example project incl. CMake setup

2018-02-15 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved PARQUET-1196. -- Resolution: Fixed Issue resolved by pull request 436 [https://github.com/apache/parquet-cpp

<    1   2   3   4   5   6   7   >