[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371375#comment-16371375 ] ASF GitHub Bot commented on PARQUET-1225: - majetideepak commented on a change in pull request

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371285#comment-16371285 ] ASF GitHub Bot commented on PARQUET-1225: - zivanfi commented on a change in pull request #444:

[jira] [Commented] (PARQUET-1233) [CPP ]Enable option to switch between stl classes and boost classes for thrift header

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371312#comment-16371312 ] ASF GitHub Bot commented on PARQUET-1233: - majetideepak commented on a change in pull request

[jira] [Updated] (PARQUET-1222) Definition of float and double sort order is ambigious

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1222: -- Fix Version/s: format-2.5.0 > Definition of float and double sort order is ambigious

[jira] [Assigned] (PARQUET-952) Avro union with single type fails with 'is not a group'

2018-02-21 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar reassigned PARQUET-952: - Assignee: Nandor Kollar > Avro union with single type fails with 'is not a group' >

Re: Parquet modular encryption

2018-02-21 Thread Gidon Gershinsky
A first PR in the encryption series is sent (to parquet-format), please review.

parquet-mr review request

2018-02-21 Thread Zoltan Ivanfi
Dear All, Our users encountered a concerning bug in parquet-mr that causes partial statistics to trigger a NPE when using predicate push-down. We would like to solve this issue with urgency and Gabor already uploaded a fix in PR #458 . I reviewed and

[jira] [Created] (PARQUET-1235) Parquet-tools cat mangles strings created by other clients

2018-02-21 Thread Michael McCarthy (JIRA)
Michael McCarthy created PARQUET-1235: - Summary: Parquet-tools cat mangles strings created by other clients Key: PARQUET-1235 URL: https://issues.apache.org/jira/browse/PARQUET-1235 Project:

[jira] [Commented] (PARQUET-1222) Definition of float and double sort order is ambigious

2018-02-21 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371529#comment-16371529 ] Zoltan Ivanfi commented on PARQUET-1222: Moving the design alternatives to this comment so that

[jira] [Updated] (PARQUET-1222) Definition of float and double sort order is ambigious

2018-02-21 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Ivanfi updated PARQUET-1222: --- Description: Currently parquet-format specifies the sort order for floating point numbers

[jira] [Updated] (PARQUET-1222) Definition of float and double sort order is ambigious

2018-02-21 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Ivanfi updated PARQUET-1222: --- Attachment: ordering.png > Definition of float and double sort order is ambigious >

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371553#comment-16371553 ] ASF GitHub Bot commented on PARQUET-1225: - wesm commented on a change in pull request #444:

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371548#comment-16371548 ] ASF GitHub Bot commented on PARQUET-1225: - wesm commented on a change in pull request #444:

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371546#comment-16371546 ] ASF GitHub Bot commented on PARQUET-1225: - wesm commented on a change in pull request #444:

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371547#comment-16371547 ] ASF GitHub Bot commented on PARQUET-1225: - wesm commented on a change in pull request #444:

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371549#comment-16371549 ] ASF GitHub Bot commented on PARQUET-1225: - wesm commented on a change in pull request #444:

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371552#comment-16371552 ] ASF GitHub Bot commented on PARQUET-1225: - wesm commented on a change in pull request #444:

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371551#comment-16371551 ] ASF GitHub Bot commented on PARQUET-1225: - wesm commented on a change in pull request #444:

[jira] [Updated] (PARQUET-1222) Definition of float and double sort order is ambigious

2018-02-21 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Ivanfi updated PARQUET-1222: --- Description: Currently parquet-format specifies the sort order for floating point numbers

[jira] [Updated] (PARQUET-1222) Definition of float and double sort order is ambigious

2018-02-21 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Ivanfi updated PARQUET-1222: --- Description: Currently parquet-format specifies the sort order for floating point numbers

[jira] [Updated] (PARQUET-1222) Definition of float and double sort order is ambigious

2018-02-21 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Ivanfi updated PARQUET-1222: --- Description: Currently parquet-format specifies the sort order for floating point numbers

[jira] [Updated] (PARQUET-1222) Definition of float and double sort order is ambigious

2018-02-21 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Ivanfi updated PARQUET-1222: --- Description: Currently parquet-format specifies the sort order for floating point numbers

Re: [VOTE] Release Apache Parquet C++ 1.4.0 RC0

2018-02-21 Thread Zoltan Borok-Nagy
Deepak, just for clarification, does it mean that parquet-cpp will also write statistics when all the values are NaN? On Wed, Feb 21, 2018 at 1:16 PM, Deepak Majeti wrote: > I am okay with this proposed fix for Impala. > > On Tue, Feb 20, 2018 at 5:46 PM, Zoltan

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371576#comment-16371576 ] ASF GitHub Bot commented on PARQUET-1225: - majetideepak commented on a change in pull request

Re: [VOTE] Release Apache Parquet C++ 1.4.0 RC0

2018-02-21 Thread Deepak Majeti
I am okay with this proposed fix for Impala. On Tue, Feb 20, 2018 at 5:46 PM, Zoltan Borok-Nagy wrote: > Hi, > > I'm implementing the quick fix for Impala. The current proposal for the > write path fix is to behave like the fmax()/fmin() functions in math.h, ie. >

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371319#comment-16371319 ] ASF GitHub Bot commented on PARQUET-1225: - majetideepak commented on a change in pull request

[jira] [Created] (PARQUET-1234) Release Parquet format 2.5.0

2018-02-21 Thread Gabor Szadovszky (JIRA)
Gabor Szadovszky created PARQUET-1234: - Summary: Release Parquet format 2.5.0 Key: PARQUET-1234 URL: https://issues.apache.org/jira/browse/PARQUET-1234 Project: Parquet Issue Type: Task

[jira] [Updated] (PARQUET-1234) Release Parquet format 2.5.0

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1234: -- Fix Version/s: format-2.5.0 > Release Parquet format 2.5.0 >

[jira] [Updated] (PARQUET-1234) Release Parquet format 2.5.0

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1234: -- Affects Version/s: format-2.5.0 > Release Parquet format 2.5.0 >

[jira] [Updated] (PARQUET-1145) Add license to .gitignore and .travis.yml

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1145: -- Fix Version/s: format-2.5.0 > Add license to .gitignore and .travis.yml >

[jira] [Updated] (PARQUET-1064) Deprecate type-defined sort ordering for INTERVAL type

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1064: -- Fix Version/s: format-2.5.0 > Deprecate type-defined sort ordering for INTERVAL type

[jira] [Commented] (PARQUET-1233) [CPP ]Enable option to switch between stl classes and boost classes for thrift header

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371369#comment-16371369 ] ASF GitHub Bot commented on PARQUET-1233: - wesm commented on a change in pull request #443:

[jira] [Updated] (PARQUET-1197) Log rat failures

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1197: -- Fix Version/s: format-2.5.0 > Log rat failures > > >

[jira] [Updated] (PARQUET-1201) Write column indexes

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1201: -- Fix Version/s: format-2.5.0 > Write column indexes > > >

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371373#comment-16371373 ] ASF GitHub Bot commented on PARQUET-1225: - majetideepak commented on a change in pull request

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371376#comment-16371376 ] ASF GitHub Bot commented on PARQUET-1225: - majetideepak commented on a change in pull request

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371224#comment-16371224 ] ASF GitHub Bot commented on PARQUET-1225: - boroknagyz commented on a change in pull request

New parquet-format release 2.5.0

2018-02-21 Thread Gabor Szadovszky
Hi, I’ve created PARQUET-1234 to track the parquet-format release 2.5.0. Added “format-2.5.0” to the fix version of all the related JIRAs:

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371223#comment-16371223 ] ASF GitHub Bot commented on PARQUET-1225: - boroknagyz commented on a change in pull request

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371225#comment-16371225 ] ASF GitHub Bot commented on PARQUET-1225: - boroknagyz commented on a change in pull request

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371282#comment-16371282 ] ASF GitHub Bot commented on PARQUET-1225: - zivanfi commented on a change in pull request #444:

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371281#comment-16371281 ] ASF GitHub Bot commented on PARQUET-1225: - zivanfi commented on a change in pull request #444:

[jira] [Updated] (PARQUET-1171) [C++] Clarify valid uses for RLE, BIT_PACKED encodings

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1171: -- Fix Version/s: (was: format-2.4.0) format-2.5.0 > [C++]

[jira] [Updated] (PARQUET-1224) [parquet-cpp] Implement specification-compliant floating point comparison

2018-02-21 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Ivanfi updated PARQUET-1224: --- Summary: [parquet-cpp] Implement specification-compliant floating point comparison (was:

[jira] [Updated] (PARQUET-1231) Not able to load the LocalFileSystem class

2018-02-21 Thread Persistent NGP (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Persistent NGP updated PARQUET-1231: Priority: Blocker (was: Major) > Not able to load the LocalFileSystem class >

[jira] [Updated] (PARQUET-1156) dev/merge_parquet_pr.py problems

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1156: -- Fix Version/s: format-2.5.0 > dev/merge_parquet_pr.py problems >

[jira] [Updated] (PARQUET-1065) Deprecate type-defined sort ordering for INT96 type

2018-02-21 Thread Gabor Szadovszky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1065: -- Fix Version/s: format-2.5.0 > Deprecate type-defined sort ordering for INT96 type >

[jira] [Resolved] (PARQUET-787) Add a size limit for heap allocations when reading

2018-02-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved PARQUET-787. --- Resolution: Fixed Fix Version/s: 1.10.0 Merged #390. > Add a size limit for heap allocations

FINAL REMINDER: CFP for Apache EU Roadshow Closes 25th February

2018-02-21 Thread Sharan F
Hello Apache Supporters and Enthusiasts This is your FINAL reminder that the Call for Papers (CFP) for the Apache EU Roadshow is closing soon. Our Apache EU Roadshow will focus on Cloud, IoT, Apache Tomcat, Apache Http and will run from 13-14 June 2018 in Berlin. Note that the CFP deadline

[jira] [Commented] (PARQUET-1234) Release Parquet format 2.5.0

2018-02-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371749#comment-16371749 ] Ryan Blue commented on PARQUET-1234: Are we going to release a 2.4.1 with the changes for column

[jira] [Commented] (PARQUET-1233) [CPP ]Enable option to switch between stl classes and boost classes for thrift header

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371776#comment-16371776 ] ASF GitHub Bot commented on PARQUET-1233: - majetideepak commented on issue #443: PARQUET-1233:

[jira] [Commented] (PARQUET-1233) [CPP ]Enable option to switch between stl classes and boost classes for thrift header

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371793#comment-16371793 ] ASF GitHub Bot commented on PARQUET-1233: - wesm closed pull request #443: PARQUET-1233: Enable

[jira] [Commented] (PARQUET-1233) [CPP ]Enable option to switch between stl classes and boost classes for thrift header

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371762#comment-16371762 ] ASF GitHub Bot commented on PARQUET-1233: - wesm commented on issue #443: PARQUET-1233: Enable

[jira] [Resolved] (PARQUET-1233) [CPP ]Enable option to switch between stl classes and boost classes for thrift header

2018-02-21 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved PARQUET-1233. --- Resolution: Fixed Issue resolved by pull request 443

Re: Parquet modular encryption

2018-02-21 Thread Gidon Gershinsky
A first PR in the encryption series is sent (to parquet-format), please review. Regards, Gidon

Re: Contributing parquet-rs to Apache?

2018-02-21 Thread Wes McKinney
hi Ivan and Chao, Since this work is ongoing for more than a year, it would be best to conduct an IP clearance to import it into the Apache Parquet project (http://incubator.apache.org/ip-clearance/). One or more members of the PMC will need to assist with this to prepare the documentation for

[jira] [Commented] (PARQUET-1225) NaN values may lead to incorrect filtering under certain circumstances

2018-02-21 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372008#comment-16372008 ] ASF GitHub Bot commented on PARQUET-1225: - majetideepak commented on issue #444: PARQUET-1225:

[jira] [Commented] (PARQUET-1234) Release Parquet format 2.5.0

2018-02-21 Thread Gidon Gershinsky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371806#comment-16371806 ] Gidon Gershinsky commented on PARQUET-1234: --- Would it make sense to include the encryption PRs

Re: [VOTE] Release Apache Parquet C++ 1.4.0 RC0

2018-02-21 Thread Deepak Majeti
Yes! The min/max will be set to NaN in the case when all the values are NaN. On Wed, Feb 21, 2018 at 10:54 AM, Zoltan Borok-Nagy wrote: > Deepak, just for clarification, does it mean that parquet-cpp will also > write statistics when all the values are NaN? > > > On