[jira] [Created] (PARQUET-1073) Hive failed to parse Parquet file generated by Spark SQL

2017-08-01 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1073: Summary: Hive failed to parse Parquet file generated by Spark SQL Key: PARQUET-1073 URL: https://issues.apache.org/jira/browse/PARQUET-1073 Project: Parquet

[jira] [Commented] (PARQUET-1061) parquet dictionary filter does not work.

2017-08-02 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16111918#comment-16111918 ] Junjie Chen commented on PARQUET-1061: -- Hi [~blue_impala_48d6] Could you please try it out? >

[jira] [Resolved] (PARQUET-1073) Hive failed to parse Parquet file generated by Spark SQL

2017-08-01 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1073. -- Resolution: Not A Problem > Hive failed to parse Parquet file generated by Spark SQL >

[jira] [Commented] (PARQUET-1073) Hive failed to parse Parquet file generated by Spark SQL

2017-08-01 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110070#comment-16110070 ] Junjie Chen commented on PARQUET-1073: -- Thanks [~hyukjin.kwon], it works! > Hive failed to parse

[jira] [Created] (PARQUET-1061) parquet dictionary filter does not work.

2017-07-18 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1061: Summary: parquet dictionary filter does not work. Key: PARQUET-1061 URL: https://issues.apache.org/jira/browse/PARQUET-1061 Project: Parquet Issue Type: Bug

[jira] [Commented] (PARQUET-1061) parquet dictionary filter does not work.

2017-07-18 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091310#comment-16091310 ] Junjie Chen commented on PARQUET-1061: -- Hi [~blue_impala_48d6] Could you please help take a look?

[jira] [Comment Edited] (PARQUET-1061) parquet dictionary filter does not work.

2017-07-18 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092409#comment-16092409 ] Junjie Chen edited comment on PARQUET-1061 at 7/19/17 12:42 AM: Yes, I

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-07-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094006#comment-16094006 ] Junjie Chen commented on PARQUET-41: Thanks Jim Very useful links and example code! > Add bloom

[jira] [Resolved] (PARQUET-1061) parquet dictionary filter does not work.

2017-08-07 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1061. -- Resolution: Not A Problem > parquet dictionary filter does not work. >

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-18 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015449#comment-16015449 ] Junjie Chen commented on PARQUET-41: Hi [~rdblue] We have a real use case from a Telecom company which

[jira] [Comment Edited] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018218#comment-16018218 ] Junjie Chen edited comment on PARQUET-41 at 5/20/17 1:00 AM: - Hi [~rdblue] In

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018218#comment-16018218 ] Junjie Chen commented on PARQUET-41: Hi [~rdblue] In telecom example, query column is not unique if

[jira] [Comment Edited] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018218#comment-16018218 ] Junjie Chen edited comment on PARQUET-41 at 5/20/17 1:05 AM: - Hi [~rdblue] In

[jira] [Comment Edited] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018218#comment-16018218 ] Junjie Chen edited comment on PARQUET-41 at 5/20/17 12:58 AM: -- Hi [~rdblue]

[jira] [Comment Edited] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-22 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018218#comment-16018218 ] Junjie Chen edited comment on PARQUET-41 at 5/22/17 6:26 AM: - Hi [~rdblue] In

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-05-18 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016790#comment-16016790 ] Junjie Chen commented on PARQUET-41: Hi [~rdblue] The distinct values in each column is increasing

[jira] [Comment Edited] (PARQUET-41) Add bloom filters to parquet statistics

2017-09-05 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143463#comment-16143463 ] Junjie Chen edited comment on PARQUET-41 at 9/6/17 3:07 AM: Hi

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-08-27 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143288#comment-16143288 ] Junjie Chen commented on PARQUET-41: Add related [design

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2017-08-28 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143463#comment-16143463 ] Junjie Chen commented on PARQUET-41: please see initial PR:

[jira] [Comment Edited] (PARQUET-41) Add bloom filters to parquet statistics

2017-08-28 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143463#comment-16143463 ] Junjie Chen edited comment on PARQUET-41 at 8/28/17 7:28 AM: - please see

[jira] [Commented] (PARQUET-1134) Release Parquet format 2.4.0

2017-10-10 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199631#comment-16199631 ] Junjie Chen commented on PARQUET-1134: -- Can we include PARQUET-41? > Release Parquet format 2.4.0

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-23 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486936#comment-16486936 ] Junjie Chen commented on PARQUET-41: [~jbapple], I understood your point, I will do benchmark to

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-12 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509737#comment-16509737 ] Junjie Chen commented on PARQUET-41: Hi Here is benchmark link:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-15 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513908#comment-16513908 ] Junjie Chen commented on PARQUET-41: Thanks [~jbapple] Since the jira may contains several

[jira] [Created] (PARQUET-1329) integrate parquet bloom filter into row group filter logic

2018-06-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1329: Summary: integrate parquet bloom filter into row group filter logic Key: PARQUET-1329 URL: https://issues.apache.org/jira/browse/PARQUET-1329 Project: Parquet

[jira] [Created] (PARQUET-1328) parquet bloom filter writer implementation

2018-06-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1328: Summary: parquet bloom filter writer implementation Key: PARQUET-1328 URL: https://issues.apache.org/jira/browse/PARQUET-1328 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-1326) parquet bloom filter support in parquet cpp

2018-06-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1326: Summary: parquet bloom filter support in parquet cpp Key: PARQUET-1326 URL: https://issues.apache.org/jira/browse/PARQUET-1326 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-1327) parquet bloom filter reader implementation

2018-06-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1327: Summary: parquet bloom filter reader implementation Key: PARQUET-1327 URL: https://issues.apache.org/jira/browse/PARQUET-1327 Project: Parquet Issue Type:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517638#comment-16517638 ] Junjie Chen commented on PARQUET-41: [~jbapple], I just created a new parquet-format PR since

[jira] [Assigned] (PARQUET-41) Add bloom filters to parquet statistics

2018-05-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-41: -- Assignee: Junjie Chen (was: Ferdinand Xu) > Add bloom filters to parquet statistics >

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338670#comment-16338670 ] Junjie Chen commented on PARQUET-41: Hi [~jbapple], AFAIK, we don't have benchmark progress to compare

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338706#comment-16338706 ] Junjie Chen commented on PARQUET-41: In Parquet-mr, when we set dictionary encoding to true, the

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-24 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338722#comment-16338722 ] Junjie Chen commented on PARQUET-41: Sure, it is feasible, then we are comparing bloom filter vs

[jira] [Created] (PARQUET-1332) Add bloom filter utility class

2018-06-21 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1332: Summary: Add bloom filter utility class Key: PARQUET-1332 URL: https://issues.apache.org/jira/browse/PARQUET-1332 Project: Parquet Issue Type: Sub-task

[jira] [Commented] (PARQUET-1332) Add bloom filter utility class

2018-06-21 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519914#comment-16519914 ] Junjie Chen commented on PARQUET-1332: -- PR for parquet-mr:

[jira] [Created] (PARQUET-1377) [C++] replace shared_ptr to unique_ptr in Bloom filter buffer allocation

2018-08-10 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1377: Summary: [C++] replace shared_ptr to unique_ptr in Bloom filter buffer allocation Key: PARQUET-1377 URL: https://issues.apache.org/jira/browse/PARQUET-1377 Project:

[jira] [Created] (PARQUET-1380) move Bloom filter test binary to parquet-testing repo

2018-08-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1380: Summary: move Bloom filter test binary to parquet-testing repo Key: PARQUET-1380 URL: https://issues.apache.org/jira/browse/PARQUET-1380 Project: Parquet

[jira] [Commented] (PARQUET-1380) move Bloom filter test binary to parquet-testing repo

2018-08-15 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581758#comment-16581758 ] Junjie Chen commented on PARQUET-1380: -- Hi [~wesmckinn], I created this to track following thing

[jira] [Commented] (PARQUET-1385) [C++] bloom_filter-test is very slow under valgrind

2018-08-17 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583425#comment-16583425 ] Junjie Chen commented on PARQUET-1385: -- The GetRandomString function is very slow, I can change to

[jira] [Commented] (PARQUET-1385) [C++] bloom_filter-test is very slow under valgrind

2018-08-17 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583489#comment-16583489 ] Junjie Chen commented on PARQUET-1385: -- std::seed_seq::generate takes more than 75% cpu cycles

[jira] [Updated] (PARQUET-1327) [C++]Bloom filter read/write implementation

2018-08-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1327: - Summary: [C++]Bloom filter read/write implementation (was: parquet bloom filter reader

[jira] [Assigned] (PARQUET-1328) [java]Bloom filter read/write implementation

2018-08-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1328: Assignee: Junjie Chen > [java]Bloom filter read/write implementation >

[jira] [Updated] (PARQUET-1328) [java]Bloom filter read/write implementation

2018-08-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1328: - Summary: [java]Bloom filter read/write implementation (was: parquet bloom filter writer

[jira] [Updated] (PARQUET-1329) [C++] Integrate Bloom filter into row group filter logic

2018-08-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1329: - Summary: [C++] Integrate Bloom filter into row group filter logic (was: integrate parquet

[jira] [Created] (PARQUET-1391) [java] Integrate Bloom filter logic

2018-08-19 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1391: Summary: [java] Integrate Bloom filter logic Key: PARQUET-1391 URL: https://issues.apache.org/jira/browse/PARQUET-1391 Project: Parquet Issue Type: Sub-task

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-07-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550201#comment-16550201 ] Junjie Chen commented on PARQUET-41: [~aniket486], Thanks for watching this. Yes, I 'm still

[jira] [Comment Edited] (PARQUET-1332) [C++] Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519914#comment-16519914 ] Junjie Chen edited comment on PARQUET-1332 at 6/29/18 8:28 AM: --- PR for

[jira] [Updated] (PARQUET-1332) [C++] Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1332: - Summary: [C++] Add bloom filter utility class (was: Add bloom filter utility class) > [C++]

[jira] [Updated] (PARQUET-1332) [C++] Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1332: - Component/s: parquet-cpp > [C++] Add bloom filter utility class >

[jira] [Commented] (PARQUET-1342) Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527315#comment-16527315 ] Junjie Chen commented on PARQUET-1342: -- PR: https://github.com/apache/parquet-mr/pull/425 > Add

[jira] [Updated] (PARQUET-1332) Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1332: - Fix Version/s: (was: 1.10.0) 1.11.0 > Add bloom filter utility class >

[jira] [Updated] (PARQUET-1326) [C++] Cross compatibility support with parquet-mr

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1326: - Summary: [C++] Cross compatibility support with parquet-mr (was: parquet bloom filter support

[jira] [Created] (PARQUET-1342) Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1342: Summary: Add bloom filter utility class Key: PARQUET-1342 URL: https://issues.apache.org/jira/browse/PARQUET-1342 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1453) Support nested column Bloom filter

2018-10-30 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1453: Summary: Support nested column Bloom filter Key: PARQUET-1453 URL: https://issues.apache.org/jira/browse/PARQUET-1453 Project: Parquet Issue Type: Sub-task

[jira] [Comment Edited] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744019#comment-16744019 ] Junjie Chen edited comment on PARQUET-1493 at 1/16/19 1:12 PM: --- Hi

[jira] [Commented] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744019#comment-16744019 ] Junjie Chen commented on PARQUET-1493: -- Hi [~gszadovszky] I just tried 3.6.0.2, still failed. 

[jira] [Commented] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744050#comment-16744050 ] Junjie Chen commented on PARQUET-1493: -- Thanks I open a issue there:

[jira] [Created] (PARQUET-1495) Perform encoding before bloom filters write out

2019-01-21 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1495: Summary: Perform encoding before bloom filters write out Key: PARQUET-1495 URL: https://issues.apache.org/jira/browse/PARQUET-1495 Project: Parquet Issue

[jira] [Updated] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1493: - Description: I checked out master branch and executed "mvn clean install -DskipTests", it

[jira] [Created] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1493: Summary: maven protobuf plugin not work properly Key: PARQUET-1493 URL: https://issues.apache.org/jira/browse/PARQUET-1493 Project: Parquet Issue Type: Bug

[jira] [Commented] (PARQUET-1328) [java]Bloom filter read/write implementation

2019-01-11 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740965#comment-16740965 ] Junjie Chen commented on PARQUET-1328: -- [~zi], Jim had reviewed some on this and we need some more

[jira] [Created] (PARQUET-1516) Store Bloom filters near to footer.

2019-01-27 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1516: Summary: Store Bloom filters near to footer. Key: PARQUET-1516 URL: https://issues.apache.org/jira/browse/PARQUET-1516 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1553) Support xxHash in Bloom filter

2019-04-02 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1553: Summary: Support xxHash in Bloom filter Key: PARQUET-1553 URL: https://issues.apache.org/jira/browse/PARQUET-1553 Project: Parquet Issue Type: New Feature

[jira] [Created] (PARQUET-1552) upgrade protoc-jar-maven-plugin to 3.7.0.1

2019-03-28 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1552: Summary: upgrade protoc-jar-maven-plugin to 3.7.0.1 Key: PARQUET-1552 URL: https://issues.apache.org/jira/browse/PARQUET-1552 Project: Parquet Issue Type:

[jira] [Assigned] (PARQUET-319) Define the parquet bloom filter statistics in parquet format

2019-02-14 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-319: --- Assignee: Junjie Chen (was: Ferdinand Xu) > Define the parquet bloom filter statistics in

[jira] [Created] (PARQUET-1592) update hash naming of bloom filter

2019-06-11 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1592: Summary: update hash naming of bloom filter Key: PARQUET-1592 URL: https://issues.apache.org/jira/browse/PARQUET-1592 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1609) support xxhash in bloom filter

2019-06-25 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1609: Summary: support xxhash in bloom filter Key: PARQUET-1609 URL: https://issues.apache.org/jira/browse/PARQUET-1609 Project: Parquet Issue Type: Improvement

[jira] [Commented] (PARQUET-1552) upgrade protoc-jar-maven-plugin to 3.8.0

2019-06-20 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869108#comment-16869108 ] Junjie Chen commented on PARQUET-1552: -- v3.7.0.1 does not fix the problem, v3.8.0 fix it. >

[jira] [Updated] (PARQUET-1552) upgrade protoc-jar-maven-plugin to 3.8.0

2019-06-20 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1552: - Summary: upgrade protoc-jar-maven-plugin to 3.8.0 (was: upgrade protoc-jar-maven-plugin to

[jira] [Updated] (PARQUET-1552) upgrade protoc-jar-maven-plugin to 3.8.0

2019-06-20 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1552: - Description: Current protoc-jar-maven-plugin has a problem when building project after a proxy

[jira] [Created] (PARQUET-1617) Add more details to bloom filter spec

2019-07-05 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1617: Summary: Add more details to bloom filter spec Key: PARQUET-1617 URL: https://issues.apache.org/jira/browse/PARQUET-1617 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-1625) Update parquet thrift to align with spec

2019-07-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1625: Summary: Update parquet thrift to align with spec Key: PARQUET-1625 URL: https://issues.apache.org/jira/browse/PARQUET-1625 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-1630) Resolve Bloom filter spec concerns

2019-08-04 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1630: Summary: Resolve Bloom filter spec concerns Key: PARQUET-1630 URL: https://issues.apache.org/jira/browse/PARQUET-1630 Project: Parquet Issue Type: Sub-task

[jira] [Assigned] (PARQUET-1592) update hash naming of bloom filter

2019-08-30 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1592: Assignee: Junjie Chen > update hash naming of bloom filter >

[jira] [Resolved] (PARQUET-1630) Resolve Bloom filter spec concerns

2019-08-30 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1630. -- Resolution: Fixed > Resolve Bloom filter spec concerns > --

[jira] [Resolved] (PARQUET-1592) update hash naming of bloom filter

2019-08-30 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1592. -- Resolution: Fixed > update hash naming of bloom filter > --

[jira] [Assigned] (PARQUET-1630) Resolve Bloom filter spec concerns

2019-08-30 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1630: Assignee: Junjie Chen > Resolve Bloom filter spec concerns >

[jira] [Commented] (PARQUET-1570) Publish 1.11.0 to maven central

2019-09-02 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921113#comment-16921113 ] Junjie Chen commented on PARQUET-1570: -- We may need to resolve PARQUET-1434 at first. > Publish

[jira] [Resolved] (PARQUET-1617) Add more details to bloom filter spec

2019-09-09 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1617. -- Resolution: Fixed > Add more details to bloom filter spec >

[jira] [Resolved] (PARQUET-1609) support xxhash in bloom filter

2019-09-09 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1609. -- Resolution: Fixed > support xxhash in bloom filter > -- > >

[jira] [Assigned] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-07 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1632: Assignee: Junjie Chen > Negative initial size when writing large values in parquet-mr >

[jira] [Commented] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-07 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901892#comment-16901892 ] Junjie Chen commented on PARQUET-1632: -- I will take a look into this. > Negative initial size

[jira] [Commented] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-08 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902820#comment-16902820 ] Junjie Chen commented on PARQUET-1632: -- The CapacityByteArrayOutputStream is overflowed since it

[jira] [Commented] (PARQUET-1326) [C++] Cross compatibility support with parquet-mr

2019-07-31 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897323#comment-16897323 ] Junjie Chen commented on PARQUET-1326: -- We need to consider to make the integration test

[jira] [Commented] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-09 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903870#comment-16903870 ] Junjie Chen commented on PARQUET-1632: -- Reopen this first.   I think the ByteInput get from

[jira] [Resolved] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-08 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1632. -- Resolution: Not A Problem it is a configuration issue. > Negative initial size when writing

[jira] [Commented] (PARQUET-1434) Release parquet-mr 1.11.0

2019-07-22 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890604#comment-16890604 ] Junjie Chen commented on PARQUET-1434: -- [~gszadovszky],  What remaining contents are still in

[jira] [Commented] (PARQUET-1657) [C++] Change Bloom filter implementation to use xxhash

2019-09-18 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932998#comment-16932998 ] Junjie Chen commented on PARQUET-1657: -- Great, the Bloom filter thrift definition was agreed and

[jira] [Created] (PARQUET-1658) travis preparing script for bloom-filter branch failed

2019-09-20 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1658: Summary: travis preparing script for bloom-filter branch failed Key: PARQUET-1658 URL: https://issues.apache.org/jira/browse/PARQUET-1658 Project: Parquet

[jira] [Updated] (PARQUET-319) Define the parquet bloom filter statistics in parquet format

2019-10-10 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-319: Fix Version/s: format-2.7.0 > Define the parquet bloom filter statistics in parquet format >

[jira] [Created] (PARQUET-1795) merge bloom filter feature branch to master

2020-02-12 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1795: Summary: merge bloom filter feature branch to master Key: PARQUET-1795 URL: https://issues.apache.org/jira/browse/PARQUET-1795 Project: Parquet Issue Type:

[jira] [Assigned] (PARQUET-1453) Support nested column Bloom filter

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1453: Assignee: Junjie Chen > Support nested column Bloom filter >

[jira] [Resolved] (PARQUET-1328) [java]Bloom filter read/write implementation

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1328. -- Fix Version/s: 1.11.1 Resolution: Fixed > [java]Bloom filter read/write

[jira] [Resolved] (PARQUET-1453) Support nested column Bloom filter

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1453. -- Fix Version/s: 1.11.1 Resolution: Fixed > Support nested column Bloom filter >

[jira] [Resolved] (PARQUET-41) Add bloom filters to parquet statistics

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-41. Fix Version/s: 1.11.1 Resolution: Fixed > Add bloom filters to parquet statistics >

[jira] [Resolved] (PARQUET-1516) Store Bloom filters near to footer.

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1516. -- Fix Version/s: 1.11.1 Assignee: Junjie Chen Resolution: Fixed > Store Bloom

[jira] [Resolved] (PARQUET-1391) [java] Integrate Bloom filter logic

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1391. -- Fix Version/s: 1.11.1 Assignee: Junjie Chen Resolution: Fixed > [java]

[jira] [Assigned] (PARQUET-1795) merge bloom filter feature branch to master

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1795: Assignee: Junjie Chen > merge bloom filter feature branch to master >

[jira] [Resolved] (PARQUET-1795) merge bloom filter feature branch to master

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1795. -- Resolution: Not A Problem > merge bloom filter feature branch to master >

[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose

2020-01-12 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013979#comment-17013979 ] Junjie Chen commented on PARQUET-1758: -- It might be better to draft a discussion on mail list for

  1   2   >