[jira] [Commented] (PARQUET-2365) Fixes NPE when rewriting column without column index

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776990#comment-17776990 ] ASF GitHub Bot commented on PARQUET-2365: - wgtmac commented on code in PR #1173: URL:

Re: [PR] PARQUET-2365 : Fixes NPE when rewriting column without column index [parquet-mr]

2023-10-18 Thread via GitHub
wgtmac commented on code in PR #1173: URL: https://github.com/apache/parquet-mr/pull/1173#discussion_r1364833487 ## parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java: ## @@ -543,6 +546,11 @@ public static ColumnIndex build(

[jira] [Commented] (PARQUET-2366) Optimize random seek during rewriting

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776977#comment-17776977 ] ASF GitHub Bot commented on PARQUET-2366: - wgtmac commented on code in PR #1174: URL:

[jira] [Resolved] (PARQUET-2361) Reduce failure rate of unit test testParquetFileWithBloomFilterWithFpp

2023-10-18 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2361. -- Fix Version/s: 1.14.0 Assignee: Feng Jiajie Resolution: Fixed > Reduce failure rate

[jira] [Commented] (PARQUET-2361) Reduce failure rate of unit test testParquetFileWithBloomFilterWithFpp

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776978#comment-17776978 ] ASF GitHub Bot commented on PARQUET-2361: - wgtmac merged PR #1170: URL:

Re: [PR] PARQUET-2361: Reduce failure rate of unit test [parquet-mr]

2023-10-18 Thread via GitHub
wgtmac merged PR #1170: URL: https://github.com/apache/parquet-mr/pull/1170 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] PARQUET-2366: Optimize random seek during rewriting [parquet-mr]

2023-10-18 Thread via GitHub
wgtmac commented on code in PR #1174: URL: https://github.com/apache/parquet-mr/pull/1174#discussion_r1364799606 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/IndexCacher.java: ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[jira] [Commented] (PARQUET-2355) Deprecate parquet-thrift

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776764#comment-17776764 ] ASF GitHub Bot commented on PARQUET-2355: - Fokko commented on code in PR #1175: URL:

Re: [PR] PARQUET-2355: Deprecate `parquet-thrift` [parquet-mr]

2023-10-18 Thread via GitHub
Fokko commented on code in PR #1175: URL: https://github.com/apache/parquet-mr/pull/1175#discussion_r1364139352 ## pom.xml: ## @@ -544,6 +544,17 @@ org.apache.parquet.hadoop.ColumnChunkPageWriteStore

[jira] [Commented] (PARQUET-2355) Deprecate parquet-thrift

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776762#comment-17776762 ] ASF GitHub Bot commented on PARQUET-2355: - Fokko commented on code in PR #1175: URL:

Re: [PR] PARQUET-2355: Deprecate `parquet-thrift` [parquet-mr]

2023-10-18 Thread via GitHub
Fokko commented on code in PR #1175: URL: https://github.com/apache/parquet-mr/pull/1175#discussion_r1364137514 ## pom.xml: ## @@ -544,6 +544,17 @@ org.apache.parquet.hadoop.ColumnChunkPageWriteStore

[jira] [Commented] (PARQUET-2355) Deprecate parquet-thrift

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776761#comment-17776761 ] ASF GitHub Bot commented on PARQUET-2355: - Fokko commented on code in PR #1175: URL:

[jira] [Commented] (PARQUET-2355) Deprecate parquet-thrift

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776763#comment-17776763 ] ASF GitHub Bot commented on PARQUET-2355: - Fokko commented on code in PR #1175: URL:

[PR] PARQUET-2368: Update japicmp to 1.18.1 [parquet-mr]

2023-10-18 Thread via GitHub
Fokko opened a new pull request, #1176: URL: https://github.com/apache/parquet-mr/pull/1176 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references them in

Re: [PR] PARQUET-2355: Deprecate `parquet-thrift` [parquet-mr]

2023-10-18 Thread via GitHub
Fokko commented on code in PR #1175: URL: https://github.com/apache/parquet-mr/pull/1175#discussion_r1364138421 ## pom.xml: ## @@ -544,6 +544,17 @@ org.apache.parquet.hadoop.ColumnChunkPageWriteStore

Re: [PR] PARQUET-2355: Deprecate `parquet-thrift` [parquet-mr]

2023-10-18 Thread via GitHub
Fokko commented on code in PR #1175: URL: https://github.com/apache/parquet-mr/pull/1175#discussion_r1364137946 ## pom.xml: ## @@ -544,6 +544,17 @@ org.apache.parquet.hadoop.ColumnChunkPageWriteStore

[jira] [Commented] (PARQUET-2368) Update japicmp to 1.18.1

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776760#comment-17776760 ] ASF GitHub Bot commented on PARQUET-2368: - Fokko opened a new pull request, #1176: URL:

[jira] [Created] (PARQUET-2368) Update japicmp to 1.18.1

2023-10-18 Thread Fokko Driesprong (Jira)
Fokko Driesprong created PARQUET-2368: - Summary: Update japicmp to 1.18.1 Key: PARQUET-2368 URL: https://issues.apache.org/jira/browse/PARQUET-2368 Project: Parquet Issue Type:

Re: [PR] PARQUET-2366: Optimize random seek during rewriting [parquet-mr]

2023-10-18 Thread via GitHub
ConeyLiu commented on code in PR #1174: URL: https://github.com/apache/parquet-mr/pull/1174#discussion_r1363996076 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java: ## @@ -92,16 +95,21 @@ public FileEncryptionProperties

[jira] [Commented] (PARQUET-2366) Optimize random seek during rewriting

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776719#comment-17776719 ] ASF GitHub Bot commented on PARQUET-2366: - ConeyLiu commented on code in PR #1174: URL:

[jira] [Commented] (PARQUET-2366) Optimize random seek during rewriting

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776718#comment-17776718 ] ASF GitHub Bot commented on PARQUET-2366: - ConeyLiu commented on code in PR #1174: URL:

Re: [PR] PARQUET-2366: Optimize random seek during rewriting [parquet-mr]

2023-10-18 Thread via GitHub
ConeyLiu commented on code in PR #1174: URL: https://github.com/apache/parquet-mr/pull/1174#discussion_r1363994746 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/IndexCacher.java: ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[PR] PARQUET-2355: Deprecate `parquet-thrift` [parquet-mr]

2023-10-18 Thread via GitHub
Fokko opened a new pull request, #1175: URL: https://github.com/apache/parquet-mr/pull/1175 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references them in

[jira] [Commented] (PARQUET-2355) Deprecate parquet-thrift

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776689#comment-17776689 ] ASF GitHub Bot commented on PARQUET-2355: - Fokko opened a new pull request, #1175: URL:

[jira] [Updated] (PARQUET-2355) Deprecate parquet-thrift

2023-10-18 Thread Fokko Driesprong (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong updated PARQUET-2355: -- Summary: Deprecate parquet-thrift (was: Remove parquet-thrift) > Deprecate

[jira] [Commented] (PARQUET-411) Format: Add a flag when min/max are truncated

2023-10-18 Thread Raunaq Morarka (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776580#comment-17776580 ] Raunaq Morarka commented on PARQUET-411: I believe this issue is addressed by the changes done

[jira] [Commented] (PARQUET-2352) Update parquet format spec to allow truncation of row group min/max stats

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776577#comment-17776577 ] ASF GitHub Bot commented on PARQUET-2352: - raunaqmorarka commented on code in PR #216: URL:

Re: [PR] PARQUET-2352: Allow truncation of row group min_values/max_value statistics [parquet-format]

2023-10-18 Thread via GitHub
raunaqmorarka commented on code in PR #216: URL: https://github.com/apache/parquet-format/pull/216#discussion_r1363572486 ## src/main/thrift/parquet.thrift: ## @@ -216,7 +216,12 @@ struct Statistics { /** count of distinct values occurring */ 4: optional i64

[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776574#comment-17776574 ] ASF GitHub Bot commented on PARQUET-2042: - wgtmac commented on PR #900: URL:

Re: [PR] PARQUET-2042: Add support for unwrapping common Protobuf wrappers and… [parquet-mr]

2023-10-18 Thread via GitHub
wgtmac commented on PR #900: URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1768050027 I just noticed this PR and sorry to see it does not check in. @mwong38 Could you try rebase it one last time? Thanks! -- This is an automated message from the Apache Git Service. To

Re: [PR] PARQUET-2352: Allow truncation of row group min_values/max_value statistics [parquet-format]

2023-10-18 Thread via GitHub
wgtmac commented on code in PR #216: URL: https://github.com/apache/parquet-format/pull/216#discussion_r1363545890 ## src/main/thrift/parquet.thrift: ## @@ -216,7 +216,12 @@ struct Statistics { /** count of distinct values occurring */ 4: optional i64 distinct_count;

[jira] [Commented] (PARQUET-2352) Update parquet format spec to allow truncation of row group min/max stats

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776571#comment-17776571 ] ASF GitHub Bot commented on PARQUET-2352: - wgtmac merged PR #216: URL:

Re: [PR] PARQUET-2352: Allow truncation of row group min_values/max_value statistics [parquet-format]

2023-10-18 Thread via GitHub
wgtmac merged PR #216: URL: https://github.com/apache/parquet-format/pull/216 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Resolved] (PARQUET-2352) Update parquet format spec to allow truncation of row group min/max stats

2023-10-18 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2352. -- Fix Version/s: format-2.10.0 Assignee: Raunaq Morarka Resolution: Fixed > Update

[jira] [Commented] (PARQUET-2352) Update parquet format spec to allow truncation of row group min/max stats

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776573#comment-17776573 ] ASF GitHub Bot commented on PARQUET-2352: - wgtmac commented on code in PR #216: URL:

[jira] [Commented] (PARQUET-1647) [Java] support for Arrow's float16

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776570#comment-17776570 ] ASF GitHub Bot commented on PARQUET-1647: - wgtmac commented on code in PR #1142: URL:

Re: [PR] PARQUET-1647: [Java][Parquet] Implement FLOAT16 logical type [parquet-mr]

2023-10-18 Thread via GitHub
wgtmac commented on code in PR #1142: URL: https://github.com/apache/parquet-mr/pull/1142#discussion_r1363539290 ## parquet-common/src/main/java/org/apache/parquet/type/Float16.java: ## @@ -0,0 +1,307 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or

[jira] [Commented] (PARQUET-2366) Optimize random seek during rewriting

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776567#comment-17776567 ] ASF GitHub Bot commented on PARQUET-2366: - wgtmac commented on code in PR #1174: URL:

Re: [PR] PARQUET-2366: Optimize random seek during rewriting [parquet-mr]

2023-10-18 Thread via GitHub
wgtmac commented on code in PR #1174: URL: https://github.com/apache/parquet-mr/pull/1174#discussion_r1363512338 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/ParquetRewriter.java: ## @@ -265,6 +265,10 @@ private void processBlocksFromReader() throws

[jira] [Commented] (PARQUET-2366) Optimize random seek during rewriting

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776536#comment-17776536 ] ASF GitHub Bot commented on PARQUET-2366: - ConeyLiu commented on code in PR #1174: URL:

Re: [PR] PARQUET-2366: Optimize random seek during rewriting [parquet-mr]

2023-10-18 Thread via GitHub
ConeyLiu commented on code in PR #1174: URL: https://github.com/apache/parquet-mr/pull/1174#discussion_r1363403671 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/ParquetRewriter.java: ## @@ -265,6 +265,10 @@ private void processBlocksFromReader() throws

[jira] [Commented] (PARQUET-1647) [Java] support for Arrow's float16

2023-10-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776524#comment-17776524 ] ASF GitHub Bot commented on PARQUET-1647: - gszadovszky commented on code in PR #1142: URL:

Re: [PR] PARQUET-1647: [Java][Parquet] Implement FLOAT16 logical type [parquet-mr]

2023-10-18 Thread via GitHub
gszadovszky commented on code in PR #1142: URL: https://github.com/apache/parquet-mr/pull/1142#discussion_r1363371525 ## parquet-common/src/main/java/org/apache/parquet/type/Float16.java: ## @@ -0,0 +1,307 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one +