[jira] [Commented] (PARQUET-1830) Vectorized API to support Column Index in Apache Spark

2020-03-30 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17070971#comment-17070971 ] Gabor Szadovszky commented on PARQUET-1830: --- [~FelixKJose], you said you would prefer option

[jira] [Assigned] (PARQUET-1827) UUID type currently not supported by parquet-mr

2020-03-30 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1827: - Assignee: Gabor Szadovszky > UUID type currently not supported by parquet-mr

[jira] [Commented] (PARQUET-1830) Vectorized API to support Column Index in Apache Spark

2020-03-30 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17070830#comment-17070830 ] Gabor Szadovszky commented on PARQUET-1830: --- [~FelixKJose], agreed. So this jira is to track

[jira] [Resolved] (PARQUET-1817) Crypto Properties Factory

2020-03-30 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1817. --- Resolution: Fixed > Crypto Properties Factory > - > >

[jira] [Resolved] (PARQUET-1805) Refactor the configuration for bloom filters

2020-03-30 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1805. --- Resolution: Fixed > Refactor the configuration for bloom filters >

[jira] [Commented] (PARQUET-1830) Vectorized API to support Column Index in Apache Spark

2020-03-27 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068431#comment-17068431 ] Gabor Szadovszky commented on PARQUET-1830: --- [~FelixKJose], the feature of having a

[jira] [Updated] (PARQUET-1828) Add a SSE2 path for the ByteStreamSplit encoder implementation

2020-03-26 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1828: -- Component/s: parquet-cpp > Add a SSE2 path for the ByteStreamSplit encoder

[jira] [Assigned] (PARQUET-1826) Document hadoop configuration options

2020-03-25 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1826: - Assignee: Walid Gara Based on our discussion in the Parquet sync I'm

[jira] [Assigned] (PARQUET-1787) Expected distinct numbers is not parsed correctly

2020-03-25 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1787: - Assignee: Walid Gara > Expected distinct numbers is not parsed correctly >

[jira] [Assigned] (PARQUET-1815) Add union API to BloomFilter interface

2020-03-25 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1815: - Assignee: Walid Gara > Add union API to BloomFilter interface >

[jira] [Assigned] (PARQUET-1816) Add intersection API to BloomFilter interface

2020-03-25 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1816: - Assignee: Walid Gara > Add intersection API to BloomFilter interface >

[jira] [Assigned] (PARQUET-1743) Add equals to BlockSplitBloomFilter

2020-03-25 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1743: - Assignee: Walid Gara > Add equals to BlockSplitBloomFilter >

[jira] [Created] (PARQUET-1826) Document hadoop configuration options

2020-03-25 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1826: - Summary: Document hadoop configuration options Key: PARQUET-1826 URL: https://issues.apache.org/jira/browse/PARQUET-1826 Project: Parquet Issue

[jira] [Commented] (PARQUET-1815) Add union API to BloomFilter interface

2020-03-18 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061641#comment-17061641 ] Gabor Szadovszky commented on PARQUET-1815: --- If one would like to use bloom filters out of

[jira] [Commented] (PARQUET-1816) Add intersection API to BloomFilter interface

2020-03-18 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061581#comment-17061581 ] Gabor Szadovszky commented on PARQUET-1816: --- Please, find my comment at PARQUET-1815. > Add

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2020-03-18 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061558#comment-17061558 ] Gabor Szadovszky commented on PARQUET-41: - [~junma], the target release for this feature is

[jira] [Commented] (PARQUET-1815) Add union API to BloomFilter interface

2020-03-18 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061545#comment-17061545 ] Gabor Szadovszky commented on PARQUET-1815: --- The currently implemented filters in parquet-mr

[jira] [Resolved] (PARQUET-1811) Update download links

2020-03-18 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1811. --- Resolution: Fixed > Update download links > - > >

[jira] [Created] (PARQUET-1811) Update download links

2020-03-05 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1811: - Summary: Update download links Key: PARQUET-1811 URL: https://issues.apache.org/jira/browse/PARQUET-1811 Project: Parquet Issue Type: Task

[jira] [Commented] (PARQUET-1809) Add new APIs for nested predicate pushdown

2020-03-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051959#comment-17051959 ] Gabor Szadovszky commented on PARQUET-1809: --- It would be nice to use string arrays (or maybe

[jira] [Assigned] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1808: - Assignee: Shankar Koirala > SimpleGroup.toString() uses String += and so has

[jira] [Commented] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051933#comment-17051933 ] Gabor Szadovszky commented on PARQUET-1808: --- [~tiddman], I agree that the current project

[jira] [Commented] (PARQUET-1809) Add new APIs for nested predicate pushdown

2020-03-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051023#comment-17051023 ] Gabor Szadovszky commented on PARQUET-1809: --- I am afraid, it is not only the filter API that

[jira] [Commented] (PARQUET-1808) SimpleGroup.toString() uses String += and so has poor performance

2020-03-03 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050013#comment-17050013 ] Gabor Szadovszky commented on PARQUET-1808: --- [~tiddman], Thanks for filing this issue.

[jira] [Resolved] (PARQUET-1803) Could not find FilleInputSplit in ParquetInputSplit

2020-02-28 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1803. --- Resolution: Fixed > Could not find FilleInputSplit in ParquetInputSplit >

[jira] [Assigned] (PARQUET-1803) Could not find FilleInputSplit in ParquetInputSplit

2020-02-28 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1803: - Assignee: Shankar Koirala > Could not find FilleInputSplit in

[jira] [Updated] (PARQUET-1803) Could not find FilleInputSplit in ParquetInputSplit

2020-02-28 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1803: -- Affects Version/s: (was: format-2.7.0) 1.11.0 > Could not

[jira] [Created] (PARQUET-1805) Refactor the configuration for bloom filters

2020-02-26 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1805: - Summary: Refactor the configuration for bloom filters Key: PARQUET-1805 URL: https://issues.apache.org/jira/browse/PARQUET-1805 Project: Parquet

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2020-02-26 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045530#comment-17045530 ] Gabor Szadovszky commented on PARQUET-41: - [~junjie], feature branch for parquet-mr has been

[jira] [Resolved] (PARQUET-1784) Column-wise configuration

2020-02-26 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1784. --- Resolution: Fixed > Column-wise configuration > - > >

[jira] [Resolved] (PARQUET-1791) Add 'prune' command to parquet-tools

2020-02-25 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1791. --- Resolution: Fixed > Add 'prune' command to parquet-tools >

[jira] [Commented] (PARQUET-1381) Add merge blocks command to parquet-tools

2020-02-24 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043505#comment-17043505 ] Gabor Szadovszky commented on PARQUET-1381: --- I don't think anyone is working on it. Feel free

[jira] [Resolved] (PARQUET-1802) CompressionCodec class not found if the codec class is not in the same defining classloader as the CodecFactory class

2020-02-24 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1802. --- Resolution: Fixed > CompressionCodec class not found if the codec class is not in

[jira] [Assigned] (PARQUET-1802) CompressionCodec class not found if the codec class is not in the same defining classloader as the CodecFactory class

2020-02-20 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1802: - Assignee: Terence Yim > CompressionCodec class not found if the codec class

[jira] [Commented] (PARQUET-1774) Release parquet 1.11.1

2020-02-19 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039845#comment-17039845 ] Gabor Szadovszky commented on PARQUET-1774: --- Waiting for Spark to confirm that

[jira] [Updated] (PARQUET-1796) Bump Apache Avro to 1.9.2

2020-02-19 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1796: -- Fix Version/s: 1.11.1 > Bump Apache Avro to 1.9.2 > - > >

[jira] [Resolved] (PARQUET-1794) Random data generation may cause flaky tests

2020-02-17 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1794. --- Resolution: Fixed > Random data generation may cause flaky tests >

[jira] [Commented] (PARQUET-1801) Add column index support for 'prune' command in Parquet-tools/cli

2020-02-17 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17038172#comment-17038172 ] Gabor Szadovszky commented on PARQUET-1801: --- Currently, only column indexes are the special

[jira] [Resolved] (PARQUET-1796) Bump Apache Avro to 1.9.2

2020-02-14 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1796. --- Resolution: Fixed > Bump Apache Avro to 1.9.2 > - > >

[jira] [Assigned] (PARQUET-1790) ParquetFileWriter missing Api for DataPageV2

2020-02-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1790: - Assignee: Brian Mwambazi > ParquetFileWriter missing Api for DataPageV2 >

[jira] [Resolved] (PARQUET-1790) ParquetFileWriter missing Api for DataPageV2

2020-02-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1790. --- Resolution: Fixed > ParquetFileWriter missing Api for DataPageV2 >

[jira] [Resolved] (PARQUET-1622) Add BYTE_STREAM_SPLIT encoding

2020-02-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1622. --- Fix Version/s: 1.12.0 Resolution: Fixed > Add BYTE_STREAM_SPLIT encoding >

[jira] [Created] (PARQUET-1794) Random data generation may cause flaky tests

2020-02-12 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1794: - Summary: Random data generation may cause flaky tests Key: PARQUET-1794 URL: https://issues.apache.org/jira/browse/PARQUET-1794 Project: Parquet

[jira] [Commented] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-02-11 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034294#comment-17034294 ] Gabor Szadovszky commented on PARQUET-1792: --- If you are talking about one file at a time you

[jira] [Commented] (PARQUET-1784) Column-wise configuration

2020-02-07 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032196#comment-17032196 ] Gabor Szadovszky commented on PARQUET-1784: --- [~garawalid], Thanks for the research and the

[jira] [Commented] (PARQUET-1787) Expected distinct numbers is not parsed correctly

2020-02-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17031399#comment-17031399 ] Gabor Szadovszky commented on PARQUET-1787: --- I'm working on a general concept of allowing

[jira] [Comment Edited] (PARQUET-1787) Expected distinct numbers is not parsed correctly

2020-02-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17031399#comment-17031399 ] Gabor Szadovszky edited comment on PARQUET-1787 at 2/6/20 9:26 AM: ---

[jira] [Commented] (PARQUET-1784) Column-wise configuration

2020-02-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17031377#comment-17031377 ] Gabor Szadovszky commented on PARQUET-1784: --- [~garawalid], The idea is to use a "root" key

[jira] [Updated] (PARQUET-1784) Column-wise configuration

2020-02-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1784: -- Description: After adding some new statistics and encodings into Parquet it is

[jira] [Updated] (PARQUET-1784) Column-wise configuration

2020-02-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1784: -- Description: After adding some new statistics and encodings into Parquet it is

[jira] [Updated] (PARQUET-1784) Column-wise configuration

2020-02-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1784: -- Description: After adding some new statistics and encodings into Parquet it is

[jira] [Created] (PARQUET-1784) Column-wise configuration

2020-02-05 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1784: - Summary: Column-wise configuration Key: PARQUET-1784 URL: https://issues.apache.org/jira/browse/PARQUET-1784 Project: Parquet Issue Type: New

[jira] [Created] (PARQUET-1774) Release parquet 1.11.1

2020-01-22 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1774: - Summary: Release parquet 1.11.1 Key: PARQUET-1774 URL: https://issues.apache.org/jira/browse/PARQUET-1774 Project: Parquet Issue Type: Task

[jira] [Resolved] (PARQUET-1745) No result for partition key included in Parquet file

2020-01-20 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1745. --- Resolution: Not A Bug Closing this issue as "Not a Bug". See my previous comment

[jira] [Resolved] (PARQUET-1746) Changed the data order after DataFrame reuse

2020-01-20 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1746. --- Resolution: Not A Problem The related Spark test generates 22 parquet files. The

[jira] [Resolved] (PARQUET-1765) Invalid filteredRowCount in InternalParquetRecordReader

2020-01-16 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1765. --- Resolution: Fixed > Invalid filteredRowCount in InternalParquetRecordReader >

[jira] [Commented] (PARQUET-1746) Changed the data order after DataFrame reuse

2020-01-15 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015736#comment-17015736 ] Gabor Szadovszky commented on PARQUET-1746: --- For me the issue is reproducible with the

[jira] [Created] (PARQUET-1765) Invalid filteredRowCount in InternalParquetRecordReader

2020-01-13 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1765: - Summary: Invalid filteredRowCount in InternalParquetRecordReader Key: PARQUET-1765 URL: https://issues.apache.org/jira/browse/PARQUET-1765 Project: Parquet

[jira] [Commented] (PARQUET-1745) No result for partition key included in Parquet file

2020-01-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014309#comment-17014309 ] Gabor Szadovszky commented on PARQUET-1745: --- The problem here is Spark sets a projection to

[jira] [Commented] (PARQUET-1746) Changed the data order after DataFrame reuse

2020-01-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011872#comment-17011872 ] Gabor Szadovszky commented on PARQUET-1746: --- What exactly is reordered here? If it is a list

[jira] [Commented] (PARQUET-1745) No result for partition key included in Parquet file

2020-01-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011868#comment-17011868 ] Gabor Szadovszky commented on PARQUET-1745: --- Unfortunately, I don't understand what exactly

[jira] [Assigned] (PARQUET-1740) Make ParquetFileReader.getFilteredRecordCount public

2020-01-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1740: - Assignee: Yuming Wang > Make ParquetFileReader.getFilteredRecordCount public

[jira] [Updated] (PARQUET-1740) Make ParquetFileReader.getFilteredRecordCount public

2020-01-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1740: -- Fix Version/s: 1.11.1 > Make ParquetFileReader.getFilteredRecordCount public >

[jira] [Resolved] (PARQUET-1740) Make ParquetFileReader.getFilteredRecordCount public

2020-01-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1740. --- Resolution: Fixed > Make ParquetFileReader.getFilteredRecordCount public >

[jira] [Updated] (PARQUET-1744) Some filters throws ArrayIndexOutOfBoundsException

2020-01-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1744: -- Fix Version/s: 1.11.1 > Some filters throws ArrayIndexOutOfBoundsException >

[jira] [Commented] (PARQUET-1744) Some filters throws ArrayIndexOutOfBoundsException

2020-01-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011792#comment-17011792 ] Gabor Szadovszky commented on PARQUET-1744: --- Thanks for creating this issue. The problem is

[jira] [Assigned] (PARQUET-1744) Some filters throws ArrayIndexOutOfBoundsException

2020-01-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1744: - Assignee: Gabor Szadovszky > Some filters throws

[jira] [Updated] (PARQUET-1739) Make Spark SQL support Column indexes

2020-01-08 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1739: -- Fix Version/s: 1.11.1 > Make Spark SQL support Column indexes >

[jira] [Resolved] (PARQUET-1703) Update API compatibility check

2020-01-07 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1703. --- Resolution: Fixed > Update API compatibility check >

[jira] [Updated] (PARQUET-1622) Add BYTE_STREAM_SPLIT encoding

2019-12-16 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1622: -- Summary: Add BYTE_STREAM_SPLIT encoding (was: Adding an encoding for FP data) >

[jira] [Updated] (PARQUET-1622) Adding an encoding for FP data

2019-12-16 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1622: -- Issue Type: New Feature (was: Wish) > Adding an encoding for FP data >

[jira] [Updated] (PARQUET-1622) Adding an encoding for FP data

2019-12-16 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1622: -- Fix Version/s: format-2.8.0 > Adding an encoding for FP data >

[jira] [Assigned] (PARQUET-1703) Update API compatibility check

2019-12-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1703: - Assignee: Gabor Szadovszky > Update API compatibility check >

[jira] [Updated] (PARQUET-1672) [DOC] Broken link to "How To Contribute" section in Parquet-MR project

2019-12-11 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1672: -- Fix Version/s: format-2.8.0 > [DOC] Broken link to "How To Contribute" section in

[jira] [Resolved] (PARQUET-1708) Fix Thrift compiler warning

2019-12-11 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1708. --- Fix Version/s: format-2.8.0 Resolution: Fixed > Fix Thrift compiler warning

[jira] [Assigned] (PARQUET-1708) Fix Thrift compiler warning

2019-12-11 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1708: - Assignee: Jiajia Li > Fix Thrift compiler warning >

[jira] [Created] (PARQUET-1714) Release parquet format 2.8.0

2019-12-10 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1714: - Summary: Release parquet format 2.8.0 Key: PARQUET-1714 URL: https://issues.apache.org/jira/browse/PARQUET-1714 Project: Parquet Issue Type: Task

[jira] [Resolved] (PARQUET-1694) Restore ColumnChunkPageWriter constructors

2019-12-10 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1694. --- Fix Version/s: (was: 1.11.0) Resolution: Not A Problem Closing this

[jira] [Resolved] (PARQUET-1434) Release parquet-mr 1.11.0

2019-12-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1434. --- Resolution: Fixed > Release parquet-mr 1.11.0 > - > >

[jira] [Commented] (PARQUET-1622) Adding an encoding for FP data

2019-12-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988736#comment-16988736 ] Gabor Szadovszky commented on PARQUET-1622: --- That's fine. I've just asked if you have a

[jira] [Commented] (PARQUET-1622) Adding an encoding for FP data

2019-12-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988610#comment-16988610 ] Gabor Szadovszky commented on PARQUET-1622: --- parquet-format 2.8.0 is not released yet. To

[jira] [Assigned] (PARQUET-1622) Adding an encoding for FP data

2019-12-03 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1622: - Assignee: Martin Radev > Adding an encoding for FP data >

[jira] [Created] (PARQUET-1703) Update API compatibility check

2019-11-26 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1703: - Summary: Update API compatibility check Key: PARQUET-1703 URL: https://issues.apache.org/jira/browse/PARQUET-1703 Project: Parquet Issue Type:

[jira] [Updated] (PARQUET-1667) Close InputStream after usage

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1667: -- Fix Version/s: (was: 1.11.0) > Close InputStream after usage >

[jira] [Updated] (PARQUET-1351) Travis builds fail for parquet-format

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1351: -- Fix Version/s: (was: 1.11.0) format-2.6.0 > Travis builds

[jira] [Updated] (PARQUET-1533) TestSnappy() throws OOM exception with Parquet-1485 change

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1533: -- Fix Version/s: 1.11.0 > TestSnappy() throws OOM exception with Parquet-1485 change

[jira] [Updated] (PARQUET-1135) upgrade thrift and protobuf dependencies

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1135: -- Fix Version/s: 1.11.0 > upgrade thrift and protobuf dependencies >

[jira] [Updated] (PARQUET-1691) Build fails due to missing hadoop-lzo

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1691: -- Fix Version/s: 1.11.0 > Build fails due to missing hadoop-lzo >

[jira] [Updated] (PARQUET-1687) Update release process

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1687: -- Fix Version/s: 1.11.0 format-2.8.0 > Update release process >

[jira] [Updated] (PARQUET-1551) Support Java 11 - top-level JIRA

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1551: -- Fix Version/s: 1.11.0 > Support Java 11 - top-level JIRA >

[jira] [Resolved] (PARQUET-1551) Support Java 11 - top-level JIRA

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1551. --- Resolution: Fixed > Support Java 11 - top-level JIRA >

[jira] [Updated] (PARQUET-1687) Update release process

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1687: -- Issue Type: Task (was: Improvement) > Update release process >

[jira] [Resolved] (PARQUET-1556) Problem with Maven repo specifications in POMs of dependencies in some development environments

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1556. --- Fix Version/s: (was: 1.12.0) Resolution: Duplicate > Problem with Maven

[jira] [Updated] (PARQUET-1364) Column Indexes: Invalid row indexes for pages starting with nulls

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1364: -- Fix Version/s: (was: 1.11.0) > Column Indexes: Invalid row indexes for pages

[jira] [Updated] (PARQUET-1690) Integer Overflow of BinaryStatistics#isSmallerThan()

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1690: -- Fix Version/s: (was: 1.11.0) Removing 1.11.0 target because this is an

[jira] [Resolved] (PARQUET-1674) The announcement email on the web site does not comply with ASF rules

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1674. --- Resolution: Fixed > The announcement email on the web site does not comply with

[jira] [Resolved] (PARQUET-1685) Truncate the stored min and max for String statistics to reduce the footer size

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1685. --- Resolution: Fixed > Truncate the stored min and max for String statistics to

[jira] [Updated] (PARQUET-1685) Truncate the stored min and max for String statistics to reduce the footer size

2019-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1685: -- Fix Version/s: (was: 1.12.0) 1.11.0 > Truncate the stored min

[jira] [Resolved] (PARQUET-1687) Update release process

2019-11-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1687. --- Resolution: Fixed > Update release process > -- > >

[jira] [Resolved] (PARQUET-1691) Build fails due to missing hadoop-lzo

2019-11-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1691. --- Resolution: Fixed > Build fails due to missing hadoop-lzo >

<    1   2   3   4   5   6   7   8   9   >