[jira] [Commented] (PARQUET-1895) Update jackson-databind to 2.9.10.5

2020-08-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176905#comment-17176905 ] Gabor Szadovszky commented on PARQUET-1895: --- 2.11.2 is later (in terms of release date) than

[jira] [Updated] (PARQUET-1895) Update jackson-databind

2020-08-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1895: -- Summary: Update jackson-databind (was: Update jackson-databind to 2.9.10.5) >

[jira] [Assigned] (PARQUET-1895) Update jackson-databind to 2.9.10.5

2020-08-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1895: - Assignee: Gabor Szadovszky > Update jackson-databind to 2.9.10.5 >

[jira] [Commented] (PARQUET-1800) Add 'prune' command to parquet-cli

2020-08-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176884#comment-17176884 ] Gabor Szadovszky commented on PARQUET-1800: --- [~sha...@uber.com], do you want to do that for

[jira] [Created] (PARQUET-1898) Release parquet-mr 1.12.0

2020-08-13 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1898: - Summary: Release parquet-mr 1.12.0 Key: PARQUET-1898 URL: https://issues.apache.org/jira/browse/PARQUET-1898 Project: Parquet Issue Type: Task

[jira] [Commented] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-08-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176883#comment-17176883 ] Gabor Szadovszky commented on PARQUET-1792: --- [~sha...@uber.com], is this still targeted for

[jira] [Commented] (PARQUET-1801) Add column index support for 'prune' command in Parquet-tools/cli

2020-08-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176882#comment-17176882 ] Gabor Szadovszky commented on PARQUET-1801: --- [~sha...@uber.com], do you think you can work on

[jira] [Commented] (PARQUET-1559) Add way to manually commit already written data to disk

2020-08-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171442#comment-17171442 ] Gabor Szadovszky commented on PARQUET-1559: --- [~wxmimperio], as I've said, parquet need to

[jira] [Commented] (PARQUET-1559) Add way to manually commit already written data to disk

2020-08-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172169#comment-17172169 ] Gabor Szadovszky commented on PARQUET-1559: --- The file writing logic is mainly implemented in

[jira] [Resolved] (PARQUET-1793) Support writing INT96 timestamp from avro

2020-08-11 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1793. --- Resolution: Won't Fix > Support writing INT96 timestamp from avro >

[jira] [Commented] (PARQUET-1894) Please fix the related Shaded Jackson Databind CVEs

2020-08-03 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169795#comment-17169795 ] Gabor Szadovszky commented on PARQUET-1894: --- To summarize Avro has breaking changes between

[jira] [Commented] (PARQUET-1881) How to enable sorted array flag while writing a column using parquet-mr

2020-07-08 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153344#comment-17153344 ] Gabor Szadovszky commented on PARQUET-1881: --- Thanks for the reference. I did not find

[jira] [Commented] (PARQUET-1739) Make Spark SQL support Column indexes

2020-07-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151859#comment-17151859 ] Gabor Szadovszky commented on PARQUET-1739: --- Removed 1.11.1 as target release because we are

[jira] [Updated] (PARQUET-1739) Make Spark SQL support Column indexes

2020-07-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1739: -- Fix Version/s: (was: 1.11.1) > Make Spark SQL support Column indexes >

[jira] [Assigned] (PARQUET-1879) Apache Arrow can not read a Parquet File written with Parqet-Avro 1.11.0 with a Map field

2020-07-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1879: - Assignee: Matthew McMahon > Apache Arrow can not read a Parquet File written

[jira] [Resolved] (PARQUET-1864) How to generate a file with UUID as a Logical type

2020-07-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1864. --- Fix Version/s: (was: 1.11.1) Assignee: Gabor Szadovszky

[jira] [Resolved] (PARQUET-1853) Minimize the parquet-avro fastutil shaded jar

2020-07-07 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1853. --- Fix Version/s: 1.11.1 Resolution: Fixed > Minimize the parquet-avro

[jira] [Commented] (PARQUET-1881) How to enable sorted array flag while writing a column using parquet-mr

2020-07-07 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152745#comment-17152745 ] Gabor Szadovszky commented on PARQUET-1881: --- Could you please more specific about this? I

[jira] [Commented] (PARQUET-1883) int96 support in parquet-avro

2020-07-10 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155322#comment-17155322 ] Gabor Szadovszky commented on PARQUET-1883: --- [~sha...@uber.com], [~satishkotha], INT96 IS

[jira] [Commented] (PARQUET-1872) Add TransCompression command

2020-06-17 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138248#comment-17138248 ] Gabor Szadovszky commented on PARQUET-1872: --- [~sha...@uber.com], if I understand correctly

[jira] [Commented] (PARQUET-1774) Release parquet 1.11.1

2020-06-23 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143031#comment-17143031 ] Gabor Szadovszky commented on PARQUET-1774: --- Hi [~maccamlc], Originally 1.11.1 was planned

[jira] [Commented] (PARQUET-1876) Port ZSTD-JNI support to 1.10.x brach

2020-06-15 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135557#comment-17135557 ] Gabor Szadovszky commented on PARQUET-1876: --- I don't think this is a bug. This is a feature

[jira] [Resolved] (PARQUET-1866) Replace Hadoop ZSTD with JNI-ZSTD

2020-06-03 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1866. --- Resolution: Fixed > Replace Hadoop ZSTD with JNI-ZSTD >

[jira] [Commented] (PARQUET-1870) Handle INT96 more gracefully in parquet-avro

2020-06-08 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127963#comment-17127963 ] Gabor Szadovszky commented on PARQUET-1870: --- We've already had a couple of discussions

[jira] [Commented] (PARQUET-1872) Add TransCompression command

2020-06-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134030#comment-17134030 ] Gabor Szadovszky commented on PARQUET-1872: --- [~sha...@uber.com], I don't know why the PR was

[jira] [Resolved] (PARQUET-1827) UUID type currently not supported by parquet-mr

2020-06-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1827. --- Resolution: Fixed > UUID type currently not supported by parquet-mr >

[jira] [Commented] (PARQUET-1842) Update Jackson Databind version to address CVE

2020-06-03 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124675#comment-17124675 ] Gabor Szadovszky commented on PARQUET-1842: --- [~pofriel], we are intensively working on

[jira] [Commented] (PARQUET-1684) [parquet-protobuf] default protobuf field values are stored as nulls

2020-07-29 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166971#comment-17166971 ] Gabor Szadovszky commented on PARQUET-1684: --- [~dossett], thanks a lot for the analysis.

[jira] [Created] (PARQUET-1890) Upgrade to Avro 1.10.0

2020-07-29 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1890: - Summary: Upgrade to Avro 1.10.0 Key: PARQUET-1890 URL: https://issues.apache.org/jira/browse/PARQUET-1890 Project: Parquet Issue Type: Improvement

[jira] [Commented] (PARQUET-1888) Provide guidance on number of file descriptors needed to read Parquet file

2020-07-28 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166259#comment-17166259 ] Gabor Szadovszky commented on PARQUET-1888: --- I guess this issue is about the rust

[jira] [Commented] (PARQUET-1684) [parquet-protobuf] default protobuf field values are stored as nulls

2020-07-28 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166275#comment-17166275 ] Gabor Szadovszky commented on PARQUET-1684: --- Let's move this discussion to this jira.

[jira] [Commented] (PARQUET-1580) Page-level CRC checksum verification for DataPageV1

2020-07-27 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165589#comment-17165589 ] Gabor Szadovszky commented on PARQUET-1580: --- [~natang.salgia], I would rather not backport it

[jira] [Commented] (PARQUET-1852) Array Index OutOf Bounds Exception when fall Back Dictionary Encoded Data

2020-07-27 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165597#comment-17165597 ] Gabor Szadovszky commented on PARQUET-1852: --- I think, we need more info here. First, it is

[jira] [Commented] (PARQUET-1894) Please fix the related Shaded Jackson Databind CVEs

2020-07-31 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168509#comment-17168509 ] Gabor Szadovszky commented on PARQUET-1894: --- We are planning to upgrade our dependencies for

[jira] [Assigned] (PARQUET-1666) Remove Unused Modules

2021-01-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1666: - Assignee: Gabor Szadovszky > Remove Unused Modules > --

[jira] [Resolved] (PARQUET-1666) Remove Unused Modules

2021-01-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1666. --- Resolution: Fixed > Remove Unused Modules > -- > >

[jira] [Commented] (PARQUET-1960) Parquet tools build error in the latest released version `apache-parquet-1.11.1`

2021-01-11 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262568#comment-17262568 ] Gabor Szadovszky commented on PARQUET-1960: --- [~jecihjoy], yes {{parquet-tools}} require

[jira] [Assigned] (PARQUET-1951) Allow different strategies to combine key values when merging parquet files

2021-01-07 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1951: - Assignee: satish > Allow different strategies to combine key values when

[jira] [Resolved] (PARQUET-1951) Allow different strategies to combine key values when merging parquet files

2021-01-07 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1951. --- Resolution: Fixed > Allow different strategies to combine key values when merging

[jira] [Assigned] (PARQUET-1928) Interpret Parquet INT96 type as FIXED[12] AVRO Schema

2020-12-07 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1928: - Assignee: Anant Damle > Interpret Parquet INT96 type as FIXED[12] AVRO Schema

[jira] [Resolved] (PARQUET-1947) DeprecatedParquetInputFormat in CombineFileInputFormat would produce wrong data

2020-12-07 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1947. --- Resolution: Fixed > DeprecatedParquetInputFormat in CombineFileInputFormat would

[jira] [Commented] (PARQUET-1946) Parquet File not readable by Google big query (works with Spark)

2020-11-23 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237422#comment-17237422 ] Gabor Szadovszky commented on PARQUET-1946: --- There are no statistics/metadata in the parquet

[jira] [Created] (PARQUET-1950) Define core features / compliance level

2020-12-10 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1950: - Summary: Define core features / compliance level Key: PARQUET-1950 URL: https://issues.apache.org/jira/browse/PARQUET-1950 Project: Parquet Issue

[jira] [Commented] (PARQUET-1872) Add TransCompression command

2020-12-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243881#comment-17243881 ] Gabor Szadovszky commented on PARQUET-1872: --- [~sha...@uber.com], based on the git logs this

[jira] [Commented] (PARQUET-1948) TransCompressionCommand Inoperable

2020-12-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243879#comment-17243879 ] Gabor Szadovszky commented on PARQUET-1948: --- [~vanhooser], this feature is to be released in

[jira] [Commented] (PARQUET-1666) Remove Unused Modules

2020-12-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243916#comment-17243916 ] Gabor Szadovszky commented on PARQUET-1666: --- Thanks, [~daijy]. But you also submitted a PR

[jira] [Updated] (PARQUET-1901) Add filter null check for ColumnIndex

2020-12-03 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1901: -- Fix Version/s: (was: 1.12.0) > Add filter null check for ColumnIndex >

[jira] [Commented] (PARQUET-1801) Add column index support for 'prune' command in Parquet-tools/cli

2020-12-03 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243076#comment-17243076 ] Gabor Szadovszky commented on PARQUET-1801: --- [~pavitheran], happy to hear that. Sure, we can

[jira] [Commented] (PARQUET-1666) Remove Unused Modules

2020-12-03 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243092#comment-17243092 ] Gabor Szadovszky commented on PARQUET-1666: --- To summarize the actions we would like to take

[jira] [Commented] (PARQUET-1677) Bump Apache Pig from 0.16.0 to 0.17.0

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242542#comment-17242542 ] Gabor Szadovszky commented on PARQUET-1677: --- [~fokko], do you want to pick this up and

[jira] [Commented] (PARQUET-1942) Bump Apache Arrow 2.0.0

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242540#comment-17242540 ] Gabor Szadovszky commented on PARQUET-1942: --- [~fokko], do you want to work on it for 1.12.0?

[jira] [Updated] (PARQUET-1800) Add 'prune' command to parquet-cli

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1800: -- Fix Version/s: (was: 1.12.0) Removed the target 1.12.0 > Add 'prune' command to

[jira] [Resolved] (PARQUET-1941) Bump Commons CLI from 1.3.1 to 1.4

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1941. --- Resolution: Fixed > Bump Commons CLI from 1.3.1 to 1.4 >

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242538#comment-17242538 ] Gabor Szadovszky commented on PARQUET-1927: --- What is the current status of this one? Is it a

[jira] [Commented] (PARQUET-1676) Remove hive modules

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242545#comment-17242545 ] Gabor Szadovszky commented on PARQUET-1676: --- [~fokko], any updates on this? > Remove hive

[jira] [Resolved] (PARQUET-1714) Release parquet format 2.8.0

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1714. --- Resolution: Fixed Parquet format 2.8.0 is already released only that I've missed

[jira] [Commented] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242521#comment-17242521 ] Gabor Szadovszky commented on PARQUET-1792: --- Removed the target 1.12.0. > Add 'mask' command

[jira] [Updated] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1792: -- Fix Version/s: (was: 1.12.0) > Add 'mask' command to parquet-tools/parquet-cli >

[jira] [Commented] (PARQUET-1666) Remove Unused Modules

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242547#comment-17242547 ] Gabor Szadovszky commented on PARQUET-1666: --- [~julienledem], [~sha...@uber.com], what is the

[jira] [Commented] (PARQUET-1901) Add filter null check for ColumnIndex

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242535#comment-17242535 ] Gabor Szadovszky commented on PARQUET-1901: --- Still want to work on this for 1.12.0? > Add

[jira] [Commented] (PARQUET-1801) Add column index support for 'prune' command in Parquet-tools/cli

2020-12-02 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242537#comment-17242537 ] Gabor Szadovszky commented on PARQUET-1801: --- As of prune is already implemented in

[jira] [Commented] (PARQUET-1951) Allow different strategies to combine key values when merging parquet files

2020-12-14 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248854#comment-17248854 ] Gabor Szadovszky commented on PARQUET-1951: --- [~satishkotha], I would not suggest using

[jira] [Assigned] (PARQUET-1952) Upgrade Avro to 1.10.1

2020-12-14 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1952: - Assignee: Yuming Wang > Upgrade Avro to 1.10.1 > -- > >

[jira] [Resolved] (PARQUET-1952) Upgrade Avro to 1.10.1

2020-12-14 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1952. --- Resolution: Fixed > Upgrade Avro to 1.10.1 > -- > >

[jira] [Commented] (PARQUET-1954) TCP connection leak in parquet dump

2020-12-15 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17249689#comment-17249689 ] Gabor Szadovszky commented on PARQUET-1954: --- Good catch, [~xiepengjie]. Would you like to

[jira] [Commented] (PARQUET-1953) hadoop-common is not an optional dependency

2020-12-15 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17249700#comment-17249700 ] Gabor Szadovszky commented on PARQUET-1953: --- There are a couple of jiras about similar

[jira] [Created] (PARQUET-1944) Unable to download transitive dependency hadoop-lzo

2020-11-11 Thread Gabor Szadovszky (Jira)
Gabor Szadovszky created PARQUET-1944: - Summary: Unable to download transitive dependency hadoop-lzo Key: PARQUET-1944 URL: https://issues.apache.org/jira/browse/PARQUET-1944 Project: Parquet

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-11-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17228729#comment-17228729 ] Gabor Szadovszky commented on PARQUET-1927: --- [~sha...@uber.com], yes, it may occur that the

[jira] [Resolved] (PARQUET-1944) Unable to download transitive dependency hadoop-lzo

2020-11-12 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1944. --- Resolution: Fixed > Unable to download transitive dependency hadoop-lzo >

[jira] [Commented] (PARQUET-1944) Unable to download transitive dependency hadoop-lzo

2020-11-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17231569#comment-17231569 ] Gabor Szadovszky commented on PARQUET-1944: --- [~jsh], closing/re-opening the PR also helps

[jira] [Assigned] (PARQUET-1917) [parquet-proto] default values are stored in oneOf fields that aren't set

2020-10-22 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1917: - Assignee: Aaron Blake Niskode-Dossett > [parquet-proto] default values are

[jira] [Resolved] (PARQUET-1917) [parquet-proto] default values are stored in oneOf fields that aren't set

2020-10-22 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1917. --- Resolution: Fixed > [parquet-proto] default values are stored in oneOf fields that

[jira] [Resolved] (PARQUET-1893) H2SeekableInputStream readFully() doesn't respect start and len

2020-10-22 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1893. --- Resolution: Fixed > H2SeekableInputStream readFully() doesn't respect start and

[jira] [Resolved] (PARQUET-1914) Allow ProtoParquetReader To Support InputFile

2020-10-22 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1914. --- Resolution: Fixed > Allow ProtoParquetReader To Support InputFile >

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-10-21 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218176#comment-17218176 ] Gabor Szadovszky commented on PARQUET-1927: --- I think, it is fine extending the current API if

[jira] [Resolved] (PARQUET-1954) TCP connection leak in parquet dump

2021-01-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1954. --- Resolution: Fixed > TCP connection leak in parquet dump >

[jira] [Assigned] (PARQUET-1954) TCP connection leak in parquet dump

2021-01-05 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1954: - Assignee: xiepengjie > TCP connection leak in parquet dump >

[jira] [Commented] (PARQUET-1666) Remove Unused Modules

2021-01-06 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17259933#comment-17259933 ] Gabor Szadovszky commented on PARQUET-1666: --- That's a good question. We've had a discussion a

[jira] [Commented] (PARQUET-1898) Release parquet-mr 1.12.0

2021-01-08 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261290#comment-17261290 ] Gabor Szadovszky commented on PARQUET-1898: --- [~pavithraramachandran], we still have a couple

[jira] [Commented] (PARQUET-1851) ParquetMetadataConveter throws NPE in an Iceberg unit test

2021-01-11 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262519#comment-17262519 ] Gabor Szadovszky commented on PARQUET-1851: --- I would not agree. An empty row group shall

[jira] [Commented] (PARQUET-1960) Parquet tools build error in the latest released version `apache-parquet-1.11.1`

2021-01-11 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262512#comment-17262512 ] Gabor Szadovszky commented on PARQUET-1960: --- I am not sure why it does not work. I guess the

[jira] [Resolved] (PARQUET-1963) DeprecatedParquetInputFormat in CombineFileInputFormat throw NPE when the first sub-split is empty

2021-01-20 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1963. --- Resolution: Fixed > DeprecatedParquetInputFormat in CombineFileInputFormat throw

[jira] [Resolved] (PARQUET-1949) Mark Parquet-1872 with not support bloom filter yet

2021-01-15 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1949. --- Resolution: Fixed > Mark Parquet-1872 with not support bloom filter yet >

[jira] [Resolved] (PARQUET-1801) Add column index support for 'prune' command in Parquet-tools/cli

2021-01-15 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1801. --- Resolution: Fixed > Add column index support for 'prune' command in

[jira] [Resolved] (PARQUET-1964) Properly handle missing/null filter

2021-01-21 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1964. --- Resolution: Fixed > Properly handle missing/null filter >

[jira] [Commented] (PARQUET-1960) Parquet tools build error in the latest released version `apache-parquet-1.11.1`

2021-01-14 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264718#comment-17264718 ] Gabor Szadovszky commented on PARQUET-1960: --- [~jecihjoy], I'm sorry I haven't noticed you've

[jira] [Commented] (PARQUET-1960) Parquet tools build error in the latest released version `apache-parquet-1.11.1`

2021-01-14 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264768#comment-17264768 ] Gabor Szadovszky commented on PARQUET-1960: --- [~jecihjoy], you need the thrift library to

[jira] [Assigned] (PARQUET-1926) Add LogicalType support to ThriftType.I64Type

2021-01-25 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1926: - Assignee: Joshua Martone > Add LogicalType support to ThriftType.I64Type >

[jira] [Updated] (PARQUET-1964) Properly handle missing/null filter

2021-01-19 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1964: -- Description: How to reproduce this issue: {code:scala} val hadoopInputFile =

[jira] [Updated] (PARQUET-1964) Properly handle missing/null filter

2021-01-19 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1964: -- Summary: Properly handle missing/null filter (was: Add null check for

[jira] [Assigned] (PARQUET-1964) Add null check for getFilteredRecordCount

2021-01-19 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1964: - Assignee: Gabor Szadovszky > Add null check for getFilteredRecordCount >

[jira] [Resolved] (PARQUET-1851) ParquetMetadataConveter throws NPE in an Iceberg unit test

2021-01-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky resolved PARQUET-1851. --- Resolution: Fixed > ParquetMetadataConveter throws NPE in an Iceberg unit test >

[jira] [Assigned] (PARQUET-1851) ParquetMetadataConveter throws NPE in an Iceberg unit test

2021-01-13 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1851: - Assignee: Junjie Chen > ParquetMetadataConveter throws NPE in an Iceberg unit

[jira] [Commented] (PARQUET-1827) UUID type currently not supported by parquet-mr

2021-01-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17258287#comment-17258287 ] Gabor Szadovszky commented on PARQUET-1827: --- [~vitalii], there are some open points for

[jira] [Updated] (PARQUET-1827) UUID type currently not supported by parquet-mr

2021-01-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1827: -- Fix Version/s: 1.12.0 > UUID type currently not supported by parquet-mr >

[jira] [Updated] (PARQUET-1717) parquet-thrift converts Thrift i16 to parquet INT32 instead of INT_16

2021-01-27 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky updated PARQUET-1717: -- Fix Version/s: (was: 1.11.0) 1.12.0 > parquet-thrift converts

[jira] [Assigned] (PARQUET-1396) Example of using EncryptionPropertiesFactory and DecryptionPropertiesFactory

2021-01-27 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1396: - Assignee: Xinli Shang > Example of using EncryptionPropertiesFactory and

[jira] [Assigned] (PARQUET-1660) [java] Align Bloom filter implementation with format

2021-01-27 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-1660: - Assignee: Junjie Chen > [java] Align Bloom filter implementation with format

[jira] [Commented] (PARQUET-1805) Refactor the configuration for bloom filters

2021-02-01 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276149#comment-17276149 ] Gabor Szadovszky commented on PARQUET-1805: --- [~yumwang], I think this performance issue is

[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-02-01 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276421#comment-17276421 ] Gabor Szadovszky commented on PARQUET-1968: --- This one sounds great. Meanwhile, we were

<    1   2   3   4   5   6   7   8   9   >