[jira] [Created] (PARQUET-1325) Parquet column access control support

2018-06-13 Thread Xinli Shang (JIRA)
Xinli Shang created PARQUET-1325: Summary: Parquet column access control support Key: PARQUET-1325 URL: https://issues.apache.org/jira/browse/PARQUET-1325 Project: Parquet Issue Type: New

[jira] [Updated] (PARQUET-1325) Parquet column access control support

2018-06-13 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1325: - Description: PARQUET-1178 proposed a mechanism for modular encryption and decryption of

[jira] [Updated] (PARQUET-1325) Parquet column access control support

2018-06-13 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1325: - Description: PARQUET-1178 proposed a mechanism for modular encryption and decryption of

[jira] [Updated] (PARQUET-1325) Parquet column access control support

2018-06-13 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1325: - Description: PARQUET-1178 proposed a mechanism for modular encryption and decryption of

[jira] [Updated] (PARQUET-1325) Flexible and finer-grained column level access control through encryption

2018-06-14 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1325: - Description: This JIRA is an extension to Modular Encryption Jira(PARQUET-1178) that will

[jira] [Updated] (PARQUET-1325) High-level flexible and fine-grained column level access control through encryption with pluggable key access

2018-06-19 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1325: - Description: This JIRA is an extension to Parquet Modular Encryption Jira(PARQUET-1178) that

[jira] [Resolved] (PARQUET-1325) High-level flexible and fine-grained column level access control through encryption with pluggable key access

2018-08-20 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang resolved PARQUET-1325. -- Resolution: Duplicate This Jira is replaced by Parquet-1396. > High-level flexible and

[jira] [Created] (PARQUET-1396) Cryptodata Interface for no-API Activation of Parquet Encryption

2018-08-20 Thread Xinli Shang (JIRA)
Xinli Shang created PARQUET-1396: Summary: Cryptodata Interface for no-API Activation of Parquet Encryption Key: PARQUET-1396 URL: https://issues.apache.org/jira/browse/PARQUET-1396 Project: Parquet

[jira] [Commented] (PARQUET-1396) Cryptodata Interface for no-API Activation of Parquet Encryption

2018-08-20 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586264#comment-16586264 ] Xinli Shang commented on PARQUET-1396: -- This is the design of Parquet-1396.  > Cryptodata

[jira] [Updated] (PARQUET-1397) Sample of usage Parquet-1396 and Parquet-1178 for column level encryption with pluggable key access

2018-08-20 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1397: - Description: This Jira provides a sample to use Parquet-1396 and Parquet-1178 column level

[jira] [Created] (PARQUET-1397) Sample of usage Parquet-1396 and Parquet-1178 for column level encryption with pluggable key access

2018-08-20 Thread Xinli Shang (JIRA)
Xinli Shang created PARQUET-1397: Summary: Sample of usage Parquet-1396 and Parquet-1178 for column level encryption with pluggable key access Key: PARQUET-1397 URL:

[jira] [Commented] (PARQUET-1432) ACID support

2018-10-01 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634187#comment-16634187 ] Xinli Shang commented on PARQUET-1432: -- Had same thought earlier. Look forward to the design.  >

[jira] [Updated] (PARQUET-1396) Cryptodata Interface for Schema Activation of Parquet Encryption

2019-01-17 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1396: - Description: This JIRA is an extension to Parquet Modular Encryption Jira(PARQUET-1178) that

[jira] [Updated] (PARQUET-1396) Cryptodata Interface for Schema Activation of Parquet Encryption

2019-01-17 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1396: - Summary: Cryptodata Interface for Schema Activation of Parquet Encryption (was: Cryptodata

[jira] [Updated] (PARQUET-1396) Cryptodata Interface for Schema Activation of Parquet Encryption

2019-01-17 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1396: - Description: This JIRA is an extension to Parquet Modular Encryption Jira(PARQUET-1178) that

[jira] [Updated] (PARQUET-1396) Cryptodata Interface for no-API Activation of Parquet Encryption

2018-09-17 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1396: - Description: This JIRA is an extension to Parquet Modular Encryption Jira(PARQUET-1178) that

[jira] [Commented] (PARQUET-1396) Cryptodata Interface for no-API Activation of Parquet Encryption

2018-09-17 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617910#comment-16617910 ] Xinli Shang commented on PARQUET-1396: -- Design of Parquet-1396 > Cryptodata Interface for no-API

[jira] [Created] (PARQUET-1533) TestSnappy() throws OOM exception with Parquet-1485 change

2019-02-18 Thread Xinli Shang (JIRA)
Xinli Shang created PARQUET-1533: Summary: TestSnappy() throws OOM exception with Parquet-1485 change Key: PARQUET-1533 URL: https://issues.apache.org/jira/browse/PARQUET-1533 Project: Parquet

[jira] [Updated] (PARQUET-1533) TestSnappy() throws OOM exception with Parquet-1485 change

2019-02-18 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1533: - Description: Parquet-1485 initialize the buffer size(inputBuffer and outputBuffer) from 0 to

[jira] [Updated] (PARQUET-1533) TestSnappy() throws OOM exception with Parquet-1485 change

2019-02-18 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1533: - Description: Parquet-1485 initialize the buffer size(inputBuffer and outputBuffer) from 0 to

[jira] [Updated] (PARQUET-1533) TestSnappy() throws OOM exception with Parquet-1485 change

2019-02-18 Thread Xinli Shang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1533: - Description: Parquet-1485 initialize the buffer size(inputBuffer and outputBuffer) from 0 to

[jira] [Created] (PARQUET-1659) Add AES-CTR to Parquet Encryption

2019-09-20 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1659: Summary: Add AES-CTR to Parquet Encryption Key: PARQUET-1659 URL: https://issues.apache.org/jira/browse/PARQUET-1659 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-1666) Remove Unused Modules

2019-09-24 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1666: Summary: Remove Unused Modules Key: PARQUET-1666 URL: https://issues.apache.org/jira/browse/PARQUET-1666 Project: Parquet Issue Type: Improvement

[jira] [Commented] (PARQUET-1681) Avro's isElementType() change breaks the reading of some parquet(1.8.1) files

2019-11-07 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969483#comment-16969483 ] Xinli Shang commented on PARQUET-1681: -- Hi [~rdblue], do you still remember or document it

[jira] [Assigned] (PARQUET-1681) Avro's isElementType() change breaks the reading of some parquet(1.8.1) files

2019-11-07 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang reassigned PARQUET-1681: Assignee: Xinli Shang > Avro's isElementType() change breaks the reading of some

[jira] [Comment Edited] (PARQUET-1681) Avro's isElementType() change breaks the reading of some parquet(1.8.1) files

2019-11-07 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969483#comment-16969483 ] Xinli Shang edited comment on PARQUET-1681 at 11/7/19 6:28 PM: --- Hi

[jira] [Created] (PARQUET-1690) Integer Overflow of BinaryStatistics#isSmallerThan()

2019-11-10 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1690: Summary: Integer Overflow of BinaryStatistics#isSmallerThan() Key: PARQUET-1690 URL: https://issues.apache.org/jira/browse/PARQUET-1690 Project: Parquet

[jira] [Commented] (PARQUET-1685) Truncate the stored min and max for String statistics to reduce the footer size

2019-10-28 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961207#comment-16961207 ] Xinli Shang commented on PARQUET-1685: -- Sounds good [~gszadovszky] and [~rdblue]. I will broadcast

[jira] [Comment Edited] (PARQUET-1685) Truncate the stored min and max for String statistics to reduce the footer size

2019-10-28 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961165#comment-16961165 ] Xinli Shang edited comment on PARQUET-1685 at 10/28/19 3:35 PM: Hi

[jira] [Commented] (PARQUET-1685) Truncate the stored min and max for String statistics to reduce the footer size

2019-10-28 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961165#comment-16961165 ] Xinli Shang commented on PARQUET-1685: -- Hi [~gszadovszky] Thanks for your reply!   Regarding "an

[jira] [Comment Edited] (PARQUET-1685) Truncate the stored min and max for String statistics to reduce the footer size

2019-10-28 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961165#comment-16961165 ] Xinli Shang edited comment on PARQUET-1685 at 10/28/19 3:35 PM: Hi

[jira] [Updated] (PARQUET-1683) Remove unnecessary string converting in readFooter method

2019-10-21 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1683: - Description: The method (String filePath = file.toString()) is always called even filePath is

[jira] [Updated] (PARQUET-1683) Remove unnecessary string converting in readFooter method

2019-10-21 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1683: - Description: The method (String filePath = file.toString()) is always called even filePath is

[jira] [Created] (PARQUET-1683) Remove unnecessary string converting in readFooter method

2019-10-21 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1683: Summary: Remove unnecessary string converting in readFooter method Key: PARQUET-1683 URL: https://issues.apache.org/jira/browse/PARQUET-1683 Project: Parquet

[jira] [Created] (PARQUET-1681) Avro's isElementType() change breaks the reading of some parquet(1.8.1) files

2019-10-18 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1681: Summary: Avro's isElementType() change breaks the reading of some parquet(1.8.1) files Key: PARQUET-1681 URL: https://issues.apache.org/jira/browse/PARQUET-1681

[jira] [Updated] (PARQUET-1681) Avro's isElementType() change breaks the reading of some parquet(1.8.1) files

2019-10-18 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1681: - Description: When using the Avro schema below to write a parquet(1.8.1) file and then read

[jira] [Created] (PARQUET-1685) Truncate the stored min and max for String statistics to reduce the footer size

2019-10-25 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1685: Summary: Truncate the stored min and max for String statistics to reduce the footer size Key: PARQUET-1685 URL: https://issues.apache.org/jira/browse/PARQUET-1685

[jira] [Commented] (PARQUET-1656) Schema change results in exception - java.lang.ClassCastException

2019-09-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934516#comment-16934516 ] Xinli Shang commented on PARQUET-1656: -- I am working on the fix of this issue in 1.8.1(upgrading

[jira] [Commented] (PARQUET-1659) Add AES-CTR to Parquet Encryption

2019-09-22 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935553#comment-16935553 ] Xinli Shang commented on PARQUET-1659: -- Yes, I understood the philosophy of designing GCM_CTR. The

[jira] [Commented] (PARQUET-1656) Schema change results in exception - java.lang.ClassCastException

2019-12-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001328#comment-17001328 ] Xinli Shang commented on PARQUET-1656: -- https://issues.apache.org/jira/browse/PARQUET-1681   >

[jira] [Assigned] (PARQUET-1656) Schema change results in exception - java.lang.ClassCastException

2019-12-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang reassigned PARQUET-1656: External issue ID: PARQUET-1681 External issue URL:

[jira] [Comment Edited] (PARQUET-1681) Avro's isElementType() change breaks the reading of some parquet(1.8.1) files

2019-12-18 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999420#comment-16999420 ] Xinli Shang edited comment on PARQUET-1681 at 12/18/19 6:47 PM:

[jira] [Commented] (PARQUET-1681) Avro's isElementType() change breaks the reading of some parquet(1.8.1) files

2019-12-18 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999420#comment-16999420 ] Xinli Shang commented on PARQUET-1681: -- [~rdblue], I verified that this issue is not related with

[jira] [Created] (PARQUET-1812) Use airlift codecs for zstd

2020-03-05 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1812: Summary: Use airlift codecs for zstd Key: PARQUET-1812 URL: https://issues.apache.org/jira/browse/PARQUET-1812 Project: Parquet Issue Type: Improvement

[jira] [Commented] (PARQUET-1764) The ParquetProperties constructor parameter list is so long

2020-02-07 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032832#comment-17032832 ] Xinli Shang commented on PARQUET-1764: -- This PR has the refactor implemented 

[jira] [Resolved] (PARQUET-1764) The ParquetProperties constructor parameter list is so long

2020-02-07 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang resolved PARQUET-1764. -- Resolution: Duplicate > The ParquetProperties constructor parameter list is so long >

[jira] [Created] (PARQUET-1791) Add 'prune' command to parquet-tools

2020-02-09 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1791: Summary: Add 'prune' command to parquet-tools Key: PARQUET-1791 URL: https://issues.apache.org/jira/browse/PARQUET-1791 Project: Parquet Issue Type:

[jira] [Updated] (PARQUET-1791) Add 'prune' command to parquet-tools

2020-02-09 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1791: - Description: During data retention, there is a need to remove unused or personal columns.

[jira] [Updated] (PARQUET-1791) Add 'prune' command to parquet-tools

2020-02-09 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1791: - Description: During data retention, there is a need to remove unused or personal columns.

[jira] [Commented] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-02-11 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034588#comment-17034588 ] Xinli Shang commented on PARQUET-1792: -- [~gershinsky], this is just a simple offline tool to

[jira] [Comment Edited] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-02-11 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034553#comment-17034553 ] Xinli Shang edited comment on PARQUET-1792 at 2/11/20 3:47 PM: ---

[jira] [Commented] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-02-11 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034553#comment-17034553 ] Xinli Shang commented on PARQUET-1792: -- [~gszadovszky] the tool can be run in parallel in a

[jira] [Comment Edited] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-02-11 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034588#comment-17034588 ] Xinli Shang edited comment on PARQUET-1792 at 2/11/20 4:36 PM: ---

[jira] [Created] (PARQUET-1800) Add 'prune' command to parquet-cli

2020-02-16 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1800: Summary: Add 'prune' command to parquet-cli Key: PARQUET-1800 URL: https://issues.apache.org/jira/browse/PARQUET-1800 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-1801) Add column index support for 'prune' command in Parquet-tools/cli

2020-02-16 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1801: Summary: Add column index support for 'prune' command in Parquet-tools/cli Key: PARQUET-1801 URL: https://issues.apache.org/jira/browse/PARQUET-1801 Project: Parquet

[jira] [Commented] (PARQUET-1801) Add column index support for 'prune' command in Parquet-tools/cli

2020-02-18 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039187#comment-17039187 ] Xinli Shang commented on PARQUET-1801: -- I think that is a great idea! I will share with you once I

[jira] [Created] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-02-10 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1792: Summary: Add 'mask' command to parquet-tools/parquet-cli Key: PARQUET-1792 URL: https://issues.apache.org/jira/browse/PARQUET-1792 Project: Parquet Issue

[jira] [Created] (PARQUET-1764) The ParquetProperties constructor parameter list is so long

2020-01-12 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1764: Summary: The ParquetProperties constructor parameter list is so long Key: PARQUET-1764 URL: https://issues.apache.org/jira/browse/PARQUET-1764 Project: Parquet

[jira] [Commented] (PARQUET-1836) why the last chunk might be larger than descriptor.size?

2020-04-09 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079512#comment-17079512 ] Xinli Shang commented on PARQUET-1836: -- I just searched the change history and see the fix comes

[jira] [Created] (PARQUET-1821) Add 'column-size' command to parquet-cli and parquet-tools

2020-03-19 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1821: Summary: Add 'column-size' command to parquet-cli and parquet-tools Key: PARQUET-1821 URL: https://issues.apache.org/jira/browse/PARQUET-1821 Project: Parquet

[jira] [Updated] (PARQUET-1866) Replace Hadoop ZSTD with JNI-ZSTD

2020-05-21 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1866: - Description: The parquet-mr repo has been using

[jira] [Created] (PARQUET-1866) Replace Hadoop ZSTD with JNI-ZSTD

2020-05-21 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1866: Summary: Replace Hadoop ZSTD with JNI-ZSTD Key: PARQUET-1866 URL: https://issues.apache.org/jira/browse/PARQUET-1866 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-10-17 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1927: Summary: ColumnIndex should provide number of records skipped Key: PARQUET-1927 URL: https://issues.apache.org/jira/browse/PARQUET-1927 Project: Parquet

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-10-19 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216849#comment-17216849 ] Xinli Shang commented on PARQUET-1927: -- [~gszadovszky], the way that Iceberg Parquet reader

[jira] [Created] (PARQUET-1901) Add filter null check for ColumnIndex

2020-08-22 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1901: Summary: Add filter null check for ColumnIndex Key: PARQUET-1901 URL: https://issues.apache.org/jira/browse/PARQUET-1901 Project: Parquet Issue Type: Bug

[jira] [Commented] (PARQUET-1901) Add filter null check for ColumnIndex

2020-08-27 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186211#comment-17186211 ] Xinli Shang commented on PARQUET-1901: -- I have the initial version of Iceberg integration working

[jira] [Comment Edited] (PARQUET-1901) Add filter null check for ColumnIndex

2020-08-27 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186211#comment-17186211 ] Xinli Shang edited comment on PARQUET-1901 at 8/28/20, 2:23 AM: I have

[jira] [Commented] (PARQUET-1901) Add filter null check for ColumnIndex

2020-08-24 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183352#comment-17183352 ] Xinli Shang commented on PARQUET-1901: -- Hi [~rdblue], please comment on this if you have different

[jira] [Created] (PARQUET-1916) Add hash functionality

2020-09-23 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1916: Summary: Add hash functionality Key: PARQUET-1916 URL: https://issues.apache.org/jira/browse/PARQUET-1916 Project: Parquet Issue Type: Sub-task

[jira] [Assigned] (PARQUET-1915) Add null command

2020-09-23 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang reassigned PARQUET-1915: Assignee: Xinli Shang > Add null command > - > > Key:

[jira] [Created] (PARQUET-1915) Add null command

2020-09-23 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1915: Summary: Add null command Key: PARQUET-1915 URL: https://issues.apache.org/jira/browse/PARQUET-1915 Project: Parquet Issue Type: Sub-task

[jira] [Commented] (PARQUET-1396) Example of using EncryptionPropertiesFactory and DecryptionPropertiesFactory

2020-10-21 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218717#comment-17218717 ] Xinli Shang commented on PARQUET-1396: -- Most of the functionality of this Jira has been addressed

[jira] [Updated] (PARQUET-1396) Example of using EncryptionPropertiesFactory and DecryptionPropertiesFactory

2020-10-21 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1396: - Summary: Example of using EncryptionPropertiesFactory and DecryptionPropertiesFactory (was:

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-10-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217827#comment-17217827 ] Xinli Shang commented on PARQUET-1927: -- Add [~rdblue],[~shardulm] as FYI** > ColumnIndex should

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-10-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217774#comment-17217774 ] Xinli Shang commented on PARQUET-1927: -- That is correct [~gszadovszky]! We need a finer-grained

[jira] [Assigned] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-10-23 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang reassigned PARQUET-1927: Assignee: Xinli Shang > ColumnIndex should provide number of records skipped >

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-10-26 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17221036#comment-17221036 ] Xinli Shang commented on PARQUET-1927: -- ParquetFileReader.getFilteredRecordCount() cannot be used

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-10-27 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17221481#comment-17221481 ] Xinli Shang commented on PARQUET-1927: -- Thanks [~gszadovszky] for the explanation. I see it now.

[jira] [Commented] (PARQUET-1739) Make Spark SQL support Column indexes

2020-07-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161566#comment-17161566 ] Xinli Shang commented on PARQUET-1739: -- [~yumwang], Can you share is the implementation is done in

[jira] [Commented] (PARQUET-1830) Vectorized API to support Column Index in Apache Spark

2020-07-20 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161573#comment-17161573 ] Xinli Shang commented on PARQUET-1830: -- [~FelixKJose]Do we have Spark task created for

[jira] [Created] (PARQUET-1893) H2SeekableInputStream readFully() doesn't respect start and len

2020-07-29 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1893: Summary: H2SeekableInputStream readFully() doesn't respect start and len Key: PARQUET-1893 URL: https://issues.apache.org/jira/browse/PARQUET-1893 Project: Parquet

[jira] [Commented] (PARQUET-1801) Add column index support for 'prune' command in Parquet-tools/cli

2020-08-13 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176961#comment-17176961 ] Xinli Shang commented on PARQUET-1801: -- I will try to do it in 1.12.0. The feature works great!

[jira] [Commented] (PARQUET-1792) Add 'mask' command to parquet-tools/parquet-cli

2020-08-13 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176959#comment-17176959 ] Xinli Shang commented on PARQUET-1792: -- We might want to push it for next release. > Add 'mask'

[jira] [Commented] (PARQUET-1883) int96 support in parquet-avro

2020-07-09 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154888#comment-17154888 ] Xinli Shang commented on PARQUET-1883: -- [~gszadovszky], Do you still have links for INT96 will be

[jira] [Commented] (PARQUET-1872) Add TransCompression command

2020-06-17 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138607#comment-17138607 ] Xinli Shang commented on PARQUET-1872: -- That is correct understanding [~gszadovszky]. > Add

[jira] [Commented] (PARQUET-1872) Add TransCompression command

2020-06-16 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17137967#comment-17137967 ] Xinli Shang commented on PARQUET-1872: -- [~gszadovszky]Thanks for the reply! I just manually linked

[jira] [Assigned] (PARQUET-1874) Add to parquet-cli

2020-06-16 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang reassigned PARQUET-1874: Assignee: Xinli Shang > Add to parquet-cli > -- > >

[jira] [Updated] (PARQUET-1872) Add TransCompression command

2020-06-12 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang updated PARQUET-1872: - Description: When ZSTD becomes more popular, there is a need to translate existing data to

[jira] [Resolved] (PARQUET-1821) Add 'column-size' command to parquet-cli and parquet-tools

2020-06-11 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinli Shang resolved PARQUET-1821. -- Resolution: Fixed > Add 'column-size' command to parquet-cli and parquet-tools >

[jira] [Created] (PARQUET-1872) Add TransCompression command

2020-06-11 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1872: Summary: Add TransCompression command Key: PARQUET-1872 URL: https://issues.apache.org/jira/browse/PARQUET-1872 Project: Parquet Issue Type: Improvement

[jira] [Created] (PARQUET-1875) Add bloom filter support

2020-06-11 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1875: Summary: Add bloom filter support Key: PARQUET-1875 URL: https://issues.apache.org/jira/browse/PARQUET-1875 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1873) Add to Parquet-tools

2020-06-11 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1873: Summary: Add to Parquet-tools Key: PARQUET-1873 URL: https://issues.apache.org/jira/browse/PARQUET-1873 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1874) Add to parquet-cli

2020-06-11 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1874: Summary: Add to parquet-cli Key: PARQUET-1874 URL: https://issues.apache.org/jira/browse/PARQUET-1874 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1876) Port ZSTD-JNI support to 1.10.x brach

2020-06-14 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1876: Summary: Port ZSTD-JNI support to 1.10.x brach Key: PARQUET-1876 URL: https://issues.apache.org/jira/browse/PARQUET-1876 Project: Parquet Issue Type: Bug

[jira] [Commented] (PARQUET-1872) Add TransCompression command

2020-12-04 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244334#comment-17244334 ] Xinli Shang commented on PARQUET-1872: -- Thanks [~gszadovszky] for working on this! I just created

[jira] [Created] (PARQUET-1949) Mark Parquet-1872 with note support bloom filter yet

2020-12-04 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1949: Summary: Mark Parquet-1872 with note support bloom filter yet Key: PARQUET-1949 URL: https://issues.apache.org/jira/browse/PARQUET-1949 Project: Parquet

[jira] [Comment Edited] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-12-02 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242631#comment-17242631 ] Xinli Shang edited comment on PARQUET-1927 at 12/2/20, 7:05 PM: It is

[jira] [Commented] (PARQUET-1666) Remove Unused Modules

2020-12-02 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242625#comment-17242625 ] Xinli Shang commented on PARQUET-1666: -- I think adding "-deprecated" is a good idea.

[jira] [Commented] (PARQUET-1901) Add filter null check for ColumnIndex

2020-12-02 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242634#comment-17242634 ] Xinli Shang commented on PARQUET-1901: -- For now, I think we can move it to the next release. >

[jira] [Commented] (PARQUET-1927) ColumnIndex should provide number of records skipped

2020-12-02 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242631#comment-17242631 ] Xinli Shang commented on PARQUET-1927: -- It is still not decided yet in the last Iceberg meeting.

  1   2   >