Parquet sync meeting notes - 9/22/2020

2020-09-22 Thread Xinli shang
9/22/2020 Hi all, Attendees: Ashish Singh, Julien, Gidon, Gabor, Xinli 1. Column Encryption 1. PRs are all merged. 2. Data masking 1. This feature should have top-down approach starting from service. Maybe we can start with parquet-tool.

[jira] [Resolved] (PARQUET-1878) [C++] lz4 codec is not compatible with Hadoop Lz4Codec

2020-09-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved PARQUET-1878. --- Fix Version/s: cpp-1.6.0 Resolution: Fixed Issue resolved by pull request 7789

[jira] [Assigned] (PARQUET-1878) [C++] lz4 codec is not compatible with Hadoop Lz4Codec

2020-09-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned PARQUET-1878: - Assignee: Patrick Pai > [C++] lz4 codec is not compatible with Hadoop Lz4Codec >

[jira] [Comment Edited] (PARQUET-1878) [C++] lz4 codec is not compatible with Hadoop Lz4Codec

2020-09-22 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199917#comment-17199917 ] Antoine Pitrou edited comment on PARQUET-1878 at 9/22/20, 8:22 AM: ---

[GitHub] [parquet-mr] winningsix commented on pull request #803: PARQUET-1886 CompressionCodec Provider-aware Compression Codec Lookup…

2020-09-22 Thread GitBox
winningsix commented on pull request #803: URL: https://github.com/apache/parquet-mr/pull/803#issuecomment-696610461 @xhochy It's overloading built-in compression implementation. And it's retrieving data from footer without introducing new spec info. Do you think we need to add it as part

[jira] [Commented] (PARQUET-1878) [C++] lz4 codec is not compatible with Hadoop Lz4Codec

2020-09-22 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199917#comment-17199917 ] Antoine Pitrou commented on PARQUET-1878: - Responding to myself: parquet-mr doesn't use the

[GitHub] [parquet-mr] winningsix commented on pull request #803: PARQUET-1886 CompressionCodec Provider-aware Compression Codec Lookup…

2020-09-22 Thread GitBox
winningsix commented on pull request #803: URL: https://github.com/apache/parquet-mr/pull/803#issuecomment-696611079 Connect to https://github.com/apache/arrow/pull/8229 This is an automated message from the Apache Git

[jira] [Commented] (PARQUET-1886) CompressionCodec Provider-aware Compression Codec Lookup for parquet-mr

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199947#comment-17199947 ] ASF GitHub Bot commented on PARQUET-1886: - winningsix commented on pull request #803: URL:

[jira] [Commented] (PARQUET-1886) CompressionCodec Provider-aware Compression Codec Lookup for parquet-mr

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199948#comment-17199948 ] ASF GitHub Bot commented on PARQUET-1886: - winningsix commented on pull request #803: URL:

[GitHub] [parquet-mr] belugabehr commented on pull request #815: PARQUET-1776: NIO wrapper for Output/Input File

2020-09-22 Thread GitBox
belugabehr commented on pull request #815: URL: https://github.com/apache/parquet-mr/pull/815#issuecomment-696754782 @HunterL OK, so I've been recently diving into Java NIO a bit and let me add some thoughts that relate to this task: Please change the naming convention and

[jira] [Commented] (PARQUET-1776) Add Java NIO Avro OutputFile InputFile

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200110#comment-17200110 ] ASF GitHub Bot commented on PARQUET-1776: - belugabehr edited a comment on pull request #815:

[jira] [Commented] (PARQUET-1776) Add Java NIO Avro OutputFile InputFile

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200109#comment-17200109 ] ASF GitHub Bot commented on PARQUET-1776: - belugabehr commented on pull request #815: URL:

[GitHub] [parquet-mr] belugabehr edited a comment on pull request #815: PARQUET-1776: NIO wrapper for Output/Input File

2020-09-22 Thread GitBox
belugabehr edited a comment on pull request #815: URL: https://github.com/apache/parquet-mr/pull/815#issuecomment-696754782 @HunterL OK, so I've been recently diving into Java NIO a bit and let me add some thoughts that relate to this task: Please change the naming

[jira] [Commented] (PARQUET-1776) Add Java NIO Avro OutputFile InputFile

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200111#comment-17200111 ] ASF GitHub Bot commented on PARQUET-1776: - belugabehr commented on pull request #815: URL:

[GitHub] [parquet-mr] belugabehr commented on pull request #815: PARQUET-1776: NIO wrapper for Output/Input File

2020-09-22 Thread GitBox
belugabehr commented on pull request #815: URL: https://github.com/apache/parquet-mr/pull/815#issuecomment-69670 Also requires unit tests. This is an automated message from the Apache Git Service. To respond to the

[jira] [Commented] (PARQUET-112) RunLengthBitPackingHybridDecoder: Reading past RLE/BitPacking stream.

2020-09-22 Thread Tristan Davolt (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200419#comment-17200419 ] Tristan Davolt commented on PARQUET-112: I am facing the same issue with Parquet 1.10.0. Data is

[jira] [Resolved] (PARQUET-1773) Parquet file in invalid state while writing to S3 when calling ParquetWriter.write

2020-09-22 Thread Tristan Davolt (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Davolt resolved PARQUET-1773. - Resolution: Not A Problem This was caused by an implementation issue. _ParquetWriter_

[jira] [Commented] (PARQUET-1773) Parquet file in invalid state while writing to S3 when calling ParquetWriter.write

2020-09-22 Thread Tristan Davolt (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200413#comment-17200413 ] Tristan Davolt commented on PARQUET-1773: - It appears the issue was indeed due to the race

[jira] [Comment Edited] (PARQUET-1773) Parquet file in invalid state while writing to S3 when calling ParquetWriter.write

2020-09-22 Thread Tristan Davolt (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200413#comment-17200413 ] Tristan Davolt edited comment on PARQUET-1773 at 9/22/20, 11:05 PM:

[jira] [Commented] (PARQUET-1776) Add Java NIO Avro OutputFile InputFile

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200329#comment-17200329 ] ASF GitHub Bot commented on PARQUET-1776: - belugabehr commented on pull request #815: URL:

[GitHub] [parquet-mr] belugabehr commented on pull request #815: PARQUET-1776: NIO wrapper for Output/Input File

2020-09-22 Thread GitBox
belugabehr commented on pull request #815: URL: https://github.com/apache/parquet-mr/pull/815#issuecomment-696929711 Here are a couple of starters: ``` private final ByteBuffer oneByteBuffer = ByteBuffer.allocate(1); @Override public int read() throws

[GitHub] [parquet-mr] julienledem commented on pull request #222: Parquet-313: Implement 3 level list writing rule for Parquet-Thrift

2020-09-22 Thread GitBox
julienledem commented on pull request #222: URL: https://github.com/apache/parquet-mr/pull/222#issuecomment-697053961 @ttim and @tlazaro, do you mind taking a look and letting us know if this looks ok to you? We'd want to get this merged.

[jira] [Commented] (PARQUET-1776) Add Java NIO Avro OutputFile InputFile

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200500#comment-17200500 ] ASF GitHub Bot commented on PARQUET-1776: - belugabehr commented on pull request #815: URL:

[GitHub] [parquet-mr] belugabehr commented on pull request #815: PARQUET-1776: NIO wrapper for Output/Input File

2020-09-22 Thread GitBox
belugabehr commented on pull request #815: URL: https://github.com/apache/parquet-mr/pull/815#issuecomment-696754782 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [parquet-mr] belugabehr edited a comment on pull request #815: PARQUET-1776: NIO wrapper for Output/Input File

2020-09-22 Thread GitBox
belugabehr edited a comment on pull request #815: URL: https://github.com/apache/parquet-mr/pull/815#issuecomment-696754782 @HunterL OK, so I've been recently diving into Java NIO a bit and let me add some thoughts that relate to this task: Please change the naming

[jira] [Commented] (PARQUET-1776) Add Java NIO Avro OutputFile InputFile

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200505#comment-17200505 ] ASF GitHub Bot commented on PARQUET-1776: - belugabehr edited a comment on pull request #815:

[GitHub] [parquet-mr] julienledem commented on pull request #222: Parquet-313: Implement 3 level list writing rule for Parquet-Thrift

2020-09-22 Thread GitBox
julienledem commented on pull request #222: URL: https://github.com/apache/parquet-mr/pull/222#issuecomment-697053961 @ttim and @tlazaro, do you mind taking a look and letting us know if this looks ok to you? We'd want to get this merged.

[jira] [Commented] (PARQUET-1886) CompressionCodec Provider-aware Compression Codec Lookup for parquet-mr

2020-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200546#comment-17200546 ] ASF GitHub Bot commented on PARQUET-1886: - winningsix commented on pull request #803: URL:

[GitHub] [parquet-mr] winningsix commented on pull request #803: PARQUET-1886 CompressionCodec Provider-aware Compression Codec Lookup…

2020-09-22 Thread GitBox
winningsix commented on pull request #803: URL: https://github.com/apache/parquet-mr/pull/803#issuecomment-696610461 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub