[jira] [Created] (PARQUET-1117) ParquetRecordWriter does not provide interface like getRowCount(),getRawDataSize() like org.apache.orc.Writer

2017-09-27 Thread liyunzhang_intel (JIRA)
liyunzhang_intel created PARQUET-1117: - Summary: ParquetRecordWriter does not provide interface like getRowCount(),getRawDataSize() like org.apache.orc.Writer Key: PARQUET-1117 URL:

Re: Compression test data

2017-09-27 Thread Ryan Blue
For anyone that would also like to test the compression codecs, I’ve uploaded a copy of parquet-cli that can read and write zstd, lz4, and brotli to my Apache public folder: http://home.apache.org/~blue/ There’s also a copy of hadoop-common that has all the codec bits for testing zstd. LZ4

Compression test data

2017-09-27 Thread Ryan Blue
Hi everyone, I ran some tests using 4 of our large tables to compare compression codecs. I tested gzip, brotli, lz4, and zstd, all with the default configuration. You can find the raw data and summary tables/graphs in this spreadsheet:

[jira] [Created] (PARQUET-1116) Add Yetus InterfaceAudience annotations to Parquet

2017-09-27 Thread Zoltan Ivanfi (JIRA)
Zoltan Ivanfi created PARQUET-1116: -- Summary: Add Yetus InterfaceAudience annotations to Parquet Key: PARQUET-1116 URL: https://issues.apache.org/jira/browse/PARQUET-1116 Project: Parquet

[jira] [Created] (PARQUET-1115) Prevent users from misusing parquet-tools merge

2017-09-27 Thread Zoltan Ivanfi (JIRA)
Zoltan Ivanfi created PARQUET-1115: -- Summary: Prevent users from misusing parquet-tools merge Key: PARQUET-1115 URL: https://issues.apache.org/jira/browse/PARQUET-1115 Project: Parquet

parquet sync

2017-09-27 Thread Julien Le Dem
starting now at: https://meet.google.com/wgv-qske-hzs