Sergio Peña created PARQUET-315:
-----------------------------------
Summary: Add PARQUET_1_0 and non-repeated data performance tests
to parquet-benchmarks
Key: PARQUET-315
URL: https://issues.apache.org/jira/browse/PARQUET-315
Project: Parquet
Issue Type: Test
Components: parquet-mr
Reporter: Sergio Peña
Priority: Minor
The current parquet-benchmarks module run some performance tests between
different block & page sizes for PARQUET_2_0 version only. We should run some
tests with PARQUET_1_0 version as well in order to get a view about new parquet
version enhancements, and be able to catch possible overheads early by
comparing with the old file format.
Also, this module uses repeated data to benchmark the settings. We should also
use random data to get different results about how current and new encodings
work with real world data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)