Sergio Peña created PARQUET-315:
-----------------------------------

             Summary: Add PARQUET_1_0 and non-repeated data performance tests 
to parquet-benchmarks
                 Key: PARQUET-315
                 URL: https://issues.apache.org/jira/browse/PARQUET-315
             Project: Parquet
          Issue Type: Test
          Components: parquet-mr
            Reporter: Sergio Peña
            Priority: Minor


The current parquet-benchmarks module run some performance tests between 
different block & page sizes for PARQUET_2_0 version only. We should run some 
tests with PARQUET_1_0 version as well in order to get a view about new parquet 
version enhancements, and be able to catch possible overheads early by 
comparing with the old file format.

Also, this module uses repeated data to benchmark the settings. We should also 
use random data to get different results about how current and new encodings 
work with real world data. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to