[ 
https://issues.apache.org/jira/browse/IMPALA-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v resolved IMPALA-7708.
-------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 3.1.0

> Switch to faster compression strategy for incremental stats
> -----------------------------------------------------------
>
>                 Key: IMPALA-7708
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7708
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>    Affects Versions: Impala 3.1.0
>            Reporter: bharath v
>            Assignee: bharath v
>            Priority: Major
>             Fix For: Impala 3.1.0
>
>
> Currently we set the Deflater mode to BEST_COMPRESSION by default.
> {noformat}
> public static byte[] deflateCompress(byte[] input) {
>     if (input == null) return null;
>     ByteArrayOutputStream bos = new ByteArrayOutputStream(input.length);
>     // TODO: Benchmark other compression levels.
>     DeflaterOutputStream stream =
>         new DeflaterOutputStream(bos, new 
> Deflater(Deflater.BEST_COMPRESSION));
> {noformat}
> In some experiments, we noticed that the fastest compression mode 
> (BEST_SPEED) performs ~8x faster with only ~4% compression ratio penalty. 
> Here are some results on a real world table with 3000 partitions with 
> incremental stats.
>  
> | |Time taken for serialization (seconds)|OutputBytes size (MB)|
> |Gzip best compression|92|194|
> |Gzip fastest compression|11|212|
> |Gzip default compression|57|195|
> |No compression|5|452|
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to