[ 
https://issues.apache.org/jira/browse/KYLIN-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang resolved KYLIN-2442.
---------------------------
       Resolution: Fixed
         Assignee: liyang
    Fix Version/s: v2.0.0

> Re-calculate expansion rate, count raw data size regardless of flat table 
> compression
> -------------------------------------------------------------------------------------
>
>                 Key: KYLIN-2442
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2442
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: liyang
>            Assignee: liyang
>             Fix For: v2.0.0
>
>
> Right now the expansion rate is calculated as "Cube Size / Raw Data Size". 
> And the raw data size is the size of intermediate hive table. This means the 
> Raw Data Size depends on the compression format of the intermediate table. 
> And affects the correctness of expansion rate and other estimates based on 
> the raw data size.
> The change intends to calculate the Raw Data Size based on the uncompressed 
> cell values of the intermediate hive table. All cells take their string form 
> and sum up the string byte size in UTF8 encoding. The result serves as Raw 
> Data Size, is stable regardless of compression and other env parameters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to