[
https://issues.apache.org/jira/browse/KYLIN-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liyang resolved KYLIN-2442.
---------------------------
Resolution: Fixed
Assignee: liyang
Fix Version/s: v2.0.0
> Re-calculate expansion rate, count raw data size regardless of flat table
> compression
> -------------------------------------------------------------------------------------
>
> Key: KYLIN-2442
> URL: https://issues.apache.org/jira/browse/KYLIN-2442
> Project: Kylin
> Issue Type: Improvement
> Reporter: liyang
> Assignee: liyang
> Fix For: v2.0.0
>
>
> Right now the expansion rate is calculated as "Cube Size / Raw Data Size".
> And the raw data size is the size of intermediate hive table. This means the
> Raw Data Size depends on the compression format of the intermediate table.
> And affects the correctness of expansion rate and other estimates based on
> the raw data size.
> The change intends to calculate the Raw Data Size based on the uncompressed
> cell values of the intermediate hive table. All cells take their string form
> and sum up the string byte size in UTF8 encoding. The result serves as Raw
> Data Size, is stable regardless of compression and other env parameters.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)