kangkaisen created KYLIN-1694:
---------------------------------

             Summary: make multiply coefficient configurable when estimating 
cuboid size
                 Key: KYLIN-1694
                 URL: https://issues.apache.org/jira/browse/KYLIN-1694
             Project: Kylin
          Issue Type: Bug
          Components: Job Engine
    Affects Versions: v1.5.1, v1.5.0
            Reporter: kangkaisen
            Assignee: Dong Li


In the current version of MRv2 build engine, in CubeStatsReader when estimating 
cuboid size , the curent method is "cube is memory hungry, storage size 
estimation multiply 0.05" and "cube is not memory hungry, storage size 
estimation multiply 0.25".

This has one major problems:the default multiply coefficient is smaller, this 
will make the estimated cuboid size much less than the actual
cuboid size,which will lead to the region numbers of HBase and the reducer 
numbers of CubeHFileJob are both smaller. obviously, the current method
makes the job of CubeHFileJob much slower.

After we remove the the default multiply coefficient, the job of CubeHFileJob 
becomes much faster.

we'd better make multiply coefficient configurable and this could be more 
friendly for user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to