Hi Zhuoran,
I faced a similar problem about cube building time. I think that depends
on the cardinality of the 2 dimensions you add. If some of these has a
big cardinality (eg. in my use case about 500.000 rows, Customer
Dimension) the number of combinations Kylin need to build the cube
increases a lot.
Some things you could try to reduce cube building time and size:
* Define all Dimension tables attributes as a Derived Dimensions. In
this cases you can not use Hierarchy optimization in Agg Group. The
query latency in queries that use derived attributes will be less
optimal than using Agg Group Hierarchies (with Normal Dimensions),
but in some cases the differences in query latency are acceptable
(in my case between 2 and 6 seconds more, depending of the query).
Cube size and building time will be reduced.
* Use "Shard By" in Rowkey for High Cardinality Dimensions. I have not
been able to test it yet, but as indicated at
https://kylin.apache.org/docs16/howto/howto_optimize_build.html
should work fine. This helps to reduce cube building time.
I hope to help you, I'm also learning to use Kylin.
Kind Regards,
El 27/04/2017 a las 12:46, 吕卓然 escribió:
Hi all,
Currently I am using Kylin 1.6.1 and I face a problem about cube
building time. I have one fact table and two lookup tables. When I set
13 normal dimensions and 15 derived dimensions and two measures (count
and count distinct). The step3 in building takes around 20mins and the
entire building takes around 1 hour. This is good.
However, when I try to increase to 15 normal dimensions and 15 derived
dimensions and two measures(count and count distinct). The step3 in
building takes around 240mins and the entire building takes forever….
BTW, I have a hierarchy dimension which has 4 normal dimensions.
I am really confusing about this. Does 13 normal dimensions become a
bottleneck in building cube?
Thanks a lot!
Zhuoran
--
*Roberto Tardío Olmos*
/Senior Big Data & Business Intelligence Consultant/
Avenida de Brasil, 17, Planta 16.
28020 Madrid
Fijo: 91.788.34.10