Re: The performance of cube with large dimension table

hongbin ma Thu, 21 Apr 2016 04:03:07 -0700

by "a large number of candidates" do you mean the dimension has very high
cardinality? I assume so.

The way to optimize high cardinality dimension cube depends on your query
pattern:

When queries involve filters on the high cardinality dimension, it's best
to put the dimension at the beginning of row key, so that filters can help
quickly filter unwanted cube rows.

If your query is filtering on other dimension and group by the high
cardinality dimension, the query can easily return massive amount of
results. The scenario used to be weakness of Kylin. However, recently we're
working on multiple improvements on similar scenarios.
https://issues.apache.org/jira/browse/KYLIN-1428.  After these improvements
released I'll summarise a blog to explain all of them.

You can also check
http://apache-kylin.74782.x6.nabble.com/How-to-use-kylin-with-high-cardinality-dimensions-td3661.html,
it might be inspiring to you.

On Thu, Apr 21, 2016 at 11:02 AM, huawang <[email protected]> wrote:

> hi,  I found that if a dimension of the cube has a large number of
> candidates, the query will be very slow. Is there any solution to this
> condition?

-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: The performance of cube with large dimension table

Reply via email to