Agree with feng yu, you need think about whether you need build such
high-cardinality dimension into Cube;
For example, if the column is something like a free text description, or a
timestamp column, it doesn't make sense to have them in Cube, as Kylin is
an OLDAP engine not a common database; you
assume average size of this column is 32 bytes, 50 millions cardinality
means 1.5GB, in the step of 'Fact Table Distinct Columns.' mapper need read
from intermediate table and remove duplicate values(do it in Combiner),
however, this job will startup more than one mapper and just one reducer,
there
Hi All,
There are two ultra high carnality columns in our cube. Both of them are
over 50 million cardinality. When building the cube, it keeps giving us the
error: Error: GC overhead limit exceeded for the reduce jobs at the
step Extract
Fact Table Distinct Columns.
We've just updated to version1