count distinct

Ruslan Dautkhanov Wed, 27 Jul 2016 15:05:59 -0700

Hello,

1)
How efficient is Kylin in materializing count distinct in its cubes?
We're more intrested in exact count distinct.


2) How effiecient is Kylin for wide datasets? We have around 700 dimensions.
Size of dataset - tens of billions records.
Is it feasible to run such a workload on, for example, a 10-node Hadoop
cluster?

3)  (This is a less critical question than the first two )
Does Kylin has a session-level setting to switch between approx and exact
count distinct?
Like Impala has a session-level setting APPX_COUNT_DISTINCT
So without changing application queries, users can switch if they're
intrerested
in approx or exact counts?


Thank you,
Ruslan Dautkhanov

count distinct

Reply via email to