Hello, 1) How efficient is Kylin in materializing count distinct in its cubes? We're more intrested in exact count distinct.
2) How effiecient is Kylin for wide datasets? We have around 700 dimensions. Size of dataset - tens of billions records. Is it feasible to run such a workload on, for example, a 10-node Hadoop cluster? 3) (This is a less critical question than the first two ) Does Kylin has a session-level setting to switch between approx and exact count distinct? Like Impala has a session-level setting APPX_COUNT_DISTINCT So without changing application queries, users can switch if they're intrerested in approx or exact counts? Thank you, Ruslan Dautkhanov
