Querying raw data / lowest granularity with Kylin

alex schufo Mon, 10 Aug 2015 02:36:01 -0700

I have some scenarios where I would like to drill down to the lowest
granularity of my table, does Kylin handle this?


If I am not mistaken a least one "group by" should always be used.

So I tried to query by grouping by all my dimensions at the same time :
"select dim1, dim2, ..., dimN, sum(measure1), ..., sum(measureN) from ...
where ... group by dim1, dim2, ..., dimN". This gives me the expected
results. Is this the correct way to do it?

Although this seems to work, with several dimension it would mean building
a lot of cubes and using a lot of space whereas in this case it would not
necessarily be used. I know that aggregation groups can be used to solve
reduce this. With the same example I created 1 aggregation group for each
dimension and the expansion rate is 200%, but I tested only on 5
dimensions. Again, is this the correct way to do it?

Relative to this topic, I saw:

v0.7.x: InvertedIndex (HybridOLAP)
Goal:
Introduce InvertedIndex to optimise queries on raw data and low level
aggregation

on https://issues.apache.org/jira/browse/KYLIN-577

Is this something that is currently available in 0.7.2? This ticket dates
back from beginning 2015, so I am not sure if it reflects Kylin current
plan or not.

Querying raw data / lowest granularity with Kylin

Reply via email to