Hi all, I have experienced the wrong result for quite a long time since
kylin-0.6x, thus, really hope that someone can do a help.

CUBE:
1, the cube is based on just one big fact table. there are 10 attributes,
and 3 metrics.
for exmaple: my_date, col2, col3, col4, col5, col6, col7, col8, col9,
col10, sum(f1), sum(f2), sum(f2)

2, my_date, col2 is normal columns, and my_date is the partition attribute,
which is date type in HIVE,

3, (col3, col4), (col5, col6), (col7, col8), (col9, col10) are 4
hirerarchies, and col3, col5, col7, col9 are parents of its hirarchy
respectively.

4, the aggregation groups are: my_date, col2, col3, col4, col5, col6, col7,
col8, col9, col10

5, the rowkeys are: my_date, col2, col3, col4, col5, col6, col7, col8,
col9, col10


NOTE THAT: in the cube, the range of my_date is between 2013-10-01 and
2015-03-04


2 kinds of issue as below:

QUERY:
1, select my_date, sum(f1), sum(f2), sum(f3) from test where
my_date>='2013-01-01' and my_date<= '2015-01-31'
note "my_date" is HIVE date type. and the error: Cannot apply '>=' to
arguments of type '<DATE> >= <CHAR(10)>'. Supported form(s):
'<COMPARABLE_TYPE> >= <COMPARABLE_TYPE>' while executing SQL, how should I
modify the SQL?

2, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test group by
my_date, col3, col4, order by my_date desc, col3 desc, col4 desc;

suppose that the actual row count of the result is GREATER THAN the default
limit 50000;

then, the result is:

my_date, col3, col4, sum(f1), sum(f2), sum(f3)
2013-10-23
2013-10-23
...
2013-10-22
2013-10-22
...
2013-10-21
2013-10-21
...
...
...
2013-10-01
2013-10-01(note 2013-10-01 is the earliest day of the cube)

totally, there are 50000 rows for the result,  but it is obvious that the
order by and limit doesn't work correctly.

3, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test group by
my_date, col3, col4, order by my_date desc, col3 desc, col4 desc LIMIT 100

just adding "LIMIT 100" to the tail of the 2nd query above, then, the
result is:

my_date, col3, col4, sum(f1), sum(f2), sum(f3)
2013-10-05
2013-10-05
....
2013-10-05

totally, there are 100 rows. all of which are "2013-10-05", apparently, the
result is not correct as well.

4, select my_date, sum(f1), sum(f2), sum(f3) from test group by my_date
order by my_date desc LIMIT 100
if I just add "my_date" as "aggreation level" as above, the result of the
4th query is correct.

Reply via email to