Re: get wrong result for the query(both in kyin-0.71 and kylin-0.6x)

hongbin ma Wed, 04 Mar 2015 19:07:59 -0800

for #2 to #4, will Kylin return correct answers if order by ascending order?
And what it the total result count of #4?


On Thu, Mar 5, 2015 at 10:56 AM, Zhou, Qianhao <[email protected]> wrote:

> #1 when add condition for date type, you should add date before the time
> string
>
> the sql should be like:
> select my_date, sum(f1), sum(f2), sum(f3) from test where
> my_date >= date'2013-01-01' and my_date <= date'2015-01-31'
>
>
>
> Best Regard
> Zhou QianHao
>
>
>
>
>
> On 3/5/15, 10:40 AM, "dong wang" <[email protected]> wrote:
>
> >Hi all, I have experienced the wrong result for quite a long time since
> >kylin-0.6x, thus, really hope that someone can do a help.
> >
> >CUBE:
> >1, the cube is based on just one big fact table. there are 10 attributes,
> >and 3 metrics.
> >for exmaple: my_date, col2, col3, col4, col5, col6, col7, col8, col9,
> >col10, sum(f1), sum(f2), sum(f2)
> >
> >2, my_date, col2 is normal columns, and my_date is the partition
> >attribute,
> >which is date type in HIVE,
> >
> >3, (col3, col4), (col5, col6), (col7, col8), (col9, col10) are 4
> >hirerarchies, and col3, col5, col7, col9 are parents of its hirarchy
> >respectively.
> >
> >4, the aggregation groups are: my_date, col2, col3, col4, col5, col6,
> >col7,
> >col8, col9, col10
> >
> >5, the rowkeys are: my_date, col2, col3, col4, col5, col6, col7, col8,
> >col9, col10
> >
> >
> >NOTE THAT: in the cube, the range of my_date is between 2013-10-01 and
> >2015-03-04
> >
> >
> >2 kinds of issue as below:
> >
> >QUERY:
> >1, select my_date, sum(f1), sum(f2), sum(f3) from test where
> >my_date>='2013-01-01' and my_date<= '2015-01-31'
> >note "my_date" is HIVE date type. and the error: Cannot apply '>=' to
> >arguments of type '<DATE> >= <CHAR(10)>'. Supported form(s):
> >'<COMPARABLE_TYPE> >= <COMPARABLE_TYPE>' while executing SQL, how should I
> >modify the SQL?
> >
> >2, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test group
> >by
> >my_date, col3, col4, order by my_date desc, col3 desc, col4 desc;
> >
> >suppose that the actual row count of the result is GREATER THAN the
> >default
> >limit 50000;
> >
> >then, the result is:
> >
> >my_date, col3, col4, sum(f1), sum(f2), sum(f3)
> >2013-10-23
> >2013-10-23
> >...
> >2013-10-22
> >2013-10-22
> >...
> >2013-10-21
> >2013-10-21
> >...
> >...
> >...
> >2013-10-01
> >2013-10-01(note 2013-10-01 is the earliest day of the cube)
> >
> >totally, there are 50000 rows for the result,  but it is obvious that the
> >order by and limit doesn't work correctly.
> >
> >3, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test group
> >by
> >my_date, col3, col4, order by my_date desc, col3 desc, col4 desc LIMIT 100
> >
> >just adding "LIMIT 100" to the tail of the 2nd query above, then, the
> >result is:
> >
> >my_date, col3, col4, sum(f1), sum(f2), sum(f3)
> >2013-10-05
> >2013-10-05
> >....
> >2013-10-05
> >
> >totally, there are 100 rows. all of which are "2013-10-05", apparently,
> >the
> >result is not correct as well.
> >
> >4, select my_date, sum(f1), sum(f2), sum(f3) from test group by my_date
> >order by my_date desc LIMIT 100
> >if I just add "my_date" as "aggreation level" as above, the result of the
> >4th query is correct.
>
>

Re: get wrong result for the query(both in kyin-0.71 and kylin-0.6x)

Reply via email to