Re: get wrong result for the query(both in kyin-0.71 and kylin-0.6x)

dong wang Wed, 04 Mar 2015 19:30:07 -0800

HI hongbin,
1, yes, as you guessed,  the result of the above query is CORRECT if we use
ASC order, thus, the issue only occurs when DESC order.
2, the total row count of the result is GREATER THAN 50000 definitely, even
much more


2015-03-05 11:06 GMT+08:00 hongbin ma <[email protected]>:

> for #2 to #4, will Kylin return correct answers if order by ascending
> order?
> And what it the total result count of #4?
>
> On Thu, Mar 5, 2015 at 10:56 AM, Zhou, Qianhao <[email protected]> wrote:
>
> > #1 when add condition for date type, you should add date before the time
> > string
> >
> > the sql should be like:
> > select my_date, sum(f1), sum(f2), sum(f3) from test where
> > my_date >= date'2013-01-01' and my_date <= date'2015-01-31'
> >
> >
> >
> > Best Regard
> > Zhou QianHao
> >
> >
> >
> >
> >
> > On 3/5/15, 10:40 AM, "dong wang" <[email protected]> wrote:
> >
> > >Hi all, I have experienced the wrong result for quite a long time since
> > >kylin-0.6x, thus, really hope that someone can do a help.
> > >
> > >CUBE:
> > >1, the cube is based on just one big fact table. there are 10
> attributes,
> > >and 3 metrics.
> > >for exmaple: my_date, col2, col3, col4, col5, col6, col7, col8, col9,
> > >col10, sum(f1), sum(f2), sum(f2)
> > >
> > >2, my_date, col2 is normal columns, and my_date is the partition
> > >attribute,
> > >which is date type in HIVE,
> > >
> > >3, (col3, col4), (col5, col6), (col7, col8), (col9, col10) are 4
> > >hirerarchies, and col3, col5, col7, col9 are parents of its hirarchy
> > >respectively.
> > >
> > >4, the aggregation groups are: my_date, col2, col3, col4, col5, col6,
> > >col7,
> > >col8, col9, col10
> > >
> > >5, the rowkeys are: my_date, col2, col3, col4, col5, col6, col7, col8,
> > >col9, col10
> > >
> > >
> > >NOTE THAT: in the cube, the range of my_date is between 2013-10-01 and
> > >2015-03-04
> > >
> > >
> > >2 kinds of issue as below:
> > >
> > >QUERY:
> > >1, select my_date, sum(f1), sum(f2), sum(f3) from test where
> > >my_date>='2013-01-01' and my_date<= '2015-01-31'
> > >note "my_date" is HIVE date type. and the error: Cannot apply '>=' to
> > >arguments of type '<DATE> >= <CHAR(10)>'. Supported form(s):
> > >'<COMPARABLE_TYPE> >= <COMPARABLE_TYPE>' while executing SQL, how
> should I
> > >modify the SQL?
> > >
> > >2, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test group
> > >by
> > >my_date, col3, col4, order by my_date desc, col3 desc, col4 desc;
> > >
> > >suppose that the actual row count of the result is GREATER THAN the
> > >default
> > >limit 50000;
> > >
> > >then, the result is:
> > >
> > >my_date, col3, col4, sum(f1), sum(f2), sum(f3)
> > >2013-10-23
> > >2013-10-23
> > >...
> > >2013-10-22
> > >2013-10-22
> > >...
> > >2013-10-21
> > >2013-10-21
> > >...
> > >...
> > >...
> > >2013-10-01
> > >2013-10-01(note 2013-10-01 is the earliest day of the cube)
> > >
> > >totally, there are 50000 rows for the result,  but it is obvious that
> the
> > >order by and limit doesn't work correctly.
> > >
> > >3, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test group
> > >by
> > >my_date, col3, col4, order by my_date desc, col3 desc, col4 desc LIMIT
> 100
> > >
> > >just adding "LIMIT 100" to the tail of the 2nd query above, then, the
> > >result is:
> > >
> > >my_date, col3, col4, sum(f1), sum(f2), sum(f3)
> > >2013-10-05
> > >2013-10-05
> > >....
> > >2013-10-05
> > >
> > >totally, there are 100 rows. all of which are "2013-10-05", apparently,
> > >the
> > >result is not correct as well.
> > >
> > >4, select my_date, sum(f1), sum(f2), sum(f3) from test group by my_date
> > >order by my_date desc LIMIT 100
> > >if I just add "my_date" as "aggreation level" as above, the result of
> the
> > >4th query is correct.
> >
> >
>

Re: get wrong result for the query(both in kyin-0.71 and kylin-0.6x)

Reply via email to