for #2 to #4, will Kylin return correct answers if order by ascending order? And what it the total result count of #4?
On Thu, Mar 5, 2015 at 10:56 AM, Zhou, Qianhao <[email protected]> wrote: > #1 when add condition for date type, you should add date before the time > string > > the sql should be like: > select my_date, sum(f1), sum(f2), sum(f3) from test where > my_date >= date'2013-01-01' and my_date <= date'2015-01-31' > > > > Best Regard > Zhou QianHao > > > > > > On 3/5/15, 10:40 AM, "dong wang" <[email protected]> wrote: > > >Hi all, I have experienced the wrong result for quite a long time since > >kylin-0.6x, thus, really hope that someone can do a help. > > > >CUBE: > >1, the cube is based on just one big fact table. there are 10 attributes, > >and 3 metrics. > >for exmaple: my_date, col2, col3, col4, col5, col6, col7, col8, col9, > >col10, sum(f1), sum(f2), sum(f2) > > > >2, my_date, col2 is normal columns, and my_date is the partition > >attribute, > >which is date type in HIVE, > > > >3, (col3, col4), (col5, col6), (col7, col8), (col9, col10) are 4 > >hirerarchies, and col3, col5, col7, col9 are parents of its hirarchy > >respectively. > > > >4, the aggregation groups are: my_date, col2, col3, col4, col5, col6, > >col7, > >col8, col9, col10 > > > >5, the rowkeys are: my_date, col2, col3, col4, col5, col6, col7, col8, > >col9, col10 > > > > > >NOTE THAT: in the cube, the range of my_date is between 2013-10-01 and > >2015-03-04 > > > > > >2 kinds of issue as below: > > > >QUERY: > >1, select my_date, sum(f1), sum(f2), sum(f3) from test where > >my_date>='2013-01-01' and my_date<= '2015-01-31' > >note "my_date" is HIVE date type. and the error: Cannot apply '>=' to > >arguments of type '<DATE> >= <CHAR(10)>'. Supported form(s): > >'<COMPARABLE_TYPE> >= <COMPARABLE_TYPE>' while executing SQL, how should I > >modify the SQL? > > > >2, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test group > >by > >my_date, col3, col4, order by my_date desc, col3 desc, col4 desc; > > > >suppose that the actual row count of the result is GREATER THAN the > >default > >limit 50000; > > > >then, the result is: > > > >my_date, col3, col4, sum(f1), sum(f2), sum(f3) > >2013-10-23 > >2013-10-23 > >... > >2013-10-22 > >2013-10-22 > >... > >2013-10-21 > >2013-10-21 > >... > >... > >... > >2013-10-01 > >2013-10-01(note 2013-10-01 is the earliest day of the cube) > > > >totally, there are 50000 rows for the result, but it is obvious that the > >order by and limit doesn't work correctly. > > > >3, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test group > >by > >my_date, col3, col4, order by my_date desc, col3 desc, col4 desc LIMIT 100 > > > >just adding "LIMIT 100" to the tail of the 2nd query above, then, the > >result is: > > > >my_date, col3, col4, sum(f1), sum(f2), sum(f3) > >2013-10-05 > >2013-10-05 > >.... > >2013-10-05 > > > >totally, there are 100 rows. all of which are "2013-10-05", apparently, > >the > >result is not correct as well. > > > >4, select my_date, sum(f1), sum(f2), sum(f3) from test group by my_date > >order by my_date desc LIMIT 100 > >if I just add "my_date" as "aggreation level" as above, the result of the > >4th query is correct. > >
