Sorry hongbin, it's my fault, for the 2 question, the row count is LESS
THAN 50000

2015-03-05 11:28 GMT+08:00 dong wang <[email protected]>:

> HI hongbin,
> 1, yes, as you guessed,  the result of the above query is CORRECT if we
> use ASC order, thus, the issue only occurs when DESC order.
> 2, the total row count of the result is GREATER THAN 50000 definitely,
> even much more
>
> 2015-03-05 11:06 GMT+08:00 hongbin ma <[email protected]>:
>
>> for #2 to #4, will Kylin return correct answers if order by ascending
>> order?
>> And what it the total result count of #4?
>>
>> On Thu, Mar 5, 2015 at 10:56 AM, Zhou, Qianhao <[email protected]> wrote:
>>
>> > #1 when add condition for date type, you should add date before the time
>> > string
>> >
>> > the sql should be like:
>> > select my_date, sum(f1), sum(f2), sum(f3) from test where
>> > my_date >= date'2013-01-01' and my_date <= date'2015-01-31'
>> >
>> >
>> >
>> > Best Regard
>> > Zhou QianHao
>> >
>> >
>> >
>> >
>> >
>> > On 3/5/15, 10:40 AM, "dong wang" <[email protected]> wrote:
>> >
>> > >Hi all, I have experienced the wrong result for quite a long time since
>> > >kylin-0.6x, thus, really hope that someone can do a help.
>> > >
>> > >CUBE:
>> > >1, the cube is based on just one big fact table. there are 10
>> attributes,
>> > >and 3 metrics.
>> > >for exmaple: my_date, col2, col3, col4, col5, col6, col7, col8, col9,
>> > >col10, sum(f1), sum(f2), sum(f2)
>> > >
>> > >2, my_date, col2 is normal columns, and my_date is the partition
>> > >attribute,
>> > >which is date type in HIVE,
>> > >
>> > >3, (col3, col4), (col5, col6), (col7, col8), (col9, col10) are 4
>> > >hirerarchies, and col3, col5, col7, col9 are parents of its hirarchy
>> > >respectively.
>> > >
>> > >4, the aggregation groups are: my_date, col2, col3, col4, col5, col6,
>> > >col7,
>> > >col8, col9, col10
>> > >
>> > >5, the rowkeys are: my_date, col2, col3, col4, col5, col6, col7, col8,
>> > >col9, col10
>> > >
>> > >
>> > >NOTE THAT: in the cube, the range of my_date is between 2013-10-01 and
>> > >2015-03-04
>> > >
>> > >
>> > >2 kinds of issue as below:
>> > >
>> > >QUERY:
>> > >1, select my_date, sum(f1), sum(f2), sum(f3) from test where
>> > >my_date>='2013-01-01' and my_date<= '2015-01-31'
>> > >note "my_date" is HIVE date type. and the error: Cannot apply '>=' to
>> > >arguments of type '<DATE> >= <CHAR(10)>'. Supported form(s):
>> > >'<COMPARABLE_TYPE> >= <COMPARABLE_TYPE>' while executing SQL, how
>> should I
>> > >modify the SQL?
>> > >
>> > >2, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test
>> group
>> > >by
>> > >my_date, col3, col4, order by my_date desc, col3 desc, col4 desc;
>> > >
>> > >suppose that the actual row count of the result is GREATER THAN the
>> > >default
>> > >limit 50000;
>> > >
>> > >then, the result is:
>> > >
>> > >my_date, col3, col4, sum(f1), sum(f2), sum(f3)
>> > >2013-10-23
>> > >2013-10-23
>> > >...
>> > >2013-10-22
>> > >2013-10-22
>> > >...
>> > >2013-10-21
>> > >2013-10-21
>> > >...
>> > >...
>> > >...
>> > >2013-10-01
>> > >2013-10-01(note 2013-10-01 is the earliest day of the cube)
>> > >
>> > >totally, there are 50000 rows for the result,  but it is obvious that
>> the
>> > >order by and limit doesn't work correctly.
>> > >
>> > >3, select my_date, col3, col4, sum(f1), sum(f2), sum(f3) from test
>> group
>> > >by
>> > >my_date, col3, col4, order by my_date desc, col3 desc, col4 desc LIMIT
>> 100
>> > >
>> > >just adding "LIMIT 100" to the tail of the 2nd query above, then, the
>> > >result is:
>> > >
>> > >my_date, col3, col4, sum(f1), sum(f2), sum(f3)
>> > >2013-10-05
>> > >2013-10-05
>> > >....
>> > >2013-10-05
>> > >
>> > >totally, there are 100 rows. all of which are "2013-10-05", apparently,
>> > >the
>> > >result is not correct as well.
>> > >
>> > >4, select my_date, sum(f1), sum(f2), sum(f3) from test group by my_date
>> > >order by my_date desc LIMIT 100
>> > >if I just add "my_date" as "aggreation level" as above, the result of
>> the
>> > >4th query is correct.
>> >
>> >
>>
>
>

Reply via email to