forward this problem to dev mail list for further discussion,
@nce, please also reply to the dev list to engage more participants

---------- Forwarded message ----------
From: nce <[email protected]>
Date: Mon, Apr 27, 2015 at 3:50 PM
Subject: Re:Re: Re: Kylin issue 363
To: "[email protected]" <[email protected]>


Hi,
Thanks for your response. I have checked the issue you mentioned in your
letter, but I doubt the problem I met is different from his.
Please see my query sentence below(it's done on kylin with a built cube):
select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from
report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and
rstmy.theday <='2014-12-31'  group by rstmy.GROUPID
With this query, I got 2808 row of results, that's correct, I got the same
result with HIVE.
But the problem came after I add "IN". With the query:
select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from
report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and
rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' ,
'00010000020012' , '00010000020002' , '00010000020013' , '00010000020028' ,
'00010000020019' , '00010000020020' , '00010000020024' , '00010000020021' ,
'00010000020004' , '00010000020016' , '00010000020009' , '00010000020017' ,
'00010000020018' , '00010000020008' , '00010000020010' , '00010000020014'
 , '00010000020006' , '00010000020030' , '00010000020029' ,
'00010000020015' , '00010000020005' , '00010000020027' , '00010000020001' ,
'00010000020023' , '00010000020003' , '00010000020026' , '00010000020031' ,
'00010000020025' , '00010000020011' , '00010000020022' ) group by
rstmy.GROUPID
I got 24 rows of result, but with HIVE, I got 26 rows. After checking the 2
rows, I was sure that kylin lost the 2 rows.
My kylin version is 0.6.6, is this because kylin doesn't support "IN" very
well? Any fix now?

Thanks,

George Ni


At 2015-04-27 10:29:22, "hongbin ma" <[email protected]> wrote:

hi,

this might be similar to a previous user's querstion: "get wrong result for
the query(both in kyin-0.71 and kylin-0.6x)"
(find the context at
http://mail-archives.apache.org/mod_mbox/kylin-dev/201503.mbox/thread)

in short:
*By default if you're making queries on the web client, a mode called
"AcceptPartialResults" is enabled, this is a protection mechanism that will
only return part of the results to reduce server overhead. Honestly it
might hurt the correctness of order by queries. *

*If you're seeking 100% correctness, after running the query you will find
a notification:*
*Note: Current results are partial, please click 'Show All' button to get
all results.*
*Click the "Show All" button to disable the "AcceptPartialResults" mode,
and you'll get a right result.*

*Notice "AcceptPartialResults" is only enabled by default at web client,
you'll not meet such problems if you're using JDBC, ODBC or standard REST
API.*

if this does not solve your problem, please paste your log at
KYLIN_HOME/logs/kylin.log and let us check more details


On Fri, Apr 24, 2015 at 7:47 PM, nce <[email protected]> wrote:

> Hi, thanks for your response. I have another question about query in kylin.
>
> After successfully building the cube, I did my query:
>
> select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from
> report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and
> rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' ,
> '00010000020012' , '00010000020002' , '00010000020013' , '00010000020028' ,
> '00010000020019' , '00010000020020' , '00010000020024' , '00010000020021' ,
> '00010000020004' , '00010000020016' , '00010000020009' , '00010000020017' ,
> '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014'
>  , '00010000020006' , '00010000020030' , '00010000020029' ,
> '00010000020015' , '00010000020005' , '00010000020027' , '00010000020001' ,
> '00010000020023' , '00010000020003' , '00010000020026' , '00010000020031' ,
> '00010000020025' , '00010000020011' , '00010000020022' ) group by
> rstmy.GROUPID order by sum(rstmy.TRANS_AT) desc limit 10
> And given the result as
> ID,                         TRANSAT
> 00010000020007,4.189393142999999E7
> 00010000020015,4.077921398E7
> 00010000020001,2.0328372849999998E7
> 00010000020009,1.248375287E7
> 00010000020017,1.1806773620000001E7
> 00010000020002,1.175256495E7
> 00010000020004,1.13462425E7
> 00010000020020,1.006050126E7
> 00010000020006,8660487.07
> 00010000020027,5094202.02
>
> Actually, this result is not accurate, different from the hive query
> result, the top 9 are right but not the 10th, actually 7421355.34 was
> supposed to be 10th not 5094202.02
>
> What's more, I delete the ID  00010000020007 within the "IN" lists to get
> the following SQL:
> select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from
> report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and
> rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' ,
> '00010000020012' , '00010000020002' , '00010000020013' , '00010000020028' ,
> '00010000020019' , '00010000020020' , '00010000020024' , '00010000020021' ,
> '00010000020004' , '00010000020016' , '00010000020009' , '00010000020017' ,
> '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014' ,
> '00010000020006' , '00010000020030' , '00010000020029' , '00010000020015' ,
> '00010000020005' , '00010000020027' , '00010000020001' , '00010000020023' ,
> '00010000020003' , '00010000020026' , '00010000020031' , '00010000020025' ,
> '00010000020011' , '00010000020022' ) group by rstmy.GROUPID order by
>  sum(rstmy.TRANS_AT) desc limit 10
>
> So I guess the top1 will not appear in the result and the 10th will be 9th
> this time. But the result was no, shown as follows:
> ID,TRANSAT
> 00010000020015,4.077921398E7
> 00010000020001,2.0328372849999998E7
> 00010000020009,1.248375287E7
> 00010000020017,1.1806773620000001E7
> 00010000020002,1.175256495E7
> 00010000020004,1.13462425E7
> 00010000020020,1.006050126E7
> 00010000020006,8660487.07
> 00010000020023,6543811.8100000005
> 00010000020027,5094202.02
>
> The 10th is still the 10th, but a new 9th is inserted.
> Have you ever met bugs like this? If so, is there any fix now?
>
> Thanks very much!
>
> George Ni
>
>
> 在 2015-04-24 18:55:44,"hongbin ma" <[email protected]> 写道:
>
> forward your question to dev list.
>
> select * is not very well supported in Kylin, because its result is
> inaccurate give kylin has only preaggregated data(i.e. cube)
>
> 2015-04-24 16:25 GMT+08:00 nce <[email protected]>:
>
>> Hi hongbin,
>>
>> 我在使用Kylin时遇到了类似
>> https://github.com/KylinOLAP/Kylin/issues/363
>> 的问题,看状态已经closed,可我在github上没能找到对应的patch.
>> 请问是由哪个commit fix的?或者代码的root cause在哪里?
>>
>> Thanks!
>>
>> George Ni
>>
>>
>>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>
>
>
>


-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone






-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Reply via email to