forward this problem to dev mail list for further discussion, @nce, please also reply to the dev list to engage more participants
---------- Forwarded message ---------- From: nce <[email protected]> Date: Mon, Apr 27, 2015 at 3:50 PM Subject: Re:Re: Re: Kylin issue 363 To: "[email protected]" <[email protected]> Hi, Thanks for your response. I have checked the issue you mentioned in your letter, but I doubt the problem I met is different from his. Please see my query sentence below(it's done on kylin with a built cube): select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and rstmy.theday <='2014-12-31' group by rstmy.GROUPID With this query, I got 2808 row of results, that's correct, I got the same result with HIVE. But the problem came after I add "IN". With the query: select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' , '00010000020012' , '00010000020002' , '00010000020013' , '00010000020028' , '00010000020019' , '00010000020020' , '00010000020024' , '00010000020021' , '00010000020004' , '00010000020016' , '00010000020009' , '00010000020017' , '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014' , '00010000020006' , '00010000020030' , '00010000020029' , '00010000020015' , '00010000020005' , '00010000020027' , '00010000020001' , '00010000020023' , '00010000020003' , '00010000020026' , '00010000020031' , '00010000020025' , '00010000020011' , '00010000020022' ) group by rstmy.GROUPID I got 24 rows of result, but with HIVE, I got 26 rows. After checking the 2 rows, I was sure that kylin lost the 2 rows. My kylin version is 0.6.6, is this because kylin doesn't support "IN" very well? Any fix now? Thanks, George Ni At 2015-04-27 10:29:22, "hongbin ma" <[email protected]> wrote: hi, this might be similar to a previous user's querstion: "get wrong result for the query(both in kyin-0.71 and kylin-0.6x)" (find the context at http://mail-archives.apache.org/mod_mbox/kylin-dev/201503.mbox/thread) in short: *By default if you're making queries on the web client, a mode called "AcceptPartialResults" is enabled, this is a protection mechanism that will only return part of the results to reduce server overhead. Honestly it might hurt the correctness of order by queries. * *If you're seeking 100% correctness, after running the query you will find a notification:* *Note: Current results are partial, please click 'Show All' button to get all results.* *Click the "Show All" button to disable the "AcceptPartialResults" mode, and you'll get a right result.* *Notice "AcceptPartialResults" is only enabled by default at web client, you'll not meet such problems if you're using JDBC, ODBC or standard REST API.* if this does not solve your problem, please paste your log at KYLIN_HOME/logs/kylin.log and let us check more details On Fri, Apr 24, 2015 at 7:47 PM, nce <[email protected]> wrote: > Hi, thanks for your response. I have another question about query in kylin. > > After successfully building the cube, I did my query: > > select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from > report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and > rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' , > '00010000020012' , '00010000020002' , '00010000020013' , '00010000020028' , > '00010000020019' , '00010000020020' , '00010000020024' , '00010000020021' , > '00010000020004' , '00010000020016' , '00010000020009' , '00010000020017' , > '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014' > , '00010000020006' , '00010000020030' , '00010000020029' , > '00010000020015' , '00010000020005' , '00010000020027' , '00010000020001' , > '00010000020023' , '00010000020003' , '00010000020026' , '00010000020031' , > '00010000020025' , '00010000020011' , '00010000020022' ) group by > rstmy.GROUPID order by sum(rstmy.TRANS_AT) desc limit 10 > And given the result as > ID, TRANSAT > 00010000020007,4.189393142999999E7 > 00010000020015,4.077921398E7 > 00010000020001,2.0328372849999998E7 > 00010000020009,1.248375287E7 > 00010000020017,1.1806773620000001E7 > 00010000020002,1.175256495E7 > 00010000020004,1.13462425E7 > 00010000020020,1.006050126E7 > 00010000020006,8660487.07 > 00010000020027,5094202.02 > > Actually, this result is not accurate, different from the hive query > result, the top 9 are right but not the 10th, actually 7421355.34 was > supposed to be 10th not 5094202.02 > > What's more, I delete the ID 00010000020007 within the "IN" lists to get > the following SQL: > select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from > report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and > rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' , > '00010000020012' , '00010000020002' , '00010000020013' , '00010000020028' , > '00010000020019' , '00010000020020' , '00010000020024' , '00010000020021' , > '00010000020004' , '00010000020016' , '00010000020009' , '00010000020017' , > '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014' , > '00010000020006' , '00010000020030' , '00010000020029' , '00010000020015' , > '00010000020005' , '00010000020027' , '00010000020001' , '00010000020023' , > '00010000020003' , '00010000020026' , '00010000020031' , '00010000020025' , > '00010000020011' , '00010000020022' ) group by rstmy.GROUPID order by > sum(rstmy.TRANS_AT) desc limit 10 > > So I guess the top1 will not appear in the result and the 10th will be 9th > this time. But the result was no, shown as follows: > ID,TRANSAT > 00010000020015,4.077921398E7 > 00010000020001,2.0328372849999998E7 > 00010000020009,1.248375287E7 > 00010000020017,1.1806773620000001E7 > 00010000020002,1.175256495E7 > 00010000020004,1.13462425E7 > 00010000020020,1.006050126E7 > 00010000020006,8660487.07 > 00010000020023,6543811.8100000005 > 00010000020027,5094202.02 > > The 10th is still the 10th, but a new 9th is inserted. > Have you ever met bugs like this? If so, is there any fix now? > > Thanks very much! > > George Ni > > > 在 2015-04-24 18:55:44,"hongbin ma" <[email protected]> 写道: > > forward your question to dev list. > > select * is not very well supported in Kylin, because its result is > inaccurate give kylin has only preaggregated data(i.e. cube) > > 2015-04-24 16:25 GMT+08:00 nce <[email protected]>: > >> Hi hongbin, >> >> 我在使用Kylin时遇到了类似 >> https://github.com/KylinOLAP/Kylin/issues/363 >> 的问题,看状态已经closed,可我在github上没能找到对应的patch. >> 请问是由哪个commit fix的?或者代码的root cause在哪里? >> >> Thanks! >> >> George Ni >> >> >> > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone > > > > -- Regards, *Bin Mahone | 马洪宾* Apache Kylin: http://kylin.io Github: https://github.com/binmahone -- Regards, *Bin Mahone | 马洪宾* Apache Kylin: http://kylin.io Github: https://github.com/binmahone
