I'll try reproduce locally first. On Mon, Apr 27, 2015 at 5:10 PM, hongbin ma <[email protected]> wrote:
> forward this problem to dev mail list for further discussion, > @nce, please also reply to the dev list to engage more participants > > ---------- Forwarded message ---------- > From: nce <[email protected]> > Date: Mon, Apr 27, 2015 at 3:50 PM > Subject: Re:Re: Re: Kylin issue 363 > To: "[email protected]" <[email protected]> > > > Hi, > Thanks for your response. I have checked the issue you mentioned in your > letter, but I doubt the problem I met is different from his. > Please see my query sentence below(it's done on kylin with a built cube): > select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from > report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and > rstmy.theday <='2014-12-31' group by rstmy.GROUPID > With this query, I got 2808 row of results, that's correct, I got the same > result with HIVE. > But the problem came after I add "IN". With the query: > select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from > report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and > rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' , > '00010000020012' , '00010000020002' , '00010000020013' , '00010000020028' , > '00010000020019' , '00010000020020' , '00010000020024' , '00010000020021' , > '00010000020004' , '00010000020016' , '00010000020009' , '00010000020017' , > '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014' > , '00010000020006' , '00010000020030' , '00010000020029' , > '00010000020015' , '00010000020005' , '00010000020027' , '00010000020001' , > '00010000020023' , '00010000020003' , '00010000020026' , '00010000020031' , > '00010000020025' , '00010000020011' , '00010000020022' ) group by > rstmy.GROUPID > I got 24 rows of result, but with HIVE, I got 26 rows. After checking the 2 > rows, I was sure that kylin lost the 2 rows. > My kylin version is 0.6.6, is this because kylin doesn't support "IN" very > well? Any fix now? > > Thanks, > > George Ni > > > At 2015-04-27 10:29:22, "hongbin ma" <[email protected]> wrote: > > hi, > > this might be similar to a previous user's querstion: "get wrong result for > the query(both in kyin-0.71 and kylin-0.6x)" > (find the context at > http://mail-archives.apache.org/mod_mbox/kylin-dev/201503.mbox/thread) > > in short: > *By default if you're making queries on the web client, a mode called > "AcceptPartialResults" is enabled, this is a protection mechanism that will > only return part of the results to reduce server overhead. Honestly it > might hurt the correctness of order by queries. * > > *If you're seeking 100% correctness, after running the query you will find > a notification:* > *Note: Current results are partial, please click 'Show All' button to get > all results.* > *Click the "Show All" button to disable the "AcceptPartialResults" mode, > and you'll get a right result.* > > *Notice "AcceptPartialResults" is only enabled by default at web client, > you'll not meet such problems if you're using JDBC, ODBC or standard REST > API.* > > if this does not solve your problem, please paste your log at > KYLIN_HOME/logs/kylin.log and let us check more details > > > On Fri, Apr 24, 2015 at 7:47 PM, nce <[email protected]> wrote: > > > Hi, thanks for your response. I have another question about query in > kylin. > > > > After successfully building the cube, I did my query: > > > > select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from > > report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and > > rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' , > > '00010000020012' , '00010000020002' , '00010000020013' , > '00010000020028' , > > '00010000020019' , '00010000020020' , '00010000020024' , > '00010000020021' , > > '00010000020004' , '00010000020016' , '00010000020009' , > '00010000020017' , > > '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014' > > , '00010000020006' , '00010000020030' , '00010000020029' , > > '00010000020015' , '00010000020005' , '00010000020027' , > '00010000020001' , > > '00010000020023' , '00010000020003' , '00010000020026' , > '00010000020031' , > > '00010000020025' , '00010000020011' , '00010000020022' ) group by > > rstmy.GROUPID order by sum(rstmy.TRANS_AT) desc limit 10 > > And given the result as > > ID, TRANSAT > > 00010000020007,4.189393142999999E7 > > 00010000020015,4.077921398E7 > > 00010000020001,2.0328372849999998E7 > > 00010000020009,1.248375287E7 > > 00010000020017,1.1806773620000001E7 > > 00010000020002,1.175256495E7 > > 00010000020004,1.13462425E7 > > 00010000020020,1.006050126E7 > > 00010000020006,8660487.07 > > 00010000020027,5094202.02 > > > > Actually, this result is not accurate, different from the hive query > > result, the top 9 are right but not the 10th, actually 7421355.34 was > > supposed to be 10th not 5094202.02 > > > > What's more, I delete the ID 00010000020007 within the "IN" lists to get > > the following SQL: > > select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from > > report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and > > rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' , > > '00010000020012' , '00010000020002' , '00010000020013' , > '00010000020028' , > > '00010000020019' , '00010000020020' , '00010000020024' , > '00010000020021' , > > '00010000020004' , '00010000020016' , '00010000020009' , > '00010000020017' , > > '00010000020018' , '00010000020008' , '00010000020010' , > '00010000020014' , > > '00010000020006' , '00010000020030' , '00010000020029' , > '00010000020015' , > > '00010000020005' , '00010000020027' , '00010000020001' , > '00010000020023' , > > '00010000020003' , '00010000020026' , '00010000020031' , > '00010000020025' , > > '00010000020011' , '00010000020022' ) group by rstmy.GROUPID order by > > sum(rstmy.TRANS_AT) desc limit 10 > > > > So I guess the top1 will not appear in the result and the 10th will be > 9th > > this time. But the result was no, shown as follows: > > ID,TRANSAT > > 00010000020015,4.077921398E7 > > 00010000020001,2.0328372849999998E7 > > 00010000020009,1.248375287E7 > > 00010000020017,1.1806773620000001E7 > > 00010000020002,1.175256495E7 > > 00010000020004,1.13462425E7 > > 00010000020020,1.006050126E7 > > 00010000020006,8660487.07 > > 00010000020023,6543811.8100000005 > > 00010000020027,5094202.02 > > > > The 10th is still the 10th, but a new 9th is inserted. > > Have you ever met bugs like this? If so, is there any fix now? > > > > Thanks very much! > > > > George Ni > > > > > > 在 2015-04-24 18:55:44,"hongbin ma" <[email protected]> 写道: > > > > forward your question to dev list. > > > > select * is not very well supported in Kylin, because its result is > > inaccurate give kylin has only preaggregated data(i.e. cube) > > > > 2015-04-24 16:25 GMT+08:00 nce <[email protected]>: > > > >> Hi hongbin, > >> > >> 我在使用Kylin时遇到了类似 > >> https://github.com/KylinOLAP/Kylin/issues/363 > >> 的问题,看状态已经closed,可我在github上没能找到对应的patch. > >> 请问是由哪个commit fix的?或者代码的root cause在哪里? > >> > >> Thanks! > >> > >> George Ni > >> > >> > >> > > > > > > -- > > Regards, > > > > *Bin Mahone | 马洪宾* > > Apache Kylin: http://kylin.io > > Github: https://github.com/binmahone > > > > > > > > > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone > > > > > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone >
