I'll try reproduce locally first.

On Mon, Apr 27, 2015 at 5:10 PM, hongbin ma <[email protected]> wrote:

> forward this problem to dev mail list for further discussion,
> @nce, please also reply to the dev list to engage more participants
>
> ---------- Forwarded message ----------
> From: nce <[email protected]>
> Date: Mon, Apr 27, 2015 at 3:50 PM
> Subject: Re:Re: Re: Kylin issue 363
> To: "[email protected]" <[email protected]>
>
>
> Hi,
> Thanks for your response. I have checked the issue you mentioned in your
> letter, but I doubt the problem I met is different from his.
> Please see my query sentence below(it's done on kylin with a built cube):
> select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from
> report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and
> rstmy.theday <='2014-12-31'  group by rstmy.GROUPID
> With this query, I got 2808 row of results, that's correct, I got the same
> result with HIVE.
> But the problem came after I add "IN". With the query:
> select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from
> report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and
> rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' ,
> '00010000020012' , '00010000020002' , '00010000020013' , '00010000020028' ,
> '00010000020019' , '00010000020020' , '00010000020024' , '00010000020021' ,
> '00010000020004' , '00010000020016' , '00010000020009' , '00010000020017' ,
> '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014'
>  , '00010000020006' , '00010000020030' , '00010000020029' ,
> '00010000020015' , '00010000020005' , '00010000020027' , '00010000020001' ,
> '00010000020023' , '00010000020003' , '00010000020026' , '00010000020031' ,
> '00010000020025' , '00010000020011' , '00010000020022' ) group by
> rstmy.GROUPID
> I got 24 rows of result, but with HIVE, I got 26 rows. After checking the 2
> rows, I was sure that kylin lost the 2 rows.
> My kylin version is 0.6.6, is this because kylin doesn't support "IN" very
> well? Any fix now?
>
> Thanks,
>
> George Ni
>
>
> At 2015-04-27 10:29:22, "hongbin ma" <[email protected]> wrote:
>
> hi,
>
> this might be similar to a previous user's querstion: "get wrong result for
> the query(both in kyin-0.71 and kylin-0.6x)"
> (find the context at
> http://mail-archives.apache.org/mod_mbox/kylin-dev/201503.mbox/thread)
>
> in short:
> *By default if you're making queries on the web client, a mode called
> "AcceptPartialResults" is enabled, this is a protection mechanism that will
> only return part of the results to reduce server overhead. Honestly it
> might hurt the correctness of order by queries. *
>
> *If you're seeking 100% correctness, after running the query you will find
> a notification:*
> *Note: Current results are partial, please click 'Show All' button to get
> all results.*
> *Click the "Show All" button to disable the "AcceptPartialResults" mode,
> and you'll get a right result.*
>
> *Notice "AcceptPartialResults" is only enabled by default at web client,
> you'll not meet such problems if you're using JDBC, ODBC or standard REST
> API.*
>
> if this does not solve your problem, please paste your log at
> KYLIN_HOME/logs/kylin.log and let us check more details
>
>
> On Fri, Apr 24, 2015 at 7:47 PM, nce <[email protected]> wrote:
>
> > Hi, thanks for your response. I have another question about query in
> kylin.
> >
> > After successfully building the cube, I did my query:
> >
> > select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from
> > report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and
> > rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' ,
> > '00010000020012' , '00010000020002' , '00010000020013' ,
> '00010000020028' ,
> > '00010000020019' , '00010000020020' , '00010000020024' ,
> '00010000020021' ,
> > '00010000020004' , '00010000020016' , '00010000020009' ,
> '00010000020017' ,
> > '00010000020018' , '00010000020008' , '00010000020010' , '00010000020014'
> >  , '00010000020006' , '00010000020030' , '00010000020029' ,
> > '00010000020015' , '00010000020005' , '00010000020027' ,
> '00010000020001' ,
> > '00010000020023' , '00010000020003' , '00010000020026' ,
> '00010000020031' ,
> > '00010000020025' , '00010000020011' , '00010000020022' ) group by
> > rstmy.GROUPID order by sum(rstmy.TRANS_AT) desc limit 10
> > And given the result as
> > ID,                         TRANSAT
> > 00010000020007,4.189393142999999E7
> > 00010000020015,4.077921398E7
> > 00010000020001,2.0328372849999998E7
> > 00010000020009,1.248375287E7
> > 00010000020017,1.1806773620000001E7
> > 00010000020002,1.175256495E7
> > 00010000020004,1.13462425E7
> > 00010000020020,1.006050126E7
> > 00010000020006,8660487.07
> > 00010000020027,5094202.02
> >
> > Actually, this result is not accurate, different from the hive query
> > result, the top 9 are right but not the 10th, actually 7421355.34 was
> > supposed to be 10th not 5094202.02
> >
> > What's more, I delete the ID  00010000020007 within the "IN" lists to get
> > the following SQL:
> > select rstmy.GROUPID as id,sum(rstmy.TRANS_AT) as transAt from
> > report_group_day_yoy as rstmy where rstmy.theday >= '2014-12-21' and
> > rstmy.theday <='2014-12-31' and rstmy.GROUPID in ('00010000020033' ,
> > '00010000020012' , '00010000020002' , '00010000020013' ,
> '00010000020028' ,
> > '00010000020019' , '00010000020020' , '00010000020024' ,
> '00010000020021' ,
> > '00010000020004' , '00010000020016' , '00010000020009' ,
> '00010000020017' ,
> > '00010000020018' , '00010000020008' , '00010000020010' ,
> '00010000020014' ,
> > '00010000020006' , '00010000020030' , '00010000020029' ,
> '00010000020015' ,
> > '00010000020005' , '00010000020027' , '00010000020001' ,
> '00010000020023' ,
> > '00010000020003' , '00010000020026' , '00010000020031' ,
> '00010000020025' ,
> > '00010000020011' , '00010000020022' ) group by rstmy.GROUPID order by
> >  sum(rstmy.TRANS_AT) desc limit 10
> >
> > So I guess the top1 will not appear in the result and the 10th will be
> 9th
> > this time. But the result was no, shown as follows:
> > ID,TRANSAT
> > 00010000020015,4.077921398E7
> > 00010000020001,2.0328372849999998E7
> > 00010000020009,1.248375287E7
> > 00010000020017,1.1806773620000001E7
> > 00010000020002,1.175256495E7
> > 00010000020004,1.13462425E7
> > 00010000020020,1.006050126E7
> > 00010000020006,8660487.07
> > 00010000020023,6543811.8100000005
> > 00010000020027,5094202.02
> >
> > The 10th is still the 10th, but a new 9th is inserted.
> > Have you ever met bugs like this? If so, is there any fix now?
> >
> > Thanks very much!
> >
> > George Ni
> >
> >
> > 在 2015-04-24 18:55:44,"hongbin ma" <[email protected]> 写道:
> >
> > forward your question to dev list.
> >
> > select * is not very well supported in Kylin, because its result is
> > inaccurate give kylin has only preaggregated data(i.e. cube)
> >
> > 2015-04-24 16:25 GMT+08:00 nce <[email protected]>:
> >
> >> Hi hongbin,
> >>
> >> 我在使用Kylin时遇到了类似
> >> https://github.com/KylinOLAP/Kylin/issues/363
> >> 的问题,看状态已经closed,可我在github上没能找到对应的patch.
> >> 请问是由哪个commit fix的?或者代码的root cause在哪里?
> >>
> >> Thanks!
> >>
> >> George Ni
> >>
> >>
> >>
> >
> >
> > --
> > Regards,
> >
> > *Bin Mahone | 马洪宾*
> > Apache Kylin: http://kylin.io
> > Github: https://github.com/binmahone
> >
> >
> >
> >
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>
>
>
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

Reply via email to