This is a bug: https://issues.apache.org/jira/browse/KYLIN-1053
As a workaround now you can use dictionary for all non-string dimensions

thanks for your observations and report, this is really helpful


On Tue, Sep 29, 2015 at 5:49 PM, Vadim Semenov <[email protected]> wrote:

> integer,
>
> and the cube was built without using dictionaries.
>
> On September 29, 2015 at 5:45:49 AM, hongbin ma ([email protected])
> wrote:
>
> interesting observation
>
> what it the data type of the dimension?
>
> On Tue, Sep 29, 2015 at 5:40 PM, Vadim Semenov <[email protected]> wrote:
>
> > I found something:
> >
> > I executed a query on one partition with a filter on one dimension with
> > two values (70, 200) and got the following in the logs:
> > http://i.imgur.com/b3zthZ3.png
> >
> > You see the scan range and 0 rows in the result set.
> >
> > I tried the same scan in hbase shell and got 0 rows:
> > scan 'KYLIN_JH6RCA65S7', {STARTROW => "\x00\x00\x00\x00\x00\x00\x01
> > 2015-09-2670\x09\x09\x09\x09", STOPROW => "\x00\x00\x00\x00\x00\x00\x01
> > 2015-09-26200\x09\x09\x09\x00"}
> >
> > then I swapped the start & stop row:
> > scan 'KYLIN_JH6RCA65S7', {STARTROW => "\x00\x00\x00\x00\x00\x00\x01
> > 2015-09-26200\x09\x09\x09", STOPROW => "\x00\x00\x00\x00\x00\x00\x01
> > 2015-09-2670\x09\x09\x09\x09\x00"}
> > and got 659 rows.
> >
> > And just to confirm I changed the order of the stop & start row in
> > HBaseKeyRange and got the results:
> > http://i.imgur.com/mT0qI4I.png
> >
> > On September 28, 2015 at 11:19:41 PM, Li Yang ([email protected]) wrote:
> > I too cannot reproduce this one. Tried IN() on the LSTG_SITE_ID column of
> > TEST_KYLIN_FACT (the test cube used by regression). Everything is good.
> > The query I used:
> >
> > select LSTG_SITE_ID, sum(price) as GMV
> > from test_kylin_fact
> > inner JOIN edw.test_cal_dt as test_cal_dt
> > ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt
> > where test_cal_dt.week_beg_dt between DATE '2013-09-01' and DATE
> > '2013-10-01' and LSTG_SITE_ID in (0, 3, 15, 23, 100)
> > group by LSTG_SITE_ID
> >
> > Meant to be similar to Vadim's query, has a date condition, and the
> > LSTG_SITE_ID is of int type. Tested many combination of the ID values,
> all
> > results are correct...
> >
> > Anyone else tried similar queries on 0.7 or 1.x releases?
> >
> >
> >
> > On Thu, Sep 24, 2015 at 1:28 PM, hongbin ma <[email protected]>
> wrote:
> >
> > > if you could give some sample data only and make sure the issue can be
> > > reproduceable on the sample data
> > >
> > > On Thu, Sep 24, 2015 at 12:28 PM, vipul jhawar <[email protected]
> >
> > > wrote:
> > >
> > > > hi
> > > >
> > > > For the cube
> > > >
> > > > JSON model is
> > > > http://www.jsoneditoronline.org/?id=1da42fe018b14c7522d3937ba81cd37e
> > > > JSON cube is
> > > > http://www.jsoneditoronline.org/?id=97008080361a2388888acd128004753d
> > > >
> > > > On minimal data for the cube, even if we generate it how will we
> share
> > it
> > > > with you ? or you want sample fact table.
> > > > I can show you the kylin query in play on our current cube which is
> > > hosted
> > > > if you want.
> > > >
> > > > Thanks
> > > >
> > > > On Wed, Sep 23, 2015 at 1:54 PM, hongbin ma <[email protected]>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > We cannot reproduce it with our test cases.
> > > > >
> > > > > However, we'd love to help to analyze the problem for you. If
> > possible,
> > > > can
> > > > > you please try to use a minimal cube definition(maybe with a little
> > > > sample
> > > > > data) that will pin-point the issue?
> > > > >
> > > > > The cube desc consist of three parts:
> > > > > 1. Cube Desc (json file)
> > > > > 2. Model Desc (json file)
> > > > > 3. Hive table schema
> > > > >
> > > > > the first two files can be checked directly in kylin web, go to
> > "Cubes"
> > > > > tab, click the cube, and checkout contents in "Json(Cube)" and
> > > > > "Json(Model)"
> > > > >
> > > > >
> > > > > On Wed, Sep 23, 2015 at 3:11 PM, hongbin ma <[email protected]>
> > > > wrote:
> > > > >
> > > > > > hi vipul
> > > > > >
> > > > > > I'm looking into this
> > > > > >
> > > > > > On Wed, Sep 23, 2015 at 3:10 PM, vipul jhawar <
> > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > >> hi
> > > > > >>
> > > > > >> Just wanted to check if someone has had a chance to look at this
> > > case.
> > > > > >>
> > > > > >> Thanks
> > > > > >>
> > > > > >> On Tue, Sep 22, 2015 at 10:33 PM, vipul jhawar <
> > > > [email protected]>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Hi
> > > > > >> >
> > > > > >> > Please let us know if you need more details on this as it is
> > > > affecting
> > > > > >> the
> > > > > >> > results and its not predictable.
> > > > > >> > We are on 0.7.2
> > > > > >> >
> > > > > >> > Thanks
> > > > > >> >
> > > > > >> > On Tue, Sep 22, 2015 at 10:04 AM, Vadim Semenov <_@
> > databuryat.com
> > > >
> > > > > >> wrote:
> > > > > >> >
> > > > > >> >> Hi,
> > > > > >> >>
> > > > > >> >> We've found issues while running some queries:
> > > > > >> >> they return inconsistent results, i.e. in some cases we don't
> > get
> > > > any
> > > > > >> >> rows, in some cases we get some rows but never all that we
> > > expected
> > > > > to
> > > > > >> get.
> > > > > >> >>
> > > > > >> >> I was able to pin-point the queries, so here're the cases:
> > > > > >> >>
> > > > > >> >> 1. SELECT dim, measure FROM table WHERE partition = one
> > partition
> > > > AND
> > > > > >> dim
> > > > > >> >> IN (a,b,c) GROUP BY dim
> > > > > >> >> We get results for a,c only
> > > > > >> >> http://i.imgur.com/SZu6f2E.png
> > > > > >> >>
> > > > > >> >> 2. IN (b)
> > > > > >> >> We get results for b as expected
> > > > > >> >> http://i.imgur.com/8c8UMWj.png
> > > > > >> >>
> > > > > >> >> 3. IN (a,b)
> > > > > >> >> We don't get any results
> > > > > >> >> http://i.imgur.com/qIepe8d.png
> > > > > >> >>
> > > > > >> >> 4. IN (b,c)
> > > > > >> >> We get results for b,c as expected
> > > > > >> >> http://i.imgur.com/Qq6yuuS.png
> > > > > >> >>
> > > > > >> >> We tried to run the queries with acceptPartial=false and with
> > > empty
> > > > > >> cache
> > > > > >> >> and the issues are still the same.
> > > > > >> >>
> > > > > >> >> What should we do to debug this?
> > > > > >> >> What might be causing these issues?
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > >
> > > > > > *Bin Mahone | 马洪宾*
> > > > > > Apache Kylin: http://kylin.io
> > > > > > Github: https://github.com/binmahone
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > >
> > > > > *Bin Mahone | 马洪宾*
> > > > > Apache Kylin: http://kylin.io
> > > > > Github: https://github.com/binmahone
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > *Bin Mahone | 马洪宾*
> > > Apache Kylin: http://kylin.io
> > > Github: https://github.com/binmahone
> > >
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Reply via email to