I found something:
I executed a query on one partition with a filter on one dimension with two
values (70, 200) and got the following in the logs:
http://i.imgur.com/b3zthZ3.png
You see the scan range and 0 rows in the result set.
I tried the same scan in hbase shell and got 0 rows:
scan 'KYLIN_JH6RCA65S7', {STARTROW => "\x00\x00\x00\x00\x00\x00\x01
2015-09-2670\x09\x09\x09\x09", STOPROW => "\x00\x00\x00\x00\x00\x00\x01
2015-09-26200\x09\x09\x09\x00"}
then I swapped the start & stop row:
scan 'KYLIN_JH6RCA65S7', {STARTROW => "\x00\x00\x00\x00\x00\x00\x01
2015-09-26200\x09\x09\x09", STOPROW => "\x00\x00\x00\x00\x00\x00\x01
2015-09-2670\x09\x09\x09\x09\x00"}
and got 659 rows.
And just to confirm I changed the order of the stop & start row in
HBaseKeyRange and got the results:
http://i.imgur.com/mT0qI4I.png
On September 28, 2015 at 11:19:41 PM, Li Yang ([email protected]) wrote:
I too cannot reproduce this one. Tried IN() on the LSTG_SITE_ID column of
TEST_KYLIN_FACT (the test cube used by regression). Everything is good.
The query I used:
select LSTG_SITE_ID, sum(price) as GMV
from test_kylin_fact
inner JOIN edw.test_cal_dt as test_cal_dt
ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt
where test_cal_dt.week_beg_dt between DATE '2013-09-01' and DATE
'2013-10-01' and LSTG_SITE_ID in (0, 3, 15, 23, 100)
group by LSTG_SITE_ID
Meant to be similar to Vadim's query, has a date condition, and the
LSTG_SITE_ID is of int type. Tested many combination of the ID values, all
results are correct...
Anyone else tried similar queries on 0.7 or 1.x releases?
On Thu, Sep 24, 2015 at 1:28 PM, hongbin ma <[email protected]> wrote:
> if you could give some sample data only and make sure the issue can be
> reproduceable on the sample data
>
> On Thu, Sep 24, 2015 at 12:28 PM, vipul jhawar <[email protected]>
> wrote:
>
> > hi
> >
> > For the cube
> >
> > JSON model is
> > http://www.jsoneditoronline.org/?id=1da42fe018b14c7522d3937ba81cd37e
> > JSON cube is
> > http://www.jsoneditoronline.org/?id=97008080361a2388888acd128004753d
> >
> > On minimal data for the cube, even if we generate it how will we share it
> > with you ? or you want sample fact table.
> > I can show you the kylin query in play on our current cube which is
> hosted
> > if you want.
> >
> > Thanks
> >
> > On Wed, Sep 23, 2015 at 1:54 PM, hongbin ma <[email protected]>
> wrote:
> >
> > > Hi,
> > >
> > > We cannot reproduce it with our test cases.
> > >
> > > However, we'd love to help to analyze the problem for you. If possible,
> > can
> > > you please try to use a minimal cube definition(maybe with a little
> > sample
> > > data) that will pin-point the issue?
> > >
> > > The cube desc consist of three parts:
> > > 1. Cube Desc (json file)
> > > 2. Model Desc (json file)
> > > 3. Hive table schema
> > >
> > > the first two files can be checked directly in kylin web, go to "Cubes"
> > > tab, click the cube, and checkout contents in "Json(Cube)" and
> > > "Json(Model)"
> > >
> > >
> > > On Wed, Sep 23, 2015 at 3:11 PM, hongbin ma <[email protected]>
> > wrote:
> > >
> > > > hi vipul
> > > >
> > > > I'm looking into this
> > > >
> > > > On Wed, Sep 23, 2015 at 3:10 PM, vipul jhawar <
> [email protected]>
> > > > wrote:
> > > >
> > > >> hi
> > > >>
> > > >> Just wanted to check if someone has had a chance to look at this
> case.
> > > >>
> > > >> Thanks
> > > >>
> > > >> On Tue, Sep 22, 2015 at 10:33 PM, vipul jhawar <
> > [email protected]>
> > > >> wrote:
> > > >>
> > > >> > Hi
> > > >> >
> > > >> > Please let us know if you need more details on this as it is
> > affecting
> > > >> the
> > > >> > results and its not predictable.
> > > >> > We are on 0.7.2
> > > >> >
> > > >> > Thanks
> > > >> >
> > > >> > On Tue, Sep 22, 2015 at 10:04 AM, Vadim Semenov <[email protected]
> > > >> >
> >
> > > >> wrote:
> > > >> >
> > > >> >> Hi,
> > > >> >>
> > > >> >> We've found issues while running some queries:
> > > >> >> they return inconsistent results, i.e. in some cases we don't get
> > any
> > > >> >> rows, in some cases we get some rows but never all that we
> expected
> > > to
> > > >> get.
> > > >> >>
> > > >> >> I was able to pin-point the queries, so here're the cases:
> > > >> >>
> > > >> >> 1. SELECT dim, measure FROM table WHERE partition = one partition
> > AND
> > > >> dim
> > > >> >> IN (a,b,c) GROUP BY dim
> > > >> >> We get results for a,c only
> > > >> >> http://i.imgur.com/SZu6f2E.png
> > > >> >>
> > > >> >> 2. IN (b)
> > > >> >> We get results for b as expected
> > > >> >> http://i.imgur.com/8c8UMWj.png
> > > >> >>
> > > >> >> 3. IN (a,b)
> > > >> >> We don't get any results
> > > >> >> http://i.imgur.com/qIepe8d.png
> > > >> >>
> > > >> >> 4. IN (b,c)
> > > >> >> We get results for b,c as expected
> > > >> >> http://i.imgur.com/Qq6yuuS.png
> > > >> >>
> > > >> >> We tried to run the queries with acceptPartial=false and with
> empty
> > > >> cache
> > > >> >> and the issues are still the same.
> > > >> >>
> > > >> >> What should we do to debug this?
> > > >> >> What might be causing these issues?
> > > >> >
> > > >> >
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > >
> > > > *Bin Mahone | 马洪宾*
> > > > Apache Kylin: http://kylin.io
> > > > Github: https://github.com/binmahone
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > *Bin Mahone | 马洪宾*
> > > Apache Kylin: http://kylin.io
> > > Github: https://github.com/binmahone
> > >
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>