except for what Luke asked, can you also share which version are you using?
We'll look into this

On Wed, Aug 5, 2015 at 11:08 PM, Luke Han <[email protected]> wrote:

> What's the data type in your hive table of that column?
> We recommend to use same as your measure definition.
>
> For data quality, we have approach to ensure it will 100%
> match with source data, there's H2 comparison in our testing case.
> But yes, let's double check the test case about huge number like your case
>
> Is it possible to share your data and model to us if not sensitive?
> It will be better to using that to do the verify and testing here.
>
> Thanks.
>
>
>
>
>
>
>
>
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Wed, Aug 5, 2015 at 10:57 PM, alex schufo <[email protected]> wrote:
>
> > I have a simple cube with a Fact Table and a Lookup Table. The Fact Table
> > includes some counter. The Lookup Table includes a region property which
> is
> > the highest level of the hierarchy and therefore can accumulate quite a
> lot
> > of data when summing the counter by region.
> >
> > I am doing a query like this:
> >
> > select sum(MY_COUNTER) as tot_count, REGION
> >
> > from FACT_TABLE
> >
> > join LOOKUP_TABLE
> >
> > on FACT_TABLE.FK = LOOKUP_TABLE.PK
> >
> > where date = '2015-01-01'
> >
> > group by REGION
> >
> >
> > I get the following result:
> >
> > -917,164,421 EMEA --> negative result!
> >
> > 777,795,808 AMER
> >
> > 514,879,134 APAC
> >
> > 117,814,485 LATAM
> >
> > I ran the exact same query in Hive and got the following result:
> >
> > 3,381,591,228  EMEA  --> big number but smaller than the BIGINT limit
> >
> > 778,718,563    AMER  --> 922,755 difference with the Kylin result
> >
> > 520,253,610    APAC   --> 5,374,476 difference with the Kylin result
> >
> > 117,913,857    LATAM --> 99,372 difference with the Kylin result
> >
> > Based on that it seems that the limit is the int limit 2^31-1
> > (2,147,483,647) and that my negative result comes from an overflow.
> >
> > In my cube I defined this measure as a bigint:
> >
> > Expression    Param Type    Param Value       Return Type
> >
> > SUM              column            AVAIL_COUNT    bigint
> >
> > The other thing that worries me a bit is that the other numbers are not
> > 100% accurate either (>100,000,000 difference!)
> >
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Reply via email to