A few things I noted

- Your DATE column perhaps is not of type DATE, but a string or varchar.
Cannot translate into dictionary ID only happens to string types. Suggest
correct the column type then Kylin will index the data better.

- Seems cuboid (DATE, REGION) is not pre-calculated, otherwise the query
should return quickly (since not many rows to scan). Guess your cube is not
defined in the most appropriate way.

We can provide more concrete suggestions if you could share the cube
definition in JSON.

Cheers
Yang

On Fri, Aug 14, 2015 at 11:13 AM, Luke Han <[email protected]> wrote:

> Hi Alex,
>    Could you please help to open JIRA for such issue for well tracking and
> patchable to fix.
>
>    Thanks.
>
> Luke
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Thu, Aug 13, 2015 at 10:35 PM, alex schufo <[email protected]>
> wrote:
>
> > I am running the following query:
> >
> > select sum(MY_COUNTER) as tot_count, REGION
> >
> > from FACT_TABLE
> >
> > join LOOKUP_TABLE
> >
> > on FACT_TABLE.FK <http://fact_table.fk/> = LOOKUP_TABLE.PK
> > <http://lookup_table.pk/>
> >
> > where date = '2015-01-01'
> >
> > group by REGION
> >
> > (for reference, this is the same query as "Negative number in SUM result
> > and Kylin results not matching exactly Hive results")
> >
> > On a cube that contains 6 months of data.
> >
> > I see the following error in the logs:
> >
> > [http-bio-7070-exec-3]:[2015-08-13
> >
> >
> 03:23:17,008][ERROR][org.apache.kylin.cube.kv.RowKeyColumnIO.writeColumn(RowKeyColumnIO.java:80)]
> > - Can't translate value 2015-01-01 to dictionary ID, roundingFlag 0.
> Using
> > default value \xFF
> >
> > and the query takes forever to process and finally ends with a
> > CallTimeoutException error.
> >
> > The same query for the same cube but with only 15 days of data was giving
> > me a result.
> >
> > If I add a more restrictive "where" clause, such as "FACT_TABLE.someID =
> 1"
> > I get a result in a few seconds.
> >
> > But actually I noticed that with the "FACT_TABLE.someID = 1" the query
> > takes about the same time regardless of whether I add date = '2015-01-01'
> > or not.
> >
> > My guess would be that "where date = '2015-01-01'" is not recognized as
> > expected and that all the 6 months are scanned and then the output
> filtered
> > to return only that specific date.
> >
>

Reply via email to