I am running the following query:

select sum(MY_COUNTER) as tot_count, REGION

from FACT_TABLE

join LOOKUP_TABLE

on FACT_TABLE.FK <http://fact_table.fk/> = LOOKUP_TABLE.PK
<http://lookup_table.pk/>

where date = '2015-01-01'

group by REGION

(for reference, this is the same query as "Negative number in SUM result
and Kylin results not matching exactly Hive results")

On a cube that contains 6 months of data.

I see the following error in the logs:

[http-bio-7070-exec-3]:[2015-08-13
03:23:17,008][ERROR][org.apache.kylin.cube.kv.RowKeyColumnIO.writeColumn(RowKeyColumnIO.java:80)]
- Can't translate value 2015-01-01 to dictionary ID, roundingFlag 0. Using
default value \xFF

and the query takes forever to process and finally ends with a
CallTimeoutException error.

The same query for the same cube but with only 15 days of data was giving
me a result.

If I add a more restrictive "where" clause, such as "FACT_TABLE.someID = 1"
I get a result in a few seconds.

But actually I noticed that with the "FACT_TABLE.someID = 1" the query
takes about the same time regardless of whether I add date = '2015-01-01'
or not.

My guess would be that "where date = '2015-01-01'" is not recognized as
expected and that all the 6 months are scanned and then the output filtered
to return only that specific date.

Reply via email to