Re: On improving WHEN statements performance on other columns

Julian Hyde Mon, 06 Jul 2015 14:50:24 -0700

Does your use case look like

   …
   WHERE (CASE
                   WHEN condition1 THEN constant1
                   WHEN condition2 THEN constant2 …
                   END ) = constant1


If so, https://issues.apache.org/jira/browse/CALCITE-727 may help. (The fix is 
not in current Kylin, but maybe it could be in within a month or so.)

Julian

On Jul 6, 2015, at 2:49 AM, Luca Costabello <[email protected]> wrote:

> Hello all,
> 
> In my adoption scenario (~50 M records) I must execute queries with WHEN
> statements. Such statements include EQ or IN operators and are not defined
> on rowkeys.
> 
> Unfortunately, the lack of secondary indexes in HBase determines response
> times that go well above 1 minute. While this can be acceptable under many
> circumstances, it severely degrades the performance of the system I have
> built over Kylin (it is my understanding that each EQ condition or IN
> element determines a HBase full scan).
> 
> I would like to know if someone have come up with a solution or workaround.
> I think you guys already apply some client request filters [1] to some
> extent.
> Has some of you tried to integrate Kylin HBase client code with hindex [2]?
> I wonder if the coprocessor-based approach adopted by hindex might be
> effective - even though hindex does not come as a standalone jar, so
> deploying the hindex HBase fork is necessary (I am not aware of how hindex
> is reliable and the latest commit is 6 month old). Besides, some change to
> Kylin HBase client code would be required (when creating cube HTables).
> I have also had a quick look at Phoenix [3], which comes with secondary
> indexes support, but I wonder if it makes sense to integrate that with
> Kylin (in this case I think Kylin HBase client code should be heavily
> modified to switch to Phoenix APIs.)
> 
> Long story short, I wonder if someone could give me a heads up and point me
> in the right direction.
> 
> 
> Cheers,
> luca
> 
> [1] http://hbase.apache.org/book.html#client.filter
> [2] https://github.com/Huawei-Hadoop/hindex/tree/hbase-0.98
> [3] https://phoenix.apache.org/secondary_indexing.html

Re: On improving WHEN statements performance on other columns

Reply via email to