Re: how to optimize kylin query

hongbin ma Sat, 03 Sep 2016 00:22:02 -0700

how many results are being returned for the slow queries? can you attach
kylin.log under KYLIN_HOME/logs?


On Fri, Sep 2, 2016 at 7:17 PM, Mars J <[email protected]> wrote:

> Hi ,
>     How can I optimize my kylin query ?
>
>     Cluster env. : Our cluster has 5 nodes for 1 master and 4 slaves, the
> hadoop cluster and hbase cluster is on the same node. The memory is 32G /
> node.
>
>     Cube Design :
>        1. tables : A fact table (200,000,000 records in hive) left join a
> account dim table (200w records in hive) and left join a branch dim table.
>        2. dimensions& measures : There are 4 dimensions include 2 columns
> of fact table and 1 column of account dim and the other column of branch
> dim.  and just add 1 measure —— sum()
>        3. aggregation group : agg1. include column idate, and set this to
> mandatory, agg2 include other 3 dimensions.
>
>      Query SQL like this:
>          select a.acct_no,sum(amt) from f left join acct_dim a on
> f.acct_no=a.acct_no [ where idate>=... and idate<...] group by a.acct_no .
>   we may change acct_no to bran_code ,or add the idate optionally, when
> group by acct_no ,it costs about 5s.
>          if we add select columns and join more table like 'left join
> acct_dim ... left join branch_dim....' it may take more seconds.
>          and when we query some detail data like some column in fact table
> , it'll query time out.
>
>        Our query is every flexible, can we optimize something to query
> this group by sql in 1s and can query the detail data correctly?
>
>    Thanks .
>
>
>



-- 
Regards,

*Bin Mahone | 马洪宾*

Re: how to optimize kylin query

Reply via email to