how many results are being returned for the slow queries? can you attach kylin.log under KYLIN_HOME/logs?
On Fri, Sep 2, 2016 at 7:17 PM, Mars J <[email protected]> wrote: > Hi , > How can I optimize my kylin query ? > > Cluster env. : Our cluster has 5 nodes for 1 master and 4 slaves, the > hadoop cluster and hbase cluster is on the same node. The memory is 32G / > node. > > Cube Design : > 1. tables : A fact table (200,000,000 records in hive) left join a > account dim table (200w records in hive) and left join a branch dim table. > 2. dimensions& measures : There are 4 dimensions include 2 columns > of fact table and 1 column of account dim and the other column of branch > dim. and just add 1 measure —— sum() > 3. aggregation group : agg1. include column idate, and set this to > mandatory, agg2 include other 3 dimensions. > > Query SQL like this: > select a.acct_no,sum(amt) from f left join acct_dim a on > f.acct_no=a.acct_no [ where idate>=... and idate<...] group by a.acct_no . > we may change acct_no to bran_code ,or add the idate optionally, when > group by acct_no ,it costs about 5s. > if we add select columns and join more table like 'left join > acct_dim ... left join branch_dim....' it may take more seconds. > and when we query some detail data like some column in fact table > , it'll query time out. > > Our query is every flexible, can we optimize something to query > this group by sql in 1s and can query the detail data correctly? > > Thanks . > > > -- Regards, *Bin Mahone | 马洪宾*
