由于前端需要将多个指标合并展示,故查询kylin时,一个sql可能会嵌套多个多个子查询。生产环境发现,子查询严重影响查询速度。大家知道什么优化方法吗? 测试如下: 1.计算活跃设备 select count(distinct deviceid) dad from KYLIN_VIEW.KYLIN_VIEW_T_DWA_ACT_XXX_DEVICE_ACTIVE where par_dt>='2019-06-01' and par_dt<='2019-11-15' group by par_dt 2.计算活跃用户 select g.par_dt,count(distinct g.userid) "activeAccount" from KYLIN_VIEW.KYLIN_VIEW_T_DWA_ACT_XXX_USER_MULTIDIM_ACTIVE g where g.par_dt>='2019-06-01' and g.par_dt<='2019-11-01' group by g.par_dt 分别执行两个sql耗时基本在0.5s左右。
3.采用子查询: select g.par_dt,count(distinct g.userid) "activeAccount",a.dad "activeDevice" from KYLIN_VIEW.KYLIN_VIEW_T_DWA_ACT_XXX_USER_MULTIDIM_ACTIVE g left join( select par_dt,count(distinct deviceid) dad from KYLIN_VIEW.KYLIN_VIEW_T_DWA_ACT_XXX_DEVICE_ACTIVE where par_dt>='2019-06-01' and par_dt<='2019-11-01' group by par_dt ) a on g.par_dt = a.par_dt where g.par_dt>='2019-06-01' and g.par_dt<='2019-11-01' group by g.par_dt,a.dad 查询耗时多达1.5s,当子查询增多时,查询耗时高达几十秒。如下图,对于此结果不是很理解,因为子查询完全可以并行去查,然后子查询的结果(数据量已经超级小了)在kylin server端聚合,应该很快啊,为什么会这么慢?
