Hi Team, Could you please confirm if filtering of rows (WHERE clause) is done in CoProcessor side?
Is there any APIs/logging to get physical plan of query? It will help us in optimising the cube. On Mon, Oct 22, 2018 at 8:58 PM Shrikant Bang <b.shrikan...@gmail.com> wrote: > Thanks ShaoFeng for response. I will try this and will update the results > of my queries. > > I would like to learn, how to identify the bottleneck in query executions. > Can we trace the query execution in each stage with timestamp? > > Also is there way we can get physical plan of query? This could help me to > design/tune my cube/queries for better response time. > > Regards, > Shrikant Bang > > On Mon, Oct 22, 2018 at 8:01 PM ShaoFeng Shi <shaofeng...@apache.org> > wrote: > >> Hi Shrikant, >> >> What's the order of the dimensions in the rowkey? In this case, you need >> to put "d3" at the heading position of the rowkey. >> >> Here is a good reference on how to design a cube, maybe we need add that >> into the FAQ or as a part of document: >> https://www.slideshare.net/YangLi43/design-cube-in-apache-kylin >> >> Shrikant Bang <b.shrikan...@gmail.com> 于2018年10月22日周一 下午3:51写道: >> >>> Hi Team, >>> >>> We are working on benchmark test for Kylin v2.5-Hbase-1.x as part of PoC. >>> >>> Here is my cube (pseudo) : >>> >>> *Dimension Table* : D1 >>> *Fact Table* : F1, F2 >>> >>> *Metrics* : SUM(D1.m1), SUM(D2.m2) >>> *Dimension Columns* -- Normal (D1.d1, D1.d2, D1.d3, F1.a1, F2.b1 ) >>> >>> JOIN (D1.d1 = F1.a1 AND D2.d2 = F2.b1) >>> >>> When I run a query matching to the cuboids it runs very fast : >>> pseudo example query: >>> >>> SELECT SUM(D1.m1), SUM(D2.m2), d1, d2, d3 >>> FROM D1 >>> JOIN F1 >>> ON D1.d1 = F1.a1 >>> JOIN F2 >>> ON D1.d2 = F2.b1 >>> GROUP BY d1, d2, d3 >>> >>> >>> But when I add where clause to query it become very slow in response >>> pseudo example query: >>> >>> SELECT SUM(D1.m1), SUM(D2.m2), d1, d2, d3 >>> FROM D1 >>> JOIN F1 >>> ON D1.d1 = F1.a1 >>> JOIN F2 >>> ON D1.d2 = F2.b1 >>> *WHERE d3 > 100 AND d3 < 1000* >>> GROUP BY d1, d2, d3 >>> >>> *In my case d3 is High Cardinality dimension which is part of row key ( >>> Normal Dimension ).* >>> >>> Here are question: >>> >>> 1. I have installed Kylin Co-Processor >>> <http://kylin.apache.org/docs20/howto/howto_update_coprocessor.html> before >>> running queries. Do Kylin query results gets filtered Co-Processor end? >>> >>> 2. How to find query traces to identify the bottleneck in response time? >>> >>> 3. Even though I have enabled Query Cache, it seems its not getting used >>> when query runs ( in case of multiple times also) . >>> >>> 4. Any best practises to tune the queries with WHERE clause? >>> >>> >>> Thank You, >>> Shrikant Bang. >>> >>> >> >> >> -- >> Best regards, >> >> Shaofeng Shi 史少锋 >> >> -- Thanks & Regards Sachin Aggarwal