Re: Kylin Cube Performance
Sure, see kylin.log below: 2016-08-04 00:47:35,839 INFO [http-bio-7070-exec-7] controller.QueryController:175 : The original query: SELECT SUM(clicks) FROM hpa_reporting2 GROUP BY site_id, child_id, search_type, hotel_id, report_date 2016-08-04 00:47:35,839 INFO [http-bio-7070-exec-7] service.QueryService:266 : The corrected query: SELECT SUM(clicks) FROM hpa_reporting2 GROUP BY site_id, child_id, search_type, hotel_id, report_date LIMIT 5 2016-08-04 00:47:35,908 INFO [http-bio-7070-exec-7] routing.QueryRouter:48 : The project manager's reference is org.apache.kylin.metadata.project.ProjectManager@3a3735a5 2016-08-04 00:47:35,909 INFO [http-bio-7070-exec-7] routing.QueryRouter:60 : Find candidates by table DEFAULT.HPA_REPORTING2 and project=KODDI_DEV : org.apache.kylin.query.routing.Candidate@51ed1b3b 2016-08-04 00:47:35,909 INFO [http-bio-7070-exec-7] routing.QueryRouter:49 : Applying rule: class org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule, realizations before: [hpa_reporting2_cube_clone(CUBE)], realizations after: [hpa_reporting2_cube_clone(CUBE)] 2016-08-04 00:47:35,910 INFO [http-bio-7070-exec-7] routing.QueryRouter:49 : Applying rule: class org.apache.kylin.query.routing.rules.RealizationSortRule, realizations before: [hpa_reporting2_cube_clone(CUBE)], realizations after: [hpa_reporting2_cube_clone(CUBE)] 2016-08-04 00:47:35,910 INFO [http-bio-7070-exec-7] routing.QueryRouter:72 : The realizations remaining: [hpa_reporting2_cube_clone(CUBE)] And the final chosen one is the first one 2016-08-04 00:47:35,975 DEBUG [http-bio-7070-exec-7] enumerator.OLAPEnumerator:107 : query storage... 2016-08-04 00:47:35,976 INFO [http-bio-7070-exec-7] v2.CubeStorageQuery:239 : exactAggregation is true 2016-08-04 00:47:35,976 INFO [http-bio-7070-exec-7] v2.CubeStorageQuery:357 : Enable limit 5 2016-08-04 00:47:35,977 DEBUG [http-bio-7070-exec-7] v2.CubeHBaseEndpointRPC:257 : New scanner for current segment hpa_reporting2_cube_clone[1970010100_2016082800] will use SCAN_FILTER_AGGR_CHECKMEM as endpoint's behavior 2016-08-04 00:47:35,979 DEBUG [http-bio-7070-exec-7] v2.CubeHBaseEndpointRPC:313 : Serialized scanRequestBytes 836 bytes, rawScanBytesString 56 bytes 2016-08-04 00:47:35,979 INFO [http-bio-7070-exec-7] v2.CubeHBaseEndpointRPC:315 : The scan 31b2dd4c for segment hpa_reporting2_cube_clone[1970010100_2016082800] is as below with 1 separate raw scans, shard part of start/end key is set to 0 2016-08-04 00:47:35,980 INFO [http-bio-7070-exec-7] v2.CubeHBaseRPC:271 : Visiting hbase table KYLIN_RIK9O18H07: cuboid exact match, from 992 to 992 Start: \x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\x00\x00\x00\x00\x00\x00\x00\x00\x00 (\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\x00\x00\x00\x00\x00\x00\x00\x00\x00) Stop: \x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00 (\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00), No Fuzzy Key 2016-08-04 00:47:35,981 DEBUG [http-bio-7070-exec-7] v2.CubeHBaseEndpointRPC:320 : Submitting rpc to 1 shards starting from shard 2, scan range count 1 2016-08-04 00:47:35,981 INFO [http-bio-7070-exec-7] v2.CubeHBaseEndpointRPC:103 : Timeout for ExpectedSizeIterator is: 99000 2016-08-04 00:47:35,981 DEBUG [http-bio-7070-exec-7] enumerator.OLAPEnumerator:127 : return TupleIterator... 2016-08-04 00:47:52,773 INFO [pool-6-thread-1] v2.CubeHBaseEndpointRPC:351 : Endpoint RPC returned from HTable KYLIN_RIK9O18H07 Shard \x4B\x59\x4C\x49\x4E\x5F\x52\x49\x4B\x39\x4F\x31\x38\x48\x30\x37\x2C\x00\x02\x2C\x31\x34\x37\x30\x31\x35\x35\x33\x31\x34\x39\x33\x37\x2E\x61\x33\x61\x35\x34\x37\x39\x61\x32\x63\x37\x61\x61\x64\x30\x36\x33\x66\x30\x33\x64\x63\x34\x65\x31\x30\x36\x33\x61\x33\x61\x37\x2E on host: ip-10-0-0-157.ec2.internal.Total scanned row: 12306477. Total filtered/aggred row: 0. Time elapsed in EP: 16562(ms). Server CPU usage: 0.24348086721950246, server physical mem left: 7.195234304E9, server swap mem left:0.0.Etc message: start latency: 15@1,agg done@13760,compress done@16562,server stats done@16562, debugGitTag:cf4d2940b67d622eacd2ac9a913b221091a35c2e;.Normal Complete: true. 2016-08-04 00:47:54,068 DEBUG [pool-6-thread-1] util.CompressionUtils:67 : Original: 46465726 bytes. Decompressed: 150553629 bytes. Time: 1294 2016-08-04 00:48:29,303 INFO [pool-4-thread-1] threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual running, 0 ready, 12 others 2016-08-04 00:48:31,990 INFO [http-bio-7070-exec-7] service.QueryService:399 : Scan count for each storageContext: 12306477, 2016-08-04 00:48:31,991 INFO [http-bio-7070-exec-7] controller.QueryController:197 : Stats of SQL response: isException: false, duration: 56152, total scan count 12306477 2016-08-04 00:48:32,000 WARN [http-bio-7070-exec-7] sizeof.ObjectGraphWalker:209 : The configured limit of 1,000 object references was reached while attempting to calculate the size of the object graph. Severe performance degradation
Re: cube query problem with kylin-1.5.3
“Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment.getRegion()Lorg/apache/hadoop/hbase/regionserver/Region;” It’s sure your current hbase is not compatible to kylin. Tty to do this: http://kylin.apache.org/docs15/howto/howto_update_coprocessor.html If it still fails, you might need to update your hbase or to use HDP/CDH to make it pass. On 8/4/16, 12:14 PM, "王峰"wrote: ok I exec SQL on table "KYLIN_CATEGORY_GROUPINGS" and got Success..like this: ==[QUERY]=== SQL: select * from KYLIN_CATEGORY_GROUPINGS User: ADMIN Success: true Duration: 0.026 Project: learn_kylin Realization Names: [kylin_sales_cube_desc_clone_clone] Cuboid Ids: [] Total scan count: 0 Result row count: 144 Accept Partial: true Is Partial Result: false Hit Exception Cache: false Storage cache used: false Message: null ==[QUERY]=== 2.select * from KYLIN_SALES Fail: 2016-08-04 12:02:00,604 INFO [http-bio-7070-exec-5] controller.QueryController:174 : Using project: learn_kylin 2016-08-04 12:02:00,604 INFO [http-bio-7070-exec-5] controller.QueryController:175 : The original query: select * from KYLIN_SALES 2016-08-04 12:02:00,605 INFO [http-bio-7070-exec-5] service.QueryService:269 : The corrected query: select * from KYLIN_SALES LIMIT 5 2016-08-04 12:02:00,623 INFO [http-bio-7070-exec-5] routing.QueryRouter:48 : The project manager's reference is org.apache.kylin.metadata.project.ProjectManager@1a9b1372 2016-08-04 12:02:00,623 INFO [http-bio-7070-exec-5] routing.QueryRouter:60 : Find candidates by table DEFAULT.KYLIN_SALES and project=LEARN_KYLIN : org.apache.kylin.query.routing.Candidate@467295b7 2016-08-04 12:02:00,623 INFO [http-bio-7070-exec-5] routing.QueryRouter:49 : Applying rule: class org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule, realizations before: [kylin_sales_cube_desc_clone_clone(CUBE)], realizations after: [kylin_sales_cube_desc_clone_clone(CUBE)] 2016-08-04 12:02:00,623 INFO [http-bio-7070-exec-5] routing.QueryRouter:49 : Applying rule: class org.apache.kylin.query.routing.rules.RealizationSortRule, realizations before: [kylin_sales_cube_desc_clone_clone(CUBE)], realizations after: [kylin_sales_cube_desc_clone_clone(CUBE)] 2016-08-04 12:02:00,623 INFO [http-bio-7070-exec-5] routing.QueryRouter:72 : The realizations remaining: [kylin_sales_cube_desc_clone_clone(CUBE)] And the final chosen one is the first one 2016-08-04 12:02:00,628 DEBUG [http-bio-7070-exec-5] enumerator.OLAPEnumerator:105 : query storage... 2016-08-04 12:02:00,628 INFO [http-bio-7070-exec-5] enumerator.OLAPEnumerator:170 : No group by and aggregation found in this query, will hack some result for better look of output... 2016-08-04 12:02:00,628 WARN [http-bio-7070-exec-5] enumerator.OLAPEnumerator:205 : SUM is not defined for measure column DEFAULT.KYLIN_SALES.SELLER_ID, output will be meaningless. 2016-08-04 12:02:00,629 INFO [http-bio-7070-exec-5] gtrecord.GTCubeStorageQueryBase:247 : exactAggregation is true 2016-08-04 12:02:00,629 INFO [http-bio-7070-exec-5] gtrecord.GTCubeStorageQueryBase:365 : Enable limit 5 2016-08-04 12:02:00,629 DEBUG [http-bio-7070-exec-5] v2.CubeHBaseEndpointRPC:271 : New scanner for current segment kylin_sales_cube_desc_clone_clone[2012010100_2016080100] will use SCAN_FILTER_AGGR_CHECKMEM as endpoint's behavior 2016-08-04 12:02:00,630 DEBUG [http-bio-7070-exec-5] v2.CubeHBaseEndpointRPC:327 : Serialized scanRequestBytes 829 bytes, rawScanBytesString 72 bytes 2016-08-04 12:02:00,630 INFO [http-bio-7070-exec-5] v2.CubeHBaseEndpointRPC:329 : The scan 324ef549 for segment kylin_sales_cube_desc_clone_clone[2012010100_2016080100] is as below with 1 separate raw scans, shard part of start/end key is set to 0 2016-08-04 12:02:00,630 INFO [http-bio-7070-exec-5] v2.CubeHBaseRPC:271 : Visiting hbase table KYLIN_O3IJKXOPI6: cuboid exact match, from 99 to 99 Start: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x63\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 (\x00\x00\x00\x00\x00\x00\x00\x00\x00c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00) Stop: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x63\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00 (\x00\x00\x00\x00\x00\x00\x00\x00\x00c\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00), No Fuzzy Key 2016-08-04 12:02:00,630 DEBUG [http-bio-7070-exec-5] v2.CubeHBaseEndpointRPC:334 : Submitting rpc to 1 shards starting from shard 0, scan range count 1 2016-08-04 12:02:00,663 INFO [http-bio-7070-exec-5] v2.CubeHBaseEndpointRPC:110 : Timeout for ExpectedSizeIterator is: 99 2016-08-04 12:02:00,663 DEBUG [http-bio-7070-exec-5]
Re: cube query problem with kylin-1.5.3
sorry I am not describe rightly , my kylin version is apache-kylin-1.5.3-HBase1.x-bin. I did not use apache-kylin-1.5.3-bin.tar.gz ... 2016-08-04 11:37 GMT+08:00 Cheng Wang: > I meant that you need to check your Kylin version. According to your > environment, try this version: > http://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-1.5.3/apache-kylin-1.5.3-HBase1.x-bin.tar.gz > > On 8/4/16, 11:03 AM, "wangfeng" wrote: > > as you say, I check the details logs of kylin-1.5.3,and found those info: > ==[QUERY]=== > SQL: select * from KYLIN_SALES > User: ADMIN > Success: false > Duration: 0.0 > Project: learn_kylin > Realization Names: [kylin_sales_cube_desc_clone_clone] > Cuboid Ids: [99] > Total scan count: 0 > Result row count: 0 > Accept Partial: true > Is Partial Result: false > Hit Exception Cache: false > Storage cache used: false > Message: Error while executing SQL "select * from KYLIN_SALES LIMIT 5": > Error in coprocessor > ==[QUERY]=== > > 2016-08-04 10:53:11,819 ERROR [http-bio-7070-exec-10] > controller.BasicController:44 : > org.apache.kylin.rest.exception.InternalErrorException: Error while > executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor > at > > org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224) > at > > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > > org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213) > at > > org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126) > at > > org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96) > at > > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617) > at > > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578) > at > > org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80) > at > > org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923) > at > > org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852) > at > > org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882) > at > > org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789) > > -- > > if hbase version 0.98 works well , can I compile the > kylin-coprocessor-1.5.3-0.jar to meet the 0.98. > btw, I used to set coprocessor to > kylin-1.5---kylin-coprocessor-1.5.0-SNAPSHOT-0.jar but did not work. > > -- > View this message in context: > http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484p5490.html > Sent from the Apache Kylin mailing list archive at Nabble.com. > > >
Re: cube query problem with kylin-1.5.3
I meant that you need to check your Kylin version. According to your environment, try this version: http://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-1.5.3/apache-kylin-1.5.3-HBase1.x-bin.tar.gz On 8/4/16, 11:03 AM, "wangfeng"wrote: as you say, I check the details logs of kylin-1.5.3,and found those info: ==[QUERY]=== SQL: select * from KYLIN_SALES User: ADMIN Success: false Duration: 0.0 Project: learn_kylin Realization Names: [kylin_sales_cube_desc_clone_clone] Cuboid Ids: [99] Total scan count: 0 Result row count: 0 Accept Partial: true Is Partial Result: false Hit Exception Cache: false Storage cache used: false Message: Error while executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor ==[QUERY]=== 2016-08-04 10:53:11,819 ERROR [http-bio-7070-exec-10] controller.BasicController:44 : org.apache.kylin.rest.exception.InternalErrorException: Error while executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor at org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224) at org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789) -- if hbase version 0.98 works well , can I compile the kylin-coprocessor-1.5.3-0.jar to meet the 0.98. btw, I used to set coprocessor to kylin-1.5---kylin-coprocessor-1.5.0-SNAPSHOT-0.jar but did not work. -- View this message in context: http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484p5490.html Sent from the Apache Kylin mailing list archive at Nabble.com.
Re: cube query problem with kylin-1.5.3
as you say, I check the details logs of kylin-1.5.3,and found those info: ==[QUERY]=== SQL: select * from KYLIN_SALES User: ADMIN Success: false Duration: 0.0 Project: learn_kylin Realization Names: [kylin_sales_cube_desc_clone_clone] Cuboid Ids: [99] Total scan count: 0 Result row count: 0 Accept Partial: true Is Partial Result: false Hit Exception Cache: false Storage cache used: false Message: Error while executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor ==[QUERY]=== 2016-08-04 10:53:11,819 ERROR [http-bio-7070-exec-10] controller.BasicController:44 : org.apache.kylin.rest.exception.InternalErrorException: Error while executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor at org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224) at org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789) -- if hbase version 0.98 works well , can I compile the kylin-coprocessor-1.5.3-0.jar to meet the 0.98. btw, I used to set coprocessor to kylin-1.5---kylin-coprocessor-1.5.0-SNAPSHOT-0.jar but did not work. -- View this message in context: http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484p5490.html Sent from the Apache Kylin mailing list archive at Nabble.com.
Re: cube query problem with kylin-1.5.3
as you say, I check the details logs of kylin-1.5.3,and found those info: ==[QUERY]=== SQL: select * from KYLIN_SALES User: ADMIN Success: false Duration: 0.0 Project: learn_kylin Realization Names: [kylin_sales_cube_desc_clone_clone] Cuboid Ids: [99] Total scan count: 0 Result row count: 0 Accept Partial: true Is Partial Result: false Hit Exception Cache: false Storage cache used: false Message: Error while executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor ==[QUERY]=== 2016-08-04 10:53:11,819 ERROR [http-bio-7070-exec-10] controller.BasicController:44 : org.apache.kylin.rest.exception.InternalErrorException: Error while executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor at org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224) at org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789) -- if hbase version 0.98 works well , can I compile the kylin-coprocessor-1.5.3-0.jar to meet the 0.98. btw, I used to set coprocessor to kylin-1.5---kylin-coprocessor-1.5.0-SNAPSHOT-0.jar but did not work. -- View this message in context: http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484p5489.html Sent from the Apache Kylin mailing list archive at Nabble.com.
[jira] [Created] (KYLIN-1938) Document for Hive/HBase/Hadoop permission required
Billy(Yiming) Liu created KYLIN-1938: Summary: Document for Hive/HBase/Hadoop permission required Key: KYLIN-1938 URL: https://issues.apache.org/jira/browse/KYLIN-1938 Project: Kylin Issue Type: Improvement Components: Documentation Affects Versions: v1.5.3 Reporter: Billy(Yiming) Liu Priority: Minor Kylin would execute quite a few hadoop command when building cube, such as dfs, set "mapreduce.job.reduces", set "hive.merge.mapredfiles" and more. Some commands are mandatory, and some are optional for better performance. Usually in a hadoop cluster, Apache Kylin should be treated as a privileged user (instead of a normal user like analyst), which can execute necessary hadoop/hdfs/hbase/hive actions (like mkdir, create htable, etc); To achieve this, the administrator need do some configurations and authorizations. What we need do is to compose a document to list these privileges. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Kylin Cube Performance
Hi Jason, could you please provide the full log since sending query to and getting result back? The key information is which cuboid is used for the query, cuboid exact match or fuzzy match, how many records be scanned and how long it tooks; Thanks. 2016-08-03 23:19 GMT+08:00 Jason Hale: > Yes, it would have to do post-aggregation in that case, but the strange > thing is that query was running fast (about 1 second), while queries with > more dimensions, such as "SELECT SUM(clicks) FROM reporting GROUP BY > site_id, child_id, report_date, hotel_id". This query will take about 106 > seconds, but it shouldn't need to do any post-aggregation so I would think > it should return much quicker than that from the respective cuboid. > > Here's the explain plan: > OLAPToEnumerableConverter > OLAPProjectRel(EXPR$0=[$4]) > OLAPAggregateRel(group=[{0, 1, 2, 3}], EXPR$0=[SUM($4)]) > OLAPProjectRel(SITE_ID=[$9], CHILD_ID=[$3], REPORT_DATE=[$0], > HOTEL_ID=[$2], CLICKS=[$10]) > OLAPTableScan(table=[[DEFAULT, HPA_REPORTING2]], fields=[[0, 1, 2, 3, 4, 5, > 6, 7, 8, 9, 10, 11]]) > > On Tue, Aug 2, 2016 at 7:46 PM, ShaoFeng Shi > wrote: > > > In the cube definition, you defined "SITE_ID", "CHILD_ID" as "Mandatory" > > dimension, which means they will not be aggregated in cube build phase > for > > all combinations. > > > > So when you run a query like "SELECT SUM(clicks) FROM reporting GROUP BY > > search_type", Kylin will use the combination "SITE_ID" + "CHILD_ID" + > > "SEARCH_TYPE" to serve, there will be post-aggregation in runtime; The > > performance is much depent on the cardinality of "SITE_ID" and > "CHILD_ID". > > > > > > 2016-08-02 23:08 GMT+08:00 Jason Hale : > > > > > I've looked over the optimization options before, but did not notice > the > > > rowkey ordering. I can try this and see if this helps me. This is the > > only > > > thing I see that I can attempt to optimize further in the design, but > > I'll > > > provide my cube design below. I only have one measure to keep it > simple: > > > > > > { > > > "uuid": "4090b854-8f0c-4288-bd73-fc50238a6030", > > > "version": "1.5.2", > > > "name": "hpa_reporting2_cube", > > > "description": "", > > > "dimensions": [ > > > { > > > "name": "DEFAULT.HPA_REPORTING2.REPORT_DATE", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "REPORT_DATE", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.SEARCH_TYPE", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "SEARCH_TYPE", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.HOTEL_ID", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "HOTEL_ID", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.CHILD_ID", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "CHILD_ID", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.COUNTRY", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "COUNTRY", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.DEVICE_TYPE", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "DEVICE_TYPE", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.STAY_LENGTH", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "STAY_LENGTH", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.TRUE_RANK_AG", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "TRUE_RANK_AG", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.ROOM_BUNDLE", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "ROOM_BUNDLE", > > > "derived": null > > > }, > > > { > > > "name": "DEFAULT.HPA_REPORTING2.SITE_ID", > > > "table": "DEFAULT.HPA_REPORTING2", > > > "column": "SITE_ID", > > > "derived": null > > > } > > > ], > > > "measures": [ > > > { > > > "name": "_COUNT_", > > > "function": { > > > "expression": "COUNT", > > > "parameter": { > > > "type": "constant", > > > "value": "1", > > > "next_parameter": null > > > }, > > > "returntype": "bigint" > > > }, > > > "dependent_measure_ref": null > > > }, > > > { > > > "name": "CLICKS", > > > "function": { > > > "expression": "SUM", > > > "parameter": { > > > "type": "column", > > > "value": "CLICKS", > > > "next_parameter": null > > > }, > > > "returntype": "decimal" > > > }, > > > "dependent_measure_ref": null > > > } > > > ], > > > "rowkey": { > > >
cube query problem with kylin-1.5.3
Hi, when I installed the kylin-1.5.3 and run sample.sh, everything looks good. However , when I queried the result cube in "Insight", tables "KYLIN_CAL_DT" and "KYLIN_CATEGORY_GROUPINGS" did not show any error with SQL, as for table "KYLIN_SALES" could not run and show log: "Error while executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor" At first, I obey the operation which was provided by kylin official website as :$KYLIN_HOME/bin/kylin.sh org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI $KYLIN_HOME/lib/kylin-coprocessor-*.jar all,But it did not work. Indeed, after building cube based on my source data, I can also not query cube,and give the exception :"Error while executing SQL "*": Error in coprocessor" so ,I need your help. thanks ps. kylin-1.5.3 hadoop-2.7.0 hbase-1.0.1 hive-2.0.0 -- View this message in context: http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484.html Sent from the Apache Kylin mailing list archive at Nabble.com.
Re: Kylin Cube Performance
Yes, it would have to do post-aggregation in that case, but the strange thing is that query was running fast (about 1 second), while queries with more dimensions, such as "SELECT SUM(clicks) FROM reporting GROUP BY site_id, child_id, report_date, hotel_id". This query will take about 106 seconds, but it shouldn't need to do any post-aggregation so I would think it should return much quicker than that from the respective cuboid. Here's the explain plan: OLAPToEnumerableConverter OLAPProjectRel(EXPR$0=[$4]) OLAPAggregateRel(group=[{0, 1, 2, 3}], EXPR$0=[SUM($4)]) OLAPProjectRel(SITE_ID=[$9], CHILD_ID=[$3], REPORT_DATE=[$0], HOTEL_ID=[$2], CLICKS=[$10]) OLAPTableScan(table=[[DEFAULT, HPA_REPORTING2]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]]) On Tue, Aug 2, 2016 at 7:46 PM, ShaoFeng Shiwrote: > In the cube definition, you defined "SITE_ID", "CHILD_ID" as "Mandatory" > dimension, which means they will not be aggregated in cube build phase for > all combinations. > > So when you run a query like "SELECT SUM(clicks) FROM reporting GROUP BY > search_type", Kylin will use the combination "SITE_ID" + "CHILD_ID" + > "SEARCH_TYPE" to serve, there will be post-aggregation in runtime; The > performance is much depent on the cardinality of "SITE_ID" and "CHILD_ID". > > > 2016-08-02 23:08 GMT+08:00 Jason Hale : > > > I've looked over the optimization options before, but did not notice the > > rowkey ordering. I can try this and see if this helps me. This is the > only > > thing I see that I can attempt to optimize further in the design, but > I'll > > provide my cube design below. I only have one measure to keep it simple: > > > > { > > "uuid": "4090b854-8f0c-4288-bd73-fc50238a6030", > > "version": "1.5.2", > > "name": "hpa_reporting2_cube", > > "description": "", > > "dimensions": [ > > { > > "name": "DEFAULT.HPA_REPORTING2.REPORT_DATE", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "REPORT_DATE", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.SEARCH_TYPE", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "SEARCH_TYPE", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.HOTEL_ID", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "HOTEL_ID", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.CHILD_ID", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "CHILD_ID", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.COUNTRY", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "COUNTRY", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.DEVICE_TYPE", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "DEVICE_TYPE", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.STAY_LENGTH", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "STAY_LENGTH", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.TRUE_RANK_AG", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "TRUE_RANK_AG", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.ROOM_BUNDLE", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "ROOM_BUNDLE", > > "derived": null > > }, > > { > > "name": "DEFAULT.HPA_REPORTING2.SITE_ID", > > "table": "DEFAULT.HPA_REPORTING2", > > "column": "SITE_ID", > > "derived": null > > } > > ], > > "measures": [ > > { > > "name": "_COUNT_", > > "function": { > > "expression": "COUNT", > > "parameter": { > > "type": "constant", > > "value": "1", > > "next_parameter": null > > }, > > "returntype": "bigint" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "CLICKS", > > "function": { > > "expression": "SUM", > > "parameter": { > > "type": "column", > > "value": "CLICKS", > > "next_parameter": null > > }, > > "returntype": "decimal" > > }, > > "dependent_measure_ref": null > > } > > ], > > "rowkey": { > > "rowkey_columns": [ > > { > > "column": "REPORT_DATE", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "SEARCH_TYPE", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "HOTEL_ID", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "CHILD_ID", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "COUNTRY", > > "encoding": "dict",
Re: HCatalogFormat error
Thanks for the response Li Yang. This was an EMR cluster which I don't have running now. I switched to setting up a HDP sandbox to get it up and running for testing purposes. If I get a chance to spin up the EMR cluster again, I will look into this further. To answer your question, though, it was the latest version of Kylin, 1.5.3, and I believe hadoop 2.4 on EMR, so this could very well have been the issue. My understanding was that the 'kylin.job.mr.lib.dir' setting would distribute the jars through the hadoop tmpjars property for Kylin to use. Is this not correct, or not available on this version? On Tue, Aug 2, 2016 at 11:52 PM, Li Yangwrote: > What's your Kylin version? > > If it is 1.5.x, your problem is detecting the right hive jar on the Kylin > node. > > Checkout bin/find-hive-dependency.sh. See if it returns right hive path. > > On Thu, Jul 28, 2016 at 6:20 AM, Jason Hale wrote: > > > I have set up a Kylin instance on the master node of my Hadoop cluster. I > > was trying on a separate client node, but had some permission issues, so > to > > simplify the test case, I've just installed it on master. Now I am > getting > > the below error. > > > > To correct this, I've tried the solution to distribute the jars in > > https://issues.apache.org/jira/browse/KYLIN-1082 using ' > > kylin.job.mr.lib.dir'. > > I'm not sure how to append to 'kylin.hive.dependency' as I cannot find > > information on that (perhaps I'm not looking in the right place). But the > > lib dir setting did not help and it still is unable to find that class. > > > > > > On #2 Step Name: Extract Fact Table Distinct Columns > > > > Kylin executes with the following parameters: > > > > -conf /opt/kylin/bin/../conf/kylin_job_conf.xml -cubename Testing -output > > > > > /kylin/kylin_metadata/kylin-40827168-d18f-4b17-a613-3febe773ce2c/Testing/fact_distinct_columns > > -segmentname 1970010100_2016073100 -statisticsenabled true > > -statisticsoutput > > > > > /kylin/kylin_metadata/kylin-40827168-d18f-4b17-a613-3febe773ce2c/Testing/statistics > > -statisticssamplingpercent 100 -jobname > > Kylin_Fact_Distinct_Columns_Testing_Step > > > > Error Msg: > > > > 2016-07-27 21:54:03,387 ERROR [pool-6-thread-2] > > execution.AbstractExecutable:116 : error running Executable > > java.lang.NoClassDefFoundError: > > org/apache/hive/hcatalog/mapreduce/HCatInputFormat > > at > > > > > org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:81) > > at > > > > > org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:111) > > at > > > > > org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:91) > > at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:91) > > at > > > > > org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:121) > > at > > > > > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) > > at > > > > > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50) > > at > > > > > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) > > at > > > > > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124) > > at > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > Caused by: java.lang.ClassNotFoundException: > > org.apache.hive.hcatalog.mapreduce.HCatInputFormat > > at > > > > > org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720) > > at > > > > > org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571) > > ... 12 more > > 2016-07-27 21:54:03,399 INFO [pool-6-thread-2] > > manager.ExecutableManager:274 : job > > id:40827168-d18f-4b17-a613-3febe773ce2c-01 from RUNNING to ERROR > > 2016-07-27 21:54:03,399 ERROR [pool-6-thread-2] > > execution.AbstractExecutable:116 : error running Executable > > org.apache.kylin.job.exception.ExecuteException: > > java.lang.NoClassDefFoundError: > > org/apache/hive/hcatalog/mapreduce/HCatInputFormat > > at > > > > > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124) > > at > > > > > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50) > > at > > > > > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) > > at > > > > > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124) > > at > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at