Re: Kylin Cube Performance

2016-08-03 Thread Jason Hale
Sure, see kylin.log below:

2016-08-04 00:47:35,839 INFO  [http-bio-7070-exec-7]
controller.QueryController:175 : The original query:  SELECT SUM(clicks)
FROM hpa_reporting2 GROUP BY site_id, child_id, search_type, hotel_id,
report_date
2016-08-04 00:47:35,839 INFO  [http-bio-7070-exec-7]
service.QueryService:266 : The corrected query: SELECT SUM(clicks) FROM
hpa_reporting2 GROUP BY site_id, child_id, search_type, hotel_id,
report_date
LIMIT 5
2016-08-04 00:47:35,908 INFO  [http-bio-7070-exec-7] routing.QueryRouter:48
: The project manager's reference is
org.apache.kylin.metadata.project.ProjectManager@3a3735a5
2016-08-04 00:47:35,909 INFO  [http-bio-7070-exec-7] routing.QueryRouter:60
: Find candidates by table DEFAULT.HPA_REPORTING2 and project=KODDI_DEV :
org.apache.kylin.query.routing.Candidate@51ed1b3b
2016-08-04 00:47:35,909 INFO  [http-bio-7070-exec-7] routing.QueryRouter:49
: Applying rule: class
org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule,
realizations before: [hpa_reporting2_cube_clone(CUBE)], realizations after:
[hpa_reporting2_cube_clone(CUBE)]
2016-08-04 00:47:35,910 INFO  [http-bio-7070-exec-7] routing.QueryRouter:49
: Applying rule: class
org.apache.kylin.query.routing.rules.RealizationSortRule, realizations
before: [hpa_reporting2_cube_clone(CUBE)], realizations after:
[hpa_reporting2_cube_clone(CUBE)]
2016-08-04 00:47:35,910 INFO  [http-bio-7070-exec-7] routing.QueryRouter:72
: The realizations remaining: [hpa_reporting2_cube_clone(CUBE)] And the
final chosen one is the first one
2016-08-04 00:47:35,975 DEBUG [http-bio-7070-exec-7]
enumerator.OLAPEnumerator:107 : query storage...
2016-08-04 00:47:35,976 INFO  [http-bio-7070-exec-7]
v2.CubeStorageQuery:239 : exactAggregation is true
2016-08-04 00:47:35,976 INFO  [http-bio-7070-exec-7]
v2.CubeStorageQuery:357 : Enable limit 5
2016-08-04 00:47:35,977 DEBUG [http-bio-7070-exec-7]
v2.CubeHBaseEndpointRPC:257 : New scanner for current segment
hpa_reporting2_cube_clone[1970010100_2016082800] will use
SCAN_FILTER_AGGR_CHECKMEM as endpoint's behavior
2016-08-04 00:47:35,979 DEBUG [http-bio-7070-exec-7]
v2.CubeHBaseEndpointRPC:313 : Serialized scanRequestBytes 836 bytes,
rawScanBytesString 56 bytes
2016-08-04 00:47:35,979 INFO  [http-bio-7070-exec-7]
v2.CubeHBaseEndpointRPC:315 : The scan 31b2dd4c for segment
hpa_reporting2_cube_clone[1970010100_2016082800] is as below with 1
separate raw scans, shard part of start/end key is set to 0
2016-08-04 00:47:35,980 INFO  [http-bio-7070-exec-7] v2.CubeHBaseRPC:271 :
Visiting hbase table KYLIN_RIK9O18H07: cuboid exact match, from 992 to 992
Start:
\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\x00\x00\x00\x00\x00\x00\x00\x00\x00
(\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\x00\x00\x00\x00\x00\x00\x00\x00\x00)
Stop:
 
\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00
(\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00),
No Fuzzy Key
2016-08-04 00:47:35,981 DEBUG [http-bio-7070-exec-7]
v2.CubeHBaseEndpointRPC:320 : Submitting rpc to 1 shards starting from
shard 2, scan range count 1
2016-08-04 00:47:35,981 INFO  [http-bio-7070-exec-7]
v2.CubeHBaseEndpointRPC:103 : Timeout for ExpectedSizeIterator is: 99000
2016-08-04 00:47:35,981 DEBUG [http-bio-7070-exec-7]
enumerator.OLAPEnumerator:127 : return TupleIterator...
2016-08-04 00:47:52,773 INFO  [pool-6-thread-1] v2.CubeHBaseEndpointRPC:351
:  Endpoint RPC returned from HTable
KYLIN_RIK9O18H07 Shard
\x4B\x59\x4C\x49\x4E\x5F\x52\x49\x4B\x39\x4F\x31\x38\x48\x30\x37\x2C\x00\x02\x2C\x31\x34\x37\x30\x31\x35\x35\x33\x31\x34\x39\x33\x37\x2E\x61\x33\x61\x35\x34\x37\x39\x61\x32\x63\x37\x61\x61\x64\x30\x36\x33\x66\x30\x33\x64\x63\x34\x65\x31\x30\x36\x33\x61\x33\x61\x37\x2E
on host: ip-10-0-0-157.ec2.internal.Total scanned row: 12306477. Total
filtered/aggred row: 0. Time elapsed in EP: 16562(ms). Server CPU usage:
0.24348086721950246, server physical mem left: 7.195234304E9, server swap
mem left:0.0.Etc message: start latency: 15@1,agg done@13760,compress
done@16562,server stats done@16562,
debugGitTag:cf4d2940b67d622eacd2ac9a913b221091a35c2e;.Normal Complete: true.
2016-08-04 00:47:54,068 DEBUG [pool-6-thread-1] util.CompressionUtils:67 :
Original: 46465726 bytes. Decompressed: 150553629 bytes. Time: 1294
2016-08-04 00:48:29,303 INFO  [pool-4-thread-1]
threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual running,
0 ready, 12 others
2016-08-04 00:48:31,990 INFO  [http-bio-7070-exec-7]
service.QueryService:399 : Scan count for each storageContext: 12306477,
2016-08-04 00:48:31,991 INFO  [http-bio-7070-exec-7]
controller.QueryController:197 : Stats of SQL response: isException: false,
duration: 56152, total scan count 12306477
2016-08-04 00:48:32,000 WARN  [http-bio-7070-exec-7]
sizeof.ObjectGraphWalker:209 : The configured limit of 1,000 object
references was reached while attempting to calculate the size of the object
graph. Severe performance degradation 

Re: cube query problem with kylin-1.5.3

2016-08-03 Thread Cheng Wang
 “Caused by: java.lang.NoSuchMethodError:
org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment.getRegion()Lorg/apache/hadoop/hbase/regionserver/Region;”

It’s sure your current hbase is not compatible to kylin. Tty to do this: 
http://kylin.apache.org/docs15/howto/howto_update_coprocessor.html

If it still fails, you might need to update your hbase or to use HDP/CDH to 
make it pass.




On 8/4/16, 12:14 PM, "王峰"  wrote:

ok I exec SQL on table "KYLIN_CATEGORY_GROUPINGS" and got Success..like
this:
==[QUERY]===
SQL: select * from   KYLIN_CATEGORY_GROUPINGS
User: ADMIN
Success: true
Duration: 0.026
Project: learn_kylin
Realization Names: [kylin_sales_cube_desc_clone_clone]
Cuboid Ids: []
Total scan count: 0
Result row count: 144
Accept Partial: true
Is Partial Result: false
Hit Exception Cache: false
Storage cache used: false
Message: null
==[QUERY]===

2.select * from   KYLIN_SALES  Fail:


2016-08-04 12:02:00,604 INFO  [http-bio-7070-exec-5]
controller.QueryController:174 : Using project: learn_kylin
2016-08-04 12:02:00,604 INFO  [http-bio-7070-exec-5]
controller.QueryController:175 : The original query:  select * from
KYLIN_SALES
2016-08-04 12:02:00,605 INFO  [http-bio-7070-exec-5]
service.QueryService:269 : The corrected query: select * from   KYLIN_SALES
LIMIT 5
2016-08-04 12:02:00,623 INFO  [http-bio-7070-exec-5] routing.QueryRouter:48
: The project manager's reference is
org.apache.kylin.metadata.project.ProjectManager@1a9b1372
2016-08-04 12:02:00,623 INFO  [http-bio-7070-exec-5] routing.QueryRouter:60
: Find candidates by table DEFAULT.KYLIN_SALES and project=LEARN_KYLIN :
org.apache.kylin.query.routing.Candidate@467295b7
2016-08-04 12:02:00,623 INFO  [http-bio-7070-exec-5] routing.QueryRouter:49
: Applying rule: class
org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule,
realizations before: [kylin_sales_cube_desc_clone_clone(CUBE)],
realizations after: [kylin_sales_cube_desc_clone_clone(CUBE)]
2016-08-04 12:02:00,623 INFO  [http-bio-7070-exec-5] routing.QueryRouter:49
: Applying rule: class
org.apache.kylin.query.routing.rules.RealizationSortRule, realizations
before: [kylin_sales_cube_desc_clone_clone(CUBE)], realizations after:
[kylin_sales_cube_desc_clone_clone(CUBE)]
2016-08-04 12:02:00,623 INFO  [http-bio-7070-exec-5] routing.QueryRouter:72
: The realizations remaining: [kylin_sales_cube_desc_clone_clone(CUBE)] And
the final chosen one is the first one
2016-08-04 12:02:00,628 DEBUG [http-bio-7070-exec-5]
enumerator.OLAPEnumerator:105 : query storage...
2016-08-04 12:02:00,628 INFO  [http-bio-7070-exec-5]
enumerator.OLAPEnumerator:170 : No group by and aggregation found in this
query, will hack some result for better look of output...
2016-08-04 12:02:00,628 WARN  [http-bio-7070-exec-5]
enumerator.OLAPEnumerator:205 : SUM is not defined for measure column
DEFAULT.KYLIN_SALES.SELLER_ID, output will be meaningless.
2016-08-04 12:02:00,629 INFO  [http-bio-7070-exec-5]
gtrecord.GTCubeStorageQueryBase:247 : exactAggregation is true
2016-08-04 12:02:00,629 INFO  [http-bio-7070-exec-5]
gtrecord.GTCubeStorageQueryBase:365 : Enable limit 5
2016-08-04 12:02:00,629 DEBUG [http-bio-7070-exec-5]
v2.CubeHBaseEndpointRPC:271 : New scanner for current segment
kylin_sales_cube_desc_clone_clone[2012010100_2016080100] will use
SCAN_FILTER_AGGR_CHECKMEM as endpoint's behavior
2016-08-04 12:02:00,630 DEBUG [http-bio-7070-exec-5]
v2.CubeHBaseEndpointRPC:327 : Serialized scanRequestBytes 829 bytes,
rawScanBytesString 72 bytes
2016-08-04 12:02:00,630 INFO  [http-bio-7070-exec-5]
v2.CubeHBaseEndpointRPC:329 : The scan 324ef549 for segment
kylin_sales_cube_desc_clone_clone[2012010100_2016080100] is as
below with 1 separate raw scans, shard part of start/end key is set to 0
2016-08-04 12:02:00,630 INFO  [http-bio-7070-exec-5] v2.CubeHBaseRPC:271 :
Visiting hbase table KYLIN_O3IJKXOPI6: cuboid exact match, from 99 to 99
Start:
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x63\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
(\x00\x00\x00\x00\x00\x00\x00\x00\x00c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00)
Stop:
 
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x63\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00
(\x00\x00\x00\x00\x00\x00\x00\x00\x00c\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00),
No Fuzzy Key
2016-08-04 12:02:00,630 DEBUG [http-bio-7070-exec-5]
v2.CubeHBaseEndpointRPC:334 : Submitting rpc to 1 shards starting from
shard 0, scan range count 1
2016-08-04 12:02:00,663 INFO  [http-bio-7070-exec-5]
v2.CubeHBaseEndpointRPC:110 : Timeout for ExpectedSizeIterator is: 99
2016-08-04 12:02:00,663 DEBUG [http-bio-7070-exec-5]

Re: cube query problem with kylin-1.5.3

2016-08-03 Thread 王峰
sorry I am not describe rightly , my kylin version is
apache-kylin-1.5.3-HBase1.x-bin.
I did not use apache-kylin-1.5.3-bin.tar.gz ...

2016-08-04 11:37 GMT+08:00 Cheng Wang :

> I meant that you need to check your Kylin version. According to your
> environment, try this version:
> http://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-1.5.3/apache-kylin-1.5.3-HBase1.x-bin.tar.gz
>
> On 8/4/16, 11:03 AM, "wangfeng"  wrote:
>
> as you say, I check the details logs of kylin-1.5.3,and found those info:
> ==[QUERY]===
> SQL: select * from KYLIN_SALES
> User: ADMIN
> Success: false
> Duration: 0.0
> Project: learn_kylin
> Realization Names: [kylin_sales_cube_desc_clone_clone]
> Cuboid Ids: [99]
> Total scan count: 0
> Result row count: 0
> Accept Partial: true
> Is Partial Result: false
> Hit Exception Cache: false
> Storage cache used: false
> Message: Error while executing SQL "select * from KYLIN_SALES LIMIT 5":
> Error in coprocessor
> ==[QUERY]===
>
> 2016-08-04 10:53:11,819 ERROR [http-bio-7070-exec-10]
> controller.BasicController:44 :
> org.apache.kylin.rest.exception.InternalErrorException: Error while
> executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor
> at
>
> org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224)
> at
>
> org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
>
> org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213)
> at
>
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126)
> at
>
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96)
> at
>
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617)
> at
>
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578)
> at
>
> org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80)
> at
>
> org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
> at
>
> org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
> at
>
> org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
> at
>
> org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
>
> --
>
> if hbase version 0.98 works well , can I compile the
> kylin-coprocessor-1.5.3-0.jar to meet the 0.98.
> btw, I used to set coprocessor to
> kylin-1.5---kylin-coprocessor-1.5.0-SNAPSHOT-0.jar  but did not work.
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484p5490.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>
>
>


Re: cube query problem with kylin-1.5.3

2016-08-03 Thread Cheng Wang
I meant that you need to check your Kylin version. According to your 
environment, try this version: 
http://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-1.5.3/apache-kylin-1.5.3-HBase1.x-bin.tar.gz

On 8/4/16, 11:03 AM, "wangfeng"  wrote:

as you say, I check the details logs of kylin-1.5.3,and found those info: 
==[QUERY]=== 
SQL: select * from KYLIN_SALES 
User: ADMIN 
Success: false 
Duration: 0.0 
Project: learn_kylin 
Realization Names: [kylin_sales_cube_desc_clone_clone] 
Cuboid Ids: [99] 
Total scan count: 0 
Result row count: 0 
Accept Partial: true 
Is Partial Result: false 
Hit Exception Cache: false 
Storage cache used: false 
Message: Error while executing SQL "select * from KYLIN_SALES LIMIT 5":
Error in coprocessor 
==[QUERY]=== 

2016-08-04 10:53:11,819 ERROR [http-bio-7070-exec-10]
controller.BasicController:44 : 
org.apache.kylin.rest.exception.InternalErrorException: Error while
executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor 
at
org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224)
 
at
org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:498) 
at
org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213)
 
at
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126)
 
at
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96)
 
at
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617)
 
at
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578)
 
at
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80)
 
at
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
 
at
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
 
at
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
 
at
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
 
--
 

if hbase version 0.98 works well , can I compile the
kylin-coprocessor-1.5.3-0.jar to meet the 0.98. 
btw, I used to set coprocessor to
kylin-1.5---kylin-coprocessor-1.5.0-SNAPSHOT-0.jar  but did not work. 

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484p5490.html
Sent from the Apache Kylin mailing list archive at Nabble.com.




Re: cube query problem with kylin-1.5.3

2016-08-03 Thread wangfeng
as you say, I check the details logs of kylin-1.5.3,and found those info: 
==[QUERY]=== 
SQL: select * from KYLIN_SALES 
User: ADMIN 
Success: false 
Duration: 0.0 
Project: learn_kylin 
Realization Names: [kylin_sales_cube_desc_clone_clone] 
Cuboid Ids: [99] 
Total scan count: 0 
Result row count: 0 
Accept Partial: true 
Is Partial Result: false 
Hit Exception Cache: false 
Storage cache used: false 
Message: Error while executing SQL "select * from KYLIN_SALES LIMIT 5":
Error in coprocessor 
==[QUERY]=== 

2016-08-04 10:53:11,819 ERROR [http-bio-7070-exec-10]
controller.BasicController:44 : 
org.apache.kylin.rest.exception.InternalErrorException: Error while
executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor 
at
org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224)
 
at
org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:498) 
at
org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213)
 
at
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126)
 
at
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96)
 
at
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617)
 
at
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578)
 
at
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80)
 
at
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
 
at
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
 
at
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
 
at
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
 
--
 

if hbase version 0.98 works well , can I compile the
kylin-coprocessor-1.5.3-0.jar to meet the 0.98. 
btw, I used to set coprocessor to
kylin-1.5---kylin-coprocessor-1.5.0-SNAPSHOT-0.jar  but did not work. 

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484p5490.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Re: cube query problem with kylin-1.5.3

2016-08-03 Thread wangfeng
as you say, I check the details logs of kylin-1.5.3,and found those info:
==[QUERY]===
SQL: select * from KYLIN_SALES 
User: ADMIN
Success: false
Duration: 0.0
Project: learn_kylin
Realization Names: [kylin_sales_cube_desc_clone_clone]
Cuboid Ids: [99]
Total scan count: 0
Result row count: 0
Accept Partial: true
Is Partial Result: false
Hit Exception Cache: false
Storage cache used: false
Message: Error while executing SQL "select * from KYLIN_SALES LIMIT 5":
Error in coprocessor
==[QUERY]===

2016-08-04 10:53:11,819 ERROR [http-bio-7070-exec-10]
controller.BasicController:44 : 
org.apache.kylin.rest.exception.InternalErrorException: Error while
executing SQL "select * from KYLIN_SALES LIMIT 5": Error in coprocessor
at
org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224)
at
org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213)
at
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126)
at
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96)
at
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617)
at
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578)
at
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80)
at
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
at
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
at
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
at
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
--

if hbase version 0.98 works well , can I compile the
kylin-coprocessor-1.5.3-0.jar to meet the 0.98.
btw, I used to set coprocessor to
kylin-1.5---kylin-coprocessor-1.5.0-SNAPSHOT-0.jar  but did not work.



--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484p5489.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


[jira] [Created] (KYLIN-1938) Document for Hive/HBase/Hadoop permission required

2016-08-03 Thread Billy(Yiming) Liu (JIRA)
Billy(Yiming) Liu created KYLIN-1938:


 Summary: Document for Hive/HBase/Hadoop permission required
 Key: KYLIN-1938
 URL: https://issues.apache.org/jira/browse/KYLIN-1938
 Project: Kylin
  Issue Type: Improvement
  Components: Documentation
Affects Versions: v1.5.3
Reporter: Billy(Yiming) Liu
Priority: Minor


Kylin would execute quite a few hadoop command when building cube, such as dfs, 
set "mapreduce.job.reduces", set "hive.merge.mapredfiles" and more. Some 
commands are mandatory, and some are optional for better performance. 

Usually in a hadoop cluster, Apache Kylin should be treated as a
privileged user (instead of a normal user like analyst), which can execute
necessary hadoop/hdfs/hbase/hive actions (like mkdir, create htable, etc);
To achieve this, the administrator need do some configurations and
authorizations.

What we need do is to compose a document to list these privileges.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Kylin Cube Performance

2016-08-03 Thread ShaoFeng Shi
Hi Jason, could you please provide the full log since sending query to and
getting result back? The key information is which cuboid is used for the
query, cuboid exact match or fuzzy match, how many records be scanned and
how long it tooks; Thanks.

2016-08-03 23:19 GMT+08:00 Jason Hale :

> Yes, it would have to do post-aggregation in that case, but the strange
> thing is that query was running fast (about 1 second), while queries with
> more dimensions, such as "SELECT SUM(clicks) FROM reporting GROUP BY
> site_id, child_id, report_date, hotel_id". This query will take about 106
> seconds, but it shouldn't need to do any post-aggregation so I would think
> it should return much quicker than that from the respective cuboid.
>
> Here's the explain plan:
> OLAPToEnumerableConverter
> OLAPProjectRel(EXPR$0=[$4])
> OLAPAggregateRel(group=[{0, 1, 2, 3}], EXPR$0=[SUM($4)])
> OLAPProjectRel(SITE_ID=[$9], CHILD_ID=[$3], REPORT_DATE=[$0],
> HOTEL_ID=[$2], CLICKS=[$10])
> OLAPTableScan(table=[[DEFAULT, HPA_REPORTING2]], fields=[[0, 1, 2, 3, 4, 5,
> 6, 7, 8, 9, 10, 11]])
>
> On Tue, Aug 2, 2016 at 7:46 PM, ShaoFeng Shi 
> wrote:
>
> > In the cube definition, you defined "SITE_ID", "CHILD_ID" as "Mandatory"
> > dimension, which means they will not be aggregated in cube build phase
> for
> > all combinations.
> >
> > So when you run a query like  "SELECT SUM(clicks) FROM reporting GROUP BY
> > search_type", Kylin will use the combination  "SITE_ID" + "CHILD_ID" +
> > "SEARCH_TYPE" to serve, there will be post-aggregation in runtime; The
> > performance is much depent on the cardinality of "SITE_ID" and
> "CHILD_ID".
> >
> >
> > 2016-08-02 23:08 GMT+08:00 Jason Hale :
> >
> > > I've looked over the optimization options before, but did not notice
> the
> > > rowkey ordering. I can try this and see if this helps me. This is the
> > only
> > > thing I see that I can attempt to optimize further in the design, but
> > I'll
> > > provide my cube design below. I only have one measure to keep it
> simple:
> > >
> > > {
> > >   "uuid": "4090b854-8f0c-4288-bd73-fc50238a6030",
> > >   "version": "1.5.2",
> > >   "name": "hpa_reporting2_cube",
> > >   "description": "",
> > >   "dimensions": [
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.REPORT_DATE",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "REPORT_DATE",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.SEARCH_TYPE",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "SEARCH_TYPE",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.HOTEL_ID",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "HOTEL_ID",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.CHILD_ID",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "CHILD_ID",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.COUNTRY",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "COUNTRY",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.DEVICE_TYPE",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "DEVICE_TYPE",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.STAY_LENGTH",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "STAY_LENGTH",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.TRUE_RANK_AG",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "TRUE_RANK_AG",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.ROOM_BUNDLE",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "ROOM_BUNDLE",
> > >   "derived": null
> > > },
> > > {
> > >   "name": "DEFAULT.HPA_REPORTING2.SITE_ID",
> > >   "table": "DEFAULT.HPA_REPORTING2",
> > >   "column": "SITE_ID",
> > >   "derived": null
> > > }
> > >   ],
> > >   "measures": [
> > > {
> > >   "name": "_COUNT_",
> > >   "function": {
> > > "expression": "COUNT",
> > > "parameter": {
> > >   "type": "constant",
> > >   "value": "1",
> > >   "next_parameter": null
> > > },
> > > "returntype": "bigint"
> > >   },
> > >   "dependent_measure_ref": null
> > > },
> > > {
> > >   "name": "CLICKS",
> > >   "function": {
> > > "expression": "SUM",
> > > "parameter": {
> > >   "type": "column",
> > >   "value": "CLICKS",
> > >   "next_parameter": null
> > > },
> > > "returntype": "decimal"
> > >   },
> > >   "dependent_measure_ref": null
> > > }
> > >   ],
> > >   "rowkey": {
> > > 

cube query problem with kylin-1.5.3

2016-08-03 Thread wangfeng
Hi, when I installed the kylin-1.5.3 and run sample.sh, everything looks
good. However ,  when I queried the result cube in 
"Insight", tables "KYLIN_CAL_DT" and "KYLIN_CATEGORY_GROUPINGS" did not show
any error with SQL, as for table 
"KYLIN_SALES"  could not run and show log: "Error while executing SQL
"select * from KYLIN_SALES LIMIT 5": Error in coprocessor"
 At first, I obey the operation which was provided by kylin official website
as :$KYLIN_HOME/bin/kylin.sh
org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI
$KYLIN_HOME/lib/kylin-coprocessor-*.jar all,But it did not work.

Indeed, after building cube based on my source data, I can also not query
cube,and give the exception :"Error while executing SQL "*": Error
in coprocessor"

so ,I need your help. thanks

ps. 
kylin-1.5.3
hadoop-2.7.0
hbase-1.0.1
hive-2.0.0





--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/cube-query-problem-with-kylin-1-5-3-tp5484.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Re: Kylin Cube Performance

2016-08-03 Thread Jason Hale
Yes, it would have to do post-aggregation in that case, but the strange
thing is that query was running fast (about 1 second), while queries with
more dimensions, such as "SELECT SUM(clicks) FROM reporting GROUP BY
site_id, child_id, report_date, hotel_id". This query will take about 106
seconds, but it shouldn't need to do any post-aggregation so I would think
it should return much quicker than that from the respective cuboid.

Here's the explain plan:
OLAPToEnumerableConverter
OLAPProjectRel(EXPR$0=[$4])
OLAPAggregateRel(group=[{0, 1, 2, 3}], EXPR$0=[SUM($4)])
OLAPProjectRel(SITE_ID=[$9], CHILD_ID=[$3], REPORT_DATE=[$0],
HOTEL_ID=[$2], CLICKS=[$10])
OLAPTableScan(table=[[DEFAULT, HPA_REPORTING2]], fields=[[0, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11]])

On Tue, Aug 2, 2016 at 7:46 PM, ShaoFeng Shi  wrote:

> In the cube definition, you defined "SITE_ID", "CHILD_ID" as "Mandatory"
> dimension, which means they will not be aggregated in cube build phase for
> all combinations.
>
> So when you run a query like  "SELECT SUM(clicks) FROM reporting GROUP BY
> search_type", Kylin will use the combination  "SITE_ID" + "CHILD_ID" +
> "SEARCH_TYPE" to serve, there will be post-aggregation in runtime; The
> performance is much depent on the cardinality of "SITE_ID" and "CHILD_ID".
>
>
> 2016-08-02 23:08 GMT+08:00 Jason Hale :
>
> > I've looked over the optimization options before, but did not notice the
> > rowkey ordering. I can try this and see if this helps me. This is the
> only
> > thing I see that I can attempt to optimize further in the design, but
> I'll
> > provide my cube design below. I only have one measure to keep it simple:
> >
> > {
> >   "uuid": "4090b854-8f0c-4288-bd73-fc50238a6030",
> >   "version": "1.5.2",
> >   "name": "hpa_reporting2_cube",
> >   "description": "",
> >   "dimensions": [
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.REPORT_DATE",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "REPORT_DATE",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.SEARCH_TYPE",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "SEARCH_TYPE",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.HOTEL_ID",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "HOTEL_ID",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.CHILD_ID",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "CHILD_ID",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.COUNTRY",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "COUNTRY",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.DEVICE_TYPE",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "DEVICE_TYPE",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.STAY_LENGTH",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "STAY_LENGTH",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.TRUE_RANK_AG",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "TRUE_RANK_AG",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.ROOM_BUNDLE",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "ROOM_BUNDLE",
> >   "derived": null
> > },
> > {
> >   "name": "DEFAULT.HPA_REPORTING2.SITE_ID",
> >   "table": "DEFAULT.HPA_REPORTING2",
> >   "column": "SITE_ID",
> >   "derived": null
> > }
> >   ],
> >   "measures": [
> > {
> >   "name": "_COUNT_",
> >   "function": {
> > "expression": "COUNT",
> > "parameter": {
> >   "type": "constant",
> >   "value": "1",
> >   "next_parameter": null
> > },
> > "returntype": "bigint"
> >   },
> >   "dependent_measure_ref": null
> > },
> > {
> >   "name": "CLICKS",
> >   "function": {
> > "expression": "SUM",
> > "parameter": {
> >   "type": "column",
> >   "value": "CLICKS",
> >   "next_parameter": null
> > },
> > "returntype": "decimal"
> >   },
> >   "dependent_measure_ref": null
> > }
> >   ],
> >   "rowkey": {
> > "rowkey_columns": [
> >   {
> > "column": "REPORT_DATE",
> > "encoding": "dict",
> > "isShardBy": false
> >   },
> >   {
> > "column": "SEARCH_TYPE",
> > "encoding": "dict",
> > "isShardBy": false
> >   },
> >   {
> > "column": "HOTEL_ID",
> > "encoding": "dict",
> > "isShardBy": false
> >   },
> >   {
> > "column": "CHILD_ID",
> > "encoding": "dict",
> > "isShardBy": false
> >   },
> >   {
> > "column": "COUNTRY",
> > "encoding": "dict",

Re: HCatalogFormat error

2016-08-03 Thread Jason Hale
Thanks for the response Li Yang. This was an EMR cluster which I don't have
running now. I switched to setting up a HDP sandbox to get it up and
running for testing purposes. If I get a chance to spin up the EMR cluster
again, I will look into this further.

To answer your question, though, it was the latest version of Kylin, 1.5.3,
and I believe hadoop 2.4 on EMR, so this could very well have been the
issue.
My understanding was that the 'kylin.job.mr.lib.dir' setting would
distribute the jars through the hadoop tmpjars property for Kylin to use.
Is this not correct, or not available on this version?

On Tue, Aug 2, 2016 at 11:52 PM, Li Yang  wrote:

> What's your Kylin version?
>
> If it is 1.5.x, your problem is detecting the right hive jar on the Kylin
> node.
>
> Checkout bin/find-hive-dependency.sh. See if it returns right hive path.
>
> On Thu, Jul 28, 2016 at 6:20 AM, Jason Hale  wrote:
>
> > I have set up a Kylin instance on the master node of my Hadoop cluster. I
> > was trying on a separate client node, but had some permission issues, so
> to
> > simplify the test case, I've just installed it on master. Now I am
> getting
> > the below error.
> >
> > To correct this, I've tried the solution to distribute the jars in
> > https://issues.apache.org/jira/browse/KYLIN-1082 using '
> > kylin.job.mr.lib.dir'.
> > I'm not sure how to append to 'kylin.hive.dependency' as I cannot find
> > information on that (perhaps I'm not looking in the right place). But the
> > lib dir setting did not help and it still is unable to find that class.
> >
> >
> > On #2 Step Name: Extract Fact Table Distinct Columns
> >
> > Kylin executes with the following parameters:
> >
> > -conf /opt/kylin/bin/../conf/kylin_job_conf.xml -cubename Testing -output
> >
> >
> /kylin/kylin_metadata/kylin-40827168-d18f-4b17-a613-3febe773ce2c/Testing/fact_distinct_columns
> > -segmentname 1970010100_2016073100 -statisticsenabled true
> > -statisticsoutput
> >
> >
> /kylin/kylin_metadata/kylin-40827168-d18f-4b17-a613-3febe773ce2c/Testing/statistics
> > -statisticssamplingpercent 100 -jobname
> > Kylin_Fact_Distinct_Columns_Testing_Step
> >
> > Error Msg:
> >
> > 2016-07-27 21:54:03,387 ERROR [pool-6-thread-2]
> > execution.AbstractExecutable:116 : error running Executable
> > java.lang.NoClassDefFoundError:
> > org/apache/hive/hcatalog/mapreduce/HCatInputFormat
> > at
> >
> >
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:81)
> > at
> >
> >
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:111)
> > at
> >
> >
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:91)
> > at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:91)
> > at
> >
> >
> org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:121)
> > at
> >
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
> > at
> >
> >
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
> > at
> >
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
> > at
> >
> >
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124)
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hive.hcatalog.mapreduce.HCatInputFormat
> > at
> >
> >
> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720)
> > at
> >
> >
> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
> > ... 12 more
> > 2016-07-27 21:54:03,399 INFO  [pool-6-thread-2]
> > manager.ExecutableManager:274 : job
> > id:40827168-d18f-4b17-a613-3febe773ce2c-01 from RUNNING to ERROR
> > 2016-07-27 21:54:03,399 ERROR [pool-6-thread-2]
> > execution.AbstractExecutable:116 : error running Executable
> > org.apache.kylin.job.exception.ExecuteException:
> > java.lang.NoClassDefFoundError:
> > org/apache/hive/hcatalog/mapreduce/HCatInputFormat
> > at
> >
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
> > at
> >
> >
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
> > at
> >
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
> > at
> >
> >
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124)
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > at