My best guess is out-of-memory crash. Returning 30K HLL counter may take 200MB~2GB memory depending on the count distinct precision. Suggest increase Kylin JVM heap and try again.
On Tue, May 17, 2016 at 10:16 AM, jyzheng <jyzh...@iflytek.com> wrote: > 1. The cardinality of app_name is about 30,000. > > 2. Yes, the Kylin has crashed after query failed. > > 3. The clue of kylin.log in the given below: > > > > Thanks for attention! > > > > > > 2016-05-17 10:13:51,531 INFO [pool-4-thread-1] > threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual running, > 0 ready, 17 others > > 2016-05-17 10:14:20,031 DEBUG [http-bio-7070-exec-5] > service.AdminService:90 : Get Kylin Runtime Config > > 2016-05-17 10:14:20,037 DEBUG [http-bio-7070-exec-8] > controller.UserController:64 : authentication.getPrincipal() is > org.springframework.security.core.userdetails.User@3b40b2f: Username: > ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; > credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: > ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER > > 2016-05-17 10:14:20,321 DEBUG [http-bio-7070-exec-5] > controller.ProjectController:97 : authentication.getPrincipal() is > org.springframework.security.core.userdetails.User@3b40b2f: Username: > ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; > credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: > ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER > > 2016-05-17 10:14:25,467 DEBUG [http-bio-7070-exec-5] > controller.UserController:64 : authentication.getPrincipal() is > org.springframework.security.core.userdetails.User@3b40b2f: Username: > ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; > credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: > ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER > > 2016-05-17 10:14:25,480 DEBUG [http-bio-7070-exec-5] > controller.UserController:64 : authentication.getPrincipal() is > org.springframework.security.core.userdetails.User@3b40b2f: Username: > ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; > credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: > ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER > > 2016-05-17 10:14:25,610 INFO [http-bio-7070-exec-1] > controller.TableController:89 : Return all table metadata in 1 seconds > > 2016-05-17 10:14:28,341 DEBUG [http-bio-7070-exec-5] > controller.UserController:64 : authentication.getPrincipal() is > org.springframework.security.core.userdetails.User@3b40b2f: Username: > ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; > credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: > ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER > > 2016-05-17 10:14:28,570 DEBUG [http-bio-7070-exec-5] > service.QueryService:293 : getting table metas > > 2016-05-17 10:14:28,574 DEBUG [http-bio-7070-exec-5] > service.QueryService:311 : getting column metas > > 2016-05-17 10:14:28,601 DEBUG [http-bio-7070-exec-5] > service.QueryService:325 : done column metas > > 2016-05-17 10:14:51,542 INFO [pool-4-thread-1] > threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual running, > 0 ready, 17 others > > 2016-05-17 10:14:52,457 INFO [http-bio-7070-exec-10] > controller.QueryController:175 : Using project: compass > > 2016-05-17 10:14:52,458 INFO [http-bio-7070-exec-10] > controller.QueryController:176 : The original query: select app_name, > count(distinct uid) as uv > > > > from sdk_log > > > > left join week_calendar on sdk_log.day_time = week_calendar.week_cal > > > > where sdk_log.day_time = date '2016-05-08' > > > > group by app_name > > > > order by uv desc > > > > limit 200 > > > > > > 2016-05-17 10:14:52,471 INFO [http-bio-7070-exec-10] > service.QueryService:269 : The corrected query: select app_name, > count(distinct uid) as uv > > > > from sdk_log > > > > left join week_calendar on sdk_log.day_time = week_calendar.week_cal > > > > where sdk_log.day_time = date '2016-05-08' > > > > group by app_name > > > > order by uv desc > > > > limit 200 > > 2016-05-17 10:14:53,938 INFO [http-bio-7070-exec-10] > routing.QueryRouter:48 : The project manager's reference is > org.apache.kylin.metadata.project.ProjectManager@58baceac > > 2016-05-17 10:14:53,941 INFO [http-bio-7070-exec-10] > routing.QueryRouter:60 : Find candidates by table DEFAULT.SDK_LOG and > project=COMPASS : org.apache.kylin.query.routing.Candidate@70f24f62 > > 2016-05-17 10:14:53,947 INFO [http-bio-7070-exec-10] > routing.QueryRouter:49 : Applying rule: class > org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule, > realizations before: [sdk_cube(CUBE)], realizations after: [sdk_cube(CUBE)] > > 2016-05-17 10:14:53,948 INFO [http-bio-7070-exec-10] > routing.QueryRouter:49 : Applying rule: class > org.apache.kylin.query.routing.rules.RealizationSortRule, realizations > before: [sdk_cube(CUBE)], realizations after: [sdk_cube(CUBE)] > > 2016-05-17 10:14:53,948 INFO [http-bio-7070-exec-10] > routing.QueryRouter:72 : The realizations remaining: [sdk_cube(CUBE)] And > the final chosen one is the first one > > 2016-05-17 10:14:54,662 DEBUG [http-bio-7070-exec-10] > enumerator.OLAPEnumerator:107 : query storage... > > 2016-05-17 10:14:54,749 INFO [http-bio-7070-exec-10] > cache.AbstractCacheFledgedQuery:85 : Cache for > 95f6d938-b974-4a92-b858-c0522faa7ff4 initializing... > > 2016-05-17 10:14:54,797 INFO [http-bio-7070-exec-10] > cache.CacheFledgedStaticQuery:58 : no existing cache to use > > 2016-05-17 10:14:54,802 INFO [http-bio-7070-exec-10] > v2.CubeStorageQuery:252 : exactAggregation is true > > 2016-05-17 10:14:54,812 INFO [http-bio-7070-exec-10] > v2.CubeStorageQuery:358 : Memory budget is set to: 49146 > > 2016-05-17 10:14:54,834 INFO [http-bio-7070-exec-10] > dict.DictionaryManager:401 : DictionaryManager(633099583) loading > DictionaryInfo(loadDictObj:true) at > /dict/DEFAULT.SDK_LOG/APP_NAME/21980b9c-a872-4fb6-aef4-3a96ed2ef1c8.dict > > 2016-05-17 10:14:54,959 INFO [http-bio-7070-exec-10] > dict.DictionaryManager:401 : DictionaryManager(633099583) loading > DictionaryInfo(loadDictObj:true) at > /dict/DEFAULT.SDK_LOG/DAY_TIME/4b6b9a89-8666-440f-94ce-cf3b2d68fc44.dict > > 2016-05-17 10:14:55,037 DEBUG [http-bio-7070-exec-10] > v2.CubeHBaseEndpointRPC:248 : New scanner for current segment > sdk_cube[20160502000000_20170103000000] will use SCAN_FILTER_AGGR_CHECKMEM > as endpoint's behavior > > 2016-05-17 10:14:55,074 DEBUG [http-bio-7070-exec-10] > v2.CubeHBaseEndpointRPC:283 : Serialized scanRequestBytes 216 bytes, > rawScanBytesString 81 bytes > > 2016-05-17 10:14:55,075 INFO [http-bio-7070-exec-10] > v2.CubeHBaseEndpointRPC:286 : The scan(s) info for current segment is as > below, shard part of start/end key is set to 0 > > 2016-05-17 10:14:55,082 INFO [http-bio-7070-exec-10] v2.CubeHBaseRPC:309 > : Visiting hbase table KYLIN_F6NPB6IA0Z: cuboid exact match, from 5 to 5 > Start: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x0B\x3C\xCB > (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x0B<\xCB) Stop: > \x00\x00\x00\x00\x00\x00\x00\x00\x00\x05\xFF\xFF\xFF\x0B\x3C\xCB\x00 > (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x05\xFF\xFF\xFF\x0B<\xCB\x00) Fuzzy > key counts: 1. Fuzzy keys : > \x00\x00\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x0B\x3C\xCB > \x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x01\x00\x00\x00; > > 2016-05-17 10:14:55,083 DEBUG [http-bio-7070-exec-10] > v2.CubeHBaseEndpointRPC:292 : Submitting rpc to 2 shards starting from > shard 1, scan requests count 1 > > 2016-05-17 10:14:55,172 INFO [http-bio-7070-exec-10] > v2.CubeHBaseEndpointRPC:123 : Timeout for ExpectedSizeIterator is 60000 > > 2016-05-17 10:14:55,189 DEBUG [http-bio-7070-exec-10] > enumerator.OLAPEnumerator:127 : return TupleIterator... > > 2016-05-17 10:14:55,190 DEBUG [http-bio-7070-exec-10] > enumerator.OLAPEnumerator:128 : Storage cache used for this storage > query:null > > 2016-05-17 10:15:05,367 INFO [pool-6-thread-2] > v2.CubeHBaseEndpointRPC:320 : <spawned by http-bio-7070-exec-10>Endpoint > RPC returned from HTable KYLIN_F6NPB6IA0Z Shard > \x4B\x59\x4C\x49\x4E\x5F\x46\x36\x4E\x50\x42\x36\x49\x41\x30\x5A\x2C\x2C\x31\x34\x36\x33\x32\x32\x35\x31\x30\x33\x35\x35\x38\x2E\x32\x32\x37\x39\x34\x66\x30\x37\x39\x34\x37\x63\x39\x34\x36\x30\x34\x64\x33\x66\x66\x36\x33\x34\x31\x62\x36\x34\x65\x34\x30\x33\x2E > on host: hfa-pro0043.hadoop.cpcc.iflyyun.cn.Total scanned row: 128455. > Total filtered/aggred row: 0. Time elapsed in EP: 9379(ms). Server CPU > usage: 0.6528155468594548, server physical mem left: 1.5867633664E10, > server swap mem left:1.00663287808E11.Etc message: 1,1183,9378,9379,. > > 2016-05-17 10:15:06,390 DEBUG [pool-6-thread-2] util.CompressionUtils:65 : > Original: 36540710 bytes. Decompressed: 54563500 bytes > > 2016-05-17 10:15:06,401 INFO [http-bio-7070-exec-10] > lookup.SnapshotManager:178 : Loading snapshotTable from > /table_snapshot/week_calendar/95bd13bd-4283-4acc-8335-22e94481b1f8.snapshot, > with loadData: true > > 2016-05-17 10:15:06,436 DEBUG [http-bio-7070-exec-10] > lookup.SnapshotManager:184 : Loaded snapshot at > /table_snapshot/week_calendar/95bd13bd-4283-4acc-8335-22e94481b1f8.snapshot > > 2016-05-17 10:15:08,057 INFO [pool-6-thread-1] > v2.CubeHBaseEndpointRPC:320 : <spawned by http-bio-7070-exec-10>Endpoint > RPC returned from HTable KYLIN_F6NPB6IA0Z Shard > \x4B\x59\x4C\x49\x4E\x5F\x46\x36\x4E\x50\x42\x36\x49\x41\x30\x5A\x2C\x00\x01\x2C\x31\x34\x36\x33\x32\x32\x35\x31\x30\x33\x35\x35\x38\x2E\x37\x65\x39\x63\x38\x30\x37\x33\x31\x62\x33\x34\x63\x62\x63\x32\x36\x35\x33\x34\x65\x32\x39\x64\x65\x35\x38\x39\x64\x36\x38\x66\x2E > on host: hfa-pro0042.hadoop.cpcc.iflyyun.cn.Total scanned row: 127642. > Total filtered/aggred row: 0. Time elapsed in EP: 11935(ms). Server CPU > usage: 0.9497198437438346, server physical mem left: 1.416450048E9, server > swap mem left:1.00663287808E11.Etc message: 1,1278,11935,11935,. > > 2016-05-17 10:15:09,453 DEBUG [pool-6-thread-1] util.CompressionUtils:65 : > Original: 35193219 bytes. Decompressed: 53229962 bytes > > > > *发件人:* user-return-560-jyzheng=iflytek....@kylin.apache.org [mailto: > user-return-560-jyzheng=iflytek....@kylin.apache.org] *代表 *Li Yang > *发送时间:* 2016年5月15日 19:32 > *收件人:* u...@kylin.apache.org > *抄送:* dev@kylin.apache.org > *主题:* Re: kylin query failed > > > > How high is the cardinality? By saying "failed" and then have to restart, > do you mean crash? Any clue in kylin.log? > > Sorry for many questions, but it's hard to help without the details. The > new 1.5.2 release will come with a diagnosis tool that can extract > necessary info into a zip, which you can share with community to diagnose. > > > > On Mon, May 9, 2016 at 5:12 PM, jyzheng <jyzh...@iflytek.com> wrote: > > > > When I send sql query to Kylin like this: > > > > select app_name, count(distinct uid) as uv > from sdk_log > left join week_calendar on sdk_log.day_time = week_calendar.week_cal > where sdk_log.day_time = date '2016-05-01' > group by app_name > order by uv desc > limit 200 > > > > > > and failed. So I have to restart Kylin. > > > > The dimension `app_name` is a great high cardinality. If I switch to > `app_tag` dimension , it will return the right result. So I can’t get the > top measure dimension in Kylin? If can, and what can I do to solve this? > > -------------------------------- > > 郑江雨 云平台 > > Phone: 15155195496 > > > > >