I haven't apply filters on the high cardinality, account no. is a high cardinality about 240w. what can I do when I need to retain the high cardinality (account no).
Is that filters applied during creating model ? 2016-09-18 9:54 GMT+08:00 hongbin ma <mahong...@apache.org>: > Hi Mars > > the query is using cuboid 15 (0x00001111), which means your query involves > the last four dimensions in the row key? From "Total scanned row: > 100245548" > we can see the cuboid is hardly pre-aggregated as a cuboid. Can you check > the last four dimensions of the row key? Do they have very high > cardinality? > > BTW, do you apply filters on the high cardinality? If so you should think > about redesigning the row key: Usually high-card dimensions should precede > low-card dimensions for filter effectiveness. > > On Sun, Sep 18, 2016 at 9:32 AM, Mars J <xujiao.myc...@gmail.com> wrote: > > > OK ,Kylin version is 1.5.2.1, fact table has 200,000,000 records, my cube > > is very simple, a fact table about transaction, a dimension table of > > account, and a dimension table of branch. account table's cardinality is > > 240w records. they left join on a acct_no column. > > > > > > 2016-09-16 23:56 GMT+08:00 Luke Han <luke...@gmail.com>: > > > > > Hi Mars, > > > You are trying to query data without group by, Kylin may not > perform > > > very well without tuning your cube. > > > > > > And we can't help you with just "log as below...", please offer > more > > > detail information about your kylin's version, source data, metadata > and > > so > > > on > > > > > > Thanks. > > > Luke > > > > > > > > > Best Regards! > > > --------------------- > > > > > > Luke Han > > > > > > On Fri, Sep 9, 2016 at 5:40 PM, Mars J <xujiao.myc...@gmail.com> > wrote: > > > > > > > hello all, > > > > My query sql 'SELECT > > > > A.ACCT_NO,F.BRAN_CODE,F.SET_DATE,F.ACCT_NO,F.DC_FLAG,F.TRANS_AMT > > > > FROM NY.TRANS_FACT F LEFT JOIN NY.ACCOUNT_DIM A ON > F.ACCT_NO=A.ACCT_NO > > > > LIMIT 100' to query a cube (size :3.6G ,and fact table has > > 200,000,000), > > > > the query is failed. > > > > > > > > kylin log is as follow : > > > > > > > > Using project: TRANS_NO_DATE > > > > 2016-09-09 17:32:15,705 INFO [http-bio-7070-exec-7] > > > > controller.QueryController:175 : The original query: SELECT > > > > A.ACCT_NO,F.BRAN_CODE,F.SET_DATE,F.ACCT_NO,F.DC_FLAG,F.TRANS_AMT > > > > FROM NY.TRANS_FACT F LEFT JOIN NY.ACCOUNT_DIM A ON > F.ACCT_NO=A.ACCT_NO > > > > LIMIT 100 > > > > 2016-09-09 17:32:15,745 INFO [http-bio-7070-exec-7] > > > routing.QueryRouter:48 > > > > : The project manager's reference is > > > > org.apache.kylin.metadata.project.ProjectManager@1aa81aff > > > > 2016-09-09 17:32:15,745 INFO [http-bio-7070-exec-7] > > > routing.QueryRouter:60 > > > > : Find candidates by table NY.TRANS_FACT and project=TRANS_NO_DATE : > > > > org.apache.kylin.query.routing.Candidate@62e0ac94 > > > > 2016-09-09 17:32:15,745 INFO [http-bio-7070-exec-7] > > > routing.QueryRouter:49 > > > > : Applying rule: class > > > > org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRul > e, > > > > realizations before: [TND1(CUBE)], realizations after: [TND1(CUBE)] > > > > 2016-09-09 17:32:15,745 INFO [http-bio-7070-exec-7] > > > routing.QueryRouter:49 > > > > : Applying rule: class > > > > org.apache.kylin.query.routing.rules.RealizationSortRule, > realizations > > > > before: [TND1(CUBE)], realizations after: [TND1(CUBE)] > > > > 2016-09-09 17:32:15,746 INFO [http-bio-7070-exec-7] > > > routing.QueryRouter:72 > > > > : The realizations remaining: [TND1(CUBE)] And the final chosen one > is > > > the > > > > first one > > > > 2016-09-09 17:32:15,756 DEBUG [http-bio-7070-exec-7] > > > > enumerator.OLAPEnumerator:107 : query storage... > > > > 2016-09-09 17:32:15,756 INFO [http-bio-7070-exec-7] > > > > enumerator.OLAPEnumerator:181 : No group by and aggregation found in > > this > > > > query, will hack some result for better look of output... > > > > 2016-09-09 17:32:15,757 INFO [http-bio-7070-exec-7] > > > > v2.CubeStorageQuery:239 : exactAggregation is true > > > > 2016-09-09 17:32:15,757 INFO [http-bio-7070-exec-7] > > > > v2.CubeStorageQuery:357 : Enable limit 100 > > > > 2016-09-09 17:32:15,757 INFO [http-bio-7070-exec-7] > > > > dict.DictionaryManager:393 : DictionaryManager(1238461247) loading > > > > DictionaryInfo(loadDictObj:true) at > > > > /dict/NY.TRANS_FACT/SET_DATE/d8379c72-dfc6-44d1-b429- > 9922cbd21091.dict > > > > 2016-09-09 17:32:15,759 INFO [http-bio-7070-exec-7] > > > > dict.DictionaryManager:393 : DictionaryManager(1238461247) loading > > > > DictionaryInfo(loadDictObj:true) at > > > > /dict/NY.TRANS_FACT/DC_FLAG/e7cdf373-2379-4313-89da- > 0d9b44954cd6.dict > > > > 2016-09-09 17:32:15,761 DEBUG [http-bio-7070-exec-7] > > > > v2.CubeHBaseEndpointRPC:257 : New scanner for current segment > > > > TND1[19700101000000_20161001000000] will use > SCAN_FILTER_AGGR_CHECKMEM > > > as > > > > endpoint's behavior > > > > 2016-09-09 17:32:15,762 DEBUG [http-bio-7070-exec-7] > > > > v2.CubeHBaseEndpointRPC:313 : Serialized scanRequestBytes 684 bytes, > > > > rawScanBytesString 50 bytes > > > > 2016-09-09 17:32:15,762 INFO [http-bio-7070-exec-7] > > > > v2.CubeHBaseEndpointRPC:315 : The scan 38504673 for segment > > > > TND1[19700101000000_20161001000000] is as below with 1 separate raw > > > scans, > > > > shard part of start/end key is set to 0 > > > > 2016-09-09 17:32:15,762 INFO [http-bio-7070-exec-7] > > v2.CubeHBaseRPC:271 > > > : > > > > Visiting hbase table KYLIN_SL43718YJF: cuboid exact match, from 15 to > > 15 > > > > Start: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x00\x00\x00\x00\ > > x00\x00 > > > > (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x00\x00\x00\x00\x00\x00) > > Stop: > > > > \x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\xFF\xFF\xFF\xFF\ > xFF\xFF\x00 > > > > (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\xFF\xFF\xFF\xFF\ > > xFF\xFF\x00), > > > No > > > > Fuzzy Key > > > > 2016-09-09 17:32:15,762 DEBUG [http-bio-7070-exec-7] > > > > v2.CubeHBaseEndpointRPC:320 : Submitting rpc to 2 shards starting > from > > > > shard 0, scan range count 1 > > > > 2016-09-09 17:32:15,763 INFO [http-bio-7070-exec-7] > > > > v2.CubeHBaseEndpointRPC:103 : Timeout for ExpectedSizeIterator is: > > > 9900000 > > > > 2016-09-09 17:32:15,763 DEBUG [http-bio-7070-exec-7] > > > > enumerator.OLAPEnumerator:127 : return TupleIterator... > > > > 2016-09-09 17:33:01,574 INFO [pool-4-thread-1] > > > > threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual > > > > running, > > > > 0 ready, 58 others > > > > 2016-09-09 17:33:48,867 INFO [BadQueryDetector] > > > > service.BadQueryDetector:104 : Slow query has been running 93.161 > > seconds > > > > (project:TRANS_NO_DATE, thread: 0xc1) -- SELECT > > > > A.ACCT_NO,F.BRAN_CODE,F.SET_DATE,F.ACCT_NO,F.DC_FLAG,F.TRANS_AMT > > > > FROM NY.TRANS_FACT F LEFT JOIN NY.ACCOUNT_DIM A ON > F.ACCT_NO=A.ACCT_NO > > > > LIMIT 100 > > > > 2016-09-09 17:33:48,875 DEBUG [BadQueryDetector] > > > > badquery.BadQueryHistoryManager:84 : Loaded 10 Bad Query(s) > > > > 2016-09-09 17:33:48,916 DEBUG [BadQueryDetector] > > > > hbase.HBaseResourceStore:262 : Update row > /bad_query/TRANS_NO_DATE.json > > > > from oldTs: 1473411958909, to newTs: 1473413628875, operation result: > > > true > > > > 2016-09-09 17:33:48,916 INFO [BadQueryDetector] > > > > service.BadQueryDetector:230 : Problematic thread 0xc1 > > > > at sun.misc.Unsafe.park(Native Method) > > > > at java.util.concurrent.locks.LockSupport.parkNanos( > > > LockSupport.java:215) > > > > at > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ > > > > ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > > > > at java.util.concurrent.ArrayBlockingQueue.poll( > > > > ArrayBlockingQueue.java:418) > > > > at > > > > org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$ > > > > ExpectedSizeIterator.next(CubeHBaseEndpointRPC.java:125) > > > > at > > > > org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$ > > > > ExpectedSizeIterator.next(CubeHBaseEndpointRPC.java:81) > > > > at > > > > com.google.common.collect.TransformedIterator.next( > > > > TransformedIterator.java:48) > > > > at com.google.common.collect.Iterators$6.hasNext(Iterators.java:583) > > > > at > > > > org.apache.kylin.storage.hbase.cube.v2.SequentialCubeTupleIterator. > > > > hasNext(SequentialCubeTupleIterator.java:96) > > > > at > > > > org.apache.kylin.query.enumerator.OLAPEnumerator. > > > > moveNext(OLAPEnumerator.java:74) > > > > > > > > 2016-09-09 17:34:01,572 INFO [pool-4-thread-1] > > > > threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual > > > > running, > > > > 0 ready, 58 others > > > > 2016-09-09 17:34:12,198 INFO [pool-6-thread-1] > > > v2.CubeHBaseEndpointRPC:351 > > > > : <sub-thread for GTScanRequest 38504673> Endpoint RPC returned from > > > HTable > > > > KYLIN_SL43718YJF Shard > > > > \x4B\x59\x4C\x49\x4E\x5F\x53\x4C\x34\x33\x37\x31\x38\x59\ > > > > x4A\x46\x2C\x00\x01\x2C\x31\x34\x37\x33\x33\x38\x37\x30\ > > > > x39\x36\x34\x30\x37\x2E\x36\x39\x33\x61\x32\x39\x61\x33\ > > > > x62\x63\x63\x35\x66\x35\x66\x31\x32\x33\x64\x64\x30\x63\ > > > > x32\x38\x63\x39\x39\x34\x64\x38\x38\x31\x2E > > > > on host: slave5.Total scanned row: 100245548. Total filtered/aggred > > row: > > > 0. > > > > Time elapsed in EP: 107063(ms). Server CPU usage: > 0.21609751440119665, > > > > server physical mem left: 4.769349632E9, server swap mem > > > > left:8.131039232E9.Etc message: start latency: 17@1,agg done@72715 > > > > ,compress > > > > done@107063,server stats done@107063, > > > > debugGitTag:cf4d2940b67d622eacd2ac9a913b221091a35c2e;.Normal > Complete: > > > > true. > > > > > > > > > > > > > -- > Regards, > > *Bin Mahone | 马洪宾* >