[ 
https://issues.apache.org/jira/browse/KYLIN-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203598#comment-16203598
 ] 

zhengzfand commented on KYLIN-2937:
-----------------------------------

kylin version : 1.6
2017-10-11 00:14:02,168 INFO  [pool-9-thread-5] dict.DictionaryManager:186 : 
{color:red}283 existing dictionaries of{color} the same column
2017-10-11 00:14:02,168 WARN  [pool-9-thread-5] dict.DictionaryManager:188 : 
Too many dictionaries under /dict/TMP_TICKETDB.TMP_USER_ALLUIDMOBILE_END/UID, 
dict count: 283
2017-10-11 00:14:02,317 INFO  [pool-9-thread-5] dict.DictionaryManager:172 : 
Identical dictionary content, reuse existing dictionary at 
/dict/TMP_TICKETDB.TMP_USER_ALLUIDMOBILE_END/UID/2c3b3596-7f52-4250-ae83-315f8571672e.dict
Accoding the log above and  the code below
        for (String existing : existings) {
            DictionaryInfo existingInfo = getDictionaryInfo(existing);
            if (existingInfo != null && 
dict.equals(existingInfo.getDictionaryObject())) {
                return existing;
            }
        }

kylin may load all 283 dics into memory.  And below is all UID dics total size 
which in hdfs :
[admin@VMS25583 ~]$ hdfs dfs -du -s -h 
"/kylin/kylin_metadata/resources/dict/TMP_TICKETDB.TMP_USER_ALLUIDMOBILE_END/UID/"
 
{color:red}61.1 G{color}  183.4 G  
/kylin/kylin_metadata/resources/dict/TMP_TICKETDB.TMP_USER_ALLUIDMOBILE_END/UID

> 非分区cube的中间数据会累积
> ---------------
>
>                 Key: KYLIN-2937
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2937
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: zhengzfand
>
> 非分区的cube构建之后,中间数据不会被清理.
> 存放在hdfs上的字典文件,会一直累积.cube构建时会加载所有这些累积字典文件,
> 有可能导致内存溢出(如果字典文件够大够多的话).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to