[
https://issues.apache.org/jira/browse/KYLIN-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203598#comment-16203598
]
zhengzfand commented on KYLIN-2937:
-----------------------------------
kylin version : 1.6
2017-10-11 00:14:02,168 INFO [pool-9-thread-5] dict.DictionaryManager:186 :
{color:red}283 existing dictionaries of{color} the same column
2017-10-11 00:14:02,168 WARN [pool-9-thread-5] dict.DictionaryManager:188 :
Too many dictionaries under /dict/TMP_TICKETDB.TMP_USER_ALLUIDMOBILE_END/UID,
dict count: 283
2017-10-11 00:14:02,317 INFO [pool-9-thread-5] dict.DictionaryManager:172 :
Identical dictionary content, reuse existing dictionary at
/dict/TMP_TICKETDB.TMP_USER_ALLUIDMOBILE_END/UID/2c3b3596-7f52-4250-ae83-315f8571672e.dict
Accoding the log above and the code below
for (String existing : existings) {
DictionaryInfo existingInfo = getDictionaryInfo(existing);
if (existingInfo != null &&
dict.equals(existingInfo.getDictionaryObject())) {
return existing;
}
}
kylin may load all 283 dics into memory. And below is all UID dics total size
which in hdfs :
[admin@VMS25583 ~]$ hdfs dfs -du -s -h
"/kylin/kylin_metadata/resources/dict/TMP_TICKETDB.TMP_USER_ALLUIDMOBILE_END/UID/"
{color:red}61.1 G{color} 183.4 G
/kylin/kylin_metadata/resources/dict/TMP_TICKETDB.TMP_USER_ALLUIDMOBILE_END/UID
> 非分区cube的中间数据会累积
> ---------------
>
> Key: KYLIN-2937
> URL: https://issues.apache.org/jira/browse/KYLIN-2937
> Project: Kylin
> Issue Type: Bug
> Reporter: zhengzfand
>
> 非分区的cube构建之后,中间数据不会被清理.
> 存放在hdfs上的字典文件,会一直累积.cube构建时会加载所有这些累积字典文件,
> 有可能导致内存溢出(如果字典文件够大够多的话).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)