[
https://issues.apache.org/jira/browse/KYLIN-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liyang resolved KYLIN-508.
--------------------------
Resolution: Fixed
> Too high cardinality is not suitable for dictionary!
> ----------------------------------------------------
>
> Key: KYLIN-508
> URL: https://issues.apache.org/jira/browse/KYLIN-508
> Project: Kylin
> Issue Type: Bug
> Components: Job Engine
> Reporter: Luke Han
> Assignee: liyang
> Labels: github-import
> Fix For: v0.7.1
>
>
> Hi !
> With building a cube faild, it throws some error.
> ```
> [QuartzScheduler_Worker-22]:[2015-01-08
> 00:21:38,468][INFO][com.kylinolap.dict.DictionaryGenerator.buildDictionaryFromValueList(DictionaryGenerator.java:72)]
> - Dictionary cardinality 9999956
> [QuartzScheduler_Worker-22]:[2015-01-08
> 00:21:38,468][ERROR][com.kylinolap.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:55)]
> - Too high cardinality is not suitable for dictionary! Are the values stable
> enough for incremental load??
> java.lang.IllegalArgumentException: Too high cardinality is not suitable for
> dictionary! Are the values stable enough for incremental load??
> at
> com.kylinolap.dict.DictionaryGenerator.buildDictionaryFromValueList(DictionaryGenerator.java:75)
> at
> com.kylinolap.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:110)
> at
> com.kylinolap.dict.DictionaryManager.buildDictionary(DictionaryManager.java:166)
> at
> com.kylinolap.cube.CubeManager.buildDictionary(CubeManager.java:171)
> ```
> in source code
> ```
> /**
> * @author yangli9
> */
> @SuppressWarnings({ "rawtypes", "unchecked" })
> public class DictionaryGenerator {
> private static final Logger logger =
> LoggerFactory.getLogger(DictionaryGenerator.class);
> private static final String[] DATE_PATTERNS = new String[] { "yyyy-MM-dd"
> };
> public static Dictionary<?> buildDictionaryFromValueList(DictionaryInfo
> info, List<byte[]> values) {
> info.setCardinality(values.size());
> ...
> // log a few samples
> StringBuilder buf = new StringBuilder();
> for (Object s : samples) {
> if (buf.length() > 0)
> buf.append(", ");
>
> buf.append(s.toString()).append("=>").append(dict.getIdFromValue(s));
> }
> logger.info("Dictionary value samples: " + buf.toString());
> logger.info("Dictionary cardinality " + info.getCardinality());
> if (values.size() > 1000000)
> throw new IllegalArgumentException("Too high cardinality is not
> suitable for dictionary! Are the values stable enough for incremental
> load??");
> return dict;
> ...
> ```
> Here is limit to 1000000, what is it means?
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/KylinOLAP/Kylin/issues/364
> Created by: [Yancey1989|https://github.com/Yancey1989]
> Labels:
> Created at: Thu Jan 08 00:27:15 CST 2015
> State: open
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)