[ 
https://issues.apache.org/jira/browse/KYLIN-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang resolved KYLIN-508.
--------------------------
    Resolution: Fixed

> Too high cardinality is not suitable for dictionary!
> ----------------------------------------------------
>
>                 Key: KYLIN-508
>                 URL: https://issues.apache.org/jira/browse/KYLIN-508
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>            Reporter: Luke Han
>            Assignee: liyang
>              Labels: github-import
>             Fix For: v0.7.1
>
>
> Hi !
> With building a cube faild, it throws some error.
> ```
> [QuartzScheduler_Worker-22]:[2015-01-08 
> 00:21:38,468][INFO][com.kylinolap.dict.DictionaryGenerator.buildDictionaryFromValueList(DictionaryGenerator.java:72)]
>  - Dictionary cardinality 9999956
> [QuartzScheduler_Worker-22]:[2015-01-08 
> 00:21:38,468][ERROR][com.kylinolap.job.hadoop.dict.CreateDictionaryJob.run(CreateDictionaryJob.java:55)]
>  - Too high cardinality is not suitable for dictionary! Are the values stable 
> enough for incremental load??
> java.lang.IllegalArgumentException: Too high cardinality is not suitable for 
> dictionary! Are the values stable enough for incremental load??
>         at 
> com.kylinolap.dict.DictionaryGenerator.buildDictionaryFromValueList(DictionaryGenerator.java:75)
>         at 
> com.kylinolap.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:110)
>         at 
> com.kylinolap.dict.DictionaryManager.buildDictionary(DictionaryManager.java:166)
>         at 
> com.kylinolap.cube.CubeManager.buildDictionary(CubeManager.java:171)
> ```
> in source code 
> ```
> /**
>  * @author yangli9
>  */
> @SuppressWarnings({ "rawtypes", "unchecked" })
> public class DictionaryGenerator {
>     private static final Logger logger = 
> LoggerFactory.getLogger(DictionaryGenerator.class);
>     private static final String[] DATE_PATTERNS = new String[] { "yyyy-MM-dd" 
> };
>     public static Dictionary<?> buildDictionaryFromValueList(DictionaryInfo 
> info, List<byte[]> values) {
>         info.setCardinality(values.size());
> ...
>         // log a few samples
>         StringBuilder buf = new StringBuilder();
>         for (Object s : samples) {
>             if (buf.length() > 0)
>                 buf.append(", ");
>             
> buf.append(s.toString()).append("=>").append(dict.getIdFromValue(s));
>         }
>         logger.info("Dictionary value samples: " + buf.toString());
>         logger.info("Dictionary cardinality " + info.getCardinality());
>         if (values.size() > 1000000)
>             throw new IllegalArgumentException("Too high cardinality is not 
> suitable for dictionary! Are the values stable enough for incremental 
> load??");
>         return dict;
> ...
> ```
> Here is limit to 1000000, what is it means?
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/KylinOLAP/Kylin/issues/364
> Created by: [Yancey1989|https://github.com/Yancey1989]
> Labels: 
> Created at: Thu Jan 08 00:27:15 CST 2015
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to