Technically Kylin have no problem to support cardinality as high as 5M or above. The current cap is 2M (see DictionaryGenerator.DICT_MAX_CARDINALITY) to remind us make sense.
I created a JIRA to make this cap configurable. https://issues.apache.org/jira/browse/KYLIN-660 On Mon, Apr 6, 2015 at 6:00 PM, Shi, Shaofeng <[email protected]> wrote: > Kylin doesn¹t support so high a cardinality: 5M; If there is such a > column, Kylin will thrown an IllegalArgumentException in building > dictionary saying: Too high cardinality is not suitable for dictionary > > Please check whether there is such error in your log file, and remove that > dimension from the cube to move ahead. > > On 4/5/15, 6:34 AM, "Jakub Skuratowicz" <[email protected]> > wrote: > > >Hi, > >I am trying to create a cube with 3 dimensions (1 normal, 1 derived, > >1hierarchy). Out of these, the largest cardinality is about 5M. However > >during the #3 Step Name: Build Dimension Dictionary - kylin dies. Web > >interface stops responding. I cannot create a new session. What is more > >mapreduce does not show any job starting for phase #3. The kylin.log > >ends with those entries: > > > >[Thread-9-EventThread]:[2015-04-04 > >17:57:44,091][INFO][org.apache.curator.framework.state.ConnectionStateMana > >ger.postState(ConnectionStateManager.java:228)] > >- State change: RECONNECTED > >[pool-6-thread-1]:[2015-04-04 > >17:58:32,755][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >17:59:36,424][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >18:00:40,181][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >18:01:48,590][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >18:02:36,059][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >18:03:38,234][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >18:04:53,317][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >18:06:01,801][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >18:07:14,274][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[pool-6-thread-1]:[2015-04-04 > >18:08:57,283][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >FetcherRunner.run(DefaultScheduler.java:117)] > >- Job Fetcher: 1 running, 1 actual running, 0 ready, 24 others > >[Thread-9-SendThread(amb0.mycorp.kom:2181)]:[2015-04-04 > >18:13:49,068][INFO][org.apache.zookeeper.ClientCnxn$SendThread.run(ClientC > >nxn.java:1096)] > >- Client session timed out, have not heard from server in 30519ms for > >sessionid 0x14c > >7660580a005d, closing socket connection and attempting reconnect > >[Thread-9-EventThread]:[2015-04-04 > >18:14:03,558][INFO][org.apache.curator.framework.state.ConnectionStateMana > >ger.postState(ConnectionStateManager.java:228)] > >- State change: SUSPENDED > >[Thread-9-SendThread(amb0.mycorp.kom:2181)]:[2015-04-04 > >18:14:38,679][INFO][org.apache.zookeeper.ClientCnxn$SendThread.logStartCon > >nect(ClientCnxn.java:975)] > >- Opening socket connection to server amb0.mycorp.kom/172.17.1.94:2181. > Wi > >ll not attempt to authenticate using SASL (unknown error) > >[Thread-9-SendThread(amb0.mycorp.kom:2181)]:[2015-04-04 > >18:14:55,993][INFO][org.apache.zookeeper.ClientCnxn$SendThread.primeConnec > >tion(ClientCnxn.java:852)] > >- Socket connection established to amb0.mycorp.kom/172.17.1.94:2181, > initi > >ating session > >[Thread-9-SendThread(amb0.mycorp.kom:2181)]:[2015-04-04 > >18:16:03,626][INFO][org.apache.zookeeper.ClientCnxn$SendThread.run(ClientC > >nxn.java:1096)] > >- Client session timed out, have not heard from server in 67633ms for > >sessionid 0x14c > >7660580a005d, closing socket connection and attempting reconnect > > > >I am using one of the latest builds (1-2days old). THere is no obvious > >indication of error. Can You please help me find a reason why it crashes ? > > > >regards > >Kuba Skuratowicz > >
