The 2 million threshold is not about record number. The cap 2 million in DictionaryGenerator means cardinality of a column. That is how many distinct values you have on a column. And if you really have a column with more than 2 million distinct values, you can choose to disable dictionary on that column. And Kylin will still work out the cube correctly.
On Fri, May 22, 2015 at 10:59 AM, 黄浩洸 <[email protected]> wrote: > Hi Luke, > > > I am puzzled that why Kylin has sat a threshold value as 1000000. I have > read a report about Kylin which introduced Kylin was good at OLAP of > billion level of big data. However,I can't test it because of the threshold > limitation. Did Kylin limit its performance purposefully?Or Kylin will > release the restriction later in the new edition? > > > Thanks, > Steve > > > > > > > > > > > ------------------ 原始邮件 ------------------ > 发件人: "Luke Han";<[email protected]>; > 发送时间: 2015年5月21日(星期四) 下午4:05 > 收件人: "[email protected]"<[email protected]>; > > 主题: Re: Kylin Cube Creation > > > > Hi Tim, could you please refer to your previous mail thread? I have left > some comments there. > > Thanks. > > > > Best Regards! > --------------------- > > Luke Han > > 2015-05-21 11:51 GMT+08:00 tim <[email protected]>: > > > Hello, I'm a student from SCUT.I'm interested in big data.Recently I want > > to test the big data query ability of kylin,and met some problem when > > creating cube. I'm new here and eager to get your help, thank you! > > > > > > 1.I load tables to hive by ETLtool kettle sucessfully,but when I try to > > load them to kylin ,it told me success but the file number and size > > displayed zero.So I try to add my my table sql to kylin's > > create_sample_tables.sql, It create sucessfully this time but it > obviously > > not a good way to create tables . what's the difference between these two > > method. And how can I load tables to kylin from hive in normal way. > > > > > > 2. When I build the sample cube ,It met error on the penultimate step > ,the > > error information is: org.apache.hadoop.security.AccessControlException: > > Permission denied. user=root is not the owner of inode=null. It seem > like > > a permission problem, and I can't find the way out. > > > > > > 3.I want to test the big data query ability,about 1 billion,but kylin set > > its threshold value as 2 million. I found the threshold setting in source > > code DictionaryGenerator.java and StorageContext.java . Are there any > other > > threshold setting? And what's next after I change the threshold value. > > > > > > I build Kylin on Hadoop 2.5.0-cdh5.3.0; > > hive-common-0.13.1-cdh5.3.0; > > hbase-0.98.6-cdh5.3.0 > > > > Regards > > > > > > tim.liang >
