Hi xianglun, Dictionary encoding is the recommended encoding. It fits for most of the cases except Ultra High Cardinality. In your case, the SHOP_NO's cardinality is 115301 which should be good for the dictionary.
Usually the step 4 is fast, less than 1 minutes in my experience. You might need to do some analysis to identify what the CPU did at that moment. 2017-09-06 15:01 GMT+08:00 周湘伦 <[email protected]>: > Hi, > The problem(out of memory) no longer arises.But the step 4(Build Dimension > Dictionary) costs too much time(45.36 mins).The high cardinality > column(shop_no) > has been used fixed_length. > ID Column Encoding Length Shard By > 1 TBL_BILL_GENERAL.SHOP_NO fixed_length[v1] 50 false > 2 TBL_BILL_GENERAL.SETT_DATE fixed_length[v1] 8 false > 3 TBL_BILL_GENERAL.END_SP_CHNL_NO dict[v1] 0 false > 4 TBL_BILL_GENERAL.MSG_TXN_CODE dict[v1] 0 false > 5 TBL_OPE_BIZ_SHOP.DELETED dict[v1] 0 false > > Name Data Type Cardinality > SHOP_NO varchar(50) 115301 > SETT_DATE char(8) 260 > END_SP_CHNL_NO varchar(200) 38 > MSG_TXN_CODE varchar(10) 6 > DELETED char(1) 2 > How can i resolve the problem? Can I use fixed_length in all > dimensions? Maybe the step 4 will take less time. > Thank you. > > > 2017-08-30 18:44 GMT+08:00 ShaoFeng Shi <[email protected]>: > > > 1. you can customize those mr parameters for hive running, like " > > mapreduce.map.memory.mb", "mapreduce.map.java.opts". For more tunning, > you > > can check Hive's wiki. > > > > 2. Yes, check and update yarn configurations according to your cluster > > profile > > 3. Yes that will help the step 4. If it didn't change the behavior, check > > whether there is a UHC column to build dictionary. > > > > 2017-08-29 14:38 GMT+08:00 周湘伦 <[email protected]>: > > > > > Hi, > > > Thank you for your answer! > > > But i was confused: > > > 1.Which hive configuration options i can optimize? I used to try > optimize > > > some hive configuration options,but it didn't work. > > > 2.Do i need to optimize hadoop configuration options? > > > 3.Do i need to optimize the KYLIN_JVM_SETTINGS in setenv.sh?(I have > > already > > > modified it,but it doesn't seem to work) > > > > > > > > > > > > 2017-08-28 21:11 GMT+08:00 ShaoFeng Shi <[email protected]>: > > > > > > > Step 2 is a hive operation, you may need check Hive configuration to > > > > allocate more memory to containers. You can also customize > > > > conf/kylin_hive_conf.xml, as it will be applied before running Hive > > > > command. > > > > > > > > Step 4 is in Kylin process; If this step got OOM, you need check > > whether > > > > the cube is using "dict" encoding for a high cardinality column. As > we > > > > know, "dict" doesn't fit for high cardinality, you need revisit the > > cube > > > > design, and change to "fixed_length" or "integer" encoding for that > > > column. > > > > > > > > Also, check this doc, it may help: > > > > https://kylin.apache.org/docs21/howto/howto_optimize_build.html > > > > > > > > > > > > 2017-08-28 13:54 GMT+08:00 周湘伦 <[email protected]>: > > > > > > > > > Sorry,i got these problems in the step 2:Redistribute Flat Hive > > > > > Table,sometimes in the step 4:Build Dimension Dictionary,it is > > > > > uncertain.But the step 2 occurs a little more. > > > > > > > > > > 2017-08-28 11:19 GMT+08:00 ShaoFeng Shi <[email protected]>: > > > > > > > > > > > In which step you got this error? This is also important for > > analysis > > > > the > > > > > > issue. > > > > > > > > > > > > 2017-08-28 10:01 GMT+08:00 周湘伦 <[email protected]>: > > > > > > > > > > > > > Hi,all > > > > > > > When i built cube,some error hapeend(java.lang. > > OutOfMemoryError). > > > > > > > > > > > > > > The version as belows: > > > > > > > jdk1.8.0_131,kylin-2.0.0 > > > > > > > > > > > > > > The ram is 8g. > > > > > > > > > > > > > > Some configure properties belows: > > > > > > > <property> > > > > > > > <name>mapreduce.map.memory.mb</name> > > > > > > > <value>5120</value> > > > > > > > <description></description> > > > > > > > </property> > > > > > > > > > > > > > > <property> > > > > > > > <name>mapreduce.map.java.opts</name> > > > > > > > <value>-Xmx4096m -XX:OnOutOfMemoryError='kill -9 > > > %p'</value> > > > > > > > <description></description> > > > > > > > </property> > > > > > > > > > > > > > > The log is : > > > > > > > # java.lang.OutOfMemoryError: Java heap space > > > > > > > # -XX:OnOutOfMemoryError="kill -9 %p" > > > > > > > # Executing /bin/sh -c "kill -9 26605"... > > > > > > > > > > > > > > Then i check the log of gc: > > > > > > > ... > > > > > > > 2017-08-27T11:16:41.112+0800: 85553.167: [Full GC (Allocation > > > > Failure) > > > > > > > 2017-08-27T11:16:41.112+0800: 85553.168: > > > [CMS2017-08-27T11:16:43.369+ > > > > > > 0800: > > > > > > > 85555.424: [CMS-concurrent-mark: 2.287/2.288 secs] [Times: > > > user=2.34 > > > > > > > sys=0.00, real=2.29 secs] > > > > > > > (concurrent mode failure): 3853567K->3853567K(3853568K), > > 7.3121409 > > > > > secs] > > > > > > > 4080766K->4063993K(4080768K), [Metaspace: > > > > 109873K->109873K(1148928K)], > > > > > > > 7.3122661 secs] [Times: user=7.31 sys=0.00, real=7.32 secs] > > > > > > > 2017-08-27T11:16:48.453+0800: 85560.508: [Full GC (Allocation > > > > Failure) > > > > > > > 2017-08-27T11:16:48.453+0800: 85560.508: [CMS: > > > > > > > 3853567K->3853567K(3853568K), 5.1092943 secs] > > > > > > 4080766K->4064637K(4080768K), > > > > > > > [Metaspace: 109873K->109873K(1148928K)], 5.1094013 secs] > [Times: > > > > > > user=5.09 > > > > > > > sys=0.00, real=5.11 secs] > > > > > > > 2017-08-27T11:16:53.563+0800: 85565.618: [GC (CMS Initial Mark) > > [1 > > > > > > > CMS-initial-mark: 3853567K(3853568K)] 4064639K(4080768K), > > 0.0862505 > > > > > secs] > > > > > > > [Times: user=0.16 sys=0.00, real=0.08 secs] > > > > > > > 2017-08-27T11:16:53.649+0800: 85565.704: > > > [CMS-concurrent-mark-start] > > > > > > > 2017-08-27T11:16:53.678+0800: 85565.734: [Full GC (Allocation > > > > Failure) > > > > > > > 2017-08-27T11:16:53.679+0800: 85565.734: > > > [CMS2017-08-27T11:16:55.983+ > > > > > > 0800: > > > > > > > 85568.038: [CMS-concurrent-mark: 2.333/2.334 secs] [Times: > > > user=2.37 > > > > > > > sys=0.01, real=2.34 secs] > > > > > > > ... > > > > > > > > > > > > > > Maybe is the error of java Garbage Collection,so i how to solve > > the > > > > > > > problem?(maybe modify the properties) > > > > > > > > > > > > > > Thanks a lot. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Best regards, > > > > > > > > > > > > Shaofeng Shi 史少锋 > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Best regards, > > > > > > > > Shaofeng Shi 史少锋 > > > > > > > > > > > > > > > -- > > Best regards, > > > > Shaofeng Shi 史少锋 > > > -- Best regards, Shaofeng Shi 史少锋
