Hi, The problem(out of memory) no longer arises.But the step 4(Build Dimension Dictionary) costs too much time(45.36 mins).The high cardinality column(shop_no) has been used fixed_length. ID Column Encoding Length Shard By 1 TBL_BILL_GENERAL.SHOP_NO fixed_length[v1] 50 false 2 TBL_BILL_GENERAL.SETT_DATE fixed_length[v1] 8 false 3 TBL_BILL_GENERAL.END_SP_CHNL_NO dict[v1] 0 false 4 TBL_BILL_GENERAL.MSG_TXN_CODE dict[v1] 0 false 5 TBL_OPE_BIZ_SHOP.DELETED dict[v1] 0 false
Name Data Type Cardinality SHOP_NO varchar(50) 115301 SETT_DATE char(8) 260 END_SP_CHNL_NO varchar(200) 38 MSG_TXN_CODE varchar(10) 6 DELETED char(1) 2 How can i resolve the problem? Can I use fixed_length in all dimensions? Maybe the step 4 will take less time. Thank you. 2017-08-30 18:44 GMT+08:00 ShaoFeng Shi <[email protected]>: > 1. you can customize those mr parameters for hive running, like " > mapreduce.map.memory.mb", "mapreduce.map.java.opts". For more tunning, you > can check Hive's wiki. > > 2. Yes, check and update yarn configurations according to your cluster > profile > 3. Yes that will help the step 4. If it didn't change the behavior, check > whether there is a UHC column to build dictionary. > > 2017-08-29 14:38 GMT+08:00 周湘伦 <[email protected]>: > > > Hi, > > Thank you for your answer! > > But i was confused: > > 1.Which hive configuration options i can optimize? I used to try optimize > > some hive configuration options,but it didn't work. > > 2.Do i need to optimize hadoop configuration options? > > 3.Do i need to optimize the KYLIN_JVM_SETTINGS in setenv.sh?(I have > already > > modified it,but it doesn't seem to work) > > > > > > > > 2017-08-28 21:11 GMT+08:00 ShaoFeng Shi <[email protected]>: > > > > > Step 2 is a hive operation, you may need check Hive configuration to > > > allocate more memory to containers. You can also customize > > > conf/kylin_hive_conf.xml, as it will be applied before running Hive > > > command. > > > > > > Step 4 is in Kylin process; If this step got OOM, you need check > whether > > > the cube is using "dict" encoding for a high cardinality column. As we > > > know, "dict" doesn't fit for high cardinality, you need revisit the > cube > > > design, and change to "fixed_length" or "integer" encoding for that > > column. > > > > > > Also, check this doc, it may help: > > > https://kylin.apache.org/docs21/howto/howto_optimize_build.html > > > > > > > > > 2017-08-28 13:54 GMT+08:00 周湘伦 <[email protected]>: > > > > > > > Sorry,i got these problems in the step 2:Redistribute Flat Hive > > > > Table,sometimes in the step 4:Build Dimension Dictionary,it is > > > > uncertain.But the step 2 occurs a little more. > > > > > > > > 2017-08-28 11:19 GMT+08:00 ShaoFeng Shi <[email protected]>: > > > > > > > > > In which step you got this error? This is also important for > analysis > > > the > > > > > issue. > > > > > > > > > > 2017-08-28 10:01 GMT+08:00 周湘伦 <[email protected]>: > > > > > > > > > > > Hi,all > > > > > > When i built cube,some error hapeend(java.lang. > OutOfMemoryError). > > > > > > > > > > > > The version as belows: > > > > > > jdk1.8.0_131,kylin-2.0.0 > > > > > > > > > > > > The ram is 8g. > > > > > > > > > > > > Some configure properties belows: > > > > > > <property> > > > > > > <name>mapreduce.map.memory.mb</name> > > > > > > <value>5120</value> > > > > > > <description></description> > > > > > > </property> > > > > > > > > > > > > <property> > > > > > > <name>mapreduce.map.java.opts</name> > > > > > > <value>-Xmx4096m -XX:OnOutOfMemoryError='kill -9 > > %p'</value> > > > > > > <description></description> > > > > > > </property> > > > > > > > > > > > > The log is : > > > > > > # java.lang.OutOfMemoryError: Java heap space > > > > > > # -XX:OnOutOfMemoryError="kill -9 %p" > > > > > > # Executing /bin/sh -c "kill -9 26605"... > > > > > > > > > > > > Then i check the log of gc: > > > > > > ... > > > > > > 2017-08-27T11:16:41.112+0800: 85553.167: [Full GC (Allocation > > > Failure) > > > > > > 2017-08-27T11:16:41.112+0800: 85553.168: > > [CMS2017-08-27T11:16:43.369+ > > > > > 0800: > > > > > > 85555.424: [CMS-concurrent-mark: 2.287/2.288 secs] [Times: > > user=2.34 > > > > > > sys=0.00, real=2.29 secs] > > > > > > (concurrent mode failure): 3853567K->3853567K(3853568K), > 7.3121409 > > > > secs] > > > > > > 4080766K->4063993K(4080768K), [Metaspace: > > > 109873K->109873K(1148928K)], > > > > > > 7.3122661 secs] [Times: user=7.31 sys=0.00, real=7.32 secs] > > > > > > 2017-08-27T11:16:48.453+0800: 85560.508: [Full GC (Allocation > > > Failure) > > > > > > 2017-08-27T11:16:48.453+0800: 85560.508: [CMS: > > > > > > 3853567K->3853567K(3853568K), 5.1092943 secs] > > > > > 4080766K->4064637K(4080768K), > > > > > > [Metaspace: 109873K->109873K(1148928K)], 5.1094013 secs] [Times: > > > > > user=5.09 > > > > > > sys=0.00, real=5.11 secs] > > > > > > 2017-08-27T11:16:53.563+0800: 85565.618: [GC (CMS Initial Mark) > [1 > > > > > > CMS-initial-mark: 3853567K(3853568K)] 4064639K(4080768K), > 0.0862505 > > > > secs] > > > > > > [Times: user=0.16 sys=0.00, real=0.08 secs] > > > > > > 2017-08-27T11:16:53.649+0800: 85565.704: > > [CMS-concurrent-mark-start] > > > > > > 2017-08-27T11:16:53.678+0800: 85565.734: [Full GC (Allocation > > > Failure) > > > > > > 2017-08-27T11:16:53.679+0800: 85565.734: > > [CMS2017-08-27T11:16:55.983+ > > > > > 0800: > > > > > > 85568.038: [CMS-concurrent-mark: 2.333/2.334 secs] [Times: > > user=2.37 > > > > > > sys=0.01, real=2.34 secs] > > > > > > ... > > > > > > > > > > > > Maybe is the error of java Garbage Collection,so i how to solve > the > > > > > > problem?(maybe modify the properties) > > > > > > > > > > > > Thanks a lot. > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Best regards, > > > > > > > > > > Shaofeng Shi 史少锋 > > > > > > > > > > > > > > > > > > > > > -- > > > Best regards, > > > > > > Shaofeng Shi 史少锋 > > > > > > > > > -- > Best regards, > > Shaofeng Shi 史少锋 >
