hi XiangLun, The screenshot couldn't be displayed. But never mind, what's the configuration of your servers, are they virtual or physical machines? VM will be a little slower, but even with virtual machine on Azure/AWS, I didn't see this step took that long time.
If the server configuration isn't the bottleneck, you can do profiling with some tool like JMC to see which method/operation took most of the CPU time. 2017-09-07 10:21 GMT+08:00 周湘伦 <[email protected]>: > I just found that CPU is 99% when performing the fourth step,but the > remaining three CPUS are 0,is this correct? > But i didn't have time to save the picture.Then i refresh the same > segment,it only took three minutes this time(I almost broke down),then i > save the CPU diagram. > [image: 内嵌图片 1] > > > 2017-09-06 22:58 GMT+08:00 ShaoFeng Shi <[email protected]>: > >> Hi xianglun, >> >> Dictionary encoding is the recommended encoding. It fits for most of the >> cases except Ultra High Cardinality. In your case, the SHOP_NO's >> cardinality is 115301 which should be good for the dictionary. >> >> Usually the step 4 is fast, less than 1 minutes in my experience. You >> might need to do some analysis to identify what the CPU did at that >> moment. >> >> 2017-09-06 15:01 GMT+08:00 周湘伦 <[email protected]>: >> >> > Hi, >> > The problem(out of memory) no longer arises.But the step 4(Build >> Dimension >> > Dictionary) costs too much time(45.36 mins).The high cardinality >> > column(shop_no) >> > has been used fixed_length. >> > ID Column Encoding Length Shard By >> > 1 TBL_BILL_GENERAL.SHOP_NO fixed_length[v1] 50 false >> > 2 TBL_BILL_GENERAL.SETT_DATE fixed_length[v1] 8 false >> > 3 TBL_BILL_GENERAL.END_SP_CHNL_NO dict[v1] 0 false >> > 4 TBL_BILL_GENERAL.MSG_TXN_CODE dict[v1] 0 false >> > 5 TBL_OPE_BIZ_SHOP.DELETED dict[v1] 0 false >> > >> > Name Data Type Cardinality >> > SHOP_NO varchar(50) 115301 >> > SETT_DATE char(8) 260 >> > END_SP_CHNL_NO varchar(200) 38 >> > MSG_TXN_CODE varchar(10) 6 >> > DELETED char(1) 2 >> > How can i resolve the problem? Can I use fixed_length in all >> > dimensions? Maybe the step 4 will take less time. >> > Thank you. >> > >> > >> > 2017-08-30 18:44 GMT+08:00 ShaoFeng Shi <[email protected]>: >> > >> > > 1. you can customize those mr parameters for hive running, like " >> > > mapreduce.map.memory.mb", "mapreduce.map.java.opts". For more tunning, >> > you >> > > can check Hive's wiki. >> > > >> > > 2. Yes, check and update yarn configurations according to your cluster >> > > profile >> > > 3. Yes that will help the step 4. If it didn't change the behavior, >> check >> > > whether there is a UHC column to build dictionary. >> > > >> > > 2017-08-29 14:38 GMT+08:00 周湘伦 <[email protected]>: >> > > >> > > > Hi, >> > > > Thank you for your answer! >> > > > But i was confused: >> > > > 1.Which hive configuration options i can optimize? I used to try >> > optimize >> > > > some hive configuration options,but it didn't work. >> > > > 2.Do i need to optimize hadoop configuration options? >> > > > 3.Do i need to optimize the KYLIN_JVM_SETTINGS in setenv.sh?(I have >> > > already >> > > > modified it,but it doesn't seem to work) >> > > > >> > > > >> > > > >> > > > 2017-08-28 21:11 GMT+08:00 ShaoFeng Shi <[email protected]>: >> > > > >> > > > > Step 2 is a hive operation, you may need check Hive configuration >> to >> > > > > allocate more memory to containers. You can also customize >> > > > > conf/kylin_hive_conf.xml, as it will be applied before running >> Hive >> > > > > command. >> > > > > >> > > > > Step 4 is in Kylin process; If this step got OOM, you need check >> > > whether >> > > > > the cube is using "dict" encoding for a high cardinality column. >> As >> > we >> > > > > know, "dict" doesn't fit for high cardinality, you need revisit >> the >> > > cube >> > > > > design, and change to "fixed_length" or "integer" encoding for >> that >> > > > column. >> > > > > >> > > > > Also, check this doc, it may help: >> > > > > https://kylin.apache.org/docs21/howto/howto_optimize_build.html >> > > > > >> > > > > >> > > > > 2017-08-28 13:54 GMT+08:00 周湘伦 <[email protected]>: >> > > > > >> > > > > > Sorry,i got these problems in the step 2:Redistribute Flat Hive >> > > > > > Table,sometimes in the step 4:Build Dimension Dictionary,it is >> > > > > > uncertain.But the step 2 occurs a little more. >> > > > > > >> > > > > > 2017-08-28 11:19 GMT+08:00 ShaoFeng Shi <[email protected] >> >: >> > > > > > >> > > > > > > In which step you got this error? This is also important for >> > > analysis >> > > > > the >> > > > > > > issue. >> > > > > > > >> > > > > > > 2017-08-28 10:01 GMT+08:00 周湘伦 <[email protected]>: >> > > > > > > >> > > > > > > > Hi,all >> > > > > > > > When i built cube,some error hapeend(java.lang. >> > > OutOfMemoryError). >> > > > > > > > >> > > > > > > > The version as belows: >> > > > > > > > jdk1.8.0_131,kylin-2.0.0 >> > > > > > > > >> > > > > > > > The ram is 8g. >> > > > > > > > >> > > > > > > > Some configure properties belows: >> > > > > > > > <property> >> > > > > > > > <name>mapreduce.map.memory.mb</name> >> > > > > > > > <value>5120</value> >> > > > > > > > <description></description> >> > > > > > > > </property> >> > > > > > > > >> > > > > > > > <property> >> > > > > > > > <name>mapreduce.map.java.opts</name> >> > > > > > > > <value>-Xmx4096m -XX:OnOutOfMemoryError='kill -9 >> > > > %p'</value> >> > > > > > > > <description></description> >> > > > > > > > </property> >> > > > > > > > >> > > > > > > > The log is : >> > > > > > > > # java.lang.OutOfMemoryError: Java heap space >> > > > > > > > # -XX:OnOutOfMemoryError="kill -9 %p" >> > > > > > > > # Executing /bin/sh -c "kill -9 26605"... >> > > > > > > > >> > > > > > > > Then i check the log of gc: >> > > > > > > > ... >> > > > > > > > 2017-08-27T11:16:41.112+0800: 85553.167: [Full GC >> (Allocation >> > > > > Failure) >> > > > > > > > 2017-08-27T11:16:41.112+0800: 85553.168: >> > > > [CMS2017-08-27T11:16:43.369+ >> > > > > > > 0800: >> > > > > > > > 85555.424: [CMS-concurrent-mark: 2.287/2.288 secs] [Times: >> > > > user=2.34 >> > > > > > > > sys=0.00, real=2.29 secs] >> > > > > > > > (concurrent mode failure): 3853567K->3853567K(3853568K), >> > > 7.3121409 >> > > > > > secs] >> > > > > > > > 4080766K->4063993K(4080768K), [Metaspace: >> > > > > 109873K->109873K(1148928K)], >> > > > > > > > 7.3122661 secs] [Times: user=7.31 sys=0.00, real=7.32 secs] >> > > > > > > > 2017-08-27T11:16:48.453+0800: 85560.508: [Full GC >> (Allocation >> > > > > Failure) >> > > > > > > > 2017-08-27T11:16:48.453+0800: 85560.508: [CMS: >> > > > > > > > 3853567K->3853567K(3853568K), 5.1092943 secs] >> > > > > > > 4080766K->4064637K(4080768K), >> > > > > > > > [Metaspace: 109873K->109873K(1148928K)], 5.1094013 secs] >> > [Times: >> > > > > > > user=5.09 >> > > > > > > > sys=0.00, real=5.11 secs] >> > > > > > > > 2017-08-27T11:16:53.563+0800: 85565.618: [GC (CMS Initial >> Mark) >> > > [1 >> > > > > > > > CMS-initial-mark: 3853567K(3853568K)] 4064639K(4080768K), >> > > 0.0862505 >> > > > > > secs] >> > > > > > > > [Times: user=0.16 sys=0.00, real=0.08 secs] >> > > > > > > > 2017-08-27T11:16:53.649+0800: 85565.704: >> > > > [CMS-concurrent-mark-start] >> > > > > > > > 2017-08-27T11:16:53.678+0800: 85565.734: [Full GC >> (Allocation >> > > > > Failure) >> > > > > > > > 2017-08-27T11:16:53.679+0800: 85565.734: >> > > > [CMS2017-08-27T11:16:55.983+ >> > > > > > > 0800: >> > > > > > > > 85568.038: [CMS-concurrent-mark: 2.333/2.334 secs] [Times: >> > > > user=2.37 >> > > > > > > > sys=0.01, real=2.34 secs] >> > > > > > > > ... >> > > > > > > > >> > > > > > > > Maybe is the error of java Garbage Collection,so i how to >> solve >> > > the >> > > > > > > > problem?(maybe modify the properties) >> > > > > > > > >> > > > > > > > Thanks a lot. >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > -- >> > > > > > > Best regards, >> > > > > > > >> > > > > > > Shaofeng Shi 史少锋 >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > Best regards, >> > > > > >> > > > > Shaofeng Shi 史少锋 >> > > > > >> > > > >> > > >> > > >> > > >> > > -- >> > > Best regards, >> > > >> > > Shaofeng Shi 史少锋 >> > > >> > >> >> >> >> -- >> Best regards, >> >> Shaofeng Shi 史少锋 >> > > -- Best regards, Shaofeng Shi 史少锋
