I have these configs for Spark – I guess this should be enough to process huge volume of data. Another question – How can we have segement cubing build process for Spark instead of by layer one? Pls. advise.
237 kylin.engine.spark-conf.spark.yarn.queue=RCMO_Pool 238 kylin.engine.spark-conf.spark.executor.memory=12G 239 kylin.engine.spark-conf.spark.executor.cores=3 240 kylin.engine.spark-conf.spark.executor.instances=2 241 kylin.engine.spark-conf.spark.eventLog.enabled=true 242 kylin.engine.spark-conf.spark.eventLog.dir=hdfs://sfpdev/tenants/rft/rcmo/kylin/spark-history 243 kylin.engine.spark-conf.spark.history.fs.logDirectory=hdfs://sfpdev/tenants/rft/rcmo/kylin/spark-history 244 kylin.engine.spark-conf.spark.hadoop.yarn.timeline-service.enabled=false 245 kylin.engine.spark-conf.spark.driver.memory=62G 246 kylin.engine.spark-conf.spark.storage.memoryFraction=0.1 Regards, Manoj From: Kumar, Manoj H Sent: Wednesday, February 07, 2018 9:51 AM To: '[email protected]' <[email protected]> Subject: RE: apache kylin 2.1 - Spark Cube Building Yes I did that.. I will come back with more logs. Regards, Manoj From: ShaoFeng Shi [mailto:[email protected]] Sent: Wednesday, February 07, 2018 6:46 AM To: user <[email protected]<mailto:[email protected]>> Subject: Re: apache kylin 2.1 - Spark Cube Building The default configuration for spark is very small; You need to tweak some parameters (like below) or enable Spark dynamic resource allocation; kylin.engine.spark-conf.spark.executor.memory=1G kylin.engine.spark-conf.spark.executor.cores=2 kylin.engine.spark-conf.spark.executor.instances=1 If you already took these actions, the performance still not good, then you need a deep tunning. 2018-02-06 12:43 GMT+08:00 Kumar, Manoj H <[email protected]<mailto:[email protected]>>: While running Spark Cube process, I noticed that this is taking other Cube tables into the consideration , Rather it should take the cube which it isdoing. Not sure why its taking data model of other cubes. Normally its being noticed that Spark is taking almost same time as Maprecuce is taking. We assume that Spark should be faster than MR jobs. 2018-02-05 23:35:07,825 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : 18/02/05 23:35:07 WARN CubeDescManager: Broken cube desc /cube_desc/FRI_CUBE_update.json 2018-02-05 23:35:07,825 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : java.lang.IllegalArgumentException: Table not found by LOAN_POSITION_BAL_009_SS1 2018-02-05 23:35:07,825 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.metadata.model.DataModelDesc.findTable(DataModelDesc.java:314) 2018-02-05 23:35:07,826 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.model.DimensionDesc.init(DimensionDesc.java:61) 2018-02-05 23:35:07,826 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.model.CubeDesc.init(CubeDesc.java:587) 2018-02-05 23:35:07,826 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.CubeDescManager.loadCubeDesc(CubeDescManager.java:196) 2018-02-05 23:35:07,826 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:321) 2018-02-05 23:35:07,826 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:114) 2018-02-05 23:35:07,827 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81) 2018-02-05 23:35:07,827 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.CubeManager.reloadCubeLocalAt(CubeManager.java:811) 2018-02-05 23:35:07,827 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(CubeManager.java:789) 2018-02-05 23:35:07,827 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:147) 2018-02-05 23:35:07,827 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:105) 2018-02-05 23:35:07,827 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:161) 2018-02-05 23:35:07,828 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37) 2018-02-05 23:35:07,828 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) 2018-02-05 23:35:07,828 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2018-02-05 23:35:07,828 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 2018-02-05 23:35:07,828 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2018-02-05 23:35:07,829 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at java.lang.reflect.Method.invoke(Method.java:606) 2018-02-05 23:35:07,829 INFO [Job 53f2a470-2973-46ad-9d97-8ddcec1933cc-315] spark.SparkExecutable:38 : at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$r Regards, Manoj This message is confidential and subject to terms at: http://www.jpmorgan.com/emaildisclaimer<http://www.jpmorgan.com/emaildisclaimer> including on confidentiality, legal privilege, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited. -- Best regards, Shaofeng Shi 史少锋 This message is confidential and subject to terms at: http://www.jpmorgan.com/emaildisclaimer including on confidentiality, legal privilege, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.
