[
https://issues.apache.org/jira/browse/KYLIN-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shaofeng SHI closed KYLIN-3123.
-------------------------------
Resolution: Incomplete
> Improve Spark Cubing
> --------------------
>
> Key: KYLIN-3123
> URL: https://issues.apache.org/jira/browse/KYLIN-3123
> Project: Kylin
> Issue Type: Improvement
> Components: Spark Engine
> Affects Versions: v2.2.0
> Environment: HDP , Hbase, Spark 2.6, Centos7
> Reporter: vu thanh dat
> Labels: beginner
> Fix For: v2.2.0
>
> Attachments: dimension.bmp, measures.bmp, rowkeys.bmp,
> spark_so_slow_2.bmp
>
>
> Hi all,
> Im using Spark to bulid Kylin cube.
> Data is about 13 millions rows for one step. Partition by date, 10 dimension,
> no measures.
> I set config:
> kylin.storage.hbase.compression-codec=snappy
> kylin.engine.spark.rdd-partition-cut-mb=1000
> kylin.engine.spark.max-partition=5000
> kylin.engine.spark-conf.spark.master=yarn
> kylin.engine.spark-conf.spark.submit.deployMode=cluster
> kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true
> kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=100
> kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=10240
> kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300
> kylin.engine.spark-conf.spark.shuffle.service.enabled=true
> kylin.engine.spark-conf.spark.shuffle.service.port=7337
> kylin.engine.spark-conf.spark.yarn.queue=default
> kylin.engine.spark-conf.spark.executor.memory=4G
> kylin.engine.spark-conf.spark.executor.cores=4
> Step Build Cube with Spark so slow, about 1hour for this step, can you show
> me to custom kylin config for speed up this step. I have 30s servers centos,
> storage 5.87T and 448 cores.
> I'm attach my config.
> Best regards and thanks!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)