Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

ShaoFeng Shi Tue, 03 Sep 2019 21:02:43 -0700

Xiaoxiang, thank you for the help!

Best regards,


Shaofeng Shi 史少锋
Apache Kylin PMC
Email: [email protected]

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: [email protected]
Join Kylin dev mail group: [email protected]




Xiaoxiang Yu <[email protected]> 于2019年9月4日周三 上午11:39写道：

> Dear Gourav,
>   Thank you for your update.
>
> ----------------
> Best wishes,
> Xiaoxiang Yu
>
>
> 发件人: Gourav Gupta <[email protected]>
> 日期: 2019年9月4日 星期三 00:09
> 收件人: Xiaoxiang Yu <[email protected]>, Wang rupeng <
> [email protected]>
> 抄送: "[email protected]" <[email protected]>
> 主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera
>
> Dear Xiaoxiang,
>
> Thanks for the helpful reply. Please be apprised, have resolved all the
> issues and now I am able to create a cube with MapReduce mode. Last caveat
> i.e. "FAILED: Execution Error, return code 3 from
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" is resolved while I
> configured the "hive.auto.convert.join = false" in kylin-hive-site.xml.
>
> Thanks for the support and appreciates the quick response from you and
> Kylin Team. I will take your help in future as well if I face any other
> issue when building a cube with spark mode.
>
> Best Regards,
> Gourav Gupta
>
> On Sun, Sep 1, 2019 at 10:54 AM Xiaoxiang Yu <[email protected]
> <mailto:[email protected]>> wrote:
> Hi friend,
>   I feel so glad to hear you have resolved some problem after a lot
> effort, and it is very kind of you to share something you found about
> kylin-port-replace-util.sh with us.
>   It seems that you meet another trouble of the first step of your cube
> building, using Hive to create a flat table. As far as I can see, the
> message provided by you “FAILED: Execution Error, return code 3 from
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask”indicated that your Hive
> is NOT configured in right way. Your Hive command run in local mode other
> than Yarn mode. It is strange, did your node which you choose to deploy
> Kylin is configured in correct way? Maybe you should ask your Hadoop
> administrator for help. Or could you please provided more detail about how
> your deploy Kylin?
>    If you use Kylin for the first time and you are familiar with Docker,
> maybe you can run a docker container to have a technical preview. Please
> refer to http://kylin.apache.org/docs/install/kylin_docker.html.
>
> ----------------
> Best wishes,
> Xiaoxiang Yu
>
>
> 发件人: Gourav Gupta <[email protected]<mailto:
> [email protected]>>
> 日期: 2019年9月1日 星期日 01:24
> 收件人: Wang rupeng <[email protected]<mailto:[email protected]>>,
> Xiaoxiang Yu <[email protected]<mailto:[email protected]>>,
> "[email protected]<mailto:[email protected]>" <[email protected]
> <mailto:[email protected]>>
> 主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera
>
> Dear Wang and Xiaoxiang,
> Thanks for providing the suggestions and solutions for all those queries
> which I had mentioned in the previous trailing mail. Truly appreciated!!!
>
> As the answers have been received from you, I did the port number
> amendment in  "./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still
> thereafter I was facing with the same issue. After doing hours of
> brainstorming, I was able to resolve the aforesaid issue(Not able to access
> Kylin UI), Actually, one of the java application was running on 9009 port
> no. and we also know that Kylin takes 3 ports 7070,9009 & 7443. Was able to
> access the Kylin Web UI while I stopped the already running script on 9009.
>
> At this time I am facing with one caveat i.e "FAILED: Execution Error,
> return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when
> I am going to create a cube in Map-Reduce mode. I googled the same and did
> the amendment( Kylin and Hive property) as per the solution I got over the
> shared link(
> https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask)
> but still, I am not able to resolve.
>
> Please let me know is there any way of resolving this issue. Attaching the
> screenshot of the error.
>
> Thanks in advance.
>
> Best Regards,
> Gourav Gupta
>
> On Sat, Aug 31, 2019 at 10:49 PM Gourav Gupta <[email protected]
> <mailto:[email protected]>> wrote:
> Dear Wang and Xiaoxiang,
> Thanks for providing the suggestions and solutions for all those queries
> which I had mentioned in the previous trailing mail. Truly appreciated!!!
>
> As the answers have been received from you, I did the port number
> amendment in  "./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still
> thereafter I was facing with the same issue. After doing hours of
> brainstorming, I was able to resolve the aforesaid issue(Not able to access
> Kylin UI), Actually, one of the java application was running on 9009 port
> no. and we also know that Kylin takes 3 ports 7070,9009 & 7443. Was able to
> access the Kylin Web UI while I stopped the already running script on 9009.
>
> At this time I am facing with one caveat i.e "FAILED: Execution Error,
> return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when
> I am going to create a cube in Map-Reduce mode. I googled the same and did
> the amendment( Kylin and Hive property) as per the solution I got over the
> shared link(
> https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask)
> but still, I am not able to resolve.
>
> Please let me know is there any way of resolving this issue. Attaching the
> screenshot of the error.
>
> Thanks in advance.
>
> Best Regards,
> Gourav Gupta
>
>
>
>
> On Fri, Aug 30, 2019 at 1:00 PM Wang rupeng <[email protected]<mailto:
> [email protected]>> wrote:
> Hi Gupta,
>     You can change kylin port by using following command and new port is
> 7070 plus the number you set:
>     ./$KYLIN_HOME/bin/kylin-port-replace-util.sh set <number>
>     If kylin web UI cannot be opened, you can check kylin log which is
> $KYLIN_HOME/logs/kylin.log to see more details.
> There are some suggestions for your doubts:
>     1. You need to add environment variable
> SPARK_HOME=/local/path/to/spark so that you can start kylin successfully
> even though you don't use spark to build cube. And you'd better using
> suggested version of spark(spark-2.3.2), you can download it by
> ./$KYLIN_HOME/bin/down-spark.sh .
>     2. Kylin supported cdh vertion is cdh5.7+, cdh6.0, cdh6.1 and you
> don't have to care about HBase version if you are using cdh. In case you
> are using cdh5.16, you can download
> apache-kylin-<version>-bin-cdh57.tar.gz from
> http://kylin.apache.org/download/
>     3. You don't have to install kylin on master node, any other node in
> cluster would be OK.
>
> -------------------
> Best wishes,
> Rupeng Wang
>
>
> 发件人: Gourav Gupta <[email protected]<mailto:
> [email protected]>>
> 日期: 2019年8月30日 星期五 02:03
> 收件人: Wang rupeng <[email protected]<mailto:[email protected]>>
> 抄送: "[email protected]<mailto:[email protected]>" <
> [email protected]<mailto:[email protected]>>
> 主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera
>
> Thanks a lot Wang for the prompt helpful reply. Actually today I have
> removed the old version of Kylin and installed successfully apache Kylin
> 2.6 for CDH mode but now at this time, we are unable to open Kylin WEB UI.
> Even though I have had changed port number 7070 to some other number in
> server.xml(Tomcat directory), but still facing the same issue.
>
> I have some doubts while configuring the Kylin which are mentioned below:
>
> 1. Would I have to write the path of spark master node or path of spark
> which has come with Kylin?
> 2.Which tar file will be suitable for cloudera 5.16 ?? What is the need of
> Kylin-HBase version?
> 3.should  I need to install and configured Kylin on master node? will
> installation over the edge node work?
>
> Actually, we are trying to switch the visualization layer from SQL(OLAP) -
> PowerBI pipeline to KYLIN-Mean Stack (Open Source/Enterprise version ). So
> your help is much appreciated on the same.
>
> I am waiting for your positive response.
>
>
> Regards,
> Gourav Gupta
>
> On Thu, Aug 29, 2019 at 5:43 PM Wang rupeng <[email protected]<mailto:
> [email protected]>> wrote:
> Hi,
>     It seems the problem is following
>     "60505 [dispatcher-event-loop-6] ERROR
> org.apache.spark.scheduler.cluster.YarnScheduler  - Lost executor 1 on
> *********: Container marked as failed:"
> It usually comes out with not enough memory for your yarn so that yarn
> container is closed because of lack of memory , you can go to yarn resource
> manager web page to see more details with yarn log.
>         If it's the memory issue, you can try to allocate more memory for
> spark yarn executor by change the following configuration item in
> "$KYLIN_HOME/conf/kylin.properties"
>     kylin.engine.spark-conf.spark.yarn.executor.memoryOverhead=384
>
>
> -------------------
> Best wishes,
> Rupeng Wang
>
>
> 在 2019/8/29 14:57，“Gourav Gupta”<[email protected]<mailto:
> [email protected]>> 写入:
>
>     Hi Sir,
>
>     I have installed and configured Apache Kylin 2.4 on Cloudera Platform
> for
>     creating the Cube.
>
>     I have been able to create a cube in MapReduce mode but getting the
>     below-mentioned caveat while executes on spark mode. I have had
> followed
>     all the steps and tried many remedies for debugging the problem.
>
>
>
>     Please let me know how to resolve this bug. Thanks in Advance.
>
>
>
>
>
>     1091 [main] ERROR org.apache.spark.SparkContext  - Error adding jar
>     (java.lang.IllegalArgumentException: requirement failed: JAR
>     kylin-job-2.4.0.jar already registered.), was the --addJars option
> used?
>
>     [Stage 0:>                                                          (0
> + 0)
>     / 2]
>     [Stage 0:>                                                          (0
> + 2)
>     / 2]
>
>
>     60505 [dispatcher-event-loop-6] ERROR
>     org.apache.spark.scheduler.cluster.YarnScheduler  - Lost executor 1 on
> **
>     *******: Container marked as failed:
>     container_e62_1566915974858_6628_01_000003 on host: *******. Exit
> status:
>     50. Diagnostics: Exception from container-launch.
>     Container id: container_e62_1566915974858_6628_01_000003
>     Exit code: 50
>     Stack trace: ExitCodeException exitCode=50:
>     at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
>     at org.apache.hadoop.util.Shell.run(Shell.java:507)
>     at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
>     at
>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213)
>     at
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>     at
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
>
>     Container exited with a non-zero exit code 50
>
>     82664 [dispatcher-event-loop-5] ERROR
>     org.apache.spark.scheduler.cluster.YarnScheduler
>      - Lost executor 2 on *******: Container marked as failed:
>     container_e62_1566915974858_6628_01_000004 on host: *******. Exit
> status:
>     50. Diagnostics: Exception from container-launch.
>     Container id: container_e62_1566915974858_6628_01_000004
>     Exit code: 50
>     Stack trace: ExitCodeException exitCode=50:
>     at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
>     at org.apache.hadoop.util.Shell.run(Shell.java:507)
>     at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
>     at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.
>     launchContainer(DefaultContainerExecutor.java:213)
>     at org.apache.hadoop.yarn.server.nodemanager.containermanager.
>     launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>     at org.apache.hadoop.yarn.server.nodemanager.containermanager.
>     launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>     ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>     ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
>
>     Container exited with a non-zero exit code 50
>
>
>     The command is:
>     export HADOOP_CONF_DIR=/etc/hadoop/conf &&
> /usr/lib/spark/bin/spark-submit
>     --class org.apache.kylin.common.util.SparkEntry  --conf
>     spark.executor.instances=1  --conf spark.yarn.archive=hdfs://
>     namenode:8020/kylin/spark/spark-libs.jar  --conf
> spark.yarn.queue=default
>      --conf 
> spark.yarn.am<http://spark.yarn.am>.extraJavaOptions=-Dhdp.version=current
> --conf
>     spark.history.fs.logDirectory=hdfs:///kylin/spark-history  --conf
>     spark.driver.extraJavaOptions=-Dhdp.version=current  --conf
>     spark.io.compression.codec=org.apache.spark.io<
> http://org.apache.spark.io>.SnappyCompressionCodec
>      --conf spark.master=yarn  --conf
>     spark.executor.extraJavaOptions=-Dhdp.version=current
>      --conf spark.hadoop.yarn.timeline-service.enabled=false  --conf
>     spark.executor.memory=4G  --conf spark.eventLog.enabled=true  --conf
>     spark.eventLog.dir=hdfs:///kylin/spark-history  --conf
>     spark.executor.cores=2  --conf spark.submit.deployMode=cluster --jars
>     /opt/apache-kylin-2.4.0-bin-cdh57/lib/kylin-job-2.4.0.jar
>     /opt/apache-kylin-2.4.0-bin-cdh57/lib/kylin-job-2.4.0.jar -className
>     org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable
>
> default.kylin_intermediate_kylin_sales_cube_c1526d16_9719_4dec_be41_346f43654e02
>     -input hdfs://nameservice1/kylin/kylin_metadata/kylin-2159d40b-
>     f14e-4500-af95-1fbfd5a4073f/kylin_intermediate_kylin_
>     sales_cube_c1526d16_9719_4dec_be41_346f43654e02 -segmentId
>     c1526d16-9719-4dec-be41-346f43654e02 -metaUrl kylin_metadata@hdfs
> ,path=hdfs:
>     //nameservice1/kylin/kylin_metadata/kylin-2159d40b-f14e-
>     4500-af95-1fbfd5a4073f/kylin_sales_cube/metadata -output
>     hdfs://nameservice1/kylin/kylin_metadata/kylin-2159d40b-
>     f14e-4500-af95-1fbfd5a4073f/kylin_sales_cube/cuboid/ -cubename
>     kylin_sales_cube
>

Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Reply via email to