[ 
https://issues.apache.org/jira/browse/KYLIN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116644#comment-17116644
 ] 

Xiaoxiang Yu commented on KYLIN-4522:
-------------------------------------

Dear [~cimolinal], have you ever try to use other version such as Kylin 2.5 or 
other previous version?

> Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile Kylin 2.6.6 
> EMR  5.19
> ---------------------------------------------------------------------------------------
>
>                 Key: KYLIN-4522
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4522
>             Project: Kylin
>          Issue Type: Bug
>          Components: Environment , Job Engine, Others
>    Affects Versions: v2.6.6
>         Environment: Release label: emr-5.19.0
> Hadoop distribution:Amazon 2.8.5
> Applications: Hive 2.3.3, HBase 1.4.7, Spark 2.3.2, Livy 0.5.0, ZooKeeper 
> 3.4.13, Sqoop 1.4.7, Oozie 5.0.0, Pig 0.17.0, HCatalog 2.3.3
>            Reporter: Carlos Ignacio Molina López
>            Priority: Major
>         Attachments: base_2020_05_25_14_29_52.zip
>
>
> Hi,
> I've tried to build the Sample kylin_sales_cube with Spark to run in Amazon 
> EMR Cluster. I saw issue KYLIN-3931 and suggestion is to use the 2.6.6 Engine 
> for Hadoop 3. In EMR Hadoop 3 is only available on EMR 6.0 which is very 
> recent and I had tried to setup versions 2.6.6 and 3.0.2 for Hadoop 3, but in 
> both cases the Kylin Site doesn't show up (Error 404 - Not Found). So I tried 
> to run in EMR 5.19 that has same version of Spark (2.3.2) used in Kylin 2.6.6.
> I am getting "java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.hbase.io.hfile.HFile" error message. 
> I had already copied the following jars to Spark Jars folder, as per 
> documentations and what I've read in kylin-issues mailing list archives:
> /usr/lib/hbase/hbase-hadoop-compat-1.4.7.jar
> /usr/lib/hbase/hbase-hadoop2-compat-1.4.7.jar
> /usr/lib/hbase/lib/hbase-common-1.4.7-tests.jar
> /usr/lib/hbase/lib/hbase-common-1.4.7.jar
> /usr/lib/hbase/hbase-client.jar
> /usr/lib/hbase/hbase-client-1.4.7.jar
> /usr/lib/hbase/hbase-server-1.4.7.jar
>  
> This is the output shown on the Step
> {{org.apache.kylin.engine.spark.exception.SparkException: OS command error 
> exit with return code: 1, error message: 20/05/25 14:03:46 WARN SparkConf: 
> The configuration key 'spark.yarn.executor.memoryOverhead' has been 
> deprecated as of Spark 2.3 and may be removed in the future. Please use the 
> new key 'spark.executor.memoryOverhead' 
> instead.org.apache.kylin.engine.spark.exception.SparkException: OS command 
> error exit with return code: 1, error message: 20/05/25 14:03:46 WARN 
> SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has 
> been deprecated as of Spark 2.3 and may be removed in the future. Please use 
> the new key 'spark.executor.memoryOverhead' instead.20/05/25 14:03:47 INFO 
> RMProxy: Connecting to ResourceManager at 
> ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal/XXX.XXX.XXX.XXX:803220/05/25 
> 14:03:49 INFO Client: Requesting a new application from cluster with 4 
> NodeManagers20/05/25 14:03:49 INFO Client: Verifying our application has not 
> requested more than the maximum memory capability of the cluster (6144 MB per 
> container)20/05/25 14:03:49 INFO Client: Will allocate AM container, with 
> 5632 MB memory including 512 MB overhead20/05/25 14:03:49 INFO Client: 
> Setting up container launch context for our AM20/05/25 14:03:49 INFO Client: 
> Setting up the launch environment for our AM container20/05/25 14:03:49 INFO 
> Client: Preparing resources for our AM container20/05/25 14:03:51 WARN 
> Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back 
> to uploading libraries under SPARK_HOME.20/05/25 14:03:54 INFO Client: 
> Uploading resource 
> file:/mnt/tmp/spark-d26c4f1f-1b8a-4cf8-a05b-842294ce017d/__spark_libs__4034657074333893156.zip
>  -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/__spark_libs__4034657074333893156.zip20/05/25
>  14:03:54 INFO Client: Uploading resource 
> file:/usr/local/kylin/apache-kylin-2.6.6-bin-hbase1x/lib/kylin-job-2.6.6.jar 
> -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/kylin-job-2.6.6.jar20/05/25
>  14:03:55 INFO Client: Uploading resource 
> file:/usr/lib/hbase/lib/hbase-common-1.4.7.jar -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-common-1.4.7.jar20/05/25
>  14:03:55 INFO Client: Uploading resource 
> file:/usr/lib/hbase/lib/hbase-server-1.4.7.jar -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-server-1.4.7.jar20/05/25
>  14:03:55 INFO Client: Uploading resource 
> file:/usr/lib/hbase/lib/hbase-client-1.4.7.jar -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-client-1.4.7.jar20/05/25
>  14:03:55 INFO Client: Uploading resource 
> file:/usr/lib/hbase/lib/hbase-protocol-1.4.7.jar -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-protocol-1.4.7.jar20/05/25
>  14:03:55 INFO Client: Uploading resource 
> file:/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.7.jar -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-hadoop-compat-1.4.7.jar20/05/25
>  14:03:56 INFO Client: Uploading resource 
> file:/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/htrace-core-3.1.0-incubating.jar20/05/25
>  14:03:56 INFO Client: Uploading resource 
> file:/usr/lib/hbase/lib/metrics-core-2.2.0.jar -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/metrics-core-2.2.0.jar20/05/25
>  14:03:56 WARN Client: Same path resource 
> file:///usr/lib/hbase/lib/hbase-hadoop-compat-1.4.7.jar added multiple times 
> to distributed cache.20/05/25 14:03:56 INFO Client: Uploading resource 
> file:/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.7.jar -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-hadoop2-compat-1.4.7.jar20/05/25
>  14:03:56 INFO Client: Uploading resource file:/etc/spark/conf/hive-site.xml 
> -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hive-site.xml20/05/25
>  14:03:56 INFO Client: Uploading resource 
> file:/mnt/tmp/spark-d26c4f1f-1b8a-4cf8-a05b-842294ce017d/__spark_conf__1997289269037988671.zip
>  -> 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/__spark_conf__.zip20/05/25
>  14:03:56 INFO SecurityManager: Changing view acls to: hadoop20/05/25 
> 14:03:56 INFO SecurityManager: Changing modify acls to: hadoop20/05/25 
> 14:03:56 INFO SecurityManager: Changing view acls groups to: 20/05/25 
> 14:03:56 INFO SecurityManager: Changing modify acls groups to: 20/05/25 
> 14:03:56 INFO SecurityManager: SecurityManager: authentication disabled; ui 
> acls disabled; users  with view permissions: Set(hadoop); groups with view 
> permissions: Set(); users  with modify permissions: Set(hadoop); groups with 
> modify permissions: Set()20/05/25 14:03:56 INFO Client: Submitting 
> application application_1590337422418_0043 to ResourceManager20/05/25 
> 14:03:56 INFO YarnClientImpl: Submitted application 
> application_1590337422418_004320/05/25 14:03:57 INFO Client: Application 
> report for application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:03:57 
> INFO Client:  client token: N/A diagnostics: AM container is launched, 
> waiting for AM container to Register with RM ApplicationMaster host: N/A 
> ApplicationMaster RPC port: -1 queue: default start time: 1590415436952 final 
> status: UNDEFINED tracking URL: 
> http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
>  user: hadoop20/05/25 14:03:58 INFO Client: Application report for 
> application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:03:59 INFO 
> Client: Application report for application_1590337422418_0043 (state: 
> ACCEPTED)20/05/25 14:04:00 INFO Client: Application report for 
> application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:04:01 INFO 
> Client: Application report for application_1590337422418_0043 (state: 
> ACCEPTED)20/05/25 14:04:02 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:02 INFO Client: 
>  client token: N/A diagnostics: N/A ApplicationMaster host: XXX.XXX.XXX.XXX 
> ApplicationMaster RPC port: 0 queue: default start time: 1590415436952 final 
> status: UNDEFINED tracking URL: 
> http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
>  user: hadoop20/05/25 14:04:03 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:04 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:05 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:06 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:07 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:08 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:09 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:10 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:11 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:12 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:13 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:14 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:15 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:16 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:17 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:18 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:19 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:21 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:22 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:23 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:24 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:25 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:26 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:27 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:28 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:29 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:30 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:31 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:32 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:33 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:34 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:35 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:36 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:37 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:38 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:39 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:40 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:41 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:42 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:43 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:44 INFO Client: Application report for 
> application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:04:44 INFO 
> Client:  client token: N/A diagnostics: AM container is launched, waiting for 
> AM container to Register with RM ApplicationMaster host: N/A 
> ApplicationMaster RPC port: -1 queue: default start time: 1590415436952 final 
> status: UNDEFINED tracking URL: 
> http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
>  user: hadoop20/05/25 14:04:45 INFO Client: Application report for 
> application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:04:46 INFO 
> Client: Application report for application_1590337422418_0043 (state: 
> ACCEPTED)20/05/25 14:04:47 INFO Client: Application report for 
> application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:04:48 INFO 
> Client: Application report for application_1590337422418_0043 (state: 
> ACCEPTED)20/05/25 14:04:49 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:49 INFO Client: 
>  client token: N/A diagnostics: N/A ApplicationMaster host: XXX.XXX.XXX.XXX 
> ApplicationMaster RPC port: 0 queue: default start time: 1590415436952 final 
> status: UNDEFINED tracking URL: 
> http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
>  user: hadoop20/05/25 14:04:50 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:51 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:52 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:53 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:54 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:55 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:56 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:57 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:04:58 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:59 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:00 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:01 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:02 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:03 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:04 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:05 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:06 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:07 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:08 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:09 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:10 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:11 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:12 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:13 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:14 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:15 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:16 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:17 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:18 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:19 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:20 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:21 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:22 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:23 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:24 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:25 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> RUNNING)20/05/25 14:05:26 INFO Client: Application report for 
> application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:27 INFO Client: 
> Application report for application_1590337422418_0043 (state: 
> FINISHED)20/05/25 14:05:27 INFO Client:  client token: N/A diagnostics: User 
> class threw exception: java.lang.RuntimeException: error execute 
> org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Job aborted. 
> at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
>  at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721)Caused
>  by: org.apache.spark.SparkException: Job aborted. at 
> org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>  at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at 
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081)
>  at 
> org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831)
>  at 
> org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:238)
>  at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
>  ... 6 moreCaused by: org.apache.spark.SparkException: Job aborted due to 
> stage failure: Task 1 in stage 1.0 failed 4 times, most recent failure: Lost 
> task 1.3 in stage 1.0 (TID 15, ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal, 
> executor 3): org.apache.spark.SparkException: Task failed while writing rows 
> at 
> org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at 
> org.apache.spark.scheduler.Task.run(Task.scala:109) at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)Caused by: 
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.hbase.io.hfile.HFile at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:880)
>  at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:805)
>  at 
> org.apache.hadoop.hbase.regionserver.StoreFile$WriterBuilder.build(StoreFile.java:739)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.getNewWriter(HFileOutputFormat3.java:224)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:181)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:153)
>  at 
> org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
>  at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
>  ... 8 more}}
> {{Driver stacktrace: at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1803)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1791)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1790)
>  at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1790) 
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:871)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:871)
>  at scala.Option.foreach(Option.scala:257) at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:871)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2024)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1973)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1962)
>  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:682) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:2034) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:2055) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:2087) at 
> org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78)
>  ... 16 moreCaused by: org.apache.spark.SparkException: Task failed while 
> writing rows at 
> org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at 
> org.apache.spark.scheduler.Task.run(Task.scala:109) at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)Caused by: 
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.hbase.io.hfile.HFile at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:880)
>  at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:805)
>  at 
> org.apache.hadoop.hbase.regionserver.StoreFile$WriterBuilder.build(StoreFile.java:739)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.getNewWriter(HFileOutputFormat3.java:224)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:181)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:153)
>  at 
> org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
>  at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
>  ... 8 more}}
> {{ ApplicationMaster host: XXX.XXX.XXX.XXX ApplicationMaster RPC port: 0 
> queue: default start time: 1590415436952 final status: FAILED tracking URL: 
> http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
>  user: hadoopException in thread "main" org.apache.spark.SparkException: 
> Application application_1590337422418_0043 finished with failed status at 
> org.apache.spark.deploy.yarn.Client.run(Client.scala:1165) at 
> org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1520) 
> at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
>  at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) 
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at 
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at 
> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)20/05/25 14:05:27 
> INFO ShutdownHookManager: Shutdown hook called20/05/25 14:05:27 INFO 
> ShutdownHookManager: Deleting directory 
> /mnt/tmp/spark-04e9eed4-d16e-406c-9fb0-972cf355db0920/05/25 14:05:27 INFO 
> ShutdownHookManager: Deleting directory 
> /mnt/tmp/spark-d26c4f1f-1b8a-4cf8-a05b-842294ce017dThe command is: export 
> HADOOP_CONF_DIR=/etc/hadoop/conf && /usr/lib/spark/bin/spark-submit --class 
> org.apache.kylin.common.util.SparkEntry  --conf spark.executor.instances=40  
> --conf spark.yarn.queue=default  --conf 
> spark.history.fs.logDirectory=hdfs:///kylin/spark-history  --conf 
> spark.master=yarn  --conf spark.hadoop.yarn.timeline-service.enabled=false  
> --conf spark.executor.memory=5G  --conf spark.eventLog.enabled=true  --conf 
> spark.eventLog.dir=hdfs:///kylin/spark-history  --conf 
> spark.yarn.executor.memoryOverhead=1024  --conf spark.driver.memory=5G  
> --conf spark.submit.deployMode=cluster  --conf 
> spark.shuffle.service.enabled=true --jars 
> /usr/lib/hbase/lib/hbase-common-1.4.7.jar,/usr/lib/hbase/lib/hbase-server-1.4.7.jar,/usr/lib/hbase/lib/hbase-client-1.4.7.jar,/usr/lib/hbase/lib/hbase-protocol-1.4.7.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.7.jar,/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,/usr/lib/hbase/lib/metrics-core-2.2.0.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.7.jar,/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.7.jar,
>  /usr/local/kylin/apache-kylin-2.6.6-bin-hbase1x/lib/kylin-job-2.6.6.jar 
> -className org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/rowkey_stats/part-r-00000_hfile
>  -counterOutput 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/counter
>  -cubename kylin_sales_cube -output 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/hfile
>  -input 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/cuboid/
>  -segmentId 0d22a9ac-5256-02cd-a5b9-44de5247871f -metaUrl 
> kylin_metadata@hdfs,path=hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/metadata
>  -hbaseConfPath 
> hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/hbase-conf.xml
>  at 
> org.apache.kylin.engine.spark.SparkExecutable.doWork(SparkExecutable.java:347)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)}}
>  
> {{Please suggest how this issue can be troubleshooted.}}
> Thank you and kind regards
> {{Carlos Molina.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to