hello,
we are testing the 2.6 RC and we are facing a systematic issue when building
cubes with spark engine (even with sample cube), whereas the MapReduce engin
succeeds.
The job process fails at step #8 Step Name: Convert Cuboid Data to HFile with
the following error (full log is available as attachment):
ClassNotFoundException: org.apache.hadoop.hbase.metrics.MetricRegistry
We run kylin on AWS EMR 5.13 (it failed also with 5.17).
Do you have any idea of the reasons why it happens ?
Hubert
OS command error exit with return code: 1, error message: 2019-01-11 08:39:24
WARN SparkConf:66 - The configuration key 'spark.yarn.executor.memoryOverhead'
has been deprecated as of Spark 2.3 and may be removed in the future. Please
use the new key 'spark.executor.memoryOverhead' instead.
SparkEntry args:-className org.apache.kylin.storage.hbase.steps.SparkCubeHFile
-partitions
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/rowkey_stats/part-r-00000_hfile
-counterOutput
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/counter
-cubename kylin_sales_cube -output
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/hfile
-input
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/cuboid/
-segmentId f944e1a8-506a-7f5e-4d6a-389a3ce53489 -metaUrl
kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
-hbaseConfPath
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/hbase-conf.xml
Running org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/rowkey_stats/part-r-00000_hfile
-counterOutput
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/counter
-cubename kylin_sales_cube -output
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/hfile
-input
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/cuboid/
-segmentId f944e1a8-506a-7f5e-4d6a-389a3ce53489 -metaUrl
kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
-hbaseConfPath
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/hbase-conf.xml
2019-01-11 08:39:25 WARN SparkConf:66 - The configuration key
'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and
may be removed in the future. Please use the new key
'spark.executor.memoryOverhead' instead.
2019-01-11 08:39:25 INFO SparkContext:54 - Running Spark version 2.3.2
2019-01-11 08:39:25 INFO SparkContext:54 - Submitted application: Converting
HFile for:kylin_sales_cube segment f944e1a8-506a-7f5e-4d6a-389a3ce53489
2019-01-11 08:39:25 INFO SecurityManager:54 - Changing view acls to: hadoop
2019-01-11 08:39:25 INFO SecurityManager:54 - Changing modify acls to: hadoop
2019-01-11 08:39:25 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-11 08:39:25 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-11 08:39:25 INFO SecurityManager:54 - SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(hadoop); groups
with view permissions: Set(); users with modify permissions: Set(hadoop);
groups with modify permissions: Set()
2019-01-11 08:39:26 INFO Utils:54 - Successfully started service 'sparkDriver'
on port 46289.
2019-01-11 08:39:26 INFO SparkEnv:54 - Registering MapOutputTracker
2019-01-11 08:39:26 INFO SparkEnv:54 - Registering BlockManagerMaster
2019-01-11 08:39:26 INFO BlockManagerMasterEndpoint:54 - Using
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2019-01-11 08:39:26 INFO BlockManagerMasterEndpoint:54 -
BlockManagerMasterEndpoint up
2019-01-11 08:39:26 INFO DiskBlockManager:54 - Created local directory at
/mnt/tmp/blockmgr-985a20b9-acaf-4178-a6a8-93e18739038d
2019-01-11 08:39:26 INFO MemoryStore:54 - MemoryStore started with capacity
912.3 MB
2019-01-11 08:39:26 INFO SparkEnv:54 - Registering OutputCommitCoordinator
2019-01-11 08:39:26 INFO log:192 - Logging initialized @2530ms
2019-01-11 08:39:26 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp:
unknown, git hash: unknown
2019-01-11 08:39:26 INFO Server:419 - Started @2639ms
2019-01-11 08:39:26 WARN Utils:66 - Service 'SparkUI' could not bind on port
4040. Attempting port 4041.
2019-01-11 08:39:26 INFO AbstractConnector:278 - Started
ServerConnector@1f12e153{HTTP/1.1,[http/1.1]}{0.0.0.0:4041}
2019-01-11 08:39:26 INFO Utils:54 - Successfully started service 'SparkUI' on
port 4041.
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@41477a6d{/jobs,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@585ac855{/jobs/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@5bb8f9e2{/jobs/job,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@5f78de22{/jobs/job/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@516ebdf8{/stages,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@4d8539de{/stages/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@3eba57a7{/stages/stage,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@67207d8a{/stages/stage/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@bcb09a6{/stages/pool,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@7c2a69b4{/stages/pool/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@a619c2{/storage,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@648ee871{/storage/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@375b5b7f{/storage/rdd,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@1813f3e9{/storage/rdd/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@28cb9120{/environment,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@3b152928{/environment/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@56781d96{/executors,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@5173200b{/executors/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@25c5e994{/executors/threadDump,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@378bd86d{/executors/threadDump/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@2189e7a7{/static,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@1ee29c84{/,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@7c8326a4{/api,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@c9d82f9{/jobs/job/kill,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@6f012914{/stages/stage/kill,null,AVAILABLE,@Spark}
2019-01-11 08:39:26 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at
http://ip-172-31-33-160.eu-west-1.compute.internal:4041
2019-01-11 08:39:26 INFO SparkContext:54 - Added JAR
file:/opt/kylin/apache-kylin-2.6.0/lib/kylin-job-2.6.0.jar at
spark://ip-172-31-33-160.eu-west-1.compute.internal:46289/jars/kylin-job-2.6.0.jar
with timestamp 1547195966698
2019-01-11 08:39:27 INFO RMProxy:98 - Connecting to ResourceManager at
ip-172-31-33-160.eu-west-1.compute.internal/172.31.33.160:8032
2019-01-11 08:39:27 INFO Client:54 - Requesting a new application from cluster
with 3 NodeManagers
2019-01-11 08:39:27 INFO Client:54 - Verifying our application has not
requested more than the maximum memory capability of the cluster (5760 MB per
container)
2019-01-11 08:39:27 INFO Client:54 - Will allocate AM container, with 896 MB
memory including 384 MB overhead
2019-01-11 08:39:27 INFO Client:54 - Setting up container launch context for
our AM
2019-01-11 08:39:27 INFO Client:54 - Setting up the launch environment for our
AM container
2019-01-11 08:39:27 INFO Client:54 - Preparing resources for our AM container
2019-01-11 08:39:29 WARN Client:66 - Neither spark.yarn.jars nor
spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2019-01-11 08:39:32 INFO Client:54 - Uploading resource
file:/mnt/tmp/spark-1257f53e-11f2-48d4-9ee7-65a1f9ea878b/__spark_libs__4991678983370181637.zip
->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/__spark_libs__4991678983370181637.zip
2019-01-11 08:39:33 INFO Client:54 - Uploading resource
file:/usr/lib/hbase/lib/hbase-common-1.4.2.jar ->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-common-1.4.2.jar
2019-01-11 08:39:33 INFO Client:54 - Uploading resource
file:/usr/lib/hbase/lib/hbase-server-1.4.2.jar ->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-server-1.4.2.jar
2019-01-11 08:39:33 INFO Client:54 - Uploading resource
file:/usr/lib/hbase/lib/hbase-client-1.4.2.jar ->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-client-1.4.2.jar
2019-01-11 08:39:33 INFO Client:54 - Uploading resource
file:/usr/lib/hbase/lib/hbase-protocol-1.4.2.jar ->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-protocol-1.4.2.jar
2019-01-11 08:39:34 INFO Client:54 - Uploading resource
file:/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar ->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-hadoop-compat-1.4.2.jar
2019-01-11 08:39:34 INFO Client:54 - Uploading resource
file:/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar ->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/htrace-core-3.1.0-incubating.jar
2019-01-11 08:39:34 INFO Client:54 - Uploading resource
file:/usr/lib/hbase/lib/metrics-core-2.2.0.jar ->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/metrics-core-2.2.0.jar
2019-01-11 08:39:34 WARN Client:66 - Same path resource
file:///usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar added multiple times to
distributed cache.
2019-01-11 08:39:34 INFO Client:54 - Uploading resource
file:/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.2.jar ->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-hadoop2-compat-1.4.2.jar
2019-01-11 08:39:35 INFO Client:54 - Uploading resource
file:/mnt/tmp/spark-1257f53e-11f2-48d4-9ee7-65a1f9ea878b/__spark_conf__1228777866751535130.zip
->
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/__spark_conf__.zip
2019-01-11 08:39:35 INFO SecurityManager:54 - Changing view acls to: hadoop
2019-01-11 08:39:35 INFO SecurityManager:54 - Changing modify acls to: hadoop
2019-01-11 08:39:35 INFO SecurityManager:54 - Changing view acls groups to:
2019-01-11 08:39:35 INFO SecurityManager:54 - Changing modify acls groups to:
2019-01-11 08:39:35 INFO SecurityManager:54 - SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(hadoop); groups
with view permissions: Set(); users with modify permissions: Set(hadoop);
groups with modify permissions: Set()
2019-01-11 08:39:35 INFO Client:54 - Submitting application
application_1547193611202_0020 to ResourceManager
2019-01-11 08:39:35 INFO YarnClientImpl:273 - Submitted application
application_1547193611202_0020
2019-01-11 08:39:35 INFO SchedulerExtensionServices:54 - Starting Yarn
extension services with app application_1547193611202_0020 and attemptId None
2019-01-11 08:39:36 INFO Client:54 - Application report for
application_1547193611202_0020 (state: ACCEPTED)
2019-01-11 08:39:36 INFO Client:54 -
client token: N/A
diagnostics: AM container is launched, waiting for AM container to
Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1547195975352
final status: UNDEFINED
tracking URL:
http://ip-172-31-33-160.eu-west-1.compute.internal:20888/proxy/application_1547193611202_0020/
user: hadoop
2019-01-11 08:39:37 INFO Client:54 - Application report for
application_1547193611202_0020 (state: ACCEPTED)
2019-01-11 08:39:38 INFO Client:54 - Application report for
application_1547193611202_0020 (state: ACCEPTED)
2019-01-11 08:39:39 INFO Client:54 - Application report for
application_1547193611202_0020 (state: ACCEPTED)
2019-01-11 08:39:40 INFO Client:54 - Application report for
application_1547193611202_0020 (state: ACCEPTED)
2019-01-11 08:39:40 INFO YarnClientSchedulerBackend:54 - Add WebUI Filter.
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS ->
ip-172-31-33-160.eu-west-1.compute.internal, PROXY_URI_BASES ->
http://ip-172-31-33-160.eu-west-1.compute.internal:20888/proxy/application_1547193611202_0020),
/proxy/application_1547193611202_0020
2019-01-11 08:39:40 INFO JettyUtils:54 - Adding filter
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs,
/jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage,
/stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json,
/storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors,
/executors/json, /executors/threadDump, /executors/threadDump/json, /static, /,
/api, /jobs/job/kill, /stages/stage/kill.
2019-01-11 08:39:40 INFO YarnSchedulerBackend$YarnSchedulerEndpoint:54 -
ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
2019-01-11 08:39:41 INFO Client:54 - Application report for
application_1547193611202_0020 (state: RUNNING)
2019-01-11 08:39:41 INFO Client:54 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: 172.31.44.79
ApplicationMaster RPC port: 0
queue: default
start time: 1547195975352
final status: UNDEFINED
tracking URL:
http://ip-172-31-33-160.eu-west-1.compute.internal:20888/proxy/application_1547193611202_0020/
user: hadoop
2019-01-11 08:39:41 INFO YarnClientSchedulerBackend:54 - Application
application_1547193611202_0020 has started running.
2019-01-11 08:39:41 INFO Utils:54 - Successfully started service
'org.apache.spark.network.netty.NettyBlockTransferService' on port 35297.
2019-01-11 08:39:41 INFO NettyBlockTransferService:54 - Server created on
ip-172-31-33-160.eu-west-1.compute.internal:35297
2019-01-11 08:39:41 INFO BlockManager:54 - Using
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication
policy
2019-01-11 08:39:41 INFO BlockManagerMaster:54 - Registering BlockManager
BlockManagerId(driver, ip-172-31-33-160.eu-west-1.compute.internal, 35297, None)
2019-01-11 08:39:41 INFO BlockManagerMasterEndpoint:54 - Registering block
manager ip-172-31-33-160.eu-west-1.compute.internal:35297 with 912.3 MB RAM,
BlockManagerId(driver, ip-172-31-33-160.eu-west-1.compute.internal, 35297, None)
2019-01-11 08:39:41 INFO BlockManagerMaster:54 - Registered BlockManager
BlockManagerId(driver, ip-172-31-33-160.eu-west-1.compute.internal, 35297, None)
2019-01-11 08:39:41 INFO BlockManager:54 - external shuffle service port = 7337
2019-01-11 08:39:41 INFO BlockManager:54 - Initialized BlockManager:
BlockManagerId(driver, ip-172-31-33-160.eu-west-1.compute.internal, 35297, None)
2019-01-11 08:39:41 INFO JettyUtils:54 - Adding filter
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
2019-01-11 08:39:41 INFO ContextHandler:781 - Started
o.s.j.s.ServletContextHandler@5657967b{/metrics/json,null,AVAILABLE,@Spark}
2019-01-11 08:39:41 INFO EventLoggingListener:54 - Logging events to
hdfs:/kylin/spark-history/application_1547193611202_0020
2019-01-11 08:39:46 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 -
Registered executor NettyRpcEndpointRef(spark-client://Executor)
(172.31.35.36:54814) with ID 1
2019-01-11 08:39:46 INFO BlockManagerMasterEndpoint:54 - Registering block
manager ip-172-31-35-36.eu-west-1.compute.internal:40827 with 2004.6 MB RAM,
BlockManagerId(1, ip-172-31-35-36.eu-west-1.compute.internal, 40827, None)
2019-01-11 08:39:56 INFO YarnClientSchedulerBackend:54 - SchedulerBackend is
ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime:
30000(ms)
2019-01-11 08:39:56 INFO AbstractHadoopJob:515 - Ready to load KylinConfig
from uri:
kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
2019-01-11 08:39:56 INFO KylinConfig:455 - Creating new manager instance of
class org.apache.kylin.cube.CubeManager
2019-01-11 08:39:57 INFO CubeManager:133 - Initializing CubeManager with
config
kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
2019-01-11 08:39:57 INFO ResourceStore:88 - Using metadata url
kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
for resource store
2019-01-11 08:39:57 INFO HDFSResourceStore:74 - hdfs meta path :
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of
class org.apache.kylin.cube.CubeDescManager
2019-01-11 08:39:57 INFO CubeDescManager:91 - Initializing CubeDescManager
with config
kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of
class org.apache.kylin.metadata.project.ProjectManager
2019-01-11 08:39:57 INFO ProjectManager:81 - Initializing ProjectManager with
metadata url
kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of
class org.apache.kylin.metadata.cachesync.Broadcaster
2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of
class org.apache.kylin.metadata.model.DataModelManager
2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of
class org.apache.kylin.metadata.TableMetadataManager
2019-01-11 08:39:57 INFO MeasureTypeFactory:117 - Checking custom measure
types from kylin config
2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering
COUNT_DISTINCT(hllc), class
org.apache.kylin.measure.hllc.HLLCMeasureType$Factory
2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering
COUNT_DISTINCT(bitmap), class
org.apache.kylin.measure.bitmap.BitmapMeasureType$Factory
2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering TOP_N(topn),
class org.apache.kylin.measure.topn.TopNMeasureType$Factory
2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering RAW(raw), class
org.apache.kylin.measure.raw.RawMeasureType$Factory
2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering
EXTENDED_COLUMN(extendedcolumn), class
org.apache.kylin.measure.extendedcolumn.ExtendedColumnMeasureType$Factory
2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering
PERCENTILE_APPROX(percentile), class
org.apache.kylin.measure.percentile.PercentileMeasureType$Factory
2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering
COUNT_DISTINCT(dim_dc), class
org.apache.kylin.measure.dim.DimCountDistinctMeasureType$Factory
2019-01-11 08:39:57 INFO DataModelManager:185 - Model kylin_sales_model is
missing or unloaded yet
2019-01-11 08:39:57 INFO DataModelManager:185 - Model kylin_streaming_model is
missing or unloaded yet
2019-01-11 08:39:57 INFO SparkCubeHFile:165 - Input path:
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/cuboid/
2019-01-11 08:39:57 INFO SparkCubeHFile:166 - Output path:
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/hfile
2019-01-11 08:39:57 INFO ZlibFactory:49 - Successfully loaded & initialized
native-zlib library
2019-01-11 08:39:57 INFO CodecPool:181 - Got brand-new decompressor [.deflate]
2019-01-11 08:39:57 INFO SparkCubeHFile:174 - ------- split key:
\x00\x0A\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\x00\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF
2019-01-11 08:39:57 INFO SparkCubeHFile:179 - There are 1 split keys, totally
2 hfiles
2019-01-11 08:39:57 INFO SparkCubeHFile:182 - Loading HBase configuration
from:hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/hbase-conf.xml
2019-01-11 08:39:57 WARN Configuration:2670 -
org.apache.hadoop.hdfs.client.HdfsDataInputStream@2f1f9515:an attempt to
override final parameter: fs.s3.buffer.dir; Ignoring.
2019-01-11 08:39:57 WARN Configuration:2670 -
org.apache.hadoop.hdfs.client.HdfsDataInputStream@2f1f9515:an attempt to
override final parameter: mapreduce.job.end-notification.max.retry.interval;
Ignoring.
2019-01-11 08:39:57 WARN Configuration:2670 -
org.apache.hadoop.hdfs.client.HdfsDataInputStream@2f1f9515:an attempt to
override final parameter: yarn.nodemanager.local-dirs; Ignoring.
2019-01-11 08:39:57 WARN Configuration:2670 -
org.apache.hadoop.hdfs.client.HdfsDataInputStream@2f1f9515:an attempt to
override final parameter: mapreduce.job.end-notification.max.attempts;
Ignoring.
2019-01-11 08:39:58 INFO MemoryStore:54 - Block broadcast_0 stored as values
in memory (estimated size 310.3 KB, free 912.0 MB)
2019-01-11 08:39:58 INFO MemoryStore:54 - Block broadcast_0_piece0 stored as
bytes in memory (estimated size 26.4 KB, free 912.0 MB)
2019-01-11 08:39:58 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in
memory on ip-172-31-33-160.eu-west-1.compute.internal:35297 (size: 26.4 KB,
free: 912.3 MB)
2019-01-11 08:39:58 INFO SparkContext:54 - Created broadcast 0 from
sequenceFile at SparkUtil.java:106
2019-01-11 08:39:58 INFO FileOutputCommitter:108 - File Output Committer
Algorithm version is 1
2019-01-11 08:39:58 INFO SparkContext:54 - Starting job: runJob at
SparkHadoopWriter.scala:78
2019-01-11 08:39:58 INFO FileInputFormat:249 - Total input paths to process :
11
2019-01-11 08:39:58 INFO DAGScheduler:54 - Registering RDD 1 (flatMapToPair at
SparkCubeHFile.java:208)
2019-01-11 08:39:58 INFO DAGScheduler:54 - Got job 0 (runJob at
SparkHadoopWriter.scala:78) with 2 output partitions
2019-01-11 08:39:58 INFO DAGScheduler:54 - Final stage: ResultStage 1 (runJob
at SparkHadoopWriter.scala:78)
2019-01-11 08:39:58 INFO DAGScheduler:54 - Parents of final stage:
List(ShuffleMapStage 0)
2019-01-11 08:39:58 INFO DAGScheduler:54 - Missing parents:
List(ShuffleMapStage 0)
2019-01-11 08:39:58 INFO DAGScheduler:54 - Submitting ShuffleMapStage 0
(MapPartitionsRDD[1] at flatMapToPair at SparkCubeHFile.java:208), which has no
missing parents
2019-01-11 08:39:59 INFO MemoryStore:54 - Block broadcast_1 stored as values
in memory (estimated size 48.0 KB, free 911.9 MB)
2019-01-11 08:39:59 INFO MemoryStore:54 - Block broadcast_1_piece0 stored as
bytes in memory (estimated size 22.4 KB, free 911.9 MB)
2019-01-11 08:39:59 INFO BlockManagerInfo:54 - Added broadcast_1_piece0 in
memory on ip-172-31-33-160.eu-west-1.compute.internal:35297 (size: 22.4 KB,
free: 912.3 MB)
2019-01-11 08:39:59 INFO SparkContext:54 - Created broadcast 1 from broadcast
at DAGScheduler.scala:1039
2019-01-11 08:39:59 INFO DAGScheduler:54 - Submitting 11 missing tasks from
ShuffleMapStage 0 (MapPartitionsRDD[1] at flatMapToPair at
SparkCubeHFile.java:208) (first 15 tasks are for partitions Vector(0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10))
2019-01-11 08:39:59 INFO YarnScheduler:54 - Adding task set 0.0 with 11 tasks
2019-01-11 08:39:59 INFO TaskSetManager:54 - Starting task 0.0 in stage 0.0
(TID 0, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0,
NODE_LOCAL, 8023 bytes)
2019-01-11 08:39:59 INFO BlockManagerInfo:54 - Added broadcast_1_piece0 in
memory on ip-172-31-35-36.eu-west-1.compute.internal:40827 (size: 22.4 KB,
free: 2004.6 MB)
2019-01-11 08:40:00 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in
memory on ip-172-31-35-36.eu-west-1.compute.internal:40827 (size: 26.4 KB,
free: 2004.6 MB)
2019-01-11 08:40:01 INFO TaskSetManager:54 - Starting task 1.0 in stage 0.0
(TID 1, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:01 INFO TaskSetManager:54 - Finished task 0.0 in stage 0.0
(TID 0) in 2882 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(1/11)
2019-01-11 08:40:02 INFO TaskSetManager:54 - Starting task 2.0 in stage 0.0
(TID 2, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 2,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:02 INFO TaskSetManager:54 - Finished task 1.0 in stage 0.0
(TID 1) in 552 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(2/11)
2019-01-11 08:40:02 INFO TaskSetManager:54 - Starting task 3.0 in stage 0.0
(TID 3, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 3,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:02 INFO TaskSetManager:54 - Finished task 2.0 in stage 0.0
(TID 2) in 503 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(3/11)
2019-01-11 08:40:03 INFO TaskSetManager:54 - Starting task 4.0 in stage 0.0
(TID 4, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 4,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:03 INFO TaskSetManager:54 - Finished task 3.0 in stage 0.0
(TID 3) in 839 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(4/11)
2019-01-11 08:40:05 INFO TaskSetManager:54 - Starting task 5.0 in stage 0.0
(TID 5, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 5,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:05 INFO TaskSetManager:54 - Finished task 4.0 in stage 0.0
(TID 4) in 1268 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(5/11)
2019-01-11 08:40:06 INFO TaskSetManager:54 - Starting task 6.0 in stage 0.0
(TID 6, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 6,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:06 INFO TaskSetManager:54 - Finished task 5.0 in stage 0.0
(TID 5) in 1286 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(6/11)
2019-01-11 08:40:07 INFO TaskSetManager:54 - Starting task 7.0 in stage 0.0
(TID 7, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 7,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:07 INFO TaskSetManager:54 - Finished task 6.0 in stage 0.0
(TID 6) in 1319 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(7/11)
2019-01-11 08:40:08 INFO TaskSetManager:54 - Starting task 8.0 in stage 0.0
(TID 8, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 8,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:08 INFO TaskSetManager:54 - Finished task 7.0 in stage 0.0
(TID 7) in 1031 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(8/11)
2019-01-11 08:40:09 INFO TaskSetManager:54 - Starting task 9.0 in stage 0.0
(TID 9, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 9,
NODE_LOCAL, 8022 bytes)
2019-01-11 08:40:09 INFO TaskSetManager:54 - Finished task 8.0 in stage 0.0
(TID 8) in 610 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(9/11)
2019-01-11 08:40:09 INFO TaskSetManager:54 - Starting task 10.0 in stage 0.0
(TID 10, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 10,
NODE_LOCAL, 8025 bytes)
2019-01-11 08:40:09 INFO TaskSetManager:54 - Finished task 9.0 in stage 0.0
(TID 9) in 182 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(10/11)
2019-01-11 08:40:09 INFO TaskSetManager:54 - Finished task 10.0 in stage 0.0
(TID 10) in 92 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1)
(11/11)
2019-01-11 08:40:09 INFO YarnScheduler:54 - Removed TaskSet 0.0, whose tasks
have all completed, from pool
2019-01-11 08:40:09 INFO DAGScheduler:54 - ShuffleMapStage 0 (flatMapToPair at
SparkCubeHFile.java:208) finished in 10.538 s
2019-01-11 08:40:09 INFO DAGScheduler:54 - looking for newly runnable stages
2019-01-11 08:40:09 INFO DAGScheduler:54 - running: Set()
2019-01-11 08:40:09 INFO DAGScheduler:54 - waiting: Set(ResultStage 1)
2019-01-11 08:40:09 INFO DAGScheduler:54 - failed: Set()
2019-01-11 08:40:09 INFO DAGScheduler:54 - Submitting ResultStage 1
(MapPartitionsRDD[3] at mapToPair at SparkCubeHFile.java:231), which has no
missing parents
2019-01-11 08:40:09 INFO MemoryStore:54 - Block broadcast_2 stored as values
in memory (estimated size 197.9 KB, free 911.7 MB)
2019-01-11 08:40:09 INFO MemoryStore:54 - Block broadcast_2_piece0 stored as
bytes in memory (estimated size 44.8 KB, free 911.7 MB)
2019-01-11 08:40:09 INFO BlockManagerInfo:54 - Added broadcast_2_piece0 in
memory on ip-172-31-33-160.eu-west-1.compute.internal:35297 (size: 44.8 KB,
free: 912.2 MB)
2019-01-11 08:40:09 INFO SparkContext:54 - Created broadcast 2 from broadcast
at DAGScheduler.scala:1039
2019-01-11 08:40:09 INFO DAGScheduler:54 - Submitting 2 missing tasks from
ResultStage 1 (MapPartitionsRDD[3] at mapToPair at SparkCubeHFile.java:231)
(first 15 tasks are for partitions Vector(0, 1))
2019-01-11 08:40:09 INFO YarnScheduler:54 - Adding task set 1.0 with 2 tasks
2019-01-11 08:40:09 INFO TaskSetManager:54 - Starting task 0.0 in stage 1.0
(TID 11, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0,
NODE_LOCAL, 7660 bytes)
2019-01-11 08:40:09 INFO BlockManagerInfo:54 - Added broadcast_2_piece0 in
memory on ip-172-31-35-36.eu-west-1.compute.internal:40827 (size: 44.8 KB,
free: 2004.5 MB)
2019-01-11 08:40:09 INFO MapOutputTrackerMasterEndpoint:54 - Asked to send map
output locations for shuffle 0 to 172.31.35.36:54814
2019-01-11 08:40:19 INFO TaskSetManager:54 - Starting task 1.0 in stage 1.0
(TID 12, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1,
NODE_LOCAL, 7660 bytes)
2019-01-11 08:40:19 WARN TaskSetManager:66 - Lost task 0.0 in stage 1.0 (TID
11, ip-172-31-35-36.eu-west-1.compute.internal, executor 1):
org.apache.spark.SparkException: Task failed while writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError:
Lorg/apache/hadoop/hbase/metrics/MetricRegistry;
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
at java.lang.Class.getDeclaredFields(Class.java:1916)
at
org.apache.hadoop.util.ReflectionUtils.getDeclaredFieldsIncludingInherited(ReflectionUtils.java:323)
at
org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.initRegistry(MetricsSourceBuilder.java:92)
at
org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.<init>(MetricsSourceBuilder.java:56)
at
org.apache.hadoop.metrics2.lib.MetricsAnnotations.newSourceBuilder(MetricsAnnotations.java:43)
at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:224)
at
org.apache.hadoop.hbase.metrics.BaseSourceImpl.<init>(BaseSourceImpl.java:115)
at
org.apache.hadoop.hbase.io.MetricsIOSourceImpl.<init>(MetricsIOSourceImpl.java:44)
at
org.apache.hadoop.hbase.io.MetricsIOSourceImpl.<init>(MetricsIOSourceImpl.java:36)
at
org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceFactoryImpl.createIO(MetricsRegionServerSourceFactoryImpl.java:73)
at org.apache.hadoop.hbase.io.MetricsIO.<init>(MetricsIO.java:32)
at org.apache.hadoop.hbase.io.hfile.HFile.<clinit>(HFile.java:191)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.metrics.MetricRegistry
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 30 more
2019-01-11 08:40:19 INFO TaskSetManager:54 - Starting task 0.1 in stage 1.0
(TID 13, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0,
NODE_LOCAL, 7660 bytes)
2019-01-11 08:40:19 INFO TaskSetManager:54 - Lost task 1.0 in stage 1.0 (TID
12) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1:
org.apache.spark.SparkException (Task failed while writing rows) [duplicate 1]
2019-01-11 08:40:24 INFO TaskSetManager:54 - Starting task 1.1 in stage 1.0
(TID 14, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1,
NODE_LOCAL, 7660 bytes)
2019-01-11 08:40:24 INFO TaskSetManager:54 - Lost task 0.1 in stage 1.0 (TID
13) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1:
org.apache.spark.SparkException (Task failed while writing rows) [duplicate 2]
2019-01-11 08:40:24 INFO TaskSetManager:54 - Starting task 0.2 in stage 1.0
(TID 15, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0,
NODE_LOCAL, 7660 bytes)
2019-01-11 08:40:24 INFO TaskSetManager:54 - Lost task 1.1 in stage 1.0 (TID
14) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1:
org.apache.spark.SparkException (Task failed while writing rows) [duplicate 3]
2019-01-11 08:40:32 INFO TaskSetManager:54 - Starting task 1.2 in stage 1.0
(TID 16, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1,
NODE_LOCAL, 7660 bytes)
2019-01-11 08:40:32 WARN TaskSetManager:66 - Lost task 0.2 in stage 1.0 (TID
15, ip-172-31-35-36.eu-west-1.compute.internal, executor 1):
org.apache.spark.SparkException: Task failed while writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
2019-01-11 08:40:32 INFO TaskSetManager:54 - Starting task 0.3 in stage 1.0
(TID 17, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0,
NODE_LOCAL, 7660 bytes)
2019-01-11 08:40:32 INFO TaskSetManager:54 - Lost task 1.2 in stage 1.0 (TID
16) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1:
org.apache.spark.SparkException (Task failed while writing rows) [duplicate 1]
2019-01-11 08:40:40 INFO TaskSetManager:54 - Starting task 1.3 in stage 1.0
(TID 18, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1,
NODE_LOCAL, 7660 bytes)
2019-01-11 08:40:40 INFO TaskSetManager:54 - Lost task 0.3 in stage 1.0 (TID
17) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1:
org.apache.spark.SparkException (Task failed while writing rows) [duplicate 2]
2019-01-11 08:40:40 ERROR TaskSetManager:70 - Task 0 in stage 1.0 failed 4
times; aborting job
2019-01-11 08:40:40 INFO YarnScheduler:54 - Cancelling stage 1
2019-01-11 08:40:40 INFO YarnScheduler:54 - Stage 1 was cancelled
2019-01-11 08:40:40 INFO DAGScheduler:54 - ResultStage 1 (runJob at
SparkHadoopWriter.scala:78) failed in 31.175 s due to Job aborted due to stage
failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3
in stage 1.0 (TID 17, ip-172-31-35-36.eu-west-1.compute.internal, executor 1):
org.apache.spark.SparkException: Task failed while writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
Driver stacktrace:
2019-01-11 08:40:40 INFO DAGScheduler:54 - Job 0 failed: runJob at
SparkHadoopWriter.scala:78, took 41.849529 s
2019-01-11 08:40:40 ERROR SparkHadoopWriter:91 - Aborting job
job_20190111083958_0003.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID
17, ip-172-31-35-36.eu-west-1.compute.internal, executor 1):
org.apache.spark.SparkException: Task failed while writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
Driver stacktrace:
at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1651)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1639)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1638)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1638)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1872)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1821)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1810)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081)
at
org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831)
at
org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:238)
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Task failed while writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
2019-01-11 08:40:40 WARN TaskSetManager:66 - Lost task 1.3 in stage 1.0 (TID
18, ip-172-31-35-36.eu-west-1.compute.internal, executor 1): TaskKilled (Stage
cancelled)
2019-01-11 08:40:40 INFO YarnScheduler:54 - Removed TaskSet 1.0, whose tasks
have all completed, from pool
2019-01-11 08:40:40 INFO AbstractConnector:318 - Stopped
Spark@1f12e153{HTTP/1.1,[http/1.1]}{0.0.0.0:4041}
2019-01-11 08:40:40 INFO SparkUI:54 - Stopped Spark web UI at
http://ip-172-31-33-160.eu-west-1.compute.internal:4041
2019-01-11 08:40:41 INFO YarnClientSchedulerBackend:54 - Interrupting monitor
thread
2019-01-11 08:40:41 INFO YarnClientSchedulerBackend:54 - Shutting down all
executors
2019-01-11 08:40:41 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Asking
each executor to shut down
2019-01-11 08:40:41 INFO SchedulerExtensionServices:54 - Stopping
SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
2019-01-11 08:40:41 INFO YarnClientSchedulerBackend:54 - Stopped
2019-01-11 08:40:41 INFO MapOutputTrackerMasterEndpoint:54 -
MapOutputTrackerMasterEndpoint stopped!
2019-01-11 08:40:41 INFO MemoryStore:54 - MemoryStore cleared
2019-01-11 08:40:41 INFO BlockManager:54 - BlockManager stopped
2019-01-11 08:40:41 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2019-01-11 08:40:41 INFO
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 -
OutputCommitCoordinator stopped!
2019-01-11 08:40:41 INFO SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" java.lang.RuntimeException: error execute
org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Job aborted.
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Job aborted.
at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081)
at
org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831)
at
org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:238)
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
... 11 more
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage
1.0 (TID 17, ip-172-31-35-36.eu-west-1.compute.internal, executor 1):
org.apache.spark.SparkException: Task failed while writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
Driver stacktrace:
at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1651)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1639)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1638)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1638)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1872)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1821)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1810)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78)
... 21 more
Caused by: org.apache.spark.SparkException: Task failed while writing rows
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more
2019-01-11 08:40:41 INFO ShutdownHookManager:54 - Shutdown hook called
2019-01-11 08:40:41 INFO ShutdownHookManager:54 - Deleting directory
/mnt/tmp/spark-b5366df4-8778-4643-8f72-c661ea2298e9
2019-01-11 08:40:41 INFO ShutdownHookManager:54 - Deleting directory
/mnt/tmp/spark-1257f53e-11f2-48d4-9ee7-65a1f9ea878b
The command is:
export HADOOP_CONF_DIR=/etc/hadoop/conf &&
/opt/kylin/apache-kylin-2.6.0/spark/bin/spark-submit --class
org.apache.kylin.common.util.SparkEntry --conf spark.executor.instances=40
--conf spark.yarn.queue=default --conf
spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf
spark.master=yarn --conf spark.hadoop.yarn.timeline-service.enabled=false
--conf spark.executor.memory=4G --conf spark.eventLog.enabled=true --conf
spark.eventLog.dir=hdfs:///kylin/spark-history --conf
spark.yarn.executor.memoryOverhead=1024 --conf spark.driver.memory=2G --conf
spark.shuffle.service.enabled=true --jars
/usr/lib/hbase/lib/hbase-common-1.4.2.jar,/usr/lib/hbase/lib/hbase-server-1.4.2.jar,/usr/lib/hbase/lib/hbase-client-1.4.2.jar,/usr/lib/hbase/lib/hbase-protocol-1.4.2.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar,/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,/usr/lib/hbase/lib/metrics-core-2.2.0.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar,/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.2.jar,
/opt/kylin/apache-kylin-2.6.0/lib/kylin-job-2.6.0.jar -className
org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/rowkey_stats/part-r-00000_hfile
-counterOutput
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/counter
-cubename kylin_sales_cube -output
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/hfile
-input
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/cuboid/
-segmentId f944e1a8-506a-7f5e-4d6a-389a3ce53489 -metaUrl
kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata
-hbaseConfPath
hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/hbase-conf.xml