[
https://issues.apache.org/jira/browse/HIVE-15912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933961#comment-15933961
]
KaiXu commented on HIVE-15912:
------------------------------
Hi [~lirui], I am using Hive2.2 on spark2.0.2, the issue also exists.
2017-03-21 03:02:30,454 Stage-5_0: 241/241 Finished Stage-6_0: 161(+1)/162
Stage-7_0: 0/2018 Stage-8_0: 0/1009 Stage-9_0: 0/1009
Failed to monitor Job[4] with exception
'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.spark.SparkTask
in hive's log I also found the TimeoutException, WARN as well as ERROR,
2017-03-21T03:02:31,466 INFO [RPC-Handler-3] rpc.RpcDispatcher:
[ClientProtocol] Closing channel due to exception in pipeline
(org.apache.hive.spark.client.SparkClientImpl$ClientProtocol.handle(io.netty.channel.ChannelHandlerContext,
org.apache.hive.spark.client.rpc.Rpc$MessageHeader)).
2017-03-21T03:02:31,468 WARN [RPC-Handler-3] rpc.RpcDispatcher:
[ClientProtocol] Expected RPC header, got org.apache.spark.SparkJobInfoImpl
instead.
2017-03-21T03:02:31,468 INFO [RPC-Handler-3] rpc.RpcDispatcher:
[ClientProtocol] Closing channel due to exception in pipeline (null).
2017-03-21T03:02:31,469 WARN [RPC-Handler-3] client.SparkClientImpl: Client
RPC channel closed unexpectedly.
2017-03-21T03:03:31,457 WARN [Thread-349] impl.RemoteSparkJobStatus: Failed to
get job info.
java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
~[netty-all-4.0.29.Final.jar:4.0.29.Final]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:171)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getStageIds(RemoteSparkJobStatus.java:87)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageProgress(RemoteSparkJobStatus.java:94)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:84)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:116)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
2017-03-21T03:03:31,457 ERROR [Thread-349] status.SparkJobMonitor: Failed to
monitor Job[4] with exception
'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
org.apache.hadoop.hive.ql.metadata.HiveException:
java.util.concurrent.TimeoutException
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:174)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getStageIds(RemoteSparkJobStatus.java:87)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageProgress(RemoteSparkJobStatus.java:94)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:84)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:116)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
Caused by: java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
~[netty-all-4.0.29.Final.jar:4.0.29.Final]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:171)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
... 8 more
2017-03-21T03:03:31,458 ERROR [Thread-349] SessionState: Failed to monitor
Job[4] with exception
'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
org.apache.hadoop.hive.ql.metadata.HiveException:
java.util.concurrent.TimeoutException
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:174)
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getStageIds(RemoteSparkJobStatus.java:87)
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageProgress(RemoteSparkJobStatus.java:94)
at
org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:84)
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60)
at
org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:116)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79)
Caused by: java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:171)
... 8 more
2017-03-21T03:03:31,458 ERROR [Thread-349] spark.SparkTask: Failed to get Spark
job information
java.lang.IllegalStateException: RPC channel is closed.
at
com.google.common.base.Preconditions.checkState(Preconditions.java:149)
~[guava-14.0.1.jar:?]
at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:276)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hive.spark.client.SparkClientImpl$ClientProtocol.run(SparkClientImpl.java:580)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hive.spark.client.SparkClientImpl.run(SparkClientImpl.java:151)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:168)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getStageIds(RemoteSparkJobStatus.java:87)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.SparkTask.getSparkJobInfo(SparkTask.java:346)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:118)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
2017-03-21T03:03:33,233 ERROR [89254c41-8375-40ef-a364-e7b546d8ba48 main]
ql.Driver: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.spark.SparkTask
in driver's log, found an InterruptedException:
17/03/21 03:02:31 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint:
Launching task 1181 on executor id: 38 hostname: hsx-node6.
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 244.0 in stage
7.0 (TID 657) in 573 ms on hsx-node4 (343/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 432.0 in stage
7.0 (TID 845) in 285 ms on hsx-node9 (344/2018)
17/03/21 03:02:31 WARN client.RemoteDriver: Shutting down driver because RPC
channel was closed.
17/03/21 03:02:31 INFO client.RemoteDriver: Shutting down remote driver.
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Starting task 769.0 in stage
7.0 (TID 1182, hsx-node6, partition 769, PROCESS_LOCAL, 25685 bytes)
17/03/21 03:02:31 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint:
Launching task 1182 on executor id: 9 hostname: hsx-node6.
17/03/21 03:02:31 INFO scheduler.DAGScheduler: Asked to cancel job 4
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 392.0 in stage
7.0 (TID 805) in 365 ms on hsx-node10 (345/2018)
17/03/21 03:02:31 INFO client.RemoteDriver: Failed to run job
63390316-af66-4f84-8c8b-806112ef2b94
java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at
scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:202)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
at org.apache.spark.SimpleFutureAction.ready(FutureAction.scala:125)
at org.apache.spark.SimpleFutureAction.ready(FutureAction.scala:114)
at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:169)
at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:169)
at
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.ready(package.scala:169)
at
org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:264)
at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:277)
at
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:362)
at
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:323)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Starting task 770.0 in stage
7.0 (TID 1183, hsx-node6, partition 770, PROCESS_LOCAL, 25685 bytes)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 469.0 in stage
7.0 (TID 882) in 350 ms on hsx-node7 (346/2018)
17/03/21 03:02:31 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint:
Launching task 1183 on executor id: 38 hostname: hsx-node6.
17/03/21 03:02:31 INFO cluster.YarnClusterScheduler: Cancelling stage 7
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: Listener ClientListener
threw an exception
java.lang.IllegalStateException: RPC channel is closed.
at
com.google.common.base.Preconditions.checkState(Preconditions.java:149)
at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:276)
at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:259)
at
org.apache.hive.spark.client.RemoteDriver$DriverProtocol.sendMetrics(RemoteDriver.java:270)
at
org.apache.hive.spark.client.RemoteDriver$ClientListener.onTaskEnd(RemoteDriver.java:490)
at
org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:45)
at
org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)
at
org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)
at
org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:63)
at
org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:36)
at
org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:94)
at
org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at
org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at
org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1252)
at
org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)
17/03/21 03:02:31 INFO server.ServerConnector: Stopped
ServerConnector@10328ecb{HTTP/1.1}{0.0.0.0:0}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@282715e5{/stages/stage/kill,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@6610a390{/api,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@a3e40d8{/,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@12a7530{/static,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@7d6673e4{/executors/threadDump/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@7603582f{/executors/threadDump,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@6b8c0277{/executors/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@293aa151{/executors,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@2db4db50{/environment/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@743d35fc{/environment,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@70265a43{/storage/rdd/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@7207552{/storage/rdd,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@5a10780b{/storage/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@6a2dd645{/storage,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@574406ab{/stages/pool/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@4a32fbe0{/stages/pool,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@70b816b5{/stages/stage/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@1fa44e08{/stages/stage,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@25170f92{/stages/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@5c6ccb81{/stages,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@6af0b071{/jobs/job/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@4d6efce4{/jobs/job,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@3f28c543{/jobs/json,null,UNAVAILABLE}
17/03/21 03:02:31 INFO handler.ContextHandler: Stopped
o.s.j.s.ServletContextHandler@7849dc62{/jobs,null,UNAVAILABLE}
17/03/21 03:02:31 INFO cluster.YarnClusterScheduler: Stage 7 was cancelled
17/03/21 03:02:31 INFO ui.SparkUI: Stopped Spark web UI at
http://192.168.1.4:41594
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 30.0 in stage
7.0 (TID 443) in 733 ms on hsx-node4 (347/2018)
17/03/21 03:02:31 INFO scheduler.DAGScheduler: ShuffleMapStage 7 (union at
SparkPlan.java:70) failed in 0.744 s
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 86.0 in stage
7.0 (TID 499) in 720 ms on hsx-node4 (348/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 333.0 in stage
7.0 (TID 746) in 664 ms on hsx-node5 (349/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 425.0 in stage
7.0 (TID 838) in 400 ms on hsx-node2 (350/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 324.0 in stage
7.0 (TID 737) in 667 ms on hsx-node4 (351/2018)
17/03/21 03:02:31 WARN spark.ExecutorAllocationManager: No stages are running,
but numRunningTasks != 0
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: SparkListenerBus has already
stopped! Dropping event
SparkListenerTaskEnd(7,0,ShuffleMapTask,Success,org.apache.spark.scheduler.TaskInfo@3241bef1,org.apache.spark.executor.TaskMetrics@21953ed8)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 440.0 in stage
7.0 (TID 853) in 387 ms on hsx-node9 (352/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 119.0 in stage
7.0 (TID 532) in 715 ms on hsx-node10 (353/2018)
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: SparkListenerBus has already
stopped! Dropping event
SparkListenerTaskStart(7,-1,org.apache.spark.scheduler.TaskInfo@3e8a68b)
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: SparkListenerBus has already
stopped! Dropping event
SparkListenerTaskEnd(7,0,ShuffleMapTask,Success,org.apache.spark.scheduler.TaskInfo@4959044b,org.apache.spark.executor.TaskMetrics@76d08da7)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 408.0 in stage
7.0 (TID 821) in 420 ms on hsx-node2 (354/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 482.0 in stage
7.0 (TID 895) in 358 ms on hsx-node8 (355/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 463.0 in stage
7.0 (TID 876) in 374 ms on hsx-node8 (356/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 68.0 in stage
7.0 (TID 481) in 731 ms on hsx-node4 (357/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 447.0 in stage
7.0 (TID 860) in 382 ms on hsx-node7 (358/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 446.0 in stage
7.0 (TID 859) in 384 ms on hsx-node8 (359/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 450.0 in stage
7.0 (TID 863) in 382 ms on hsx-node7 (360/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 137.0 in stage
7.0 (TID 550) in 716 ms on hsx-node5 (361/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 421.0 in stage
7.0 (TID 834) in 408 ms on hsx-node9 (362/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 166.0 in stage
7.0 (TID 579) in 710 ms on hsx-node4 (363/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 430.0 in stage
7.0 (TID 843) in 405 ms on hsx-node6 (364/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 117.0 in stage
7.0 (TID 530) in 723 ms on hsx-node4 (365/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 473.0 in stage
7.0 (TID 886) in 372 ms on hsx-node7 (366/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 305.0 in stage
7.0 (TID 718) in 679 ms on hsx-node10 (367/2018)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 445.0 in stage
7.0 (TID 858) in 388 ms on hsx-node9 (368/2018)
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: SparkListenerBus has already
stopped! Dropping event
SparkListenerTaskEnd(7,0,ShuffleMapTask,Success,org.apache.spark.scheduler.TaskInfo@395b13ea,org.apache.spark.executor.TaskMetrics@4b141cd3)
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: SparkListenerBus has already
stopped! Dropping event
SparkListenerTaskEnd(7,0,ShuffleMapTask,Success,org.apache.spark.scheduler.TaskInfo@6dfb0a2c,org.apache.spark.executor.TaskMetrics@5cf4c211)
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: SparkListenerBus has already
stopped! Dropping event
SparkListenerTaskEnd(7,0,ShuffleMapTask,Success,org.apache.spark.scheduler.TaskInfo@3dd14e26,org.apache.spark.executor.TaskMetrics@56482fcb)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 479.0 in stage
7.0 (TID 892) in 366 ms on hsx-node8 (369/2018)
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: SparkListenerBus has already
stopped! Dropping event
SparkListenerTaskEnd(7,0,ShuffleMapTask,Success,org.apache.spark.scheduler.TaskInfo@63d7f544,org.apache.spark.executor.TaskMetrics@19c94765)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 426.0 in stage
7.0 (TID 839) in 409 ms on hsx-node2 (370/2018)
17/03/21 03:02:31 ERROR scheduler.LiveListenerBus: SparkListenerBus has already
stopped! Dropping event
SparkListenerTaskEnd(7,0,ShuffleMapTask,Success,org.apache.spark.scheduler.TaskInfo@2e5e9a04,org.apache.spark.executor.TaskMetrics@39ffde73)
17/03/21 03:02:31 INFO scheduler.TaskSetManager: Finished task 379.0 in stage
7.0 (TID 792) in 665 ms on hsx-node8 (371/2018)
> Executor kill task and Failed to get spark memory/core info
> -----------------------------------------------------------
>
> Key: HIVE-15912
> URL: https://issues.apache.org/jira/browse/HIVE-15912
> Project: Hive
> Issue Type: Bug
> Components: Hive, Spark
> Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark2.0.2
> Hive2.2
> Reporter: KaiXu
>
> Hive on Spark, failed with error:
> Starting Spark Job = 12a8cb8c-ed0d-4049-ae06-8d32d13fe285
> Failed to monitor Job[ 6] with exception 'java.lang.IllegalStateException(RPC
> channel is closed.)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> Hive's log:
> 2017-02-14T19:03:09,147 INFO [stderr-redir-1] client.SparkClientImpl:
> 17/02/14 19:03:09 INFO yarn.Client: Application report for
> application_1486905599813_0403 (state: ACCEPTED)
> 2017-02-14T19:03:10,817 WARN [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main]
> spark.SetSparkReducerParallelism: Failed to get spark memory/core info
> java.util.concurrent.TimeoutException
> at
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
> ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
> at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:155)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:165)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getMemoryAndCores(SparkSessionImpl.java:77)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:119)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.runJoinOptimizations(SparkCompiler.java:291)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:120)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11085)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:279)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:510)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1302)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1442)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1222)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:430)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:749)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:715)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:642)
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[?:1.8.0_60]
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[?:1.8.0_60]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_60]
> at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> ~[hadoop-common-2.7.1.jar:?]
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> ~[hadoop-common-2.7.1.jar:?]
> 2017-02-14T19:03:10,817 INFO [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main]
> spark.SetSparkReducerParallelism: Set parallelism for reduce sink RS[24] to:
> 1 (calculated)
> yarn log:
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 329.0 in stage 15.0
> (TID 3865). 3228 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 201.0 in stage 15.0
> (TID 3737). 3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 9.0 in stage 15.0
> (TID 3545). 4027 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 457.0 in stage 15.0
> (TID 3993). 3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned
> task 4221
> 17/02/14 19:05:36 INFO executor.Executor: Running task 687.0 in stage 15.0
> (TID 4221)
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 265.0 in stage 15.0
> (TID 3801). 3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned
> task 4240
> 17/02/14 19:05:36 INFO executor.Executor: Running task 692.0 in stage 15.0
> (TID 4240)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned
> task 4242
> 17/02/14 19:05:36 INFO executor.Executor: Running task 763.0 in stage 15.0
> (TID 4242)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned
> task 4243
> 17/02/14 19:05:36 INFO executor.Executor: Running task 836.0 in stage 15.0
> (TID 4243)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned
> task 4244
> 17/02/14 19:05:36 INFO executor.Executor: Running task 863.0 in stage 15.0
> (TID 4244)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned
> task 4248
> 17/02/14 19:05:36 INFO executor.Executor: Running task 979.0 in stage 15.0
> (TID 4248)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned
> task 4280
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_687 locally
> 17/02/14 19:05:36 INFO executor.Executor: Running task 990.0 in stage 15.0
> (TID 4280)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task
> 687.0 in stage 15.0 (TID 4221)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task
> 73.0 in stage 15.0 (TID 3609)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task
> 763.0 in stage 15.0 (TID 4242)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task
> 979.0 in stage 15.0 (TID 4248)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task
> 990.0 in stage 15.0 (TID 4280)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task
> 863.0 in stage 15.0 (TID 4244)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task
> 836.0 in stage 15.0 (TID 4243)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task
> 692.0 in stage 15.0 (TID 4240)
> 17/02/14 19:05:36 INFO storage.BlockManager: Removing RDD 24
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 73.0 in stage 15.0
> (TID 3609). 3214 bytes result sent to driver
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_863 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_692 locally
> 17/02/14 19:05:36 ERROR executor.Executor: Exception in task 990.0 in stage
> 15.0 (TID 4280)
> org.apache.spark.TaskKilledException
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:264)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 687.0 in stage
> 15.0 (TID 4221)
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_836 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_979 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_763 locally
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 863.0 in stage
> 15.0 (TID 4244)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 979.0 in stage
> 15.0 (TID 4248)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 836.0 in stage
> 15.0 (TID 4243)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 692.0 in stage
> 15.0 (TID 4240)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 763.0 in stage
> 15.0 (TID 4242)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver
> commanded a shutdown
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver from
> 192.168.1.1:35981 disconnected during shutdown
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver from
> 192.168.1.1:35981 disconnected during shutdown
> 17/02/14 19:05:36 INFO memory.MemoryStore: MemoryStore cleared
> 17/02/14 19:05:36 INFO storage.BlockManager: BlockManager stopped
> 17/02/14 19:05:36 INFO util.ShutdownHookManager: Shutdown hook called
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)