[
https://issues.apache.org/jira/browse/SPARK-32803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190962#comment-17190962
]
Apache Spark commented on SPARK-32803:
--------------------------------------
User 'ulysses-you' has created a pull request for this issue:
https://github.com/apache/spark/pull/29652
> Catch InterruptedException when resolve rack in SparkRackResolver
> -----------------------------------------------------------------
>
> Key: SPARK-32803
> URL: https://issues.apache.org/jira/browse/SPARK-32803
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.1.0
> Reporter: ulysses you
> Priority: Major
>
> In Yarn mode, Insert into hive table can produce some dirty data when kill
> Spark application. The error msg is:
> ```
> java.io.IOException: java.lang.InterruptedException at
> org.apache.hadoop.util.Shell.runCommand(Shell.java:607) at
> org.apache.hadoop.util.Shell.run(Shell.java:507) at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789) at
> org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:251)
> at
> org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:188)
> at
> org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119)
> at
> org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:101)
> at org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:81) at
> org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:37)
> at
> org.apache.spark.scheduler.TaskSetManager$$anonfun$addPendingTask$1.apply(TaskSetManager.scala:235)
> at
> org.apache.spark.scheduler.TaskSetManager$$anonfun$addPendingTask$1.apply(TaskSetManager.scala:216)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at
> org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:216)
> at
> org.apache.spark.scheduler.TaskSetManager$$anonfun$1.apply$mcVI$sp(TaskSetManager.scala:188)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) at
> org.apache.spark.scheduler.TaskSetManager.<init>(TaskSetManager.scala:187) at
> org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:250)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:208)
> at
> org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1215)
> at
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:1071)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:1074)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:1073)
> at scala.collection.immutable.List.foreach(List.scala:392) at
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:1073)
> at
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1014)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2069)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2061)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2050)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
> ```
> The reason is:
> 1. `CachedDNSToSwitchMapping` may execute a shell command to resolve hostname
> when we kill Spark application, then we got an `InterruptedException` error.
> 2. `DAGSchedulerEventProcessLoop` ignore `InterruptedException`, so
> `FileFormatWriter` action cann't get the failed info.
> 3. `FileFormatWriter` not abort this commit and leaves some dirty data.
>
> So we should catch the `InterruptedException` error and throw a
> `SparkException`.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]