[
https://issues.apache.org/jira/browse/SPARK-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645331#comment-14645331
]
Alex Liu commented on SPARK-4300:
---------------------------------
We find this issue still exists in 1.3.1
{code}
ERROR [Thread-6] 2015-07-28 22:49:57,653 SparkWorker-0 ExternalLogger.java:96 -
Error writing stream to file
/var/lib/spark/worker/worker-0/app-20150728224954-0003/0/stderr
ERROR [Thread-6] 2015-07-28 22:49:57,653 SparkWorker-0 ExternalLogger.java:96 -
java.io.IOException: Stream closed
ERROR [Thread-6] 2015-07-28 22:49:57,654 SparkWorker-0 ExternalLogger.java:96 -
at
java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170)
~[na:1.8.0_40]
ERROR [Thread-6] 2015-07-28 22:49:57,654 SparkWorker-0 ExternalLogger.java:96 -
at java.io.BufferedInputStream.read1(BufferedInputStream.java:283)
~[na:1.8.0_40]
ERROR [Thread-6] 2015-07-28 22:49:57,654 SparkWorker-0 ExternalLogger.java:96 -
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
~[na:1.8.0_40]
ERROR [Thread-6] 2015-07-28 22:49:57,654 SparkWorker-0 ExternalLogger.java:96 -
at java.io.FilterInputStream.read(FilterInputStream.java:107)
~[na:1.8.0_40]
ERROR [Thread-6] 2015-07-28 22:49:57,655 SparkWorker-0 ExternalLogger.java:96 -
at
org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70)
~[spark-core_2.10-1.3.1.1.jar:1.3.1.1]
ERROR [Thread-6] 2015-07-28 22:49:57,655 SparkWorker-0 ExternalLogger.java:96 -
at
org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)
[spark-core_2.10-1.3.1.1.jar:1.3.1.1]
ERROR [Thread-6] 2015-07-28 22:49:57,655 SparkWorker-0 ExternalLogger.java:96 -
at
org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
[spark-core_2.10-1.3.1.1.jar:1.3.1.1]
ERROR [Thread-6] 2015-07-28 22:49:57,655 SparkWorker-0 ExternalLogger.java:96 -
at
org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
[spark-core_2.10-1.3.1.1.jar:1.3.1.1]
ERROR [Thread-6] 2015-07-28 22:49:57,655 SparkWorker-0 ExternalLogger.java:96 -
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
[spark-core_2.10-1.3.1.1.jar:1.3.1.1]
ERROR [Thread-6] 2015-07-28 22:49:57,656 SparkWorker-0 ExternalLogger.java:96 -
at
org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)
[spark-core_2.10-1.3.1.1.jar:1.3.1.1]
{code}
at
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/worker/ExecutorRunner.scala#L159
The process auto shuts down, but the log appenders are still running, which
causes the error log messages.
> Race condition during SparkWorker shutdown
> ------------------------------------------
>
> Key: SPARK-4300
> URL: https://issues.apache.org/jira/browse/SPARK-4300
> Project: Spark
> Issue Type: Bug
> Components: Spark Shell
> Affects Versions: 1.1.0
> Reporter: Alex Liu
> Assignee: Sean Owen
> Priority: Minor
> Fix For: 1.2.2, 1.3.1, 1.4.0
>
>
> When a shark job is done. there are some error message as following show in
> the log
> {code}
> INFO 22:10:41,635 SparkMaster:
> akka.tcp://[email protected]:57641 got
> disassociated, removing it.
> INFO 22:10:41,640 SparkMaster: Removing app app-20141106221014-0000
> INFO 22:10:41,687 SparkMaster: Removing application
> Shark::ip-172-31-11-204.us-west-1.compute.internal
> INFO 22:10:41,710 SparkWorker: Asked to kill executor
> app-20141106221014-0000/0
> INFO 22:10:41,712 SparkWorker: Runner thread for executor
> app-20141106221014-0000/0 interrupted
> INFO 22:10:41,714 SparkWorker: Killing process!
> ERROR 22:10:41,738 SparkWorker: Error writing stream to file
> /var/lib/spark/work/app-20141106221014-0000/0/stdout
> ERROR 22:10:41,739 SparkWorker: java.io.IOException: Stream closed
> ERROR 22:10:41,739 SparkWorker: at
> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)
> ERROR 22:10:41,740 SparkWorker: at
> java.io.BufferedInputStream.read1(BufferedInputStream.java:272)
> ERROR 22:10:41,740 SparkWorker: at
> java.io.BufferedInputStream.read(BufferedInputStream.java:334)
> ERROR 22:10:41,740 SparkWorker: at
> java.io.FilterInputStream.read(FilterInputStream.java:107)
> ERROR 22:10:41,741 SparkWorker: at
> org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70)
> ERROR 22:10:41,741 SparkWorker: at
> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)
> ERROR 22:10:41,741 SparkWorker: at
> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
> ERROR 22:10:41,742 SparkWorker: at
> org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
> ERROR 22:10:41,742 SparkWorker: at
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311)
> ERROR 22:10:41,742 SparkWorker: at
> org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)
> INFO 22:10:41,838 SparkMaster: Connected to Cassandra cluster: 4299
> INFO 22:10:41,839 SparkMaster: Adding host 172.31.11.204 (Analytics)
> INFO 22:10:41,840 SparkMaster: New Cassandra host /172.31.11.204:9042 added
> INFO 22:10:41,841 SparkMaster: Adding host 172.31.11.204 (Analytics)
> INFO 22:10:41,842 SparkMaster: Adding host 172.31.11.204 (Analytics)
> INFO 22:10:41,852 SparkMaster:
> akka.tcp://[email protected]:57641 got
> disassociated, removing it.
> INFO 22:10:41,853 SparkMaster:
> akka.tcp://[email protected]:57641 got
> disassociated, removing it.
> INFO 22:10:41,853 SparkMaster:
> akka.tcp://[email protected]:57641 got
> disassociated, removing it.
> INFO 22:10:41,857 SparkMaster:
> akka.tcp://[email protected]:57641 got
> disassociated, removing it.
> INFO 22:10:41,862 SparkMaster: Adding host 172.31.11.204 (Analytics)
> WARN 22:10:42,200 SparkMaster: Got status update for unknown executor
> app-20141106221014-0000/0
> INFO 22:10:42,211 SparkWorker: Executor app-20141106221014-0000/0 finished
> with state KILLED exitStatus 143
> {code}
> /var/lib/spark/work/app-20141106221014-0000/0/stdout is on the disk. It is
> trying to write to a close IO stream.
> Spark worker shuts down by {code}
> private def killProcess(message: Option[String]) {
> var exitCode: Option[Int] = None
> logInfo("Killing process!")
> process.destroy()
> process.waitFor()
> if (stdoutAppender != null) {
> stdoutAppender.stop()
> }
> if (stderrAppender != null) {
> stderrAppender.stop()
> }
> if (process != null) {
> exitCode = Some(process.waitFor())
> }
> worker ! ExecutorStateChanged(appId, execId, state, message, exitCode)
>
> {code}
> But stdoutAppender concurrently writes to output log file, which creates race
> condition.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]