[
https://issues.apache.org/jira/browse/SPARK-10554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740904#comment-14740904
]
Sean Owen commented on SPARK-10554:
-----------------------------------
Well the situation here is that the block manager fails to initialize, and is
stopped very early. Many bets are off. Since the dir may be shared I think the
conservative thing is to not delete files.
Yes the condition is needed since the driver will always clean up its local
files as it's not participating in things like the external shuffle service.
But if we don't know we're the driver for some reason, I think it's best to not
proceed. Anyway, that's the current behavior!
> Potential NPE with ShutdownHook
> -------------------------------
>
> Key: SPARK-10554
> URL: https://issues.apache.org/jira/browse/SPARK-10554
> Project: Spark
> Issue Type: Bug
> Components: Block Manager
> Affects Versions: 1.5.0
> Reporter: Nithin Asokan
> Priority: Minor
>
> Originally posted in user mailing list
> [here|http://apache-spark-user-list.1001560.n3.nabble.com/Potential-NPE-while-exiting-spark-shell-tt24523.html]
> I'm currently using Spark 1.3.0 on yarn cluster deployed through CDH5.4. My
> cluster does not have a 'default' queue, and launching 'spark-shell' submits
> an yarn application that gets killed immediately because queue does not
> exist. However, the spark-shell session is still in progress after throwing a
> bunch of errors while creating sql context. Upon submitting an 'exit'
> command, there appears to be a NPE from DiskBlockManager with the following
> stack trace
> {code}
> ERROR Utils: Uncaught exception in thread delete Spark local dirs
> java.lang.NullPointerException
> at
> org.apache.spark.storage.DiskBlockManager.org$apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
>
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
>
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>
> at
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
>
> Exception in thread "delete Spark local dirs" java.lang.NullPointerException
> at
> org.apache.spark.storage.DiskBlockManager.org$apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
>
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
>
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>
> at
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
> at
> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
>
> {code}
> I believe the problem appears to be surfacing from a shutdown hook that's
> tries to cleanup local directories. In this specific case because the yarn
> application was not submitted successfully, the block manager was not
> registered; as a result it does not have a valid blockManagerId as seen here
> https://github.com/apache/spark/blob/v1.3.0/core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala#L161
> Has anyone faced this issue before? Could this be a problem with the way
> shutdown hook behaves currently?
> Note: I referenced source from apache spark repo than cloudera.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]