[
https://issues.apache.org/jira/browse/SPARK-16599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097302#comment-16097302
]
Ryan Williams commented on SPARK-16599:
---------------------------------------
Update: I think the real story here is the first stack trace above: {{Failed to
get broadcast_0_piece0 of broadcast_0}}; in my real application, I see that
error causing jobs to fail without the {{None.get}} occurring.
Curiously, setting {{log4j.logger.org.apache.spark=DEBUG}} causes the error to
go away in my real application; setting {{log4j.logger.org.apache.spark=DEBUG}}
in my repro causes the {{None.get}} to go away, but the job still fails due to
{{Failed to get broadcast_0_piece0 of broadcast_0}}.
[Here is full output of such a repro run with debug logging
enabled|https://gist.github.com/ryan-williams/564069ba1bd2c052c64be68f7d86c0c5].
The error seems sensitive to the order of creation of:
1. {{Broadcast}} from first {{SparkContext}}
2. Second {{SparkContext}}
The bug occurs when the {{Broadcast}} comes first:
{code}
// Make a Broadcast
val bs = sc.broadcast(Set(1, 2, 3))
// "Accidentally" create second SparkContext
println(Foo.foo)
{code}
but not when the order is reversed:
{code}
// "Accidentally" create second SparkContext
println(Foo.foo)
// Make a Broadcast
val bs = sc.broadcast(Set(1, 2, 3))
{code}
This also raises a question about whether everyone in this thread is seeing the
same issue/exception, since my {{None.get}} seems to be caused by some other
issue regarding block-management in the presence of multiple {{SparkContext}}'s.
> java.util.NoSuchElementException: None.get at at
> org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-16599
> URL: https://issues.apache.org/jira/browse/SPARK-16599
> Project: Spark
> Issue Type: Bug
> Affects Versions: 2.0.0
> Environment: centos 6.7 spark 2.0
> Reporter: binde
>
> run a spark job with spark 2.0, error message
> Job aborted due to stage failure: Task 0 in stage 821.0 failed 4 times, most
> recent failure: Lost task 0.3 in stage 821.0 (TID 1480, e103):
> java.util.NoSuchElementException: None.get
> at scala.None$.get(Option.scala:347)
> at scala.None$.get(Option.scala:345)
> at
> org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
> at
> org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]