[ 
https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610178#comment-17610178
 ] 

Garret Wilson edited comment on SPARK-40582 at 9/27/22 5:49 PM:
----------------------------------------------------------------

Will Spark be updated to use the newer version of Scala when it's released 
rather than just providing a workaround?

For example, it looks like [this has already been fixed in 
Scala|https://github.com/scala/bug/issues/12613] since June 2022 in v2.13.8. 
Can I just upgrade the Scala dependencies somehow, or does Spark constantly 
saddle me with an outdated version of Scala and until Scala decides to upgrade?

(I wish Spark weren't based on Scala for this very sort of issue. I wish we 
weren't forced to juggle the versions of both the Spark library and another 
language, and be hit with another layer of bugs because of this.)


was (Author: garretwilson):
Will Spark be updated to use the newer version of Scala when it's released 
rather than just providing a workaround?

(I wish Spark wasn't based on Scala for this very sort of issue. I wish we 
weren't forced to juggle the versions of both the Spark library and another 
language, and be hit with another layer of bugs because of this.)

> NullPointerException: Cannot invoke 
> invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-40582
>                 URL: https://issues.apache.org/jira/browse/SPARK-40582
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.3.0
>            Reporter: Garret Wilson
>            Priority: Critical
>
> I'm running a simple little Spark 3.3.0 pipeline on Windows 10 using Java 17 
> and UDFs. I hardly do anything interesting, and now when I run the pipeline 
> on only 30,000 records I'm getting this:
> {noformat}
> [ERROR] Error in removing shuffle 2
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.spark.ShuffleStatus.invalidateSerializedMapOutputStatusCache()" 
> because "shuffleStatus" is null
>         at 
> org.apache.spark.MapOutputTrackerMaster.$anonfun$unregisterShuffle$1(MapOutputTracker.scala:882)
>         at 
> org.apache.spark.MapOutputTrackerMaster.$anonfun$unregisterShuffle$1$adapted(MapOutputTracker.scala:881)
>         at scala.Option.foreach(Option.scala:437)
>         at 
> org.apache.spark.MapOutputTrackerMaster.unregisterShuffle(MapOutputTracker.scala:881)
>         at 
> org.apache.spark.storage.BlockManagerStorageEndpoint$$anonfun$receiveAndReply$1.$anonfun$applyOrElse$3(BlockManagerStorageEndpoint.scala:59)
>         at 
> scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.scala:17)
>         at 
> org.apache.spark.storage.BlockManagerStorageEndpoint.$anonfun$doAsync$1(BlockManagerStorageEndpoint.scala:89)
>         at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:678)
>         at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:467)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>         at java.base/java.lang.Thread.run(Thread.java:833)
> {noformat}
> I searched and couldn't find any of the principal terms in the error message.
> Disconcerting that Spark is breaking at what seems to be a fundamental part 
> of processing, and with a {{NullPointerException}} at that.
> I have already asked this question on [Stack 
> Overflow|https://stackoverflow.com/q/73732970], and even posted a bounty, 
> with no solutions. (The only answer so far is from someone who doesn't even 
> use Spark and just posted links.)
> _Update:_ Now it just happened with only 1000 records. But then I reran the 
> pipeline immediately with no changes, and it succeeded. So this 
> {{NullPointerException}} bug is nondeterministic. Not good at all.
> _Update:_ Now it just happened with only 10 records. But then as before I 
> reran the pipeline immediately with no changes, and it succeeded.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to