[ https://issues.apache.org/jira/browse/SPARK-45616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-45616. --------------------------------- Fix Version/s: 3.5.1 4.0.0 Resolution: Fixed Issue resolved by pull request 43466 [https://github.com/apache/spark/pull/43466] > Usages of ParVector are unsafe because it does not propagate ThreadLocals or > SparkSession > ----------------------------------------------------------------------------------------- > > Key: SPARK-45616 > URL: https://issues.apache.org/jira/browse/SPARK-45616 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL, Tests > Affects Versions: 3.5.0 > Reporter: Ankur Dave > Assignee: Ankur Dave > Priority: Minor > Labels: pull-request-available > Fix For: 3.5.1, 4.0.0 > > > CastSuiteBase and ExpressionInfoSuite use ParVector.foreach() to run Spark > SQL queries in parallel. They incorrectly assume that each parallel operation > will inherit the main thread’s active SparkSession. This is only true when > these parallel operations run in freshly-created threads. However, when other > code has already run some parallel operations before Spark was started, then > there may be existing threads that do not have an active SparkSession. In > that case, these tests fail with NullPointerExceptions when creating > SparkPlans or running SQL queries. > The fix is to use the existing method ThreadUtils.parmap(). This method > creates fresh threads that inherit the current active SparkSession, and it > propagates the Spark ThreadLocals. > We should also add a scalastyle warning against use of ParVector. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org