sunchao commented on code in PR #45052:
URL: https://github.com/apache/spark/pull/45052#discussion_r1501741974
##########
core/src/main/scala/org/apache/spark/storage/BlockManager.scala:
##########
@@ -177,15 +177,17 @@ private[spark] class HostLocalDirManager(
* Manager running on every node (driver and executors) which provides
interfaces for putting and
* retrieving blocks both locally and remotely into various stores (memory,
disk, and off-heap).
*
- * Note that [[initialize()]] must be called before the BlockManager is usable.
+ * Note that [[initialize()]] must be called before the BlockManager is
usable. Also, the
+ * `memoryManager` is initialized at a later stage after DriverPlugin is
loaded, to allow the
+ * plugin to overwrite memory configurations.
*/
private[spark] class BlockManager(
val executorId: String,
rpcEnv: RpcEnv,
val master: BlockManagerMaster,
val serializerManager: SerializerManager,
val conf: SparkConf,
- memoryManager: MemoryManager,
+ var memoryManager: MemoryManager,
Review Comment:
> The stack trace of the NPE that we saw earlier was part of spark context
initialization ... not an access from task, right ?
Thanks @mridulm for checking! I think that stack trace doesn't reveal the
root cause of the issue. I added a bunch of debugging messages in the code and
found out the task that was causing the issue:
```
setting active env to org.apache.spark.SparkEnv@5ab3ee8b in
pool-1-thread-1-ScalaTest-running-JobCancellationSuite
active env = org.apache.spark.SparkEnv@5ab3ee8b, thread = Executor task
launch worker for task 0.0 in stage 0.0 (TID 0)
java.base/java.lang.Thread.getStackTrace(Thread.java:1619)
org.apache.spark.storage.BlockManager.memoryManager$lzycompute(BlockManager.scala:210)
org.apache.spark.storage.BlockManager.memoryManager(BlockManager.scala:204)
org.apache.spark.storage.BlockManager.memoryStore$lzycompute(BlockManager.scala:248)
org.apache.spark.storage.BlockManager.memoryStore(BlockManager.scala:247)
org.apache.spark.scheduler.Task.$anonfun$run$3(Task.scala:146)
org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1287)
org.apache.spark.scheduler.Task.run(Task.scala:144)
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:633)
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:96)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:636)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base/java.lang.Thread.run(Thread.java:840)
memory manager of org.apache.spark.SparkEnv@5ab3ee8b is null, _memoryManager
= null, thread = Executor task launch worker for task 0.0 in stage 0.0 (TID 0)
set memory manager for org.apache.spark.SparkEnv@5ab3ee8b, threadName =
pool-1-thread-1-ScalaTest-running-JobCancellationSuite
java.base/java.lang.Thread.getStackTrace(Thread.java:1619)
org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
org.apache.spark.SparkContext.<init>(SparkContext.scala:141)
org.apache.spark.JobCancellationSuite.$anonfun$new$45(JobCancellationSuite.scala:430)
org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)
org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)
org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)
org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)
org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)
org.apache.spark.SparkFunSuite.$anonfun$test$2(SparkFunSuite.scala:155)
org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
org.scalatest.Transformer.apply(Transformer.scala:22)
org.scalatest.Transformer.apply(Transformer.scala:20)
org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)
```
The "setting active env" and "set memory manager" messages are logged in
`SparkContext` initialization, while the "active env =" and "memory manager of
" are logged in `BlockManager` when trying to access the `memoryManager`. The
first stack trace shows it is from the separate worker thread.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]