cxzl25 commented on PR #1946:
URL: https://github.com/apache/auron/pull/1946#issuecomment-3788513889
> Could you share the root cause and troubleshooting clues for this q14b
issue
Judging from the call stack, the accumUpdates of TaskResult contains null
values.
```java
ERROR TaskResultGetter: Exception while getting task result
java.lang.NullPointerException
at
org.apache.spark.scheduler.TaskResultGetter$$anon$3.$anonfun$run$3(TaskResultGetter.scala:109)
```
The only place where `TaskMetrics#externalAccums` might be updated is
`TaskMetrics#registerAccumulator`, which in turn is invoked via
`taskContext.registerAccumulator(this)` called by the deserialization of
`AccumulatorV2#readObject`. From the perspective of the code, it is highly
unlikely for a null value to be written here.
Since `externalAccums` is an ArrayBuffer and not thread-safe, I tried adding
the synchronized keyword to `TaskMetrics#registerAccumulator`, and the NPE
issue was resolved. I then added some logging to record which threads were
accessing this method concurrently, and ultimately identified that Auron's
deserialization of expressions may be the root cause of this problem.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]