sarutak commented on code in PR #52524:
URL: https://github.com/apache/spark/pull/52524#discussion_r2616483360
##########
core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala:
##########
@@ -481,23 +481,33 @@ private[storage] class
BlockInfoManager(trackingCacheVisibility: Boolean = false
val writeLocks =
Option(writeLocksByTask.remove(taskAttemptId)).getOrElse(Collections.emptySet)
writeLocks.forEach { blockId =>
blockInfo(blockId) { (info, condition) =>
- assert(info.writerTask == taskAttemptId)
- info.writerTask = BlockInfo.NO_WRITER
- condition.signalAll()
+ // Check the existence of `blockId` because `unlock` may have already
removed it
+ // concurrently.
+ if (writeLocks.contains(blockId)) {
+ blocksWithReleasedLocks += blockId
+ assert(info.writerTask == taskAttemptId)
+ info.writerTask = BlockInfo.NO_WRITER
+ condition.signalAll()
+ }
}
Review Comment:
@mridulm
Thank you for taking a look at this PR and sorry for the late reply.
> Do we need to make a change on the write side ?
Yes. Reading the value from a broadcast in a task flows
[BlockManager#putSingle](https://github.com/apache/spark/blob/43f7936d7b3a4701e3d0fdb44663006cbe0db70b/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala#L296)
so unlocking for write lock can also happen.
> If yes, there is a possibility of NPE in unlock
[here](https://github.com/apache/spark/blob/6bf8c172f04a1706b52ab451181b6b1e98a647b6/core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala#L392)
?
I reproduced NPE thrown so I updated to check null.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]