timarmstrong commented on a change in pull request #34245:
URL: https://github.com/apache/spark/pull/34245#discussion_r727407075
##########
File path: core/src/main/scala/org/apache/spark/TaskContextImpl.scala
##########
@@ -82,23 +82,33 @@ private[spark] class TaskContextImpl(
@volatile private var _fetchFailedException: Option[FetchFailedException] =
None
@GuardedBy("this")
- override def addTaskCompletionListener(listener: TaskCompletionListener)
- : this.type = synchronized {
- if (completed) {
+ override def addTaskCompletionListener(listener: TaskCompletionListener):
this.type = {
+ val needToCallListener = synchronized {
+ if (completed) {
+ true
+ } else {
+ onCompleteCallbacks += listener
+ false
+ }
+ }
Review comment:
The API doesn't intend to guarantee any ordering of when the task
completion listeners are called AFAICT. I think before this change the
implementation ends up guaranteeing that the listeners are called sequentially.
So it seems possible that some code could be accidentally depending on that.
This might be overengineering it, but we could have a scheme that avoided
the deadlock issues and guaranteed sequential execution of callbacks. You would
have at most one single thread at any point in time responsible for invoking
callbacks. If another thread needs to invoke a callback, it either delegates it
to the current callback invocation thread, or it becomes the callback execution
thread itself. This means that the callback invocation thread needs to first
invoke all of the current registered callbacks, but when it's done with those,
check to see if any more callbacks have been queued.
I think we could do that by having the callback invocation thread taking
ownership of the current callbacks list, but after invoking those callbacks
checking to see if any more have been queued. We'd also need a variable to
track if there's a current callback execution thread.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]