HeartSaVioR commented on a change in pull request #24676: [SPARK-27804]
LiveListenerBus may create multiple AsyncEventQueues with same name in
concurrent scene
URL: https://github.com/apache/spark/pull/24676#discussion_r286522772
##########
File path: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala
##########
@@ -86,44 +87,61 @@ private[spark] class LiveListenerBus(conf: SparkConf) {
}
/**
- * Add a listener to a specific queue, creating a new queue if needed.
Queues are independent
- * of each other (each one uses a separate thread for delivering events),
allowing slower
- * listeners to be somewhat isolated from others.
- */
+ * Add a listener to a specific queue, creating a new queue if needed.
Queues are independent
+ * of each other (each one uses a separate thread for delivering events),
allowing slower
+ * listeners to be somewhat isolated from others.
+ *
+ * @param listener
+ * @param name name of AsyncEventQueue, @see [[AsyncEventQueue.name]] for
more
+ */
private[spark] def addToQueue(
listener: SparkListenerInterface,
- queue: String): Unit = synchronized {
Review comment:
Here write operation on `queues` is guarded with `synchronized`, so what you
describe should not happen as I commented on JIRA issue.
You've asked in JIRA comment why using explicit thread-safe list if the list
is guarded. As I'm not the author of code I can't say, but you can trace back
to #19211 and read some comments around `queues`.
The list is expected to have very small writing operations, whereas
iteration frequently happens. If you see javadoc about `CopyOnWriteArrayList`:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/CopyOnWriteArrayList.html
```
This is ordinarily too costly, but may be more efficient than alternatives
when traversal operations vastly outnumber mutations, and is useful when you
cannot or don't want to synchronize traversals, yet need to preclude
interference among concurrent threads.
```
As you can see `postToQueues`, traversal operations are not guarded with
`synchronized`. So the list should be still thread-safe, but it is not to avoid
race condition between multiple writes (it's guarded with `synchronized`) but
race between read (traversal) and write.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]