[GitHub] [spark] tenglei commented on a diff in pull request #38181: [SPARK-40720][CORE] Fix spark-ui jobs status not updating under high concurrency scenario

GitBox Sun, 09 Oct 2022 18:40:24 -0700


tenglei commented on code in PR #38181:
URL: https://github.com/apache/spark/pull/38181#discussion_r990872972



##########
core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala:
##########
@@ -154,8 +154,9 @@ private class AsyncEventQueue(
       return
     }
 
-    eventCount.incrementAndGet()
-    if (eventQueue.offer(event)) {
+    if (eventQueue.offer(event, conf.get(LISTENER_BUS_EVENT_QUEUE_TIMEOUT),
+      TimeUnit.MILLISECONDS)) {

Review Comment:
   Blocking only occurs when the eventQueue is full, and a full eventQueue 
means that the load on the current driver is already relatively high, and 
continuing to allow individual events to be generated will affect the overall 
speed of processing jobs. After several tests on my end, when the queue is full 
and the timeout is set to 30s, there is instead a performance improvement of 
around 5% in a high concurrency scenario (compared to before the 
modification).So I think it makes sense to have a logic that allows for 
flexible blocking



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] tenglei commented on a diff in pull request #38181: [SPARK-40720][CORE] Fix spark-ui jobs status not updating under high concurrency scenario

Reply via email to