Ngone51 commented on pull request #34536:
URL: https://github.com/apache/spark/pull/34536#issuecomment-964738126


   > SparkListenerExecutorAdded and SparkListenerExecutorRemoved are distinct 
from blockmanager events.
   Based on what I currently see, can you clarify why 
SparkListenerExecutorRemoved needs to be fired ?
   
   So, first of all, we should know that there's a case (reported by 
SPARK-35011) where the executor doesn't exist in the scheduler backend but 
exist in `BlockMangerMaster`(in the way of `BlockManager`). In this case, only 
a `SparkListenerBlockManagerAdded` event that is fired during `BlockManager` 
registration. And on the `AppStatusListener` side, whenever there's a 
`SparkListenerExecutorAdded` or `SparkListenerBlockManagerAdded`, it'd create a 
live executor entity for the executor. Therefore, we'd have a live executor in 
UI in the case of SPARK35011, even if the executor is dead indeed.
   
   For such registered `BlockManager`s, fortunately, we have 
`HeartbeatReceiver.expireDeadHosts` to remove them in the end, which fires a 
`SparkListenerBlockManagerRemoved` during removal. Note that, there won't be a 
`SparkListenerExecutorRemoved` fired since scheduler backend 
(`executorDataMap`) already doesn't contain the executor.
   
   However, for `AppStatusListener`, it only accepts  
`SparkListenerExecutorRemoved` to remove a live executor in UI but not 
`SparkListenerBlockManagerRemoved`. Therefore, we need to fire a separate 
`SparkListenerExecutorRemoved` for it.
   
   > If there is downstream use of blockmanager and executor events 
interchangably, we should fix that instead of duplicating event ? (I am 
assuming reference to AppStatusListener was for this ?)
   
   Yes, it's `AppStatusListener` that needs the event. If we fix in 
`AppStatusListener`, we'd miss the exact executor loss reason in UI 
(`SparkListenerExecutorRemoved` contains a loss reason field but 
`SparkListenerBlockManagerRemoved` doesn't). So I choose to duplicate the event 
instead of fixing in `AppStatusListener`.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to