Github user bOOm-X commented on the issue:

    https://github.com/apache/spark/pull/18004
  
    I am against the other approach for multiple reasons. But the key one is 
that it will change the synchronization paradigm,  and clearly it will change 
the behavior of the current listeners and maybe causing bugs. For example, the 
StorageStatusListener and the StorageListener are dependent. The second one use 
the "result" of the first one. If you put them in different thread, it will for 
sure change the current behavior. Will it cause fatal bug, I do not know.
    The asynchronous mechanism will be implemented in a very different way for 
all listeners. No global approach can be used because of the very different 
types of the messages and their frequency. What you will leverage at the 
listener level is the type of messages that the listener is interested in (for 
the logging listener, the blockUpdated messages - the far most frequent one - 
are ignored), the message processing type (for the logging listener the 
processing is the same for all message type), and the dependencies of the 
listener (For the logging listener, there is no dependencies).
    
    For the other significant (in term of performance) listener - the 
StorageListener - all of the key aspects are very different: 
    It processes the blockUpdated messages.
    All the different message types have a different processing
    The storageListener depends on the storageStatusListener (they have to 
process messages synchronously)
    
    The key thing in the event logging listener is the ability to not queue the 
blockUpdated messages and so be able to "not consider" them.
    For the couple  storageStatusListener /  storageListener , I think that the 
key thing is that you can batch consecutive blockUpdated messages (the other 
messages like SparkListenerStageSubmitted, ... act as a barrier) to decrease 
the processing time. This optimization will be much more complex than the 
logging listener one, and much less significant in terms of performance 
improvement    
    
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to