vanzin opened a new pull request #26670: [SPARK-30033][core] Manage shuffle IO 
plugins using Spark's plugin system.
URL: https://github.com/apache/spark/pull/26670
 
 
   SPARK-25299 is introducing a new plugin interface for shuffle IO; currently,
   parts of that API provide lifecycle methods that are already covered by the
   plugin API that was added in SPARK-29396.
   
   This change makes some modifications so that:
   
   - The driver and executor components of the shuffle plugin extend their
     respective counterparts in the generic plugin API.
   - The shuffle IO plugin is managed by the same code that manages other
     generic plugins.
   
   This simplifies and reuses similar code that exists in both implementations,
   and also provides more functionality to shuffle plugins: not only do they 
have
   more contextual information (without having to query APIs like SparkEnv) but
   they also have access to other functionality in the plugin API that would
   otherwise require touching internal Spark APIs.
   
   There is a small change to the generic plugin API to avoid registering an
   RPC endpoint and starting threads when not needed; plugins now must 
explicitly
   say they want to handle RPC messages for the endpoint to be created. This is
   done because the default shuffle plugin is now loaded by the plugin system,
   and does not need the RPC functionality. (This API hasn't been released yet
   so it's ok to make the change.)
   
   The only downside is that initialization of the SortShuffleManager in 
executors
   is a bit weird, because of the order in which things are initialized: the
   shuffle manager is initialized by SparkEnv, and plugin initialization happens
   after that. In any case, all initialization is done before any tasks are
   allowed to run..
   
   Currently, the shuffle plugin is always loaded, regardless of whether the 
sort
   shuffle manager is being used; this was already the case in the driver, but
   now is also the case in the executors. It shouldn't be hard to fix that if
   needed.
   
   Tested with existing and updated unit tests.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to