hui730 commented on PR #43621:
URL: https://github.com/apache/spark/pull/43621#issuecomment-1790565946

   > > My plan is to create a new binary from executor binary,use 
System.arraycopy().
   > 
   > This is what is effectively happening currently, right ? The underlying 
serialzed task array array is immutable - and is repeatedly read to deserialize 
into the task closure.
   > 
   > I want to make sure I understand the proposal, and how it is different 
from what Spark is currently doing.
   
   Assuming that there are currently n tasks (same stage) running 
simultaneously in the executor.
   the current situation is to read the remote broadcast, deserializing it, and 
then start running these n tasks. This step is serial.
   In my modification, the step of reading the remote broadcast and 
deserializing it into array [byte], which is asynchronous. It is parallel to 
the launchTask. This can save time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to