mridulm commented on pull request #34158:
URL: https://github.com/apache/spark/pull/34158#issuecomment-932617201


   
   > 3. In `BlockManager.registerWithExternalShuffleServer`:
   >    When `serializer.supportsRelocationOfSerializedObjects` returns `false` 
then the executor will register with the push-based shuffle version of 
`shuffleManagerMeta` even though push-based shuffle won't be used.
   >    Does this result in correct behavior?❓
   >    I'm worried about potentially hard-to-debug issues in the rare scenario 
where push-based shuffle would be used _except_ for the fact that the 
user-configured `KryoSerializer` doesn't support relocation of serialized 
objects. In our test code we have a 
`org.apache.spark.serializer.RegistratorWithoutAutoReset` which can be used to 
test the corner-case where that might occur.
   
   For push based shuffle, we pass additional metadata as suffix to 
shuffleManager id when registering with external shuffle manager (ESS).
   If present, it initializes ESS to support receiving block pushes. But if the 
application does not push blocks subsequently (because push based shuffle was 
disabled due to insufficient mergers/serializer relocation issue/etc), it is 
just functionality that is enabled but not used.
   
   So strictly speaking, this should not change things.
   
   > 
   > Just brainstorming here, but as a possible alternative fix: what if the 
driver evaluated whether push based shuffle could really be enabled, stored the 
result of that evaluation into a private internal configuration, then passed 
that configuration to executors (letting us skip the computation on the 
executors)? I wonder if there's any precedence for doing this elsewhere in the 
Spark code or whether we'd run into different initialization order issues.
   
   This is definitely a good idea - and we do use spark conf to pass 
`spark.app.attempt.id` to executors this way for use in push based shuffle.
   We can add something like `spark.shuffle.push.internal.enabled` - which can 
be used for this case.
   
   Thoughts ?
   
   +CC @Victsm, @zhouyejoe, @otterc, @venkata91 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to