Github user YuhuWang2002 commented on the issue:

    https://github.com/apache/spark/pull/15297
  
    @tgravescs :
    Thank you for your response, when a single reduce task handling huge data, 
it's slowly and unstable. so we split one reduce task to multi- reduce task.
    A single reduce task doing like A join B. we split to multi-task. task 1 
doing A1 join B,  task 2 dong A2 join B and so on.  A1 is a part of A which 
read from a range of maps output.  For spark sql, it is the A1 as a  separate 
partitions when processing. so it can use mutil-executor to run the task.  for 
dispersion the process pressure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to