ulysses you created SPARK-34226:
-----------------------------------

             Summary: Reduce RepartitionOperation num partitions to its child 
max row
                 Key: SPARK-34226
                 URL: https://issues.apache.org/jira/browse/SPARK-34226
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.2.0
            Reporter: ulysses you


It's no meaning to repartition data if partition number is larger than data 
row, but would waste the resouce due to redundant task.

With ETL case, we always inject `repartition` or `distribute by` to reduce the 
output partition but the partition number may bigger than data row. It's better 
that try our best to reduce the redundant partition.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to