guihuawen created SPARK-48378:
---------------------------------

             Summary: Limit the maximum number of dynamic partitions
                 Key: SPARK-48378
                 URL: https://issues.apache.org/jira/browse/SPARK-48378
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: guihuawen
             Fix For: 4.0.0


This issues https://issues.apache.org/jira/browse/SPARK-37217  

‘Assuming that 1001 partitions are written, the data of 10001 partitions will 
be deleted first, but because {{hive.exec.max.dynamic.partitions}} is 1000 by 
default, loadDynamicPartitions will fail at this time, but the data of 1001 
partitions has been deleted.

So we can check whether the number of dynamic partitions is greater than 
{{hive.exec.max.dynamic.partitions}} before deleting, it should fail quickly at 
this time.’

 

Can this be done to make it more comprehensive.


If a task on the execution node has already generated partitions that exceed 
hive. exec. max. dynamic. partitions, the task should be stopped at the 
execution node because the cost of generating data is high.


If the executing node does not generate a partition exceeding 
hive.axec.max.dynamic.partitions for a certain task, it is still determined 
through the driver.

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to