guihuawen created SPARK-48378:
---------------------------------
Summary: Limit the maximum number of dynamic partitions
Key: SPARK-48378
URL: https://issues.apache.org/jira/browse/SPARK-48378
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 4.0.0
Reporter: guihuawen
Fix For: 4.0.0
This issues https://issues.apache.org/jira/browse/SPARK-37217
‘Assuming that 1001 partitions are written, the data of 10001 partitions will
be deleted first, but because {{hive.exec.max.dynamic.partitions}} is 1000 by
default, loadDynamicPartitions will fail at this time, but the data of 1001
partitions has been deleted.
So we can check whether the number of dynamic partitions is greater than
{{hive.exec.max.dynamic.partitions}} before deleting, it should fail quickly at
this time.’
Can this be done to make it more comprehensive.
If a task on the execution node has already generated partitions that exceed
hive. exec. max. dynamic. partitions, the task should be stopped at the
execution node because the cost of generating data is high.
If the executing node does not generate a partition exceeding
hive.axec.max.dynamic.partitions for a certain task, it is still determined
through the driver.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]