cxzl25 opened a new pull request #34493:
URL: https://github.com/apache/spark/pull/34493


   ### What changes were proposed in this pull request?
   SPARK-29295 introduces a mechanism that writes to external tables is a 
dynamic partition method, and the data in the target partition will be deleted 
first.
   
   Assuming that 1001 partitions are written, the data of 10001 partitions will 
be deleted first, but because `hive.exec.max.dynamic.partitions` is 1000 by 
default, loadDynamicPartitions will fail at this time, but the data of 1001 
partitions has been deleted.
   
   So we can check whether the number of dynamic partitions is greater than 
`hive.exec.max.dynamic.partitions` before deleting, it should fail quickly at 
this time.
   
   ### Why are the changes needed?
   Avoid data that cannot be recovered when the job fails.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   add UT
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to