[PR] To solve the issue of generating excessively large execution plans when encountering multiple levels of subqueries while enabling DynamicPartitionPruning. [spark]

via GitHub Wed, 28 Aug 2024 04:51:27 -0700


chenfengbin2009 opened a new pull request, #47911:
URL: https://github.com/apache/spark/pull/47911


   [SPARK-48486][CORE] Instead of PartitionPruning#prune is expected
   ### What changes were proposed in this pull request?
   
   The code of method: prune in class: 
org.apache.spark.sql.execution.dynamicpruning.PartitionPruning is modified. And 
class: org.apache.spark.sql.internal.SQLConf has added a property value: 
DYNAMIC_PARTITION_PRUNING_MAX_LENGTH and a dynamicPartitionPruningMaxLength 
method.
   
   ### Why are the changes needed?
   
   In order to address the issue of Spark's DynamicPartitionPruning generating 
plans with excessive character length and causing excessive memory usage 
leading to OOM errors when encountering too many levels of subqueries.
   
   ### Does this PR introduce _any_ user-facing change?
   yes
   When DYNAMIC_PARTITION_PRUNING_ENABLED is set to true and DPP optimization 
is enabled, the generated execution plan may trigger a potential issue.
   
   ### How was this patch tested?
   I have provided test cases as much as possible.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] To solve the issue of generating excessively large execution plans when encountering multiple levels of subqueries while enabling DynamicPartitionPruning. [spark]

Reply via email to