gruuya opened a new pull request, #11765:
URL: https://github.com/apache/datafusion/pull/11765

   ## Which issue does this PR close?
   
   Relates to #9373 and #9375.
   
   ## Rationale for this change
   
   I'm dealing with a situation where we [have deeply nested 
plans](https://github.com/splitgraph/seafowl/blob/82a9ca4a7da9eed0d17e9fe0f6b3357af8bd97a1/src/frontend/flight/sync/writer.rs#L387-L398),
 which we want to execute and stream the data into storage (Parquet/Delta), and 
we're hitting the stack overflow problem observed in the aforementioned issues.
   
   Since this is on a write path we don't really need the analyzer/optimizer 
rules (I think), which are a part of the problem due to tree node recursion 
that takes place there. This is not an issue, since those can easily be opted 
out of via `with_analyzer_rules`/`with_optimizer_rules`. 
   
   However, the tightest bottleneck as per lldb is actually 
`ApplyFunctionRewrites`, which can't be opted out of, even though after 
https://github.com/apache/datafusion/pull/11155 it has no rewrite rules by 
default.
   
   ## What changes are included in this PR?
   
   Make `ApplyFunctionRewrites` simply bail out of the plan 
transformation/rewrite if it has no rules to apply (the default case presently).
   
   ## Are these changes tested?
   
   I wanted to add a test that checks for reference equity of the in/out plans 
but then recalled `AnalyzerRule::analyze` takes ownership of it. 
   
   So the only test is that I see a higher stack overflow threshold with this 
change.
   
   ## Are there any user-facing changes?
   
   None.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to