alamb commented on code in PR #10033:
URL:
https://github.com/apache/arrow-datafusion/pull/10033#discussion_r1570466057
##########
datafusion/core/src/execution/context/mod.rs:
##########
@@ -471,24 +471,37 @@ impl SessionContext {
/// [`SQLOptions::verify_plan`].
pub async fn execute_logical_plan(&self, plan: LogicalPlan) ->
Result<DataFrame> {
match plan {
- LogicalPlan::Ddl(ddl) => match ddl {
- DdlStatement::CreateExternalTable(cmd) => {
- self.create_external_table(&cmd).await
- }
- DdlStatement::CreateMemoryTable(cmd) => {
- self.create_memory_table(cmd).await
- }
- DdlStatement::CreateView(cmd) => self.create_view(cmd).await,
- DdlStatement::CreateCatalogSchema(cmd) => {
- self.create_catalog_schema(cmd).await
+ LogicalPlan::Ddl(ddl) => {
+ // Box::pin avoids allocating the stack space within this
function's frame
+ // for every one of these individual async functions,
decreasing the risk of
+ // stack overflows.
+ match ddl {
+ DdlStatement::CreateExternalTable(cmd) => {
+ Box::pin(async move {
self.create_external_table(&cmd).await })
+ as std::pin::Pin<Box<dyn futures::Future<Output =
_> + Send>>
Review Comment:
> I did second-guess this change a lot ... as I expected Rust to allocate
enough stack space not for all futures, but for the biggest future out of all,
as only one of them will actually be called. But then I don't know why would
memory go down progressively with every future that I boxed.
What I have seen rust do (in debug builds only) is allocate stack space for
each local variable in the function. I speculate that this is to make debugging
easier as each variable has a unique space in the stack and won't get over
written with values from other variables depending on where it is.
When I have worked with C/C++ in the past (gcc mostly) the slots on the
stack frame are reused among local variables which makes debugging chalening
(as sometimes several variables in the debugger look like they change even when
only one is "live" at any point
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]