alamb commented on PR #11453:
URL: https://github.com/apache/datafusion/pull/11453#issuecomment-2227453348
> It doesn't look like it's a single struct/enum filling the stack, but
multiple small structs/enums being copied, which kinda explains why the release
build passes, but not the debug build.
I think Rust debug builds keep space on the stack for all local variables to
make debugging easier, but that also makes for much larger stack frames in
debug build
>
> Maybe we've been slightly increasing the stack usage with every change and
this PR caused it to finally overflow.
Yes, I think this is a likely explanation
>
> I think we should temporarily increase the stack size to something > 2 MB
(I'm thinking 2.5 MB) for now and try to find the problematic test in a
follow-up, just to keep things moving.
>
> To replicate, change `datafusion/sqllogictest/bin/sqllogictests.rs:54` from
>
> ```rust
> #[tokio::main]
> #[cfg(not(target_family = "windows"))]
> pub async fn main() -> Result<()> {
> run_tests().await
> }
> ```
>
> to
>
> ```rust
> #[cfg(not(target_family = "windows"))]
> fn main() -> Result<()> {
> tokio::runtime::Builder::new_multi_thread()
> .thread_stack_size(2 * 1024 * 1024) // 2 MB
> // .thread_stack_size(2 * 1024 * 1024 - 128 * 1024) // 1.9 MB
> // .thread_stack_size(2 * 1024 * 1024 + 128 * 1024) // 2.1 MB
> .enable_all()
> .build()
> .unwrap()
> .block_on(run_tests())
> }
> ```
>
> and run the tests `RUST_BACKTRACE=1 cargo test --lib --tests --bins
--features avro,json,backtrace`
>
> Also, why do we spawn a thread in the windows main function that spawns
tokio threads? Is there a reason why we don't spawn the tokio threads directly?
`datafusion/sqllogictest/bin/sqllogictests.rs:36`
>
> ```rust
> #[cfg(target_family = "windows")]
> pub fn main() {
> // Tests from `tpch/tpch.slt` fail with stackoverflow with the default
stack size.
> thread::Builder::new()
> .stack_size(2 * 1024 * 1024) // 2 MB
> .spawn(move || {
> tokio::runtime::Builder::new_multi_thread()
> .enable_all()
> .build()
> .unwrap()
> .block_on(async { run_tests().await })
> .unwrap()
> })
> .unwrap()
> .join()
> .unwrap();
> }
> ```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]