adamreeve commented on PR #16738: URL: https://github.com/apache/datafusion/pull/16738#issuecomment-3177700851
I've pushed an example fix for the deadlock to my fork here: https://github.com/adamreeve/datafusion/commit/68f55a88d55647182b7d81b31c40a8e0805b15de This isn't a great solution as it means that creating the column writers could be blocked for a long time while a row group is written. I'm not sure how we can work around this though without exposing more of the low-level API in arrow-rs. If we could take an `ArrowRowGroupWriterFactory` in `spawn_parquet_parallel_serialization_task` then that would avoid needing this locking. (Plus it should fix being able to use the correct row group index). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org