westonpace commented on issue #35649: URL: https://github.com/apache/arrow/issues/35649#issuecomment-1585013681
Yes, I think this is an example of a non-R thread calling R code. This is an Arrow worker thread running the query. The Arrow worker is calling duckdb and asking for the next batch. Duckdb is then calling into R and asking for the next batch. Perhaps DuckDb needs something like `SafeCallIntoR` but that wouldn't be trivial as you know probably better than I do. If Acero knew that the source was going to be calling into R then Acero could use SafeCallIntoR to fetch the next batch but, from Acero's perspective, it's just grabbing a batch from "some kind of C data producer". Another possibility would be to do something like run the entire Acero plan synchronously (both I/O threads and CPU threads). Acero would never spawn a thread so you'll always be on the R thread but that will tank performance (especially running the I/O tasks synchronously) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
