alamb opened a new pull request #9605:
URL: https://github.com/apache/arrow/pull/9605


   # Rationale
   
   As spotted / articulated by @edrevo 
https://github.com/apache/arrow/pull/9523#issuecomment-786911328, the 
intermixing of `crossbeam` channels (not designed for `async` and can block 
task threads) and `async` code such as DataFusion can lead to deadlock.
   
   At least one of the crossbeam uses predates DataFusion being async (e.g. the 
one in the parquet reader). The use of crossbeam in the repartition operator in 
#8982 may have resulted from the re-use of the same pattern.
   
   # Changes
   
   1. Removes the use of crossbeam channels from DataFusion (in 
`RepartitionExec` and `ParquetExec`) and replace with tokio channels (which are 
designed for single threaded code).
   2. Removes `crossbeam` dependency entirely
   3. Removes use of `multi_thread`ed executor in tests (e.g. 
`#[tokio::test(flavor = "multi_thread")]`) which can mask hangs
   
   # Kudos / Thanks
   
   This PR incorporates the work of @seddonm1 from 
https://github.com/apache/arrow/pull/9603 and @edrevo in  
https://github.com/edrevo/arrow/tree/remove-crossbeam (namely 
97c256c4f76b8185311f36a7b27e317588904a3a). A big thanks to both of them for 
their help in this endeavor. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to