joemarshall opened a new pull request, #35471:
URL: https://github.com/apache/arrow/pull/35471

   As previously discussed in #35176 this is a patch that adds an option 
`ARROW_DISABLE_THREADING`. When it is turned on, arrow threadpool and serial 
executors don't spawn threads, and instead run tasks in the main thread when 
futures are waited for.
   
   It doesn't mess with threading in projects included as dependencies, e.g. 
multithreaded malloc implementations because if you're building for a non 
threaded environment, you can't use those anyway.
   
   Basically where this is at is that it runs the test suite okay, and I think 
should work well enough to be a backend for pandas on emscripten/pyodide, so 
subject to reviews etc. it is worth merging (and would be jolly convenient for 
me if it was).
   
   What this means is:
   1) It is possible to use arrow in non-threaded emscripten/webassembly 
environments (with some build patches specific to emscripten which I'll put in 
once this is in)
   2) Most of arrow just works, albeit slower in parts.
   
   Things that don't work and probably won't:
   1) Server stuff that relies on threads. Not a massive problem I think 
because environments with threading restrictions are currently typically also 
restricted from making servers anyway (i.e. they are web browsers)
   2) Anything that relies on actually doing two things at once (for obvious 
reasons)
   
   Things that don't work yet and could be fixed in future:
   1) use of asynchronous file/network APIs in emscripten which would mean I/O 
could work efficiently in one thread.
   2) asofjoin - right now the implementation relies on std::thread - it needs 
refactoring to work with threadpool like everything else in arrow, but I'm not 
sure I am expert enough in the codebase to do it well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to