joemarshall commented on PR #35672: URL: https://github.com/apache/arrow/pull/35672#issuecomment-1591883165
> Thank you for working on this. This will be a great feature. I haven't fully understood everything but I think I am getting the picture. Could you just give a general description of how you might have multiple serial executors? Is this from multiple user threads? Or is this becuase the CPU executor and the I/O executor need to be two distinct executors for some reason? So there's two ways to end up with multiple executors - firstly there's lots of code that assumes that while an executor is running a cpu based task, it can fire and wait for i/o calls. In order to support this kind of waiting in CPU processing code, we still have to keep track of things in multiple executors. There's also I think some stuff in other things which use multiple executors to run things in parallel whilst keeping each set of executions neatly serialised. And I think things that use nested executors. I found it easier to keep the multiple executors in existence logically, even if in reality there is only ever one thing happening at any time. I did try putting everything in a single executor and a lot of tests failed, so I figured it made more sense to keep the logical structure. The only other difference is that instead of waiting on a condition when a serial executor is held up by I/o or whatever, it instead loops through all other executors to see if they have any tasks that need running. Because of that, the concept of a current executor, i.e. the innermost executor being called is used instead of the thread id to detect if a particular executor is running the currently executing task. The other potential advantage of keeping the logical structure of things like this is that in future, there is potential for support for asynchronous i/o even in systems without threading - e.g. in webassembly there is an extension (currently experimental only, but standards track ) which enables creating and waiting on JavaScript futures. With that, whilst you still often don't have threads, you can do i/o in parallel with computation. That could integrate with this code with minimal modifications I think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
