tustvold commented on issue #13692:
URL: https://github.com/apache/datafusion/issues/13692#issuecomment-2543174655

   I've pushed a simple example to 
[io_stall](https://github.com/tustvold/io_stall/blob/main/src/rayon.rs) that 
glues together [rayon](https://docs.rs/rayon/latest/rayon/) and 
[async_task](https://docs.rs/async-task/latest/async_task/) to yield an async 
scheduler that is able to accommodate CPU bound tasks whilst not starving IO.
   
   ```
   cargo run --release --bin tokio -- --cpu-duration 1s --concurrency 7
       Finished `release` profile [optimized] target(s) in 0.05s
        Running `target/release/tokio --cpu-duration 1s --concurrency 7`
   Average duration of 1002 ms (IO 2 ms) over 1 samples, throughput 0.9975502 
rps
   Average duration of 2002 ms (IO 1002 ms) over 1 samples, throughput 
0.9996063 rps
   Average duration of 3002 ms (IO 2002 ms) over 1 samples, throughput 
0.9998498 rps
   Average duration of 3000 ms (IO 2000 ms) over 1 samples, throughput 0.999554 
rps
   Average duration of 5003 ms (IO 4003 ms) over 1 samples, throughput 
0.9995086 rps
   Average duration of 6003 ms (IO 5003 ms) over 1 samples, throughput 
0.9998254 rps
   Average duration of 4001 ms (IO 3001 ms) over 1 samples, throughput 
0.99941516 rps
   ```
   
   vs
   
   ```
   cargo run --release --bin rayon -- --cpu-duration 1s --concurrency 7
      Compiling io_stall v0.1.0 (/home/raphael/repos/scratch/io_stall)
       Finished `release` profile [optimized] target(s) in 0.45s
        Running `target/release/rayon --cpu-duration 1s --concurrency 7`
   Average duration of 1002 ms (IO 2 ms) over 1 samples, throughput 0.9976903 
rps
   Average duration of 1002 ms (IO 2 ms) over 7 samples, throughput 6.994286 rps
   Average duration of 1000 ms (IO 0 ms) over 7 samples, throughput 6.9929976 
rps
   Average duration of 1000 ms (IO 0 ms) over 7 samples, throughput 6.9927454 
rps
   Average duration of 1000 ms (IO 0 ms) over 7 samples, throughput 6.994525 rps
   Average duration of 1000 ms (IO 0 ms) over 7 samples, throughput 6.993674 rps
   Average duration of 1000 ms (IO 0 ms) over 7 samples, throughput 6.993697 rps
   Average duration of 1000 ms (IO 0 ms) over 7 samples, throughput 6.9924793 
rps
   ```
   I am sure there are ways to improve this, but I think this has some quite 
interesting properties, in particular:
   
   * Mostly a drop-in replacement for tokio::spawn
   * Less than 100 lines of code to maintain
   * Avoids IO starvation
   * Allows using rayon's very ergonomic parallelism options
   * Preserves the thread-locality originating from the way non-blocking 
operators are recursively "composed"
   
   However, it is important to highlight that with this approach IO will still 
degrade poorly once CPU resources are saturated. Where there is a clear IO 
boundary, e.g. AsyncFileReader::get_bytes, it may still be worthwhile to spawn 
that as a dedicated task so that it can run to completion without needing to 
"find time" on the rayon pool. However, this can be done as an optimisation if 
people run into such issues. Ultimately this fixes the major issue where 
concurrency nose dives long before CPU resources are saturated, with limited 
shenanigans.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to