Github user vanzin commented on the issue: https://github.com/apache/spark/pull/21402 > StreamRequest will not block the server netty handler thread. Hmm, I'm not so sure that's accurate. I think the main difference is that I don't think there is currently any code path that sends a `StreamRequest` to the shuffle service. But for example if many executors request files from the driver simultaneously, you could potentially end up in the same situation. It's a less serious issue since I think it's a lot less common for large files to be transferred that way, at least after startup. I took a look at the code and it seems the actual change that avoids the disk thrashing is the synchronization done in the chunk fetch handler; it only allows a certain number of threads to actually do disk reads simultaneously. That's an improvement already, but a couple of questions popped in my head when I read your comment: - how does that relate to maxChunksBeingTransferred()? Aren't both settings effectively a limit on the number of requests being serviced, making the existing one a little redundant? - would there be benefits by trying to add some sort of disk affinity to these threads? e.g. send fetch requests hitting different disks to different queues.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org