marin-ma commented on PR #11722:
URL: https://github.com/apache/gluten/pull/11722#issuecomment-4037894211

   @guowangy In general, random I/O is considered a bottleneck in shuffle, and 
that's why there are so many remote shuffle service projects and solutions like 
celeborn, uniffle are aimed at. The remote shuffle service usually coalesce the 
shuffle outputs from mapper side to reduce the random IO access. However, the 
design in this PR seems to go in the opposite direction, since it may introduce 
more random I/O during reads.
   
   Directly writing the segments to the data file would make the partition 
writer logic simpler, but we intentionally didn't choose that approach based on 
the above consideration. I'm not sure if your test is based on single node or 
on a cluster. If it's on single node and disk IO is not bottleneck, then the 
solution may not be practical in real use case. 
   
   Besides, based on our experience, external shuffle service is usually 
enabled in real production environments because it provides better stability 
when executor process is down, and it's more like a must-have feature that the 
shuffle framework should support.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to