zhengchenyu commented on PR #1660: URL: https://github.com/apache/incubator-uniffle/pull/1660#issuecomment-2072211403
> Thanks for your work. @zhengchenyu After reviewing the design doc and concrete code briefly, I have some question about this feature. > > ### Spark client > 1. What's the diference of RecordBlob and RecordBuffer? The combine and sort difference is not reflected on the names > 2. Why not making the `RMWriteBufferManager` extend the `WriteBufferManager`, `RMWriterBufferManager` can almost replace the WriteBufferManager, we could control whether sort by config. > 3. Is the local sort in block level enoguh? Can we make the block bigger if spill to file? > > ### Shuffle-Server > 1. If the merge failed, the reducer should failover to original mechanism > 2. The merge process will introduce too much random read. If this happens on the HDD, the whole process is terrible. So the key point is to make the block bigger (Sort merge them before flushing to disk to create a bigger block?) But this looks will break the original blockId mechanism Spark client (1) RecordBlob is used for combine and RecordBuffer is used for sort (2) RMWriteBufferManager should be extended from WriteBufferManager. I just don't want to affect other code. If the proposal is accepted, I will do it later. (3) If we wanna bigger block, best way is increase the block size. Shuffle-Server (1) Yes. We need failover to original rss, especially if we can't load class of key, value, comparator. (2) I firstly test in HDD, it is terrible. Then I test in SSD cluster, then better. We can try to merge the block, but need lots of modification. I just wanted to do it quickly, so I reused the original mechanism -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
