[GitHub] [incubator-uniffle] advancedxy commented on pull request #435: [Feature] Support multiple disk selection for one partition when using local storage

GitBox Mon, 26 Dec 2022 00:00:50 -0800


advancedxy commented on PR #435:
URL: 
https://github.com/apache/incubator-uniffle/pull/435#issuecomment-1364975415


   > > Seems like we haven't get rid of storageId in proto.
   > 
   > > I haven't thought it thoroughly, but it should be possible to eliminate 
the need of storageId.
   > 
   > Looking forward to better design.
   
   I went through this PR today. It may require some id to indicate which 
shuffle_data/shuffle_index current points, but I don't think it's a good design 
to have the client passing around the storageIndex/storageId.
   
   One possible solution would be similar with how hdfs handles multiple 
IndexFile/DataFile. 
   For local shuffle client:
   1. Get all index data once, resulting a list of <ShuffleIndexResult>
   2. Create a list of `LocalFileClientReaderHandler`, which corresponding a 
list of shuffle data file.
   3. Iterate through sequentially/parallelly to fetch the data.
   
   In step 2 and 3, you may pass a `ShuffleDataFileId`/ `ShuffleIndexId` to 
indicate which shuffle file is fetched. It may be effectively the same as 
StorageId, but I believe it's more natural to use `ShuffleDataFileId` here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-uniffle] advancedxy commented on pull request #435: [Feature] Support multiple disk selection for one partition when using local storage

Reply via email to