zuston opened a new issue, #297:
URL: https://github.com/apache/incubator-uniffle/issues/297

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the 
[issues](https://github.com/apache/incubator-uniffle/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Describe the bug
   
   When enable the MEMORY_LOCALFILE storage type in uniffle shuffle-server(it 
has the 4 disks), the first event of (appId:x,shuffleId:x,partition:1) is 
flushing from memory to localfile.
   But when selecting the storage in `LocalStorageManager`, the disk selected 
by 
`localStorages.get(ShuffleStorageUtils.getStorageIndex(localStorages.size(),event.getAppId(),event.getShuffleId(),event.getStartPartition())`
 is corrupted maybe due to reaching high-watermark (suppose disk0 is 
corrupted), and so it will fallback to use the disk1.
   
   But the second event of (appId:x,shuffleId:x,partition:1) is flushing, the 
disk0 has been repaired. It means the second event's data will be flushed to 
disk0.
   
   And the reading client will fetch the disk0 data directly and ignore data in 
disk1, that will lost some data for App.
   
   ### Affects Version(s)
   
   master
   
   ### Uniffle Server Log Output
   
   _No response_
   
   ### Uniffle Engine Log Output
   
   _No response_
   
   ### Uniffle Server Configurations
   
   _No response_
   
   ### Uniffle Engine Configurations
   
   _No response_
   
   ### Additional context
   
   Currently the storage event data flushed is determined by the hash of 
appId&shuffleId&partitionId and localstorages size, it's a static strategy. 
That means we should store the state of flushing storages for one partition to 
solve the corrupted storage problem.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@uniffle.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to