[ https://issues.apache.org/jira/browse/HDDS-12169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan Blough resolved HDDS-12169. -------------------------------- Resolution: Duplicate Naturally I don't find the issue successfully until I search for my own. Resolving as a duplicate of HDDS-7265. > Add a RackScatter container placement policy for RATIS > ------------------------------------------------------ > > Key: HDDS-12169 > URL: https://issues.apache.org/jira/browse/HDDS-12169 > Project: Apache Ozone > Issue Type: New Feature > Affects Versions: 1.4.0 > Reporter: Ryan Blough > Priority: Minor > > I observed a scenario where a commercial Ozone user had rack awareness > configured, using standard RATIS 3 containers, and they experienced data loss > due to a series of unfortunate events. > The key bad event for them is that their hardware was managed by a vendor, > who shut down one of their racks for maintenance by surprise. This took 2 out > of 3 replicas down for long enough to trigger replication, and in multiple > cases the remaining containers had been impacted by the opened-twice because > of volume failure problem, resulting ultimately in data loss. > It seems like this problem would have been averted if the loss of a single > rack didn't make a majority of replicas unavailable. > I believe the solution is to provide a container placement policy for RATIS > like the RackScatter policy for EC. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org