On 11/06/2012 11:24 AM, Gandalf Corvotempesta wrote:
2012/11/6 Wido den Hollander <[email protected]>:
The setup described on that page has 90 nodes, so one node failing is a
little over 1% of the cluster which fails.
I think i'm missing something.
In case of a failure, they will always have to resync 36 TB of data,
no matter if they have 90 servers.
Each server is 36TB, so every times they  need to resync the whole server.

Well you have to keep in mind that when a node fails the PG's that resided on that node have to be redistributed over all the other nodes. So you begin moving about 1% of the data between all the remaining nodes/osds (coming from an OSD that has the remaining replica of the pg to the new OSD that will get a replica). So you move from and to all the remaining osd's and that will give you a lot of bandwidth and therefor fast recorvery to a consistent state.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to