2010/5/6 Martin Fick <[email protected]>:
>> Yeah, you've got it right.  The rbd image is striped
>> over small objects, which are independently assigned
>> to OSDs.  The load should be very well distributed.
>
> How can that be on a 2 OSD setup with double redundancy?
> In this case, if all of a replicas smaller objects are
> not on a single node, how will it recover from an OSD
> failure?
>
> The only way I see this possible is if file foo is
> split into small objects A1 A2 A3 A4 and replicas B1
> B2 B3 B4 and you spread those across 2 OSDs like this:
>
> replica 1 (A1 B2 A3 B4)
> replica 2 (B1 A2 B3 A4)
>
> but then A1 has to know that it is the same as B1.  Is
> that the case?
The hashing probably isn't quite even enough to alternate the objects,
but yes -- different objects (even those forming a single "file") will
have different primary replicas even in a small system.
Since the default RBD unit is 4MB in size, and the disk is presumably
several to hundreds of gigabytes, you've got a reasonably well-striped
system.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to