Re: [ceph-users] How objects are reshuffled on addition of new OSD

2015-09-08 Thread Gregory Farnum
On Tue, Sep 1, 2015 at 2:31 AM, Shesha Sreenivasamurthy  wrote:
> I had a question regarding how OSD locations are determined by CRUSH.
>
> From the CRUSH paper I gather that the replica locations of an object (A) is
> a vector (v) that is got by the function c(r,x) = (hash (x) + rp) mod m).

It is a hash function, but I don't think this is quite right. Objects
are hashed (quickly, using rjenkins or something) into a placement
group. The CRUSH function is then run on that placement group to
assign it to a vector of OSDs; this is pretty configurable and takes a
tree as input (with the choice of straw, list, etc types).

>
> Now when new OSDs are added, objects are shuffled to maintain uniform data
> distribution. What in the above equation changes so that only minimal
> movement is achieved. More specifically, if nothing in the above equation
> changes then all the objects again map to the same locations. If p is
> changed, then lots of object location can be changed. Therefore, how does
> CRUSH guarantees only minimal data movement.

Like I said, that's not the equation. It's more like you have three
doors to choose from at each of three levels, and when you add a new
door somewhere in the tree, you only move a little bit of the data
around.

>
> Followup question is, if there in an ongoing IO to an object, the primary
> replica is the one that will be getting updated. Does the re-shuffling in
> that case do not consider currently hot objects for movement ?

It definitely does not consider heat. Everything is based on the
object names (locators, more specifically, but they're generally the
same). Responsibility for maintaining the IO lives in layers above
CRUSH.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How objects are reshuffled on addition of new OSD

2015-08-31 Thread Shesha Sreenivasamurthy
I had a question regarding how OSD locations are determined by CRUSH.

>From the CRUSH paper I gather that the replica locations of an object (A)
is a vector (v) that is got by the function *c(r,x) = (hash (x) + rp) mod
m)*.

Now when new OSDs are added, objects are shuffled to maintain uniform data
distribution. What in the above equation changes so that only minimal
movement is achieved. More specifically, if nothing in the above equation
changes then all the objects again map to the same locations. If p is
changed, then lots of object location can be changed. Therefore, how does
CRUSH guarantees only minimal data movement.

Followup question is, if there in an ongoing IO to an object, the primary
replica is the one that will be getting updated. Does the re-shuffling in
that case do not consider currently hot objects for movement ?

Thanking you sincerely,
Shesha
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com