On Fri, 7 Mar 2014, Dan van der Ster wrote:
> On Thu, Mar 6, 2014 at 9:30 PM, Sage Weil <[email protected]> wrote:
> > Sheldon just
> > pointed out a talk from ATC that discusses the basic problem:
> >
> >         
> > https://www.usenix.org/conference/atc13/technical-sessions/presentation/cidon
> >
> > The situation with CRUSH is slightly better, I think, because the number
> > of peers for a given OSD in a large cluster is bounded (pg_num /
> > num_osds), but I think we may still be able improve things.
> 
> I'm surprised they didn't cite Ceph -- aren't copysets ~= placement groups?

I think so (I didn't listen to the whole talk :).  My ears did perk up 
when Carlos (who was part of the original team at UCSC) asked the question 
about the CRUSH paper at the end, though. :)

Anyway, now I'm thinking that this *is* really just all about tuning 
pg_num/pgp_num.  And of course managing failure domains in the CRUSH map 
as best we can to align placement with expected sources of correlated 
failure.  But again, I would appreciate any confirmation from others' 
intuitions or (better yet) a proper mathematical model.  This bit of my 
brain is full of cobwebs, and wasn't particularly strong here to begin 
with.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to