On 05/04/2017 03:58 PM, Xavier Villaneau wrote:
> Hello Loïc,
> 
> On Thu, May 4, 2017 at 8:30 AM Loic Dachary <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     Is there a way to calculate the optimum nearfull ratio for a given 
> crushmap ?
> 
> 
> This is a question that I was planning to cover in those calculations I was 
> working on for python-crush. I've currently shelved the work for a few weeks 
> but intend to look at it again as time frees up.

Of course ! Now I see how the two are related.

Thanks.

> Basically, I see this as a five-fold uncertainty problem:
> 1. CRUSH mappings are pseudo-random and therefore (usually) uneven
> 2. Object distribution between placement groups has the exact same issue
> 3. Object size within a given pool can also vary greatly (from bytes to 
> megabytes)
> 4. Failures and the following re-balancing are also random.
> 5. Finally, pools can occupy different and overlapping sets of OSDs, and hold 
> independent sets of objects.
> 
> Thanks to your new CRUSH tools, I think #1 and #4 are solved respectively by 
> the ability to:
> - generate a CRUSH map for a precise (and even) distribution of PGs;
> - test mappings for every scenario of N failures and find the worst-case 
> scenario (very expensive calculation, but possible).
> 
> Issues #2 and #3 are more tricky. The big picture is that a given amount of 
> data is placed more evenly the more objects there are, and there should be a 
> way to use statistics to quantify that. Variance in object size then brings 
> in more uncertainty, but I think that metric is difficult to quantify outside 
> of very specific use cases where object size are known.
> 
> Finally, this might all be made redundant by the new auto-rebalancing feature 
> that Sage is planning for Luminous. If we can assume even data placement at 
> all times the #4 is the only thing we need to worry about. For 
> performance-based placement that would be very different however. And if 
> pools have overlapping OSD sets, that could be fairly tricky too.
> 
> Maybe some other users here already have some rule of thumb or actual 
> calculations for that. I was planning to get into the statistical 
> calculations of data placement assuming unique object size as the next step 
> for the paper I am working on. Would there be a need for such tools?
> 
> Regards,
> -- 
> Xavier Villaneau
> Storage Software Eng. at Concurrent Computer Corp.
> 

-- 
Loïc Dachary, Artisan Logiciel Libre
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to