I think before we start bug fix or try to get rid of ruleset concept, we can 
start with define a reasonable use case. How we expect user to play with rule 
and pools.  there is no CLI to create/modify a ruleset, even worse , you are 
not able to get the ruleset id without dump a rule. 

currently the logic of command flow is really strange, user writes a rule, when 
he wants to use the rule,he need to find out the ruleset who contains the rule, 
and specified the ruleset to a pool. If the ruleset only contains a rule, the 
concept of ruleset is  confusing and useless, if the ruleset contains more than 
one rules, user may have the risk that ceph select a rule in the ruleset, but 
not the one he want...



在 2014-8-8,22:34,"Loic Dachary" <[email protected]> 写道:

> 
> 
> On 08/08/2014 16:12, Sage Weil wrote:
>> On Fri, 8 Aug 2014, Loic Dachary wrote:
>>> Hi,
>>> 
>>> As you noticed, there are places where ruleset and ruleno / ruleid are used 
>>> interchangeably although they are not. This is a source of subtle bugs that 
>>> can be hard to trace. By default ruleid and ruleset are the same, but 
>>> dumping a crush map including
>>> 
>>> rule data {
>>>        ruleset 0
>>>        type replicated
>>>        min_size 1
>>>        max_size 10
>>>        step take default
>>>        step chooseleaf firstn 0 type host
>>>        step emit
>>> }
>>> rule metadata {
>>>        ruleset 1
>>>        type replicated
>>>        min_size 1
>>>        max_size 10
>>>        step take default
>>>        step chooseleaf firstn 0 type host
>>>        step emit
>>> }
>>> 
>>> and swapping the rules as follows
>>> 
>>> rule metadata {
>>>        ruleset 1
>>>        type replicated
>>>        min_size 1
>>>        max_size 10
>>>        step take default
>>>        step chooseleaf firstn 0 type host
>>>        step emit
>>> }
>>> 
>>> rule data {
>>>        ruleset 0
>>>        type replicated
>>>        min_size 1
>>>        max_size 10
>>>        step take default
>>>        step chooseleaf firstn 0 type host
>>>        step emit
>>> }
>>> 
>>> will have ruleset 1 with rule id 0 and ruleset 0 with rule id 1
>>> 
>>> Since the ruleset is the only reliable number, from the user point of 
>>> view, we could simply change CrushWrapper.h to never return the rule id 
>>> and assume only ruleset are given in argument, even where it currently 
>>> claims to be a rule id.
>> 
>> I'm worried about making that sort of change in an internal interface.  
>> And, more generally, about CRUSH maps in the wild that may have odd 
>> mappings that we don't want to break with subtle changes (even fixes).  :/
>> 
>>> The downside is that looking up the ruleset implies iterating over all 
>>> the rules, but that's probably not an issue.
>>> 
>>> What do you think ?
>> 
>> I sat down a few months ago and tried to figure out if we could get rid of 
>> the ruleset concept entirely and simply map pools directly to rules 
>> (which are the things the user conceptually thinks about, we name, etc.).  
>> The original motivation for a ruleset was to be able to adjust the pool 
>> replication factor and have the system adjust the placement behavior 
>> accordingly, but in reality that is a pretty useless capability: num_rep 
>> rarely changes, and when it does you can simply adjust the placement rule 
>> at the same time.  Unfortunately, I didn't come up with any easy and 
>> clean way to do it and gave up.
>> 
>> I think we should try again.  Getting rid of this particular wart will 
>> save us a lot of confusion and complexity and improve the user/admin 
>> experience significantly...
>> 
>> My suspicion is that we may need to have a explicit 'upgrade' validation 
>> step that rejiggers an existing CRUSH map to remap ruleids and rulesets to 
>> map to each other, and enforce that constraint on the cluster.  Then we 
>> could get away with renaming the field and clean up all the admin tools 
>> and such based on that constraint.  Then, in a year or two, we can change 
>> the actual placement code to drop the ruleset logic.  Otherwise we'll need 
>> to set incompatible feature bits and force clients to update and so on, 
>> which we want to avoid...
> 
> Understood. Even before going into this, it looks like we need a way to find 
> all bugs like http://tracker.ceph.com/issues/9044 and fix them. Reading the 
> code won't be enough I'm afraid. What about changing ruleno and ruleset into 
> structs so that compilation shows where they are used interchangeably when 
> they should not ? 
> 
> Cheers
> 
> -- 
> Loïc Dachary, Artisan Logiciel Libre
> 
N�����r��y����b�X��ǧv�^�)޺{.n�+���z�]z���{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�m��������zZ+�����ݢj"��!�i

Reply via email to