I guess most users just think of ruleset and never finds out there are two different numbers with slightly different semantic.
For me the use case is, 100% of the time : creating a rule via the command line and get the ruleset via dump OR create and update a rule via dump / load the osdmap, in which case I diligently (for no reason, just because it seemed right) increment the ruleset and keep them in order. I have no use of rule ids and only use rulesets. Cheers On 08/08/2014 16:54, Chen, Xiaoxi wrote: > I think before we start bug fix or try to get rid of ruleset concept, we can > start with define a reasonable use case. How we expect user to play with rule > and pools. there is no CLI to create/modify a ruleset, even worse , you are > not able to get the ruleset id without dump a rule. > > currently the logic of command flow is really strange, user writes a rule, > when he wants to use the rule,he need to find out the ruleset who contains > the rule, and specified the ruleset to a pool. If the ruleset only contains a > rule, the concept of ruleset is confusing and useless, if the ruleset > contains more than one rules, user may have the risk that ceph select a rule > in the ruleset, but not the one he want... > > > > 在 2014-8-8,22:34,"Loic Dachary" <[email protected]> 写道: > >> >> >> On 08/08/2014 16:12, Sage Weil wrote: >>> On Fri, 8 Aug 2014, Loic Dachary wrote: >>>> Hi, >>>> >>>> As you noticed, there are places where ruleset and ruleno / ruleid are >>>> used interchangeably although they are not. This is a source of subtle >>>> bugs that can be hard to trace. By default ruleid and ruleset are the >>>> same, but dumping a crush map including >>>> >>>> rule data { >>>> ruleset 0 >>>> type replicated >>>> min_size 1 >>>> max_size 10 >>>> step take default >>>> step chooseleaf firstn 0 type host >>>> step emit >>>> } >>>> rule metadata { >>>> ruleset 1 >>>> type replicated >>>> min_size 1 >>>> max_size 10 >>>> step take default >>>> step chooseleaf firstn 0 type host >>>> step emit >>>> } >>>> >>>> and swapping the rules as follows >>>> >>>> rule metadata { >>>> ruleset 1 >>>> type replicated >>>> min_size 1 >>>> max_size 10 >>>> step take default >>>> step chooseleaf firstn 0 type host >>>> step emit >>>> } >>>> >>>> rule data { >>>> ruleset 0 >>>> type replicated >>>> min_size 1 >>>> max_size 10 >>>> step take default >>>> step chooseleaf firstn 0 type host >>>> step emit >>>> } >>>> >>>> will have ruleset 1 with rule id 0 and ruleset 0 with rule id 1 >>>> >>>> Since the ruleset is the only reliable number, from the user point of >>>> view, we could simply change CrushWrapper.h to never return the rule id >>>> and assume only ruleset are given in argument, even where it currently >>>> claims to be a rule id. >>> >>> I'm worried about making that sort of change in an internal interface. >>> And, more generally, about CRUSH maps in the wild that may have odd >>> mappings that we don't want to break with subtle changes (even fixes). :/ >>> >>>> The downside is that looking up the ruleset implies iterating over all >>>> the rules, but that's probably not an issue. >>>> >>>> What do you think ? >>> >>> I sat down a few months ago and tried to figure out if we could get rid of >>> the ruleset concept entirely and simply map pools directly to rules >>> (which are the things the user conceptually thinks about, we name, etc.). >>> The original motivation for a ruleset was to be able to adjust the pool >>> replication factor and have the system adjust the placement behavior >>> accordingly, but in reality that is a pretty useless capability: num_rep >>> rarely changes, and when it does you can simply adjust the placement rule >>> at the same time. Unfortunately, I didn't come up with any easy and >>> clean way to do it and gave up. >>> >>> I think we should try again. Getting rid of this particular wart will >>> save us a lot of confusion and complexity and improve the user/admin >>> experience significantly... >>> >>> My suspicion is that we may need to have a explicit 'upgrade' validation >>> step that rejiggers an existing CRUSH map to remap ruleids and rulesets to >>> map to each other, and enforce that constraint on the cluster. Then we >>> could get away with renaming the field and clean up all the admin tools >>> and such based on that constraint. Then, in a year or two, we can change >>> the actual placement code to drop the ruleset logic. Otherwise we'll need >>> to set incompatible feature bits and force clients to update and so on, >>> which we want to avoid... >> >> Understood. Even before going into this, it looks like we need a way to find >> all bugs like http://tracker.ceph.com/issues/9044 and fix them. Reading the >> code won't be enough I'm afraid. What about changing ruleno and ruleset into >> structs so that compilation shows where they are used interchangeably when >> they should not ? >> >> Cheers >> >> -- >> Loïc Dachary, Artisan Logiciel Libre >> -- Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
