On Fri, 8 Aug 2014, Chen, Xiaoxi wrote: > Make sense, would you mind me to take this job? I will start with the > conversion function in monitor.
That would be great! I would make it an OSDMap method, as everything it needs is encapsulated there. And I would make an OSDMap or CrushWrapper method that is just a check to verify that all rules have ruleset == their rule id so we can enforce that for any new maps. sage > > ? 2014-8-9?0:08?"Sage Weil" <[email protected]> ??? > > > On Fri, 8 Aug 2014, Chen, Xiaoxi wrote: > >> For my side, I have seen some guys(actually more than 80% of the user I > >> have seen in university/company) do as the following way: > >> > >> 0.What they want to do is : create a pool that located in a specified rack. > >> > >> 1.Create a rule with "ceph osd crush rule create-simple myrule1 rack2 host > >> firstn > >> 2.Create a pool named mypool1 > >> 3.Set the ruleset of the pool, but they aren't that clear about the > >> difference in concepts of rule and ruleset... Here they just need an ID, > >> so they use "ceph osd crush rule ls" to list all the rules(they may > >> imaging rule=ruleset), and then start to count, 0, 1,2, 3, aha, the ID for > >> myrule1 is 3. So they simply type in "ceph osd pool set mypool1 > >> crush_ruleset 3.... > >> > >> In most case, this works, but actually this is not the right way to do..... > > > > Yeah, this is exactly the sort of flow we should fix. > > > > How about this: > > > > Starting with Giant (or whatever), we enforce that ruleset == rule id. > > That is, the ruleset must be unique. We have a conversion function in the > > monitor that will take an existing OSDMap (w/ embedded CRUSH map) and make > > any changes needed to make this true by giving a new ruleset to any > > rules that share, and adjusting the pools accordingly. Moving forward, > > the mon will refuse to accept an injected CRUSH map that doesn't satisfy > > this constraint, and the new crushtool will refuse to compile one. > > > > We set a flag on the OSDMap indicating that this invariant (one rule per > > ruleset and rule id == ruleset id) is now true. If this flag is set, the > > mapping code can skip the old rule resolution (which searches all rules > > for a rule with the right ruleset and size) and look up the rule directly. > > > > We also adjust crushtool decompile to say > > > > rule replicated_ruleset { > > id 0 # do not change unnecessarily > > type replicated > > > > instead of > > > > rule replicated_ruleset { > > ruleset 0 > > type replicated > > > > We can continue to recognize "ruleset" instead of "id" when compiling but > > issue a warning. We can also drop the min/max size values when > > decompiling and ignore them when compiling (and always set them to large > > numbers, like 0/255). > > > > Then we adjust all of the mon commands to take rule ids (== ruleset ids) > > or rule name. > > > > What do you think? > > sage > > > > > >> > >> -----Original Message----- > >> From: Loic Dachary [mailto:[email protected]] > >> Sent: Friday, August 8, 2014 11:11 PM > >> To: Chen, Xiaoxi > >> Cc: Sage Weil; Ma Jianpeng; Ceph Development > >> Subject: Re: Resolving the ruleno / ruleset confusion > >> > >> I guess most users just think of ruleset and never finds out there are two > >> different numbers with slightly different semantic. > >> > >> For me the use case is, 100% of the time : creating a rule via the command > >> line and get the ruleset via dump OR create and update a rule via dump / > >> load the osdmap, in which case I diligently (for no reason, just because > >> it seemed right) increment the ruleset and keep them in order. > >> > >> I have no use of rule ids and only use rulesets. > >> > >> Cheers > >> > >> On 08/08/2014 16:54, Chen, Xiaoxi wrote: > >>> I think before we start bug fix or try to get rid of ruleset concept, we > >>> can start with define a reasonable use case. How we expect user to play > >>> with rule and pools. there is no CLI to create/modify a ruleset, even > >>> worse , you are not able to get the ruleset id without dump a rule. > >>> > >>> currently the logic of command flow is really strange, user writes a > >>> rule, when he wants to use the rule,he need to find out the ruleset who > >>> contains the rule, and specified the ruleset to a pool. If the ruleset > >>> only contains a rule, the concept of ruleset is confusing and useless, > >>> if the ruleset contains more than one rules, user may have the risk that > >>> ceph select a rule in the ruleset, but not the one he want... > >>> > >>> > >>> > >>> ? 2014-8-8?22:34?"Loic Dachary" <[email protected]> ??? > >>> > >>>> > >>>> > >>>> On 08/08/2014 16:12, Sage Weil wrote: > >>>>> On Fri, 8 Aug 2014, Loic Dachary wrote: > >>>>>> Hi, > >>>>>> > >>>>>> As you noticed, there are places where ruleset and ruleno / ruleid > >>>>>> are used interchangeably although they are not. This is a source of > >>>>>> subtle bugs that can be hard to trace. By default ruleid and > >>>>>> ruleset are the same, but dumping a crush map including > >>>>>> > >>>>>> rule data { > >>>>>> ruleset 0 > >>>>>> type replicated > >>>>>> min_size 1 > >>>>>> max_size 10 > >>>>>> step take default > >>>>>> step chooseleaf firstn 0 type host > >>>>>> step emit > >>>>>> } > >>>>>> rule metadata { > >>>>>> ruleset 1 > >>>>>> type replicated > >>>>>> min_size 1 > >>>>>> max_size 10 > >>>>>> step take default > >>>>>> step chooseleaf firstn 0 type host > >>>>>> step emit > >>>>>> } > >>>>>> > >>>>>> and swapping the rules as follows > >>>>>> > >>>>>> rule metadata { > >>>>>> ruleset 1 > >>>>>> type replicated > >>>>>> min_size 1 > >>>>>> max_size 10 > >>>>>> step take default > >>>>>> step chooseleaf firstn 0 type host > >>>>>> step emit > >>>>>> } > >>>>>> > >>>>>> rule data { > >>>>>> ruleset 0 > >>>>>> type replicated > >>>>>> min_size 1 > >>>>>> max_size 10 > >>>>>> step take default > >>>>>> step chooseleaf firstn 0 type host > >>>>>> step emit > >>>>>> } > >>>>>> > >>>>>> will have ruleset 1 with rule id 0 and ruleset 0 with rule id 1 > >>>>>> > >>>>>> Since the ruleset is the only reliable number, from the user point > >>>>>> of view, we could simply change CrushWrapper.h to never return the > >>>>>> rule id and assume only ruleset are given in argument, even where > >>>>>> it currently claims to be a rule id. > >>>>> > >>>>> I'm worried about making that sort of change in an internal interface. > >>>>> And, more generally, about CRUSH maps in the wild that may have odd > >>>>> mappings that we don't want to break with subtle changes (even > >>>>> fixes). :/ > >>>>> > >>>>>> The downside is that looking up the ruleset implies iterating over > >>>>>> all the rules, but that's probably not an issue. > >>>>>> > >>>>>> What do you think ? > >>>>> > >>>>> I sat down a few months ago and tried to figure out if we could get > >>>>> rid of the ruleset concept entirely and simply map pools directly to > >>>>> rules (which are the things the user conceptually thinks about, we > >>>>> name, etc.). > >>>>> The original motivation for a ruleset was to be able to adjust the > >>>>> pool replication factor and have the system adjust the placement > >>>>> behavior accordingly, but in reality that is a pretty useless > >>>>> capability: num_rep rarely changes, and when it does you can simply > >>>>> adjust the placement rule at the same time. Unfortunately, I didn't > >>>>> come up with any easy and clean way to do it and gave up. > >>>>> > >>>>> I think we should try again. Getting rid of this particular wart > >>>>> will save us a lot of confusion and complexity and improve the > >>>>> user/admin experience significantly... > >>>>> > >>>>> My suspicion is that we may need to have a explicit 'upgrade' > >>>>> validation step that rejiggers an existing CRUSH map to remap > >>>>> ruleids and rulesets to map to each other, and enforce that > >>>>> constraint on the cluster. Then we could get away with renaming the > >>>>> field and clean up all the admin tools and such based on that > >>>>> constraint. Then, in a year or two, we can change the actual > >>>>> placement code to drop the ruleset logic. Otherwise we'll need to > >>>>> set incompatible feature bits and force clients to update and so on, > >>>>> which we want to avoid... > >>>> > >>>> Understood. Even before going into this, it looks like we need a way to > >>>> find all bugs like http://tracker.ceph.com/issues/9044 and fix them. > >>>> Reading the code won't be enough I'm afraid. What about changing ruleno > >>>> and ruleset into structs so that compilation shows where they are used > >>>> interchangeably when they should not ? > >>>> > >>>> Cheers > >>>> > >>>> -- > >>>> Lo?c Dachary, Artisan Logiciel Libre > >> > >> -- > >> Lo?c Dachary, Artisan Logiciel Libre > >> > >> N?????r??y??????X???v???)?{.n?????z?]z????ay?????j??f???h??????w??? > >> ???j:+v???w????????????zZ+???????j"????i > N????y????b?????v?????{.n??????z??ay????????j???f????????????????:+v??????????zZ+??????"?!? -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
