Sam, Greg, A simpler proposal is documented at :
https://github.com/dachary/ceph/commit/ff11902bdc26aa35c70dd2f4d9de31f4cd207519#diff-5518964bc98a094a784ce2d17a5b0cc1R1
which is part of the proposed implementation for locally repairable code
https://github.com/ceph/ceph/pull/1921
Hopefully it makes sense ;-)
Cheers
On 09/06/2014 22:38, Samuel Just wrote:
> I'm finding that I don't really understand how the LRC specification
> works. Is there a doc somewhere I can read?
> -Sam
>
> On Mon, Jun 9, 2014 at 1:18 PM, Gregory Farnum <[email protected]> wrote:
>> On Fri, Jun 6, 2014 at 7:30 AM, Loic Dachary <[email protected]> wrote:
>>> Hi Andreas,
>>>
>>> On 06/06/2014 13:46, Andreas Joachim Peters wrote:> Hi Loic,
>>>> the basic implementation looks very clean.
>>>>
>>>> I have few comments/ideas:
>>>>
>>>> - the reconstruction strategy using the three levels is certainly
>>>> efficient enough for standard cases but does not guarantee always the
>>>> minimum decoding (in cases where one layer is not enough to reconstruct)
>>>> since your third algorithm is just brute-force to reconstruct everything
>>>> through all layers until we have what we need ...
>>>
>>> The third strategy is indeed brute force. Do you think it is worth changing
>>> to be minimal ? It would be nice to quantify the percent of cases it
>>> addresses. Do you know how to do that ? It looks like a very small
>>> percentage but there is no proof it is small ;-)
>>>
>>>> - the whole LRC configuration actually does not describe the placement -
>>>> it still looks disconnected from the placement strategy/crush rules ...
>>>> wouldn't it make sense to have the crush rule implicit in the description
>>>> or a function to derive it automatically based on the LRC configuration?
>>>> Maybe you have this already done in another way and I didn't see it ...
>>>
>>> Good catch.
>>>
>>> What about this:
>>>
>>> " [ \"_aAAA_aAA_\", \"set choose datacenter 2\","
>>> " \"_aXXX_aXX_\" ],"
>>> " [ \"b_BBB_____\", \"set choose host 5\","
>>> " \"baXXX_____\" ],"
>>> " [ \"_____cCCC_\", \"\","
>>> " \"baXXXcaXX_\" ],"
>>> " [ \"_____DDDDd\", \"\","
>>> " \"baXXXcaXXd\" ],"
>>>
>>> Which translates into
>>>
>>> take root
>>> set choose datacenter 2
>>> set choose host 5
>>>
>>> In other words, the ruleset is created by concatenating the strings from
>>> the description, without any kind of smart computation. It is up to the
>>> person who creates the description to add the ruleset near a description
>>> that makes sense. There is going to be minimal checking to make sure the
>>> ruleset can actually be used to get the required number of chunks.
>>>
>>> It probably is very difficult and very confusing to automate the generation
>>> of the ruleset. If it is implicit rather than explicit as above, the
>>> operator will have to somehow understand and learn how it is computed to
>>> make sure it does what is desired. With an explicit set of crush rules
>>> loosely coupled to chunk mapping, the operator can read the crush
>>> documentation instead of guessing.
>>
>> I think I'm missing some context for this discussion (maybe I haven't
>> been reading other threads closely enough); can you discuss this in
>> more detail?
>> Matching up CRUSH rulesets and the EC plugin formulas is very
>> important and demonstrated to be difficult, but I don't really
>> understand what you're suggesting here, which makes me think it's not
>> quite the right idea. ;)
>>
>>>
>>>> - should the plug-in have the ability to select reconstruction on
>>>> proximity or this should be up-to the higher layer to provide chunks in a
>>>> way that reconstruction would select the 'closest' layer? The relevance of
>>>> the question you will understand better in the next point ....
>>>>
>>>> - I remember we had this 3 data centre example with (8,4) where you can
>>>> reconstruct every object if 2 data centres are up. Another appealing
>>>> example avoiding remote access when reading an object is that you have 2
>>>> data centres having a replication of e.g. (4,2) encoded objects. Can you
>>>> describe in your LRC configuration language to store the same chunk twice
>>>> like __ABCCBA__ ?
>>>
>>> Unless I'm mistaken that would require the caller of the plugin to support
>>> duplicate data chunks and provide a kind of proximity check. Since this is
>>> not currently supported by the OSD logic, it is difficult to figure out how
>>> an erasure code plugin could provide support for this use case.
>>
>> I haven't looked at the EC plugin interface at all, but I thought the
>> OSD told the plugin what chunks it could access, and the plugin tells
>> it which ones to fetch. So couldn't the plugin simply output duplicate
>> chunks, and not have the OSD retrieve both of them?
>> -Greg
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
