On 05/04/2013 08:47 PM, Noah Watkins wrote:
> 
> On May 4, 2013, at 11:36 AM, Loic Dachary <[email protected]> wrote:
> 
>>
>>
>> On 05/04/2013 08:27 PM, Noah Watkins wrote:
>>>
>>> On May 4, 2013, at 10:16 AM, Loic Dachary <[email protected]> wrote:
>>>
>>>> it would be great to get feedback before the ceph summit to address the 
>>>> most prominent issues.
>>>
>>> One thing that has been in the back of my mind is how this proposal is 
>>> influenced (if at all) by a future that includes declustered per-file raid 
>>> in CephFS. I realize that may be a distant future, but it seems as though 
>>> there could be a lot of overlap for the (non-client driven) 
>>> rebuild/recovery component of such an architecture.
>>
>> Hi Noah,
>>
>> I'm not sure what declustered per-file raid is, which means it had no 
>> influence on this proposal ;-) Would you be so kind as to educate me ?
> 
> I'm definitely far from an expert on the topic. But briefly the way I think 
> about it is:
> 
> Currently CephFS stripes a file byte stream across a set of objects (e.g. 
> first MB in object 0, 2nd in object 1, etc..), and each of these objects is 
> in turn replicated. Following a failure, PGs re-replicate objects.
> 
> In client drive raid the striping algorithm is changed, and clients are 
> calculating and distributing parity. In this case the parity rather than 
> replication provides redundancy. So, one might consider storing objects in a 
> pool with replication size 1. However, the standard PG that does replication 
> wouldn't be able to handle faults correctly (parity rebuild, rather than 
> re-replication), and a smart PG like the ErasureCodedPG would be needed.
> 
> So it seems like the problems are related, but I'm not sure exactly how much 
> overlap there is :)

Do you refer to 
http://ceph.com/docs/master/architecture/#how-ceph-clients-stripe-data when 
talking about client drive raid ? My understanding is that it is designed to 
maximize throughout. This is done in the client library ( gateway, rbd or 
cephfs ). Since erasure encoding is about recovering from failures and would be 
implemented in libosd ( next to ReplicatedPG ), I am under the impression that 
there is no overlap.

What do you think ?

> 
> -Noah
> 
> 
>> Cheers
>>
>>> -Noah
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to [email protected]
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> -- 
>> Loïc Dachary, Artisan Logiciel Libre
>> All that is necessary for the triumph of evil is that good people do nothing.
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to