On 05/04/2013 08:47 PM, Noah Watkins wrote: > > On May 4, 2013, at 11:36 AM, Loic Dachary <[email protected]> wrote: > >> >> >> On 05/04/2013 08:27 PM, Noah Watkins wrote: >>> >>> On May 4, 2013, at 10:16 AM, Loic Dachary <[email protected]> wrote: >>> >>>> it would be great to get feedback before the ceph summit to address the >>>> most prominent issues. >>> >>> One thing that has been in the back of my mind is how this proposal is >>> influenced (if at all) by a future that includes declustered per-file raid >>> in CephFS. I realize that may be a distant future, but it seems as though >>> there could be a lot of overlap for the (non-client driven) >>> rebuild/recovery component of such an architecture. >> >> Hi Noah, >> >> I'm not sure what declustered per-file raid is, which means it had no >> influence on this proposal ;-) Would you be so kind as to educate me ? > > I'm definitely far from an expert on the topic. But briefly the way I think > about it is: > > Currently CephFS stripes a file byte stream across a set of objects (e.g. > first MB in object 0, 2nd in object 1, etc..), and each of these objects is > in turn replicated. Following a failure, PGs re-replicate objects. > > In client drive raid the striping algorithm is changed, and clients are > calculating and distributing parity. In this case the parity rather than > replication provides redundancy. So, one might consider storing objects in a > pool with replication size 1. However, the standard PG that does replication > wouldn't be able to handle faults correctly (parity rebuild, rather than > re-replication), and a smart PG like the ErasureCodedPG would be needed. > > So it seems like the problems are related, but I'm not sure exactly how much > overlap there is :)
Do you refer to http://ceph.com/docs/master/architecture/#how-ceph-clients-stripe-data when talking about client drive raid ? My understanding is that it is designed to maximize throughout. This is done in the client library ( gateway, rbd or cephfs ). Since erasure encoding is about recovering from failures and would be implemented in libosd ( next to ReplicatedPG ), I am under the impression that there is no overlap. What do you think ? > > -Noah > > >> Cheers >> >>> -Noah >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to [email protected] >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- >> Loïc Dachary, Artisan Logiciel Libre >> All that is necessary for the triumph of evil is that good people do nothing. >> > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Loïc Dachary, Artisan Logiciel Libre All that is necessary for the triumph of evil is that good people do nothing.
signature.asc
Description: OpenPGP digital signature
