Hi koleosfuscus, A simpler proposal was made a few days ago. As you rightfully point out, the previous one was a bit complicated to understand ;-)
http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/19753 Cheers On 02/06/2014 20:28, Koleos Fuskus wrote: > Hi Loic, > I am trying to understand your proposal on > http://pad.ceph.com/p/cdsgiant-pyramid-erasure-code > Is the mapping specification a new feature on CRUSH to support Pyramid Codes? > I don't follow from line 72, when you are talking about "crush > multidatacenter mapping". > Adding a failure domain typically adds a new level in the pyramid using xor? > > "** if one chunk is missing, will return that all chunks from the local > cluster are needed" > If one chunk is missing, it recovers it using xor instead of jerasure? > "** if two chunks are missing in the same local cluster, it will defer to the > global level" > In this case it has the pyramid code doesn't help, does it? > ** if two chunks are missing, each of them in a different local cluster, it > will return that it needs all chunks from both local cluster but will not > defer to the upper level > > Best, > koleos > > > > On Monday, June 2, 2014 3:14 PM, Loic Dachary <[email protected]> wrote: > Hi Andreas, > > On 02/06/2014 14:20, Andreas Joachim Peters wrote:> Hi Loic, >> >> I think this gives all the flexibility to define any possible combination >> for encoding ... >> >> When one constructs the steps one has just to be aware that the 'most local' >> encoding should happen in the end, right? > > Yes. > >> >> It would be usefule to have a tool which outputs then for each data aND >> parity chunk the achieved 'redundancy' and the overall volume and maximal >> reconstruction 'overhead'. > > Right. I'm kind of hoping koleosfuscus (cc'ed) will be able to fit that into > the reliability model, but we've not discussed that yet. In any case you are > right, a small command line tool would be helpful. Something that would > explain: if you loose one of the chunks you need four to recover. If you lose > two you need all of them. That's more humanly readable and understandable > than the full description ;-) > > Cheers > >> >> Cheers Andreas. >> >> ________________________________________ >> From: Loic Dachary [[email protected]] >> Sent: 31 May 2014 19:10 >> To: Andreas Joachim Peters >> Cc: Ceph Development >> Subject: Pyramid erasure code description revisited >> >> Hi Andreas, >> >> After a few weeks and a fresh eye, I revisited the way pyramid erasure code >> could be described by the system administrator. Here is a proposal that is >> hopefully more intuitive than the one from the last CDS ( >> http://pad.ceph.com/p/cdsgiant-pyramid-erasure-code ). >> >> These are the steps to create all coding chunks. The upper case letters are >> data chunks and the lower case letters are coding chunks. >> >> "__ABC__DE_" data chunks placement >> >> Step 1 >> "__ABC__DE_" >> "_yVWX_zYZ_" K=5, M=2 >> "_aABC_bDE_" >> >> Step 2 >> "_aABC_bDE_" >> "z_XYZ_____" K=3, M=1 >> "caABC_bDE_" >> >> Step 3 >> "caABC_bDE_" >> "_____zXYZ_" K=3, M=1 >> "caABCdbDE_" >> >> Step 4 >> "caABCdbDE_" >> "_____WXYZz" K=4, M=1 >> "caABCdbDEe" >> >> The interpretation of Step 3 is as follows: >> >> Given the output of the previous step ( "caABC_bDE_" ), the bDE chunks are >> considered to be data chunks at this stage and they are marked with XYZ. A >> K=3, M=1 coding chunk is calculated and placed in the chunk marked with z ( >> "_____zXYZ_" ). The output of this coding step is the previous step plus the >> coding chunk that was just calculated, named d ( "caABCdbDE_" ). >> >> This gives the flexibility of deciding wether or not a coding chunk from a >> previous step is used as data to compute the coding chunk of the next step. >> It also allows for unbalanced steps such as step 4. >> >> For decoding, the steps are walked from the bottom up. If E is missing, it >> can be reconstructed from dbD.e in step 4 and the other steps are skipped >> because it was the only missing chunk. If AB are missing, all steps that >> have not be used to encode it are ignored, up to step 2 that will fail to >> recover them because M=1 and yeild to step 1 that will use a..CbDE >> successfully because M=2. >> >> Giving up the recursion and favor iteration seems to simplify how it can be >> explained. And I suspect the implementation is also simpler. What do you >> think ? >> >> Cheers >> >> -- >> Loïc Dachary, Artisan Logiciel Libre >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to [email protected] >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
