On 06/18/2013 07:22 AM, Loic Dachary wrote:
Hi Ceph,
TL;DR: use jerasure 1.2 with Reed-Solomon to code/decode/repair an object, and
upgrade to 2.0 when available.
Disclaimer: I'm no expert ;-) The terms are explained in wikipedia[1].
Using Reed-Solomon object O is encoded by dividing it into consecutive chuncks
O1, O2, ... ON and computing parity blocks P1, P2, ... PK. Reading the
original content of object O is a simple concatenation of O1, O2, ... ON. If O2
or P2 are lost, they can be repaired/reconstructed using O1 ... ON and P1 ...
PK. If the use case is mostly reading objects and repairs are at least 1000
times less likely than normal operations, being able to read the object from
non-coded chuncks is attractive.
Reed-Solomon is significantly more expensive to encode ( 100MB/s order of
magnitude on a single 2.5Ghz core ) than fountain codes with the current
jerasure implementation[2]. However, gf-complete[3] that will be used in the
upcoming version of jerasure significantly improves performances ( 2 to 10
times faster ) and the difference becomes negligible.
One thing that we might consider is that ARM is very quickly becoming an
option for Ceph. It may be very important to have our erasure coding
scheme be viable on that platform and CPU is going to be the primary
bottleneck. It may be worth a quick look at NEON to see if there are
any things we should be thinking about now.
Reed-Solomon coding family is the only one that can keep the chuncks unencoded
and therefore concatenable.
The jerasure library is packaged and being worked on by the author at the
moment. All other Free Software implementations are either not packaged or not
maintained.
The license[4] of jerasure is compatible with the license of Ceph.
Performances depend on the parameters to the Reed-Solomon functions but they
will also be influenced by the buffer sizes used when calling the encoding
functions: smaller buffers will mean more calls and more overhead.
Open questions:
* Does Mojette Transform [5] have compelling qualities compared to other code
families ?
* Do hierarchical codes [6] have compelling qualities ? Implementing them would
require a different API. To be effective they need to take into account the
context in which an object is stored where the other code only require the
object itself.
* I have not experiemented with the jerasure API yet
Feedback and criticisms are welcome :-)
[1] http://en.wikipedia.org/wiki/Erasure_code
[2] jerasure 1.2 http://web.eecs.utk.edu/~plank/plank/papers/CS-08-627.html
[3] gf-complete http://web.eecs.utk.edu/~plank/plank/papers/CS-13-703.html
[4] jerasure license https://github.com/tsuraan/Jerasure/blob/master/License.txt
[5] Mojette Transform http://en.wikipedia.org/wiki/Mojette_Transform
[6] hierarchical codes http://www.e-biersack.eu/BPublished/nc_springer.pdf
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html