On 06/19/2013 01:33 PM, Mark Nelson wrote:
> On 06/18/2013 07:22 AM, Loic Dachary wrote:
>> Hi Ceph,
>>
>> TL;DR: use jerasure 1.2 with Reed-Solomon to code/decode/repair an object, 
>> and upgrade to 2.0 when available.
>>
>> Disclaimer: I'm no expert ;-) The terms are explained in wikipedia[1].
>>
>> Using Reed-Solomon object O is encoded by dividing it into consecutive 
>> chuncks O1, O2, ... ON and computing parity blocks P1, P2, ... PK.  Reading 
>> the original content of object O is a simple concatenation of O1, O2, ... 
>> ON. If O2 or P2 are lost, they can be repaired/reconstructed using O1 ... ON 
>> and P1 ... PK. If the use case is mostly reading objects and repairs are at 
>> least 1000 times less likely than normal operations, being able to read the 
>> object from non-coded chuncks is attractive.
>>
>> Reed-Solomon is significantly more expensive to encode ( 100MB/s order of 
>> magnitude on a single 2.5Ghz core ) than fountain codes with the current 
>> jerasure implementation[2]. However, gf-complete[3] that will be used in the 
>> upcoming version of jerasure significantly improves performances ( 2 to 10 
>> times faster ) and the difference becomes negligible.
> 
> One thing that we might consider is that ARM is very quickly becoming an 
> option for Ceph.  It may be very important to have our erasure coding scheme 
> be viable on that platform and CPU is going to be the primary bottleneck.  It 
> may be worth a quick look at NEON to see if there are any things we should be 
> thinking about now.

Hi Mark,

In another thread James Plank wrote that CPU usage is not going to be a problem 
as long as we're not trying to slice an object into more than 2^16 chunks ( the 
actual sentence is "I agree that the CPU burden of the GF arithmetic will not 
be a bottleneck in your system, regardless of which implementation you use, as 
long as you stay at or below GF(2^16)." 
http://article.gmane.org/gmane.comp.file-systems.ceph.devel/15650 ). It looks 
like we're aiming for something in the order of 10 data chunks + 5 parity 
chunks, i.e. much lower than 2^16. My hunch is that using more than 100 OSDs to 
code a single object would be problematic for reasons that are unrelated to the 
maths involved in coding it anyway.

That being said I can look for/write benchmark code based on jerasure to run on 
ARM and get a rough idea of the CPU footprint, if you think it's worth it. 

Cheers
> 
>>
>> Reed-Solomon coding family is the only one that can keep the chuncks 
>> unencoded and therefore concatenable.
>>
>> The jerasure library is packaged and being worked on by the author at the 
>> moment. All other Free Software implementations are either not packaged or 
>> not maintained.
>>
>> The license[4] of jerasure is compatible with the license of Ceph.
>>
>> Performances depend on the parameters to the Reed-Solomon functions but they 
>> will also be influenced by the buffer sizes used when calling the encoding 
>> functions: smaller buffers will mean more calls and more overhead.
>>
>> Open questions:
>>
>> * Does Mojette Transform [5] have compelling qualities compared to other 
>> code families ?
>> * Do hierarchical codes [6] have compelling qualities ? Implementing them 
>> would require a different API. To be effective they need to take into 
>> account the context in which an object is stored where the other code only 
>> require the object itself.
>> * I have not experiemented with the jerasure API yet
>>
>> Feedback and criticisms are welcome :-)
>>
>> [1] http://en.wikipedia.org/wiki/Erasure_code
>> [2] jerasure 1.2 http://web.eecs.utk.edu/~plank/plank/papers/CS-08-627.html
>> [3] gf-complete http://web.eecs.utk.edu/~plank/plank/papers/CS-13-703.html
>> [4] jerasure license 
>> https://github.com/tsuraan/Jerasure/blob/master/License.txt
>> [5] Mojette Transform http://en.wikipedia.org/wiki/Mojette_Transform
>> [6] hierarchical codes http://www.e-biersack.eu/BPublished/nc_springer.pdf
>>
>>
> 

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to