Re: CEPH Erasure Encoding + OSD Scalability

Loic Dachary Mon, 09 Dec 2013 08:46:06 -0800

Hi,

Mark Nelson suggested we use perf ( linux-tools ) for benchmarking. It looks 
like something that would help indeed : the benchmark program would only 
concern itself with doing some work according to the options and let 
performances be collected from the outside, using tools that are familiar to 
people doing benchmarking.


What do you think ?

Cheers

$ perf stat -e
  Error: switch `e' requires a value

 usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available 
events
        --filter <filter>
                          event filter
    -i, --no-inherit      child tasks do not inherit counters
    -p, --pid <pid>       stat events on existing process id
    -t, --tid <tid>       stat events on existing thread id
    -a, --all-cpus        system-wide collection from all CPUs
    -g, --group           put the counters into a counter group
    -c, --scale           scale/normalize counters
    -v, --verbose         be more verbose (show counter open errors, etc)
    -r, --repeat <n>      repeat command and print average + stddev (max: 100, 
forever: 0)
    -n, --null            null run - dont start any counters
    -d, --detailed        detailed run - start a lot of events
    -S, --sync            call sync() before starting a run
    -B, --big-num         print large numbers with thousands' separators
    -C, --cpu <cpu>       list of cpus to monitor in system-wide
    -A, --no-aggr         disable CPU count aggregation
    -x, --field-separator <separator>
                          print counts with custom separator
    -G, --cgroup <name>   monitor event in cgroup name only
    -o, --output <file>   output file name
        --append          append to the output file
        --log-fd <n>      log output to fd, instead of stderr
        --pre <command>   command to run prior to the measured command
        --post <command>  command to run after to the measured command
    -I, --interval-print <n>
                          print counts at regular interval in ms (>= 100)
        --per-socket      aggregate counts per processor socket
        --per-core        aggregate counts per physical processor core


On 12/11/2013 19:06, Loic Dachary wrote:
> Hi Andreas,
> 
> On 12/11/2013 02:11, Andreas Joachim Peters wrote:
>> Hi Loic,
>>
>> I am finally doing the benchmark tool and I found a bunch of wrong parameter 
>> checks which can make the whole thing SEGV.
>>
>> All the RAID-6 codes have restrictions on the parameters but they are not 
>> correctly enforced for Liberation & Blaum-Roth codes in the CEPH wrapper 
>> class ... see text from PDF
>>
>> "Minimal Density RAID-6 codes are MDS codes based on binary matrices which 
>> satisfy a lower-bound on the number  of non-zero entries. Unlike Cauchy 
>> coding, the bit-matrix elements do not correspond to elements in GF (2 w ). 
>> Instead, the bit-matrix itself has the proper MDS property. Minimal Density 
>> RAID-6 codes perform faster than Reed-Solomon and Cauchy Reed-Solomon codes 
>> for the same parameters. Liberation coding, Liber8tion coding, and 
>> Blaum-Roth coding are three examples of this kind of coding that are 
>> supported in jerasure.
>>
>> With each of these codes, m must be equal to two and k must be less than or 
>> equal to w. The value of w has restrictions based on the code:
>>
>> • With Liberation coding, w must be a prime number [Pla08b].
>> • With Blaum-Roth coding, w + 1 must be a prime number [BR99]. • With 
>> Liber8tion coding, w must equal 8 [Pla08a].
>>
>> ...
>>
>> Do you add this fixes?
> 
> Nice catch. I created and assigned to myself : 
> http://tracker.ceph.com/issues/6754
>>
>> For the benchmark suite it runs currently 308 different configurations for 
>> the 2 algorithm which make sense from the performance point of view and 
>> provides this output:
>>
>>
>> # -----------------------------------------------------------------
>> # Erasure Coding Benchmark - (C) CERN 2013 - [email protected]
>> # Ram-Size=12614856704 Allocation-Size=100000000
>> # -----------------------------------------------------------------
>> # [ -BENCH- ] [       ] technique=memcpy                                     
>>                        speed=5.408 [GB/s] latency=9.245 ms
>> # [ -BENCH- ] [       ] technique=d=a^b^c-xor                                
>>                        speed=4.377 [GB/s] latency=17.136 ms
>> # [ -BENCH- ] [001/304] 
>> technique=cauchy_good:k=05:m=2:w=8:lp=0:packet=00064:size=50000000          
>> speed=1.308 [GB/s] latency=038   [ms] size-overhead=40   [%]
>> ..
>> ..
>> # [ -BENCH- ] [304/304] 
>> technique=liberation:k=24:m=2:w=29:lp=2:packet=65536:size=50000000          
>> speed=0.083 [GB/s] latency=604   [ms] size-overhead=16   [%]
>> # -----------------------------------------------------------------
>> # Erasure Code Performance Summary::
>> # -----------------------------------------------------------------
>> # RAM:                   12.61 GB
>> # Allocation-Size        0.10 GB
>> # -----------------------------------------------------------------
>> # Byte Initialization:   29.35 MB/s
>> # Memcpy:                5.41 GB/s
>> # Triple-XOR:            4.38 GB/s
>> # -----------------------------------------------------------------
>> # Fastest RAID6          2.72 GB/s 
>> liber8tion:k=06:m=2:w=8:lp=0:packet=04096:size=50000000
>> # Fastest Triple Failure 0.96 GB/s 
>> cauchy_good:k=06:m=3:w=8:lp=0:packet=04096:size=50000000
>> # Fastest Quadr. Failure 0.66 GB/s 
>> cauchy_good:k=06:m=4:w=8:lp=0:packet=04096:size=50000000
>> # -----------------------------------------------------------------
>> # .................................................................
>> # Top 1  RAID6          2.72 GB/s 
>> liber8tion:k=06:m=2:w=8:lp=0:packet=04096:size=50000000
>> # Top 2  RAID6          2.72 GB/s 
>> liber8tion:k=06:m=2:w=8:lp=0:packet=16384:size=50000000
>> # Top 3  RAID6          2.64 GB/s 
>> liber8tion:k=06:m=2:w=8:lp=0:packet=65536:size=50000000
>> # Top 4  RAID6          2.60 GB/s 
>> liberation:k=07:m=2:w=7:lp=0:packet=16384:size=50000000
>> # Top 5  RAID6          2.59 GB/s 
>> liberation:k=05:m=2:w=7:lp=0:packet=04096:size=50000000
>> # .................................................................
>> # Top 1  Triple         0.96 GB/s 
>> cauchy_good:k=06:m=3:w=8:lp=0:packet=04096:size=50000000
>> # Top 2  Triple         0.94 GB/s 
>> cauchy_good:k=06:m=3:w=8:lp=0:packet=16384:size=50000000
>> # Top 3  Triple         0.93 GB/s 
>> cauchy_good:k=06:m=3:w=8:lp=0:packet=65536:size=50000000
>> # Top 4  Triple         0.89 GB/s 
>> cauchy_good:k=07:m=3:w=8:lp=0:packet=04096:size=50000000
>> # Top 5  Triple         0.87 GB/s 
>> cauchy_good:k=05:m=3:w=8:lp=0:packet=04096:size=50000000
>> # .................................................................
>> # Top 1  Quadr.         0.66 GB/s 
>> cauchy_good:k=06:m=4:w=8:lp=0:packet=04096:size=50000000
>> # Top 2  Quadr.         0.65 GB/s 
>> cauchy_good:k=07:m=4:w=8:lp=0:packet=04096:size=50000000
>> # Top 3  Quadr.         0.64 GB/s 
>> cauchy_good:k=06:m=4:w=8:lp=0:packet=16384:size=50000000
>> # Top 4  Quadr.         0.64 GB/s 
>> cauchy_good:k=05:m=4:w=8:lp=0:packet=04096:size=50000000
>> # Top 5  Quadr.         0.64 GB/s 
>> cauchy_good:k=06:m=4:w=8:lp=0:packet=65536:size=50000000
>> # .................................................................
>>
>> It takes around 30 second on my box. 
> 
> 
> That looks great :-) If I understand correctly, it means 
> https://github.com/ceph/ceph/pull/740 will no longer have benchmarks as they 
> are moved to a separate program. Correct ?
> 
>> I will add a measurement how the XOR and the 3 top algorithms scale with the 
>> number of cores and make the object-size configurable from the command line. 
>> Anything else ? 
> 
> It would be convenient to run this from a "workunit" ( i.e. a script in 
> ceph/qa/workunits/ ) so that it can later be run by teuthology integration 
> tests. That could be used to show regression.
> 
> Shall I add the possiblity to test a single user specified configuration via 
> command line arguments?
>>
> I would need to play with it to comment usefully.
> 
> Cheers
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

signature.asc
Description: OpenPGP digital signature

Re: CEPH Erasure Encoding + OSD Scalability

Reply via email to